I don’t like Lavender. I think humans should always be in the loop. I’d like to ...

I don’t like Lavender. I think humans should always be in the loop. I’d like to see more care by analysts for kill orders.

That said, any organization might do something if it’s 90% accurate. Assuming it even is (doubt it), I think any fair evaluation of such a technology must ask:

What is the accuracy of inexperienced humans in the same position who are rushing through the review during a blitz invasion? If they have battle experience, what about them, too? (I’m assuming most won’t.)

Is the system better than those humans or worse? How often?

Do the strengths and weaknesses of the system allow confidence scores on predictions to know which need more review? Can we also increase reviews when the number of deaths will be high?

That’s how I’d start a review of this tech. If anyone is building military AI, I also ask that you please include methods to highlight likely corner cases or high-stakes situations. Then, someone’s human instincts might kick in where they spot and resolve a problem even in the heat of war.