As I also write in my story, this push raises alarms from some AI safety experts about whether large language models are fit to analyze subtle pieces of intelligence in situations with high geopolitical stakes. It also accelerates the US toward a world where AI is not just analyzing military data but suggesting actions—for example, generating lists of targets. Proponents say this promises greater accuracy and fewer civilian deaths, but many human rights groups argue the opposite.
With that in mind, here are three open questions to keep your eye on as the US military, and others around the world, bring generative AI to more parts of the so-called “kill chain.”
What are the limits of “human in the loop”?
Talk to as many defense-tech companies as I have and you’ll hear one phrase repeated quite often: “human in the loop.” It means that the AI is responsible for particular tasks, and humans are there to check its work. It’s meant to be a safeguard against the most dismal scenarios—AI wrongfully ordering a deadly strike, for example—but also against more trivial mishaps. Implicit in this idea is an admission that AI will make mistakes, and a promise that humans will catch them.
But the complexity of AI systems, which pull from thousands of pieces of data, make that a herculean task for humans, says Heidy Khlaaf, who is chief AI scientist at the AI Now Institute, a research organization, and previously led safety audits for AI-powered systems.
“‘Human in the loop’ is not always a meaningful mitigation,” she says. When an AI model relies on thousands of data points to draw conclusions, “it wouldn’t really be possible for a human to sift through that amount of information to determine if the AI output was erroneous.” As AI systems rely on more and more data, this problem scales up.
Is AI making it easier or harder to know what should be classified?
In the Cold War era of US military intelligence, information was captured through covert means, written up into reports by experts in Washington, and then stamped “Top Secret,” with access restricted to those with proper clearances. The age of big data, and now the advent of generative AI to analyze that data, is upending the old paradigm in lots of ways.
One specific problem is called classification by compilation. Imagine that hundreds of unclassified documents all contain separate details of a military system. Someone who managed to piece those together could reveal important information that on its own would be classified. For years, it was reasonable to assume that no human could connect the dots, but this is exactly the sort of thing that large language models excel at.
With the mountain of data growing each day, and then AI constantly creating new analyses, “I don’t think anyone’s come up with great answers for what the appropriate classification of all these products should be,” says Chris Mouton, a senior engineer for RAND, who recently tested how well suited generative AI is for intelligence and analysis. Underclassifying is a US security concern, but lawmakers have also criticized the Pentagon for overclassifying information.
#Phase #military #arrived