The Pentagon is testing artificial intelligence models to see which are most favoured by 25 of the department’s "power users,” as the US military races to find alternatives to Anthropic PBC’s Claude, according to a senior defence official.
Tests began at the start of March, according to the senior official, three days after Defense Secretary Pete Hegseth declared Anthropic a supply-chain risk over the company’s insistence on guardrails for its technology and moved to drop it as a provider of AI tools. The company is now battling the designation in court, saying it could cost it billions of dollars in revenue.
The US military relies heavily on Anthropic’s Claude for a digital mission control platform known as Maven Smart System for its classified operations against Iran. Claude has gained fans within the department for its ease of use and performance, although the full extent of its deployment in AI targeting isn’t public.
Following the split with Anthropic, the Pentagon gave itself six months to wind down its use of the company’s products. Earlier this month, the Defense Department announced new agreements with several other companies to use their AI tools on classified networks as it seeks to develop multiple new model suppliers.
There have been signs of de-escalation between the White House and Anthropic, particularly in the wake of the release of Mythos, a new powerful model from Anthropic that has threatened to upend existing cybersecurity defences. But Pentagon officials show no signs of wanting to mend fences with the company.
Emil Michael, the US undersecretary of defense for research and engineering, said Thursday in an interview with Bloomberg Television that negotiations with Anthropic remain on ice because of the company’s legal challenge to the supply-chain risk designation and that the Pentagon is prepared to move on to other vendors.
Although experts have argued that Mythos far outstrips other companies’ capabilities, Michael said that he expects new model releases from rivals to deliver "similar capabilities” to Anthropic’s cutting-edge models every month or two. Despite other parts of the US government now using Anthropic models, Michael argued that the company’s ideological stance risks not being compatible with the Defense Department’s mandate.
As part of the new testing effort, models from OpenAI and Alphabet Inc’s Google are among those being examined in a digital platform separate from the Maven Smart System, according to the senior Pentagon official, who requested anonymity to detail new department initiatives that aren’t yet public.
The effort relies on so-called power users of artificial intelligence in each of five military theatre commands around the world. It seeks to determine which are most effective at supporting military operations, the official added, describing power users as operators who are already heavy consumers of the Defense Department’s existing AI tools.
Initial tests show that the models respond differently to the same queries or so-called prompts, and that varying the prompts across models has helped get the best out of them, said the official, who declined to share metrics or interim rankings. The Pentagon may consider making the final evaluation public, the official said.
While some users initially expected to prefer Anthropic, the Pentagon hasn’t gotten the kind of pushback it had expected from power users to try other models, the official said, without detailing the tests or specific feedback about any of the models. The Pentagon wants to provide military operators access to multiple models, so no single company is in a favoured position. In addition, the Pentagon may determine that specific models work more effectively in different scenarios, the official added.
Earlier this year, the Pentagon released a memo pushing to speed up adoption of AI across the US military. Every conflict from now on will become more AI-enabled, the official said, adding that the department doesn’t want to make wrong decisions in targeting. The technology is particularly effective for repetitive tasks and introducing efficiency, the person said.
The official declined to address whether AI played any role in the strike against a primary school in Iran, citing an ongoing investigation.
Multiple human rights groups have warned of risks associated with bringing AI to warfare given its unpredictability in combination with its capacity for speed and scale. They have questioned whether it’s possible to use AI in a way that seeks to limit the harm of armed conflict, including for civilians, wounded and those who surrender. – Bloomberg
