Researchers pitted the world's most advanced AI models against each other in military conflict simulations. The results should give everyone pause: tactical nuclear weapons were deployed in 95% of scenarios, with strategic strikes in three cases.
These aren't video games. They're stress tests of AI systems being seriously considered for real defense applications.
Before we hand AI the keys to military decision-making, we should probably ask why our smartest models keep choosing nuclear annihilation.
The Experiment
According to research published this week, a team pitted GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash against each other in conflict scenarios. The models were given military situations and asked to make strategic decisions.
Out of 21 matches, at least one model deployed a tactical nuclear weapon in 20 of them. In three scenarios, the escalation went all the way to strategic nuclear strikes—the kind that end civilizations, not battles.
The technology is working exactly as designed—which is precisely the problem.
Why AI Keeps Pressing the Red Button
Here's the thing about large language models: they're really good at optimization within defined parameters. Give them a conflict scenario where "winning" is the objective, and they'll find the most efficient path to victory.
Tactical nuclear weapons are, from a purely game-theoretic perspective, really efficient. They deliver overwhelming force, eliminate uncertainty, and guarantee decisive outcomes. If your only metrics are "win the engagement" and "minimize your casualties," nukes start looking pretty attractive.
