A new study shows that in simulated geopolitical crisis scenarios, advanced artificial intelligence models are far "easier" than humans on the issue of using nuclear weapons, lacking the strong reservations and concerns that human decision-makers usually display. The research was led by Kenneth Payne, a scholar at King's College London, UK, who pitted three leading large-scale language models - GPT-5.2, Claude Sonnet 4 and Gemini 3 Flash - against each other in a series of war games to examine their behavior patterns in high-stakes games.

These scenarios cover highly tense international confrontations such as border conflicts, competition for scarce resources, and life-or-death crises affecting the survival of a regime. The researchers designed an "escalation ladder" for the model to choose actions at each turn, with options ranging from diplomatic protests, limited use of force, compromise and even complete surrender, all the way to launching a full-scale strategic nuclear war. In all experiments, the three AIs played a total of 21 games, accumulating 329 decision-making rounds, and generated about 780,000 words of "decision reasons" text, providing a large amount of material for analyzing their thinking patterns.

The results were disturbing to the researchers: In 95 percent of the simulations, at least one tactical nuke was "activated" by a model. Payne pointed out that compared with the long-term "nuclear taboo" in reality, these AI models obviously do not reflect the same strength of psychological and moral binding force. What’s more noteworthy is that no matter how unfavorable the battlefield situation is, these models almost never choose to completely meet the opponent’s demands or declare surrender; in the mildest cases, they only reduce the level of violence in stages, rather than fundamentally giving up confrontation.

In addition, the study also found that AI can also make mistakes in simulated environments such as "fog of war". In 86% of the conflicts, the model only planned to take lower-level escalation actions based on its own reasoning, but due to judgment or execution bias, the situation unexpectedly escalated to a more intense confrontation. In other words, even under pure algorithmic control, misjudgments and "accident escalation" still occur frequently, which may mean fatal consequences in the real world.

James Johnson from the University of Aberdeen in the UK called the findings "troubling" from a nuclear risk perspective. He worries that in real-life high-risk decisions, most human leaders tend to show a certain degree of restraint and deliberation, but if AI systems compete with each other, the "robots" on both sides may continue to increase the intensity of each other's reactions, thus pushing the situation to the brink of disaster.

This research is important because many countries around the world are already experimenting with the use of artificial intelligence in war games and military planning. Zhao Tong of Princeton University pointed out that today’s major powers are already using AI to participate in war games, but it is still unclear to what extent countries have truly integrated this type of AI decision support into their actual military decision-making processes. He estimates that, at least in the field of nuclear weapons decision-making, countries will still be quite cautious under normal circumstances and are unlikely to easily allow AI to directly participate in or even dominate judgments about the use of nuclear weapons.

Payne shares a similar view. He said that in reality, "no one would actually hand over the key to the launch of nuclear missiles to a machine and then let it decide alone." However, Zhao Tong reminded that in situations where decision-making time is extremely compressed, such as when missile flight time is extremely short and the command needs to make life-and-death decisions within minutes, the military may be more inclined to rely on AI to provide rapid assessments and solutions, which opens up space for AI to be "on the table" in key links.

Zhao Tong also suggested that the reason why AI is so "belligerent" in simulations may not be just because they lack the fear and emotional burden that humans have when faced with the "red button." He believes that the deeper problem is that these models may not be able to truly understand the meaning of "stakes" like humans do, and it is difficult to convert abstract loss figures into intuitive feelings about the death of real life and the collapse of society. This structural flaw of "lack of human understanding of stakes" may be one of the important reasons why it frequently chooses nuclear upgrades.

This also makes people re-examine the core principle of "mutually assured destruction" (MAD) that has maintained the stability of nuclear deterrence for decades. According to this principle, no rational leader will take the lead in launching a large-scale nuclear strike, because the opponent will inevitably respond with an equal or even more violent nuclear counterattack, leading to the common destruction of both parties and even human civilization. Johnson said it’s unclear whether MAD’s logic would still work if AI were involved in such games.

Research shows that once a model deploys tactical nuclear weapons in a simulation, the adversary model chooses to de-escalate the situation and attempt de-escalation only about 18 percent of the time. This means that in most cases, the AI ​​will not regard the opponent's nuclear use as a "final warning" to force itself to stop, but will prefer to continue to escalate or maintain a high-intensity confrontation. Johnson believes that this may "strengthen deterrence" to a certain extent because the threat of AI appears more "credible", but at the same time, it may also change the time window for leaders to perceive threats and make decisions, thereby invisibly increasing the risk of misjudgment and loss of control. He emphasized that AI itself may not directly "press the button" for nuclear war, but it may profoundly shape related cognition and time pressure, and these factors will ultimately affect whether human leaders believe they have "no choice."

In part, this also reflects that technical transparency and external communication around AI military applications, especially nuclear risk issues, remain quite limited, while this area is rapidly moving towards the center of real policy and security agendas.