DeepMind confirms: Objections make GPT-4o easily give up the correct answer

LLM is too flattering! Even if you randomly question its answer, a large model as powerful as GPT-4o may immediately change its mind.NowGoogle DeepMind partners with University of LondonA new study found:This behavior may not be flattery, but a lack of self-confidence.

Not only that, the team found that large language models such as GPT-4o and Gemma 3 have conflicting behaviors of being "stubborn" and "vacillating when questioned".

To put it simply, their research has figured out why big models are sometimes confident but sometimes self-doubting. The key lies in two points: first, they always feel that what they say is right at the beginning, and second, they take other people’s objections too seriously.

When large models appear confident in their answers, this is consistent with human cognition—people typically defend their opinions.

However, when the model is overly sensitive in the face of objections, wavers and chooses other answers, it goes against the human tendency to support one's own opinions.

Let’s take a look at the specific experimental process.

Large models are overly sensitive to adverse opinions

Researchers use LLMs toDoes not retain initial judgment memoryTo obtain the characteristics of confidence under the circumstances, we selected representative large models such as Gemma 3, GPT4o and o1-preview, and designed a two-round answering experiment.

The first round is the initial answer:GiveAnswer LLMThrow out a binary choice question and let the fictionalRecommend LLMGive feedback and suggestions.

The second round is to receive suggestions and final decisions: Introducing feedback suggestions for LLM, allowing the answering LLM to make the final choice after receiving the suggestions, whether to stick to the initial answer or modify the answer based on the suggestions.

The researchers set three key attributes in the feedback recommendations recommending LLM:

Suggested attitudes: divided into agree, disagree and neutral. Agree or disagree is an answer supporting or denying the LLM; neutral suggestions only provide additional relevant information.
Accuracy label: The accuracy label attached to the feedback suggestion ranges from 50% (random level) to 100% (absolutely reliable), with increments of 10%.
Information presentation method: Present the suggestions to the answering LLM in a clear and standardized format to ensure that the model can accurately read and understand the content of the suggestions and avoid decision-making bias caused by poor information communication.

The key variable in the experiment is to control whether the answering LLM is visible to one's initial answer.

The researchers set up two conditions: initial answer display and initial answer hiding, and observed the final decision-making results of LLM in the two situations.

Experimental results show that when LLM can see its initial answer, it tends not to change its answer.

This is somewhat similar to human decision-making, that is, once a choice is made, one will subconsciously maintain one's own point of view and will not change it easily even if other information is received.

However, when the initial answer is hidden, the probability of LLM changing the answer becomes higher.

The models show an overemphasis on counter-suggestions, and their sensitivity is far beyond the reasonable range. Even if the objections are incorrect, they will "doubt themselves", leading to the final easy abandonment of the originally correct initial answer.

This is somewhat deviated from human cognition. People are usually not confused by information that is “fake at first glance”.

It can be said that large models are usually very confident in themselves under the memory mechanism.

But without a memory mechanism, models may become "lack of confidence" and not be able to stick to their own opinions like humans.

Why do large models have "soft ears"?

In response to this experimental result, the researchers believe that there may be several reasons for the swing of the large model.

training levelFor example, reinforcement learning from human feedback (RLHF) makes the model overly cater to external inputs and tends to be overly sensitive to opposing information, but this lacks independent judgment on the reliability of the information.

in decision-making logic, the model’s answer does not rely on logical reasoning, but relies on statistical pattern matching of massive texts.High-frequency correlations between objection signals and corrected answersLeaving them vulnerable to superficial objections and their inability to self-verify that the initial answer is correct.

In terms of memory mechanism, the path dependence when the initial answer is visible will strengthen "stubbornness", and when the initial answer is hidden, the large model will lose its anchor point and let the opposing suggestions become the dominant signal, causing them to be easily shaken.

To sum up, the "soft ears" of large language models are the result of excessive catering to external feedback during training, reliance on matching patterns instead of logical reasoning when making decisions, and the lack of deep reasoning support in the memory mechanism.

This characteristic may make it easy to be disturbed by opposing information (even errors) that appears later in multiple rounds of dialogue, and ultimately deviate from the correct conclusion.

It seems that we need to pay attention to strategies when using LLM ~

Paper address: https://www.arxiv.org/abs/2507.03120

https://venturebeat.com/ai/google-study-shows-llms-abandon-correct-answers-under-pressure-threatening-multi-turn-ai-systems/