OpenClaw decided to implement the nuclear bomb option and emptied important user mailboxes

In the field of artificial intelligence, alignment refers to making the goals, behaviors, and decisions of artificial intelligence systems truly consistent with the true intentions, values, and long-term interests of humans or designers/users, rather than just superficially obedient or completing literal tasks.

The protagonist of this article is Summer Yue, head of alignment at Meta Super Artificial Intelligence Laboratory. She made a mistake while using the OpenClaw AI robot, which caused her entire Gmail mailbox history to be cleared.

Summer Yue also admitted that artificial intelligence alignment researchers are not immune to misalignment problems, so when you use various AIs, especially OpenClaw, you must confirm various instructions to prevent overturning.

Event background:

Summer Yue used the OpenClaw AI robot to build a workflow. This workflow has been running smoothly in a test environment for several weeks without any failures. The instructions of this workflow are to check the mailbox and recommend which emails can be archived or deleted, but do not perform any actions before human confirmation.

After weeks of running smoothly without glitches, Summer Yue felt confident that the workflow was fully operational, so she deployed the workflow to her primary Gmail mailbox to execute the action.

Memory loss causes command errors:

There are relatively few emails in the test mailbox, but there are a large number of emails in the main mailbox. When processing emails, the OpenClaw AI robot triggers the context compression mechanism built into the framework. In order to prevent overly long conversations from overwhelming the context window of the model, this mechanism will automatically summarize and discard early messages.

When the robot was processing Summer Yue's main mailbox, the massive amount of emails overwhelmed the context, causing it to automatically compress and lose part of its memory. During this process, the robot continued to identify and process emails before February 15, 2026 according to the previous process.

However, the robot (should I say the model) believes that the most efficient cleaning solution is the nuclear bomb option - directly clearing all emails, and even plans to continue cleaning cycles until all emails are cleared. Due to the loss of human confirmed instructions, the robot completes the email clearing task completely autonomously.

How to interrupt the instruction if an error is found?

The biggest problem with new products or products that everyone is not familiar with is that various misoperations may occur. For example, Summer Yue found that the robot had sent a large number of instructions asking the robot to stop operating when clearing emails.

But the problem is that the OpenClaw AI robot's run will not stop by default, and messages sent by users also need to be queued for processing. That is, new messages sent by users can only be processed after the previous task is completed.

During the process of clearing emails, Summer Yue sent multiple commands in the hope that the robot would stop operating, but to no avail. In the end, she could only run to the Mac Mini and manually kill all processes to stop the robot.

In this case, the command that the user needs to execute is actually /stop. This command can forcefully interrupt the command being executed by the robot. Simply sending text messages is useless, which is the message queuing problem mentioned earlier.

Summary afterwards:

Summer Yue later posted a post mocking herself:: To be honest, this is a rookie mistake. Alignment researchers are not immune to misalignment problems because they are overconfident after running tests for several weeks without incident.

Other netizens saw this and laughed at themselves. Even professional alignment researchers would be upset. If ordinary users hand over their real wallets, mailboxes, calendars and other highly private content to AI, how big of a risk will it be?