On the afternoon of May 29, many netizens discovered that DeepSeek has a limit on the number of times it can be regenerated and modified. After modifying or regenerating several times in succession, the page will prompt that the upper limit has been reached. Some netizens reported that in normal conversations, the upper limit will be reached after regenerating 3 to 6 times; while in expert mode, there may only be 3 opportunities. The upper limit of modification input times is generally 6 times.

At present, DeepSeek has not made an official announcement, nor has it made public a fixed quota table, but this matter has triggered heated discussions in the community - after all, DeepSeek has a large number of loyal users, including me. We occasionally encounter busy servers and page crashes, which is understandable to everyone, but we can't just add restrictions silently, which makes people panic.
The API is not affected at all, so it is most likely an infrastructure problem, a familiar recipe.

01
Temporary current limit due to computing power shortage?
Regarding this restriction, the "semi-official account" Baiqiang on Xiaohongshu said: Don't panic, this is temporary.

According to it, "modify message" and "regenerate" are suddenly restricted. It is not that DeepSeek is doing the so-called "negative optimization", it's just thatTemporary measures taken after the computing power pressure is too great.
Because the number of users of DeepSeek has grown rapidly during this period, especially starting from the afternoon of May 29, the request pressure on the App side was obvious. In order to prioritize and ensure that the most basic text conversations can still be used normally, the team can only first restrict high-frequency operations such as "modify messages" and "regenerate".
Although on the user interface, "regenerate" is just a click of a button, and "modify message" is just a matter of changing the original question. But for the server, these two things are not simple refreshes, but a new inference request. Every time the user clicks to regenerate, the model must reprocess the context and generate the answer. The same is true for modifying the message. As long as the original question changes, the model needs to answer it again based on the new input.
Therefore, when a large number of users click repeatedly at the same time and treat "Regenerate" as an infinite card drawing button, these requests will become a heavy pressure on the server.
This matter can be viewed together with some recent changes in DeepSeek. For example, the reason behind expert mode file uploading and delisting, smart search function turning off, and "server busy" from time to time is actually the shortage of computing resources.The overall service pressure has become so high that trade-offs need to be made.

DeepSeek is so easy to use, the underlying infrastructure needs to keep up.
Regarding the situation where "modify message" or "regenerate" reaches the upper limit, Baiqiang's advice is not to click continuously and quickly. You can stop first and wait 15 to 30 minutes before trying again. According to the account, in most cases the restriction will be automatically restored after waiting; if you click repeatedly and quickly, it may be recognized by the system as an abnormally high-frequency request, causing the restriction time to become longer.
Bai Qiang also mentioned that Huawei’s new Ascend super node cards are being deployed and are expected to go online in the second half of the year. By then, the computing power will be significantly expanded, and these temporary restrictions will most likely be lifted.
However, DeepSeek has not officially issued an announcement on this matter yet. Including the number of limits, recovery time, and specific changes after the computing power expansion in the second half of the year, we still need to wait for further official confirmation.
02
DeepSeek is not an isolated case
In fact, DeepSeek is not the first AI company to do this.
When the computing power of large-model products is tight, users surge, or the pressure is too high during peak periods, common practices are to limit the current flow, downgrade, queue, or separately limit some high-consumption functions.
ChatGPT has always had a message limit, and even paid users may encounter usage limits during periods of high demand. After free users reach the advanced model quota, they will also be switched to a lighter model to continue using it.
It can be understood as a kind of "service downgrade", which does not mean that it is not allowed to be used, but that everyone cannot have unlimited use of the most expensive and resource-intensive capabilities.

Similar to Claude, Anthropic will set usage budgets for different users, and high-frequency scenarios such as Claude Code and API will also adjust the upper limit according to capacity changes. When the computing power is abundant, the quota can be increased; when the demand pressure increases, the restrictions will become more obvious.
On May 6, Anthropic also issued a special document saying that with the new computing power cooperation and increased capacity, it has increased the usage limit of Claude Code and Claude API. In turn, it also shows thatThe usage limit is directly linked to the computing power capacity. It will be tightened when the computing power is tight, and will be relaxed after the computing power is expanded.

However, Anthropic has now set more detailed usage limits for different subscription quotas, and high-consumption scenarios such as Claude Code are getting closer to the logic of token metering and charging.
Image and video generation products are more typical. The image generation functions of Sora and Gemini, as well as other AI video tools, have experienced times of tightening the number of generations, longer queue times, and reduced free quotas when demand surged.
It can be said that "every inference has a cost" has become an unavoidable reality for AI products.
Some time ago, the charging of bean bags triggered a round of discussion. "Doubao, expensive and difficult to use" once became a hot search on Weibo. It is different from DeepSeek's restricted functions this time, but the logic behind user reactions is the same: everyone is used to AI products being cheap and easy to use. Once the platform starts charging or starts to restrict certain functions, user sentiment will easily rebound.
It is actually a very common practice for AI companies to make basic capabilities free, start charging for complex capabilities, set quotas for high-cost functions, and temporarily limit the flow during peak periods.
DeepSeek limits the number of "regeneration" and "modification messages", which is not unusual in the entire AI industry. What it restricts is not the chat entrance or the model itself, but only those operations that are easy to be clicked frequently by users, but each click will consume reasoning resources again.
Because basic dialogue must be preserved as much as possible, this is the bottom line for users to be able to use the product; however, functions such as regeneration, repeated modifications, file uploads, Internet searches, long context, and multi-modal generation are all more likely to be limited or downgraded when pressure is high.
I feel that the focus of this controversy is not "how much to limit" but "how to limit".
If the official could explain earlier that this is a temporary current limit, how long it will take to resume, and what operations are affected, users would most likely be more accepting of it.But if it suddenly becomes unavailable without an announcement, of course everyone will immediately wonder whether the functions are shrinking and whether there will be charges in the future.
When AI products change from early adopter tools to daily tools, users will increasingly care about stability and transparency. The limitations themselves are understandable, but it is best not to let users know about them for the first time in a pop-up window.
DeepSeek needs to know that if the number of uses increases so much that it needs to be limited, it also means that there are many users waiting for you.
Even if it's just a letter of approval.