SemiAnalysis, an industry analysis organization, recently conducted an actual test on the multi-level subscription plans of OpenAI and Anthropic. The results showed that there is a huge computing power subsidy gap hidden behind the affordable monthly fees. The agency purchased various subscription plans from the two companies and continued to run heavy-duty tasks such as long-term programming and "intelligent agents" until it hit the weekly usage limit, and then calculated the theoretical costs corresponding to these usages based on public API prices.

HKeyVeRXAAAGLnY.jpg

Calculation results show that if OpenAI’s “ChatGPT Pro 20x” subscription priced at US$200 is fully used, the corresponding API billing can reach a maximum of approximately US$14,000. Anthropic’s “Claude Max 20x” solution priced at the same price can approach a theoretical Token cost of around US$8,000 under extreme usage conditions. This means that a small number of heavy users are enough to eat up the originally limited profit margins under the subscription model.

According to SemiAnalysis analysis, this is one of the reasons why large model companies pay special attention to "utilization". For Anthropic, at tiers like Claude Pro and Claude Max 5x, the company can roughly break even when actual user usage reaches about 20%. By comparison, OpenAI's profit margins are even thinner: on ChatGPT Plus and ChatGPT Pro 5x subscriptions, once utilization exceeds about 11.4%, the company starts losing money on that user.

On higher-priced top plans, the economic structure tightens further. The report pointed out that Anthropic's gross profit margin on high-end subscriptions was close to zero when the utilization rate reached about 10%, while OpenAI fell into the negative gross profit range when the utilization rate was about 5.7%. In other words, users do not need to reach extremely heavy usage frequency for these subscriptions to change from "profit-making products" to "loss-making products".

In this context, how to adjust prices or restrict access has become a difficult problem for manufacturers. The fixed monthly fee subscription model is a key factor in the rapid popularity of products such as ChatGPT and Claude. Once quotas are tightened or thresholds are raised, user growth momentum may be weakened. In the current large-scale model “arms race”, model capability and availability are still one of the most important competitive chips, which makes it more difficult for companies to easily adjust their strategies.

On the other hand, changes in how AI is actually used are also driving up cost pressures. The report pointed out that the token consumption of new workflows represented by multi-step "agent" systems that autonomously call tools can reach a thousand times that of traditional single-round conversations. This high-intensity call pattern has forced some large enterprises to re-examine the internal openness and cost control strategies of AI tools.

According to reports, companies such as Microsoft, Meta, and Amazon have scaled back their previous practices of encouraging employees to conduct large-scale trials and internal promotions because of the rapid expansion of internal bills. In one widely publicized case, a company burned through $500 million on Anthropic's services in just one month without setting any restrictions on employees' use of Claude, directly triggering emergency management intervention.

HKyDGuQa4AAi4ow.png

Under the pressure of cost and actual demand, more and more enterprises are beginning to adopt more refined model routing strategies. One approach is to hand over complex, high-value problems to expensive "frontier models" (frontier models), while delegating routine office work, basic question and answer tasks to cheaper models. By offloading tasks this way, some companies can cut overall AI costs by as much as 95%, The Wall Street Journal cites research. Vishal Misra, associate dean of Columbia University, pointed out that companies do not always need top-level large models that "understand quantum gravity". Many open source models are sufficient for daily needs, which will also squeeze the premium space of high-priced closed models.

Some AI startups have made more radical migrations. Flo Crivello, founder and CEO of AI assistant startup Lindy, said the company has switched 100% of its traffic to DeepSeek V4, completely migrating away from the Anthropic model. In their evaluation, DeepSeek V4 was comparable in capabilities to Claude Sonnet at a fraction of the cost, a migration that has reportedly saved the company millions of dollars.

Others choose to build their own systems based on open source models, combining internal data with their own infrastructure in exchange for a more controllable long-term cost structure. Although this path requires a higher initial investment, it helps reduce dependence on third-party cloud AI vendors and allows enterprises to have more granular control over inference costs, data security, and performance optimization. In specific vertical scenarios, internal models that have been fine-tuned may even outperform general-purpose cutting-edge models.

In the medium to long term, the industry generally expects that some costs will gradually decrease with infrastructure expansion, hardware evolution, and model iteration. SemiAnalysis predicts that with the mid-to-high-end capability level represented by the current Opus 4.8, it is expected to be profitably provided at a price of about US$20 per month through more mature technology and more efficient computing power in the future. But this judgment does not apply to the most cutting-edge top-tier models, which will remain high in running costs for the foreseeable future and are more likely to be charged through API billing, tiered feature unlocking, etc., rather than simply packaged into a unified subscription plan for the masses.

Until then, AI service providers still need to strike a difficult balance between two directions: on the one hand, users want to obtain the most powerful AI capabilities possible at a low and predictable monthly fee; on the other hand, the underlying computing power and infrastructure that support these capabilities are still expensive and highly sensitive to usage intensity. OpenAI CEO Sam Altman also publicly admitted that Token cost is becoming an increasingly serious problem, and the company is working hard to optimize products and architecture to allow users to achieve "more value with less expenditure" when using ChatGPT.