OpenAI has become increasingly optimistic about expected revenue from consumers and enterprises, significantly raising its revenue forecast for the next five years. However, the shadow hanging over these forecasts is that cloud server costs continue to rise, and the growth rate has exceeded the revenue growth rate.

This pressure was particularly evident last year: OpenAI's gross profit margin fell to 33% from 40% in 2024, far from its own target of 46%. Chief rival Anthropic also recently raised its revenue forecast, but said in December it expected gross margins of 40% in 2025. While this is a huge improvement from -94% gross margin in 2024, it's still 10 percentage points below Anthropic's previous target.

For both companies, one of the main reasons is soaring inference costs — the amount companies pay cloud providers for computing power when users interact with chatbots or call models through APIs.

OpenAI's inference costs quadrupled last year to $8.4 billion, up from $6.6 billion forecast last summer. The company told potential investors that it had to acquire server resources at a higher cost due to higher-than-expected demand for its services.

Cloud service providers charge for renting servers on demand (i.e., spot instances), which is usually much higher than the price reserved in advance.

Meanwhile, Anthropic's inference costs are expected to more than triple to $2.7 billion in 2025, also higher than the company had previously expected, for reasons that are unclear.

It is worth noting that despite the increase in the number of people using the service, the gross profit margins of the two companies have worsened. The average price of computing power rental throughout the year is actually declining, and the two companies have been claiming to have found ways to run large models more efficiently.

One reason for OpenAI's declining gross margins may be the cost of running ChatGPT for a large number of free users. As I reported, only about 5% of OpenAI’s 910 million weekly users are paying customers. The company's financial projections show that nearly half of its total inference costs last year, $3.9 billion, was spent supporting free users, compared with $4.5 billion for paid users.

Another factor is the type of AI these companies are running. Last year, OpenAI launched the video generation model Sora, which requires far more computing power than plain text queries. The inference model also launched at the same time also requires more computing power to generate answers than traditional large language models.

According to people familiar with the matter, the company is also letting users try out a number of new, computationally intensive features before introducing usage limits, including the GPT‑4o model, which is popular for generating Ghibli-style images.

However, there are bright spots.

OpenAI's service efficiency for paying users has significantly improved: its computing power gross profit margin (the revenue remaining after running models for such users) reached about 70% in October, which is higher than about 52% at the end of 2024 and about 35% in January 2024.

OpenAI plans to generate more revenue from free users, mainly in the form of advertising and e-commerce, and drive more free users to convert into subscribers. For example, in January, the company launched an ad-supported ChatGPT subscription package, priced globally at approximately US$5 to US$8 per month.

This year, OpenAI expects to spend about 66% of its total $14.1 billion in inference costs serving paying customers. By the end of 2030, 94% of approximately $85 billion in inference costs is expected to be used to support paying users. That same year, its gross margin is expected to rise to 67%.

Still, the recent financial performance of OpenAI and Anthropic raises questions about whether they can achieve their goal of gross margins above 60% by the end of 2030 — a level that would be on par with some of the top publicly traded software companies today.

OpenAI and Anthropic currently appear to have no pressure to get investors to shoulder these costs. But ultimately, they must prove that the business revenue generated by users is enough to cover the cost of providing services to them.