According to the Financial Times, citing people familiar with the matter, Google warned Meta around March this year that it could no longer meet its large-scale computing power and capacity needs for the Gemini model, forcing the social giant to reduce its use and delay multiple internal AI projects. Relevant restrictions are still ongoing. Meta has asked internal employees to "watch closely" the consumption of AI tokens and use a more restrained approach to call model inputs, outputs and overall usage. This is in sharp contrast to the company's attitude of vigorously promoting internally and even "mandating" the use of AI in certain scenarios over the past year.

The report pointed out that although Meta has invested heavily in building its own Llama open source model in recent years, and CEO Mark Zuckerberg has continuously declared that AI will become the company's next-generation core platform, Meta actually relies heavily on Google's Gemini in many key business links. According to people familiar with the matter, Meta uses Gemini extensively in scenarios such as customer service, advertiser chatbots, code generation, suspicious or harmful content removal and fraud detection, and it was selected as the internal preferred solution precisely because its performance is better than Meta’s own model; at the same time, Anthropic’s Claude is also competing and using it in some businesses.

Google's tightening of supply not only affects Meta, but also other customers using Google Cloud and Gemini, but Meta stands out because demand is much higher than similar customers. Unlike Google, Microsoft, and Amazon, Meta does not operate its own cloud computing business, which means that in addition to its internal self-developed AI systems, it must purchase external computing power and model services from competitors. In the context of rapid expansion of internal demand, this structural dependence problem is further amplified.

In order to cope with the soaring demand for AI, Google has continued to increase investment in data centers and dedicated hardware in recent years. Its cloud business quarterly revenue has exceeded 20 billion U.S. dollars, and its backlog of unfinished orders is close to 460 billion U.S. dollars, showing that the overall computing power market demand far exceeds existing production capacity. Google said that its first-party model processed more than 16 billion tokens per minute through direct API calls, an increase of about 60% from the previous quarter. This also confirmed from the side that in the commercialization stage of large models, computing power and capacity are becoming key bottleneck resources.

Meta is trying to solve the same problem from another path: on the one hand, the company is expanding its own data center, and on the other hand, it is working with Broadcom to develop customized MTIA acceleration chips, hoping to gradually reduce its dependence on cloud service and model providers such as Google in the future. After the setback in its bet on the Metaverse, Meta urgently needs to establish a "next platform" narrative in the field of AI. This incident of being "limited" due to over-reliance on external models also exposed its shortcomings and urgency in infrastructure and computing power layout.