Google releases dedicated chip for AI training and inference, once again challenging Nvidia

After years of producing chips that can be used for both artificial intelligence model training and inference, Google is splitting the two tasks into different processors, its latest move to compete with Nvidia in the field of AI hardware. Google announced on Wednesday that it will make this change to its eighth-generation tensor processing unit (TPU), with both chips set to launch later this year.

Amin Wahdat, Google's senior vice president and chief technology officer of artificial intelligence and infrastructure, said in a blog post: "With the rise of AI agents, we believe that the industry will benefit from chips that are professionally customized for training and deployment needs."

In March this year, Nvidia promoted its upcoming chip product, which would allow models to quickly respond to user questions with the help of technology acquired through its $20 billion acquisition of chip start-up Groq. Google is a big customer of Nvidia, but it also offers TPUs as an alternative to companies using its cloud services.

Most of the world's top technology companies are developing artificial intelligence-specific semiconductors to maximize computing efficiency and meet the needs of specific application scenarios. Apple has been developing its own neural network engine AI components into iPhone chips for many years; Microsoft released its second-generation AI chip in January this year; last week, Meta announced that it is cooperating with Broadcom to develop a variety of AI processors.

Google is a pioneer in this trend. In 2015, Google began to use self-developed chips to run AI models, and in 2018 it opened leasing to cloud service customers. Amazon Cloud Technology launched the Inferentia chip for processing AI requests in 2018, and the Trainium processor for training AI models in 2020.

Analysts from the investment institution DADavidson estimated in September last year that the total value of the TPU business plus the Google DeepMind AI team was approximately US$900 billion.

At present, no technology giant can replace Nvidia, and Google has not even compared the performance of the new chip with the AI chip leader's products. However, Google said that the performance of the new training chip is 2.8 times that of the seventh-generation Ironwood TPU released in November last year, and the price is the same; the performance of the inference chip is increased by 80%.

Nvidia said its upcoming Groq3LPU hardware will use large amounts of static random access memory (SRAM), a technology also used by AI chip maker Cerebras, which submitted a listing application earlier this month. Google's new inference chip, codenamed TPU8i, is also equipped with SRAM. The single chip has 384MB SRAM built in, and the capacity is three times that of Ironwood TPU.

Sundar Pichai, CEO of Google's parent company Alphabet, wrote in a blog that the chip architecture is designed to "achieve massive throughput and low latency at a cost-effective manner to meet the needs of running millions of agents simultaneously."

The application scale of Google AI chips is expanding. Google stated that Citadel Securities has built quantitative research software based on Google TPU, and all 17 national laboratories of the U.S. Department of Energy use AI collaborative scientist software developed based on this chip; artificial intelligence company Anthropic has also committed to using several gigawatts of Google TPU computing power.