Yesterday, Arm announced significant progress in its "total design" plan. Launched a year ago, the program aims to accelerate the development of custom chips for data centers by fostering collaboration among industry partners. The ecosystem has grown to include nearly 30 participating companies, with the recent additions of AlcorMicro, Egis, PUFSecurity and SEMIFIVE.
A noteworthy development is that Arm, Samsung Foundry, AD Technology and Rebellions have collaborated to create an artificial intelligence CPU chip platform. The collaboration aims to provide solutions for cloud, HPC and AI/ML workloads, combining Rebellions’ AI accelerators with AD Technology’s computing chips, implemented using Samsung Foundry’s 2nm Gate-All-Around (GAA) FET technology.
The platform is expected to bring significant efficiency improvements to generative artificial intelligence workloads, and it is estimated that for an LLM like Llama3.1 with 405 billion parameters, its efficiency will be 2-3 times higher than standard CPU designs.
Arm’s approach emphasizes the importance of CPU computing in supporting the full AI stack, including advanced technologies such as data preprocessing, orchestration and retrieval augmentation generation (RAG). The company's Compute Subsystem (CSS) is designed to meet these requirements, providing a foundation for partners to build diverse chipset solutions.
Several companies, including AlcorMicro and Alphawave, have announced plans to develop CSS-powered chips for a variety of artificial intelligence and high-performance computing applications. The program also focuses on software readiness, ensuring that major frameworks and operating systems are compatible with Arm-based systems. Recent efforts include the introduction of ArmKleidi technology, which optimizes CPU-based inference for open source projects such as PyTorch and Llama.cpp.
It is worth noting that, as Google claims, most AI workloads are inferred on the CPU, so it makes a lot of sense to build the most efficient and best-performing CPU for AI.