After years of hard work, CPUs based on the Arm architecture have experienced substantial growth in the server market and are liked and adopted by many customers. Hyperscale cloud service providers such as Amazon Cloud Service (AWS), Alibaba, Microsoft and other self-developed CPUs all choose to cooperate with Arm. Why is this?
"The answer is very simple. By working with Arm, they can build and optimize solutions based on their own use cases and infrastructure." Mohamed Awad, Arm's senior vice president and general manager of the infrastructure division, said at the 2023 ArmTechSymposia annual technology conference.
One of the most important AI chip providers, NVIDIA, also likes the customizable features of Arm server CPUs like ultra-large-scale cloud service providers.
NVIDIA's powerful GH200 super chip contains 72 ArmNeoverse cores, coupled with NVIDIA's GPU, the AI performance of GH200 can be improved by 10 times compared with systems based on x86 architecture.
In order to meet the customized needs of more customers in infrastructure construction, Arm has two other important initiatives.
Why ArmNeoverseCPU is preferred?
The GH200 GraceHopper super chip platform is a product released by NVIDIA in May this year. It is designed to handle massive generative AI tasks. The NVIDIA DGX GH200 supercomputer, which has 256 GH200 super chips, has improved its AI performance to an astonishing Exaflop (one million trillion calculations per second) level.
The key to such powerful AI performance lies in the change of system architecture.
Traditional system architecture in infrastructure
Traditional server system architecture,The memory is connected to a general-purpose off-the-shelf CPU (HostCPU) through PCIe, which is responsible for managing multiple accelerators.
"This traditional architecture was the only architecture available on the market in the past." Mohamed Awad pointed out, "The problem with this architecture is that it is common to use off-the-shelfThe interface between the CPU and the accelerator directly limits the final performance level of the product.Because all accelerators must access additional memory through this general-purpose off-the-shelf CPU, memory consistency cannot be achieved, the performance of the accelerator cannot be fully utilized, and the requirements of generative AI cannot be well supported. "
Facing new application requirements, modern system architecture has emerged in the infrastructure field
The GH200 super chip changes the traditional architecture. Through NVLink,Let each CPU be connected to an accelerator independently to achieve strong memory consistency. One of the key points is the customizable CPU.To this end, with the help of such an architecture, NVIDIA can give full play to the efficiency of the GPU and maximize performance based on actual scenarios and use cases.
"Only by understanding the final use case and designing the CPU according to the usage scenario can we achieve better efficiency and achieve the best performance of the product." Mohamed Awad further said, "NVIDIA and Arm leveraged the flexibility brought by Arm technology to design the chips they need to further optimize the system, while making full use of Arm's powerful software ecosystem."
The next question is, will the architecture proposed by Nvidia become the mainstream in the era of generative AI?
"It is still too early to judge whether a CPU versus a GPU as an accelerator will be the main trend in the future, or the only trend."Mohamed Awad told Leifeng.com, "We are in the era of computing acceleration. In the future architecture, no matter what method of coupling, there will be an accelerator next to any general-purpose CPU. The uniqueness of Arm is that it can help partners build customized CPUs from scratch and according to their needs, and make a good connection between the CPU and the accelerator."
Since x86 provides a standard CPU chip, the best choice for CPU in the GH200 super chip platform is ArmCPU.This is also the key to ArmNeoverse's popularity.
In other words, standardized CPUs cannot meet the customized needs of infrastructure, and customization has become Arm’s trump card in the server market.
Customizable, Arm’s “killer trump card” in the server market
In August this year, Arm launched the Arm Neoverse Computing Subsystem (CSS), which enables the Arm ecosystem to create specialized chips at lower costs, with less risk and in shorter time.
ArmCSS's first-generation product, ArmNeoverseCSSN2, integrates the NeoverseN2 platform and has verified configurations and optimized power consumption, performance and area (PPA).
"NeoverseCSS can help our partners further reduce investment, accelerate the accessibility of our solutions to the entire ecosystem, and speed up the time to market of partners' products." Mohamed Awad said.
Leifeng.com (public account: Leifeng.com) learned that there are Arm customers who use NeoverseCSS.Save up to 80 people/year of engineer time.There are also customers who use NeoverseCSS.The project took only 13 months from concept to tape-out.
Microsoft's recently released Cobalt100CPU is also based on NeoverseCSS.
"Arm Neoverse has many customers in the Chinese market, especially in the infrastructure field, and its development has been very strong in the past three or four years." said Zou Ting, global vice president of Arm's China business. "Arm also actively participates in local ecology and open source software communities such as data centers and cloud computing, including the Dragon Lizard community, to help these communities better integrate into the Arm global ecosystem."
Mohamed Awad also emphasized that China is one of Arm's very important markets. The total shipments of Chinese partners' chips based on the Arm architecture have reached 30 billion. Arm has nearly 400 technology authorized customers in China, and this number continues to rise every month.
Arm's global ecosystem is also the key to meeting the differentiated needs of customers. Based on NeoverseCSS, Arm has launched a comprehensive design (ArmTotalDesign) to further combine the power of the ecosystem to simplify the development process of customized chips and make delivery easier and more convenient.
The launch of Arm's comprehensive design allows ASIC design companies to quickly start design projects and provide their design solutions to required customers at any time; IP suppliers can pre-integrate, pre-verify and pre-optimize advanced IP for NeoverseCSS; EDA partners can seamlessly support the most advanced tools and processes to simplify SoC design; commercial firmware solutions can begin development well before chip tape-out; at the same time, the design of NeoverseCSS will be specially optimized to give full play to the advantages of leading process nodes.
Obviously, in an era when infrastructure is pursuing differentiation, ArmNeoverseCSS and Arm comprehensive design are the best choices to meet differentiated needs at the moment.
Also note thatArm has transformed into a computing platform company,Today, Arm’s comprehensive computing solutions (ArmTotalComputeSolutions), ArmNeoverse platform, ArmCorstone and SOAFEE computing platforms have been widely used in mobile, infrastructure, Internet of Things, automotive and other fields.