In October 2024, AMD and Intel jointly established the x86 Ecosystem Advisory Group (x86 Ecosystem Advisory Group) to bring together industry leaders to jointly promote the future of x86 computing architecture. When EAG was established, it announced four core features: FRED, AVX10, ChkTag and ACE.Now AMD and Intel jointly released the ACE white paper, officially promoting this instruction set known as the "x86 Standard Matrix Acceleration Architecture" to the developer community.

The core goal of ACE is straightforward: to improve the matrix multiplication performance of x86 chips by orders of magnitude.

Matrix multiplication is the basic computing unit of neural networks and large language models. Although existing SIMD instruction sets such as AVX10 can complete matrix operations, there are obvious bottlenecks in computing density and scalability.

By introducing a matrix acceleration mechanism based on outer product operations, ACE achieves a computational density that is 16 times that of the equivalent AVX10 multiply-accumulate operation while consuming the same input vector.

In terms of data format support, ACE natively covers the current mainstream accuracy standards in the AI ​​field, including INT8, OCP FP8, OCP MXFP8, OCP MXINT8 and BF16.

As an extended instruction set of AVX10, ACE's software ecological adaptation is already in progress. Deep Learning and HPC underlying libraries, Python scientific computing libraries such as NumPy and SciPy, and mainstream machine learning frameworks such as PyTorch and TensorFlow have all started integration work.

AMD and Intel emphasized in the white paper that the design concept of ACE is low friction and wide coverage. From notebooks to supercomputers, developers do not need to rewrite code for different hardware platforms.

This is in sharp contrast to the solution of migrating AI computing to dedicated accelerators, which often requires additional code adaptation and migration costs.