Amazon's cloud computing platform AWS announced on Tuesday the launch of a new product called "AI Factories", which allows large enterprises and government agencies to run Amazon's artificial intelligence systems in their own data centers. Customers provide computer rooms and power, and AWS is responsible for deploying the entire AI infrastructure and performing unified operations and maintenance. It can also be connected with existing AWS cloud services.


This model is aimed at organizations that are extremely sensitive to data sovereignty and hope that their data does not need to be uploaded to third-party model providers or share underlying hardware resources with others.

The concept of "AI factory" in the new product name comes from Nvidia's existing hardware system of the same name, which integrates a complete set of components required to run large models such as GPU chips and network equipment. The AI ​​Factory of AWS is actually an in-depth cooperation project with NVIDIA. Both companies confirmed that the solution will combine the technical capabilities of both parties to provide customers with integrated AI computing power and platform stack.

In terms of specific configuration, companies can choose to install NVIDIA's latest generation Blackwell GPU, or use Amazon's new self-developed Trainium3 training chip to meet different computing power and cost needs. In addition to computing power, the system will integrate AWS's own network, storage, database and security components, and can directly access Amazon Bedrock, a model selection and hosting service, and SageMaker, a model construction and training platform, forming a unified environment from the underlying hardware to the upper-layer development tools.

It is worth noting that AWS is not the only cloud computing giant betting on "AI factories." As early as October, Microsoft demonstrated its AI infrastructure based on NVIDIA AI Factory technology, which is used to host OpenAI and other workloads in global data centers, and built a new generation of "AI super factory" level data centers in Wisconsin and Georgia around this architecture. However, at that time, Microsoft did not emphasize the direct deployment of similar systems to client rooms in the form of local private clouds, but instead focused on its ability to build its own cloud infrastructure.

In terms of dealing with data sovereignty, Microsoft has previously announced plans to build data centers and cloud services locally in various countries, and has proposed a variety of localization solutions including "Azure Local". Microsoft will provide and host the hardware and then deploy it at customer sites to meet regulatory and compliance requirements. In contrast, AWS has clearly proposed a model of customer-owned data centers and AWS "moving the AI ​​factory in" this time, further blurring the boundaries between traditional enterprise private data centers and public clouds, and hybrid clouds and enterprise hosting rooms have once again returned to the center of the industry stage.

From the perspective of the outside world, the wave of generative AI is prompting several major cloud service providers to re-increase investment in enterprise-built data centers and hybrid cloud architectures, which is quite reminiscent of the era when "enterprise computer rooms were king" more than a decade ago. As hardware suppliers such as NVIDIA are further bound to cloud giants, localized and sovereign deployment around "AI factories" is likely to become one of the key options for large organizations when planning AI infrastructure in the next few years.