New cooling copper plate technology is expected to reduce data center cooling energy consumption by more than 90%

As the wave of artificial intelligence drives up the demand for electricity in data centers, a research team at the University of Illinois at Urbana-Champaign has developed a new three-dimensional printed pure copper cooling plate technology, which is expected to significantly reduce the power consumption of data centers for cooling from about 30% of the current total power consumption to about 1.1%. Researchers estimate that if this technology is fully applied in ultra-large-scale data centers, the overall cooling-related energy consumption is expected to be reduced by more than 90%, approaching the efficiency limit that current thermal engineering can achieve.

According to the International Energy Agency, global data center electricity consumption will reach 485 terawatt hours in 2025, of which approximately 30%—a value that already exceeds Sweden’s annual electricity consumption—is used for cooling facilities themselves. At the same time, the rapid development of generative artificial intelligence has led the industry to even consider building data centers in space to obtain more direct solar energy supply. What's even more ironic is that about one-third of these huge power expenditures have nothing to do with the computing itself, but are used to "move away" the electrical energy converted into heat by the chips.

Taking Nvidia's GB200 chip as an example, the power consumption of a single chip reaches 1,200 watts, and the daily power consumption is about 28.8 kilowatt hours, which is close to the average daily power consumption of an average American household. Due to the inevitable Joule heating effect, these 1200 watts are almost equally converted into heating power, which is theoretically enough to heat more than 50 glasses of water in just one hour. If thousands, or even hundreds of thousands, of these chips are stacked densely in racks as they are now, without any cooling intervention, the 220,000 GPUs and 300 megawatts of power in xAI's Colossus 1 data center alone are enough to heat approximately 785,000 square feet of space to about 1,200 degrees Celsius in one hour, which is hotter than magma. It can be seen that cooling has become an unavoidable and even life-and-death link in the operation of data centers.

Behnood Bazmi, the first author of the paper and mechanical engineer, pointed out, "Cooling is the bottleneck of current chip design. By bridging the gap between computing design and manufacturing capabilities, our solution provides a new path for liquid cooling of more energy-efficient chips and various electronic equipment." For a long time, data centers have mainly relied on air cooling: installing metal heat sinks on CPUs and GPUs, expanding the heat exchange area through thin fins, and supplementing it with forced convection by high-power fans. In order to drive a huge air-handling system, this method itself consumes a lot of power, and in the face of the sharply increasing heat flow density of the new generation of AI accelerator chips, traditional air cooling is becoming increasingly inadequate.

Therefore, the industry is accelerating the shift to direct chip liquid cooling solutions, that is, installing a metal "cold plate" above the processor, guiding the flow of coolant through its internal tiny channels, and quickly dissipating chip heat. Conventional cold plates on the market have long been put into use, but the design of their internal fins and flow channels generally prioritizes ease of processing. The geometric shapes are mostly rectangular or cylindrical, and the materials are mostly made of aluminum alloy or stainless steel. It is difficult to balance the ultimate heat exchange performance and flow resistance control.

The innovation of the University of Illinois team is concentrated in the two key aspects of material and fin structure. The researchers used topology optimization methods and introduced mathematical optimization algorithms to redesign the internal microstructure of the cold plate, evolving from the traditional square column and cylindrical geometry to a more complex, jagged, and sharp three-dimensional shape to maximize the heat transfer area and thermal performance while taking into account the flow channel resistance. Because these highly complex structures are almost impossible to economically process through traditional processes, the team turned to advanced electrochemical additive manufacturing (ECAM) to directly generate the desired shape in a layer-by-layer manner. In terms of material selection, they boldly used pure copper, which has excellent thermal conductivity but is extremely difficult to finely shape in conventional 3D printing.

According to mechanical engineer Nenad Miljkovic, corresponding author of the paper, ECAM technology can process pure copper into fine features as fine as 30 to 50 microns, which is even smaller than the diameter of a human hair. Experimental results show that compared with commercial conventional cold plates, this topology-optimized cold plate made of pure copper can improve cooling performance by up to about 32% under liquid cooling conditions, while reducing the pressure drop of the system by up to 68%. The decrease in pressure drop means that the pump power required to promote coolant circulation per unit time is greatly reduced. The combination of the two brings significant overall energy consumption savings.

The research team further conducted modeling analysis from the overall level of the data center. In the current scenario where air cooling still dominates, a data center with an installed capacity of 1 GW may require approximately 550 MW of additional power for cooling infrastructure alone. After adopting the optimized liquid cooling solution they proposed, the cooling power consumption of the same size facility is expected to be reduced to approximately 11 megawatts. In other words, while maintaining effective heat dissipation of the extreme heat generated by large-scale AI hardware, the energy consumption of cooling is expected to be compressed from the current approximately 30% to 35% to approximately 1.1%, an overall reduction of more than 95%.

If these model predictions can be reproduced in real hyperscale deployments, the impact on data center energy efficiency will be revolutionary. According to the research team's estimation, this system can help the data center achieve a power usage efficiency (PUE) of about 1.011, which means that almost every watt of power input from the power grid is used directly for computing, rather than being consumed in auxiliary means such as cooling, transmission and distribution losses, or lighting. For comparison, most of the world's most advanced ultra-large-scale data center PUEs are between 1.1 and 1.3, while the theoretical "perfect" data center PUE is 1.0, that is, no energy is wasted on cooling and supporting infrastructure.

Of course, the research team also admitted that the current figures on the energy consumption of the entire data center are still in the model deduction stage and are not based on on-site measurement results of real gigawatt-level data centers. Even so, if the technology can maintain performance in large-scale deployments as expected, it has the potential to significantly reduce one of the biggest overlooked hidden energy consumption behind the current AI boom - data center cooling. Researchers believe that this idea of combining design optimization with advanced manufacturing processes is not limited to data centers, but can also be expanded to a wider range of electronic equipment and even other engineering fields that require efficient thermal management.