ARPA-E Talks ‘Moonshot’ Program to Revolutionize Cooling
At DCW 2024, the program director of the DOE’s Advanced Research Projects Agency-Energy expounded on the radical COOLERCHIPS program and how it could change data center cooling forever.
April 24, 2024
The U.S. Department of Energy’s Advanced Research Projects Agency-Energy (ARPA-E) announced federal funding for a great many vendors and academic institutions in September of 2022 as part of the COOLERCHIPS program. The goal? To develop transformational, highly efficient and reliable cooling technologies that reduce total cooling energy expenditure to less than 5% of a typical data center’s IT load at any time and at any U.S. location for a high-density compute system. That’s a tall order.
Eighteen months later, where do we stand? Peter de Bock, program director for ARPA-E, provided an update at the 2024 Data Center World conference.
“We have projects ongoing on single and two-phase immersion cooling and direct-to-chip (DtC) cooling, as well as others and we intend to have a proof of concept by the first half of 2026,” he said. “By 2030, the country with the most efficient, powerful and lower-TCO data centers will be at a major advantage.”
Solving the Collaboration and Funding Tangle
De Bock describes this as a moonshot program. Most projects being funded involve multiple participants to push technology far beyond current limits.
“The government pays the bill to get around challenges such as which vendor pays for innovation,” said De Bock.
But which COOLERCHIPS technologies and projects will ultimately achieve program goals by dramatically reducing the thermal resistance of heat rejection and allow coolants to exist at temperatures less than 10°C different from the operating temperatures of the latest generation of chips? De Bock didn’t give much away during his talk. He noted that progress had been made in developing potential solutions that can effectively cool data center densities of 80kW/m3 and greater. Work continues to ensure the winning approaches provide low total cost of ownership (TCO) without compromising data center reliability and availability.
Some of the ideas under development include better 3D flow manipulation of cold plates to efficiently transfer heat. Cold plate materials are also receiving attention. Silicon, in combination with other materials, might be a better option than copper and aluminum. For example, HP has developed a silicon-based printer head (working with NVIDIA).
Another Intel COOLERCHIPS project is attempting to take immersion cooling to the next level. Areas such as fluid replacement are being investigated to take the replacement of immersion fluids from every six months to as long as five years.
Additionally, one issue being investigated is raising the temperatures of data centers. It requires a lot of cooling and power to keep air temperatures low enough in the data center to be comfortable for maintenance personnel.
“Reimagined data center architectures may enable us to not have humans in the same room as computers so we can run hotter data centers and lower energy demands,” said De Bock. “Over the next 18 months, we will see what ideas pan out.”
He revealed that COOLERCHIPS is actually an acronym, standing for Cooling Operations Optimized for Leaps in Energy, Reliability and Carbon Hyper-efficiency for Information Processing Systems.
ARPA-E expects 90% of the projects to fail to meet their objectives. But 10% success will probably transform the industry. Participants are encouraged to try radical and unproven techniques and designs. ASHRAE and other standards are largely being ignored. De Bock’s view is that trying to adhere to standards would bog the program down. In any case, the winning projects are likely to break existing standards and occasion the development of new standards.
At Data Center World 2024, De Bock championed evaporative cooling as an efficient way to cool the data center – provided water is available. He noted energy savings of about 60%, but added that once temperatures hit 55℃, other cooling methods are required.
Four COOLERCHIPS Tracks
The COOLERCHIPS program has four distinct tracks:
1. Components pertaining to the secondary cooling loop that transfers heat from the servers to the facility water or primary cooling loop.
2. Cooling systems for modular and edge data centers that encompass the secondary and primary cooling loops, which transfer heat from facility water to the ambient.
“We are looking at the all-in-one edge module concept with units offering low latency, minimal-to-no water usage, and the ability to serve high compute densities,” said De Bock.
3. Data center cooling system software that will include the ability to model energy efficiency, reliability, CO2 footprint, and cost simultaneously.
“We lack tools for energy, CO2, energy and cost modeling and some teams are focused on that,” said De Bock. “Any solutions devised within the program will have to prove they can achieve 99.2% uptime at a minimum.”
4. Support facilities for testing new technologies developed under the first two tracks.
ARPA-E continues to look for participants in its programs. It is also allowing participants to exceed the original parameters. Some are being encouraged to propose higher targets for cooling efficiency, temperatures, and densities and apply for more funding.
“All cooling methods have their place – including immersion, DtC, air cooling and others,” said De Bock. “The ideal outcome would be to arrive at modular units that can be manufactured in the millions to achieve the economies of scale of the automotive industry.”
Market Realities
Rakesh Radhakrishnan, Technology-to-Market Advisor at theU.S. Department of Energy (DOE), added perspective on the commercialization of COOLERCHIPS solutions.
“We increase funding for those that have the best commercialization potential,” he said. “This is done to take these projects to POC inside a data center.”
Thus, there is initial ARPA-E funding to foster research and innovation. The best candidates receive a further cash injection for scale up and POC. The most viable technologies are handed over to the vendors and the investment community to take them to market.
With work ongoing on so many fronts simultaneously, coupled with the volume of innovation apparent within the vendor community, one or more winning cooling technologies is sure to emerge. The need is urgent.
Shen Wang, principal analyst at Omdia, noted that the die size of CPUs has increased 100X since the 1970s. Since 2000, processor size is up 7.6x and power consumption has risen by 4.6X.
“Innovation is needed for racks above 150 KW,” said Wang. “Will it be immersion, DtC or a combination of immersion and air cooling? Time will tell. But the areas of greatest need right now are to be able to cool precisely and efficiently.
About the Author
You May Also Like