Has the Data Center Staffing Crisis Stifled Cooling Innovation?Has the Data Center Staffing Crisis Stifled Cooling Innovation?
At a time when spending is tighter, data centers are finding ways to optimize the resources they have. Maybe that’s putting the liquid cooling era on hold.
As society scrambles to lock away the memory of the last pandemic, it’s only natural for the data center industry to recalibrate itself in search of “the new normal.” But have conditions settled to the point where we can statistically declare what “normal” should be? The latest global data center survey results from Uptime Institute suggest, even though people may be more immune to the disease, the data center economy remains in a state of flux, moving innovation more toward a standstill.
The most telling indicator may be survey respondents’ power usage efficiency (PUE) levels, which Uptime has tracked annually since 2007. Through the pandemic, average annual PUEs stayed statistically flat — understandable given the economic conditions at that time. For 2023, however, instead of resuming its downward trend, average PUE appears to be ticking upward to 1.58 — back where it was five years ago.
“One of the reasons [PUE] has been pretty steady-state, and hasn’t continued to drop,” explained Uptime Chief Technology Officer Chris Brown in a recent company webinar, “is that data center owners and operators typically are not monitoring and managing to a single variable, such as PUE. They’re managing an entire system.
Steady-State Theory
“One way to get PUE to drop further with the existing technology,” Brown continued, “is to have active systems actually modulate equipment, and reset supplier temperatures based upon rack inlet temperatures, where the hot spots [and] cold spots are on the data floor, as well as the weather. The issue is that requires significant capital investment. It’s going to also increase operating costs, because the people required to maintain that equipment are more costly than people who would just operate direct-expansion air conditioners on a daily basis. It can also induce risk, because you have an active system that, if it goes haywire, could shut down equipment rather than starting equipment.”
Another very significant factor playing into operators’ design decisions, Brown noted, is the increasing efficiencies of conventional cooling methods such as direct-expansion (DX). The gains to be realized by moving to direct liquid cooling (DLC) may be smaller than they were prior to the pandemic. One possible reason, at least theoretically, may be the urgent need for operators to optimize their existing buildouts, now that their CapEx and OpEx have both shrunk.
Could this perfect storm of DX efficiency improvements, converging with the rising cost of skilled human power, have a dampening effect on cooling innovation — one that could leave PUE levels flat, if not gently rising, at least for the near future? Data Center Knowledge put this question to Chris Brown.
Two decades ago, prior to the advent of digital scroll compressors, Brown told us, as much as five times the electrical power was required for a DX system to deliver one ton of cooling load (12,000 BTU/hr) as for a chilled water system. Naturally that gap has shrunk somewhat, but in recent years, it’s become slight enough to garner attention.
“What we’re seeing is, more data centers are using direct expansion cooling systems,” said Brown, “because it’s much easier to operate those systems than chilled water systems. You can operate with a smaller staff that’s not necessarily highly skilled in hydraulics and hydronics. You just need access to an HVAC technician, if there’s a problem. So we’re seeing data center operators and owners ... using a little more power than with the chilled water systems. Then on the OpEx side, with the staffing and the risk associated with it, they’re helping themselves out because it’s much easier to operate, thus they run much less risk of human error.”
The Sustainability Factor
Responses to Uptime’s 2023 survey from some 572 data center operators worldwide suggest their uptake of higher efficiency cooling options, including direct liquid cooling, has noticeably slowed. About 56 percent of respondents told Uptime their highest density cabinets (40 kW and above) continue to utilize perimeter cooling systems — fairly ordinary air conditioners.
As Uptime research analyst Jacqueline Davis noted during the webinar, DLC manufacturers continue to tout feasible PUE levels descending below even 1.1, given ideal operating conditions. At the same time, DLC continues to promise sustainable rack densities above 30 kW.
But as the real world stands now, noted Davis, rack densities remain relatively flat. Probably as a result, DLC rollouts are happening in limited space — rarely the full floor.
“It’s a story of mixed infrastructure,” said Davis. “A lot of those are very limited rollouts at present, and a lot of those are sharing data haul space with air cooling. Until we see a really large rollout of direct liquid cooling, we’re not going to see an impact, for a little while, on the industry average PUE at large.”
So what are the signals really telling us? Have the staffing shortage, the high cost of skills, the rising costs of maintenance and cooling, plus — while we’re at it — global warming, all conspired to stifle data center design innovation? Or can we just as easily make the opposite case: Are these factors all compelling operators to drive innovation in design, through investments in incremental efficiency gains and system redundancy?
For a fresh perspective, we put the question to Omdia’s head of cloud and data center research, Vladimir Galabov.
“I have not heard of global warming being reported as a driver for higher cooling costs or maintenance issues,” wrote Galabov in a note to Data Center Knowledge. “Cooling is something data center operators have optimized continuously for decades and in the current wave they’re investing for two reasons: 1) AI computing requires highly configured servers which are so densely packed with processors that they require cooling innovation to enable fully populated racks, and maximize computing per square foot. 2) Lower the power consumption of data centers by reducing the number of mechanical components like fans. Lower power consumption means less costs and better sustainability credentials.”
In its UPS Market Update last June, to which Galabov contributed, Omdia did report a modest 4 percent reduction in annual forecast server shipments for 2023 — its first such forecast contraction since the 2007 global financial crisis. But Omdia attributes this shake-up to a different mix of factors than Uptime’s survey indicates:
Companies refocusing their priorities to make way for servers more proficient in training AI models
The availability of lower-cost cloud computing capacity, to help companies get by as they reassess their long-term demand strategies
A wave of new sustainability goals that incorporate Scope 3 emissions — the carbon impact of activities brought on as a result of data center activity, as opposed to directly attributable to it.
Galabov says Omdia actually hasn’t seen the continuation of supply chain issues that Uptime has reported. Still, he added, “I think investment is holding as steady as possible, given the significant increases of the cost of capital with the rise of interest rates globally.”
But that doesn’t really settle the main question: Are these factors driving innovation or stifling it? “This is a bit of a targeted question,” admitted Galabov. “It’s difficult to choose either option 1 or 2.”
The N Factor
If the efficiency improvements for DLC over DX were still 5x, maybe it would be worth training staff (and, of course, paying them more) to do harder jobs that incur greater risk. But the pressures placed on operators by the pandemic to optimize their use of available resources, may have altered the equation.
Consider the supply chain constraints data centers still face — which, according to Uptime’s clients, never relented. Some major operators may be deploying zero redundancy (a state referred to simply as “N”) simply because replacement parts are scarce. Uptime’s Chris Brown talked of some clients seeking panel boards being back-ordered for 28 weeks, and replacement engine generators on hold for 12 months or longer.
It’s a situation that’s compelling some operators that do have CapEx available, to invest in a complete N+1 backup system now, regardless of how long it may take to build. And because it’s a backup, it probably won’t have DLC, but rather a DX with what efficiency they can muster for the time being.
Will we see investments in innovation, particularly in design, cooling, and power efficiency, slow down if not decline over the next few years — let’s say, until 2025 — as long as operators prioritize stockpiling systems and parts, and minimizing labor costs, over driving the next waves in rack density improvements, cooling, and power efficiency?
“For small companies, an investment in additional systems to protect oneself from supply chain shortages could encourage them not to upgrade soon,” Brown responded, in a follow-up note to Data Center Knowledge, “But I doubt that, since most will invest in additional redundancy to shore up older systems, and they will still, at some point, want to upgrade those older systems. In fact, adding a system is a common first step to upgrading an older system that will take time to complete, so they are not running on N for an extended period of time.”
Cautioning us that his views were straight-up speculation, Brown went on: As long as supply chain shortages continue, secondary markets for replacement parts could flourish. Brown noted one example of UPS manufacturers purchasing back parts being replaced, so they can be refurbished and resold as spares.
“If overall average rack densities really started to climb rapidly — which I doubt will happen,” wrote Brown, “everyone will work hard and invest to support the business at hand. But as long as rack densities do not climb rapidly, the current technology is doing the job and probably does not make a compelling business case for most to dramatically change their cooling and power system approaches.”
It’s a mix of signals from the data center market, that’s driving a mix of interpretations. The market could be at a standstill, or it could be dead center in the eye of a storm. Here’s what’s clear: Until we know what “normal” has become, we may not have a handle on our bearings.
About the Author
You May Also Like