Google Unveils Advanced Liquid Cooling for TPUs at Hot Chips 2025, Highlighting Scalability, Efficiency, and Reliability in Datacenter AI Infrastructure
Liquid cooling is emerging as a critical solution in modern datacenters, driven by the escalating power and heat output of advanced AI chips. At Hot Chips 2025, Google shared insights into its large-scale liquid cooling infrastructure designed specifically for its Tensor Processing Units (TPUs), highlighting how the technology is essential for managing the thermal demands of cutting-edge machine learning hardware. Google began experimenting with liquid cooling for TPUs in 2018, and since then has evolved its approach to support datacenter-wide cooling. Unlike traditional server-level cooling, Google’s current system uses liquid cooling loops that span entire racks. Each rack contains six Coolant Distribution Units (CDUs), which function similarly to the radiator and pump combo in enthusiast water cooling setups. These CDUs use flexible hoses and quick-disconnect couplings to simplify maintenance and reduce alignment tolerances. With five CDUs active at any time, one can be serviced without interrupting operations—enabling true zero-downtime maintenance. The CDUs transfer heat between the server-side coolant and the facility’s chilled water supply, keeping the two liquids completely separate. The coolant then flows through manifolds to reach TPU servers. In the current design, TPU chips are connected in series within each loop, meaning later chips in the chain receive warmer coolant. Cooling capacity is therefore sized to handle the thermal load of the final chip in the loop. To improve thermal performance, Google adopted a split-flow cold plate design, which outperforms traditional straight-through configurations. This innovation is complemented by a shift in TPUv4 architecture: moving from a lidded design to a bare-die setup, similar to the delidding practice common among PC enthusiasts. This change boosts heat transfer efficiency, a necessity given that TPUv4 consumes 1.6 times more power than its predecessor. Beyond better heat removal, liquid cooling reduces the overall energy needed for cooling. Google reports that the power used by liquid cooling pumps is less than 5% of the fan power required in air-cooled systems. While enthusiast water-cooled PCs still rely on fans to exhaust heat from radiators, datacenter systems use water-to-water heat exchange, shifting the bulk of cooling power to pumps. This makes the system far more efficient at scale—though such efficiency is less impactful in individual PC builds due to lower absolute power demands. Maintenance remains a major challenge. Both datacenters and enthusiast systems face risks like leaks and microbial growth. Google addresses these with rigorous component testing, leak detection alerts, scheduled maintenance, filtration, and standardized response protocols. These measures allow a large team to react quickly and consistently—far beyond what’s typical in DIY setups. The trend is clear: liquid cooling is no longer just for high-end PCs. At Hot Chips 2025, multiple vendors showcased liquid-cooled systems. Nvidia displayed its GB300 server with visible external cooling connections and flexible tubing, even incorporating fans. Rebellions AI, a South Korean AI accelerator startup, demonstrated its REBEL Quad chip with a water block and chiller setup—though the final product will use air cooling in a PCIe form factor. These developments signal that liquid cooling is now a foundational element of datacenter design, especially as AI workloads continue to push hardware to thermal limits. With the AI boom showing no signs of slowing, efficient, scalable, and reliable cooling solutions like those presented by Google will be essential for the future of computing.