Liquid Cooling for AI GPU Servers — DLC, Immersion, and Data Center Requirements

High-density GPU servers for AI training and inference — particularly those using NVIDIA H100 SXM5, H200 SXM5, and B200 SXM5 — exceed the thermal dissipation capacity of conventional data center air cooling. A single 8-GPU H100 SXM5 server (Supermicro SYS-821GE-TNHR, Dell XE9680) has a TDP of 10.2–11.2 kW. A single rack of 8 such servers reaches 80–90 kW per rack — far beyond the 8–15 kW per rack that typical precision air conditioning (CRAC/CRAH) can remove. B200-based servers exceed this further, with 8-GPU configurations reaching 120+ kW per rack. Liquid cooling is mandatory for high-density GPU cluster deployments.

Why Air Cooling Fails at GPU Densities

Conventional hot-aisle/cold-aisle air cooling in data centers is designed for standard server densities of 3–10 kW per rack. Modern air-cooled servers (1U/2U dual-CPU servers, network switches, storage arrays) typically draw 0.2–2 kW per 1U. A standard 42U rack might contain 20 compute servers at 1.5 kW each = 30 kW total, which conventional CRAC units can handle.

GPU AI servers break this assumption. Each NVIDIA H100 SXM5 GPU has a 700W TDP; 8 GPUs plus CPUs, memory, NVMe, and networking in a 10U server total roughly 10–11 kW for a single server. Four such servers in one rack reach 40–44 kW. Eight servers fill a 42U rack and reach 80–90 kW. Conventional air cooling cannot remove this heat — the physics of moving air across hot components limit the maximum heat extraction to approximately 20–30 kW per rack for standard cooling infrastructure, even with high-pressure precision air conditioning.

Direct Liquid Cooling (DLC)

Direct Liquid Cooling (also called direct-to-chip cooling or cold plate cooling) is the most widely deployed liquid cooling method for enterprise GPU servers. In DLC, a metal cold plate is mounted directly on the GPU die (and optionally on CPUs, memory, and VRMs). Coolant — typically water or a water-glycol mixture — flows through the cold plate, absorbs heat from the component, and carries it to a heat exchanger connected to the facility's chilled water loop.

DLC does not replace air cooling entirely — fans and air circulation remain inside the server chassis to cool components that are not directly on the liquid loop (PCIe retimers, NVMe drives, board-level components). DLC servers are typically described as "liquid + air hybrid" cooling systems. The percentage of heat removed by liquid varies by design: high-density DLC servers route 60–90% of total server heat to the liquid loop, with the remainder exhausted as hot air to the room.

NVIDIA B200 DLC Requirements

NVIDIA B200 SXM5 GPUs are designed for DLC-first deployment. The B200 GPU die has a 1,000W TDP; a single B200 GPU produces more heat than multiple H100 PCIe GPUs combined. NVIDIA's design guidelines for B200-based systems (GB200 NVL72, Supermicro B300 SXM servers) require direct liquid cooling — air-cooled B200 server configurations are not available. GB200 NVL72 systems are rack-scale designs with fully integrated CDU (Coolant Distribution Unit) delivering chilled water directly to the NVL72 rack.

NVIDIA H100/H200 DLC Options

H100 SXM5 and H200 SXM5 servers are available in both air-cooled and DLC configurations depending on the OEM platform:

Supermicro SYS-821GE-TNHR — standard air-cooled 10U 8-GPU H100/H200 SXM5 server; suitable for data centers with 20–25 kW per rack capacity and supplemental rear-door heat exchangers
Supermicro SYS-821GE-TNHR (DLC variant) — same platform with direct liquid cooling cold plates on GPUs; required for deployments targeting 80+ kW per rack density
Dell XE9680 — available in air-cooled and DLC configurations
NVIDIA DGX H100 — air-cooled by default; NVIDIA offers DGX H100 DLC variants for high-density cluster deployments

Coolant Distribution Units (CDUs)

A Coolant Distribution Unit is the facility-side component of a DLC deployment. The CDU connects the facility's chilled water supply (or building cooling loop) to the server-side liquid cooling loops in the rack. A CDU typically handles the heat exchange between facility coolant (which may run at higher temperature or pressure than the server-side loop requires) and the server-side coolant. CDUs are sized by total kW capacity — a CDU serving a rack of 8× H100 DLC servers needs to handle approximately 80–90 kW of heat removal.

CDUs are installed at the rack row or per-rack depending on deployment design. Manifolds distribute coolant from the CDU to individual servers in the rack via flexible hoses and quick-disconnect fittings that allow server removal without draining the loop.

Rear-Door Heat Exchangers (RDHx)

A rear-door heat exchanger (RDHx) is a passive liquid-cooled panel mounted on the rear door of a standard server rack. Hot air exhausted from servers at the rear of the rack flows through the RDHx, where it is cooled by chilled water flowing through the exchanger, and exits as cool air rather than hot air. RDHx is a retrofit-friendly approach — it can be installed on existing racks without modifying the servers themselves — but it only removes heat from the air exhaust stream and does not provide the same thermal performance as direct liquid cooling on the GPU components.

Rear-door heat exchangers are typically suitable for rack densities up to 30–40 kW and are commonly used with air-cooled H100 PCIe GPU servers (which can reach 10–20 kW per rack depending on GPU count) or with air-cooled H100 SXM5 deployments at moderate rack density. They are not adequate for B200 or full-rack H100 SXM5 deployments.

Immersion Cooling

Immersion cooling submerges entire servers or server components in a dielectric fluid (non-conductive liquid) that absorbs heat directly from all components without requiring airflow or cold plates. Two types are in enterprise deployment:

Single-phase immersion: Servers are submerged in a dielectric fluid (engineered fluids from 3M, Shell, or similar) that remains liquid at operating temperature. The fluid circulates and carries heat to a heat exchanger connected to the facility cooling loop. Single-phase immersion is appropriate for high-density AI GPU servers and handles component TDPs well beyond what DLC handles.
Two-phase immersion: A fluorocarbon fluid with a low boiling point vaporizes on contact with hot components, carries the heat as vapor to a condenser where it returns to liquid, and falls back to the bath. Two-phase immersion provides extremely efficient heat transfer and very high thermal uniformity across all components but requires more specialized facility infrastructure and fluids.

Immersion cooling is deployed in hyperscale AI clusters and specialized data centers. It is not yet mainstream for enterprise GPU deployments due to the significant facility and operational changes required compared to DLC or RDHx retrofits.

Data Center Requirements for GPU AI Servers

Organizations deploying GPU AI servers need to evaluate their data center's cooling infrastructure before procurement:

Air-cooled H100 PCIe servers (4–8 GPUs): 15–25 kW per rack. Standard CRAC units with hot-aisle/cold-aisle containment can handle this, or add RDHx for supplemental cooling.
Air-cooled H100/H200 SXM5 servers (8 GPUs per server): 80–90 kW per fully populated rack. Requires significant supplemental cooling — either per-rack RDHx, in-row cooling, or transitioning to DLC.
DLC H100/H200 SXM5 servers: Chilled water supply to rack, CDU per row or per rack, minimum chilled water supply at 18–22°C. Facility must have chilled water drops at the rack rows and CDU installation space.
B200/GB200 NVL72: Requires purpose-built DLC infrastructure. GB200 NVL72 is a 72-GPU rack-scale system with fully integrated cooling delivered as a complete rack solution; facility must support 120+ kW per rack with dedicated chilled water supply.

Haink and GPU Server Cooling

Haink supplies air-cooled and DLC variants of H100/H200 SXM5 GPU servers (Supermicro, Dell, HPE platforms) in Hong Kong, Dubai, and Mainland China. For high-density GPU deployments, Haink can advise on cooling requirements, CDU sizing, and which server configurations are compatible with the facility's existing cooling infrastructure before procurement.

Related Resources

Frequently Asked Questions

Does H100 require liquid cooling?

H100 PCIe (350W TDP) can be deployed in standard air-cooled servers without modifications. H100 SXM5 (700W TDP) in 8-GPU servers can be air-cooled with high-efficiency chassis fans, but rack density is limited to approximately one or two servers per rack with standard CRAC cooling. For full-rack H100 SXM5 deployments at 80+ kW per rack, direct liquid cooling or rear-door heat exchangers are required. H200 SXM5 has similar requirements to H100 SXM5. B200 SXM5 (1,000W TDP) requires direct liquid cooling — air-cooled B200 server configurations are not available.

What is a CDU in data center cooling?

A Coolant Distribution Unit (CDU) is the facility-side heat exchanger that connects a building's chilled water or cooling water loop to the server-side liquid cooling loop in direct liquid cooling deployments. The CDU regulates temperature, pressure, and flow rate of coolant delivered to the servers, and removes heat from the server-side coolant by exchanging it with the facility's chilled water supply. CDUs are typically sized in kW of heat removal capacity — a CDU handling a rack of 8× H100 DLC servers needs to remove approximately 80–90 kW continuously.

Can I add liquid cooling to existing air-cooled servers?

Not directly — DLC cold plates are integrated into server chassis design and cannot be retrofit onto servers designed for air cooling. The practical approach for organizations with air-cooled GPU servers in warm data centers is to add rear-door heat exchangers (RDHx) to existing racks, which requires only chilled water drops at the rack and RDHx installation, without modifying the servers. For new GPU server procurement, organizations with chilled water infrastructure available should specify DLC server variants from the start.

What chilled water temperature is needed for GPU server DLC?

Direct liquid cooling for GPU AI servers typically operates with facility-side chilled water supply temperatures of 14–22°C. Higher supply water temperatures (warm water cooling) reduce data center cooling energy consumption but require GPUs to operate closer to their thermal limits; most DLC GPU server deployments use 14–18°C supply water for reliable thermal headroom. The CDU regulates the exact temperature delivered to server cold plates within the required range regardless of facility supply temperature variations.