Home / GPU Cluster Deployment

GPU clusters — from 4 nodes to a data hall

Everything a training or inference cluster needs, on one bill of materials: GPU nodes, fabric, storage, power and cooling — staged delivery with the compliance work done.

GPU cluster hardware

Cluster projects fail on the seams: GPUs arrive without fabric, cooling specs surface too late, customs holds a rack hostage. We quote the whole stack together and ship it in deployment order.

Cluster bill of materials

LayerOptionsNotes
Compute nodesH200 NVL PCIe nodes; HGX B200/B300 systemsPCIe path avoids allocation queues
FabricInfiniBand NDR; 400G Ethernet (QSFP-DD)Cut-length cable schedules included
StorageNetApp AFF / PowerScale for datasets & checkpointsSized to GPU count and IO profile
Head & service nodesR660/DL360 class for scheduling, login, monitoringTypically from stock
PowerIntelligent PDUs, high-wattage PSUs, UPSPer-rack power budgets calculated
CoolingRear-door HX for PCIe pods; DLC for HGX densityDecision tree in our cooling guide

Typical cluster shapes

Inference pod4× H200 NVL nodes + 400G fabric — serving 70–180B models.
→ from ~$350k all-in
Fine-tuning cluster8–16 nodes with InfiniBand and shared flash storage.
→ quoted per allocation
Training clusterHGX B200/B300 with DLC, multi-rack fabric.
→ project-quoted, staged delivery
Sovereign / air-gappedPrivate cluster with no external dependencies.
→ compliance documentation included

Availability and pricing anchors

from ~$350k4-node H200 NVL inference pod, all-in
1 BOMcompute, fabric, storage, power, cooling
Stagedracks ship in deployment order
Export-screeneddual-use compliance per destination

Stock rotates daily — positions are "typically available" and confirmed per request, usually within one business day. Stock guides →

Export compliance. NVIDIA H200/H100/B-series GPUs are US export-controlled dual-use items (ECCN 3A090). Haink supplies them only after end-user and destination screening under US EAR and OFAC rules, and declines any order to a restricted destination or end use. Hong Kong and Mainland China destinations are treated as controlled under current US rules; orders are quoted accordingly.

Frequently asked questions

How much does a GPU cluster cost?

A 4-node H200 NVL inference pod with fabric and storage lands around $350k all-in; HGX training clusters run from high six figures depending on GPU allocation and scale. We quote the complete BOM, not just the GPUs.

PCIe or HGX for our first cluster?

PCIe (H200 NVL) deploys in weeks from stock and serves most inference and fine-tuning. HGX earns its premium for large-scale training with NVLink-heavy communication. We model both against your workload.

Do you handle the fabric and cabling design?

Yes — InfiniBand or 400G Ethernet topologies with port-level BOMs and cut-length cable schedules, so nothing arrives missing.

Can you deliver a cluster to a hard-to-ship country?

That is our lane: export screening, DDP delivery, customs clearance and staged shipping from Hong Kong and Dubai hubs.

Who installs it?

Integration partners at major destinations, or we deliver rack-ready kits with elevations and schedules for your team.

Scoping a cluster?

Pricing, availability and delivered lead time within one business day.

sales@haink.org