GPU Cluster Deployment — Hardware, Fabric & Delivery

Cluster bill of materials

Layer	Options	Notes
Compute nodes	H200 NVL PCIe nodes; HGX B200/B300 systems	PCIe path avoids allocation queues
Fabric	InfiniBand NDR; 400G Ethernet (QSFP-DD)	Cut-length cable schedules included
Storage	NetApp AFF / PowerScale for datasets & checkpoints	Sized to GPU count and IO profile
Head & service nodes	R660/DL360 class for scheduling, login, monitoring	Typically from stock
Power	Intelligent PDUs, high-wattage PSUs, UPS	Per-rack power budgets calculated
Cooling	Rear-door HX for PCIe pods; DLC for HGX density	Decision tree in our cooling guide

Typical cluster shapes

Inference pod4× H200 NVL nodes + 400G fabric — serving 70–180B models.

→ from ~$350k all-in

Fine-tuning cluster8–16 nodes with InfiniBand and shared flash storage.

→ quoted per allocation

Training clusterHGX B200/B300 with DLC, multi-rack fabric.

→ project-quoted, staged delivery

Sovereign / air-gappedPrivate cluster with no external dependencies.

→ compliance documentation included

Availability and pricing anchors

from ~$350k4-node H200 NVL inference pod, all-in

1 BOMcompute, fabric, storage, power, cooling

Stagedracks ship in deployment order

Export-screeneddual-use compliance per destination

Stock rotates daily — positions are "typically available" and confirmed per request, usually within one business day. Stock guides →

Export compliance. NVIDIA H200/H100/B-series GPUs are US export-controlled dual-use items (ECCN 3A090). Haink supplies them only after end-user and destination screening under US EAR and OFAC rules, and declines any order to a restricted destination or end use. Hong Kong and Mainland China destinations are treated as controlled under current US rules; orders are quoted accordingly.

Frequently asked questions

How much does a GPU cluster cost?

A 4-node H200 NVL inference pod with fabric and storage lands around $350k all-in; HGX training clusters run from high six figures depending on GPU allocation and scale. We quote the complete BOM, not just the GPUs.

PCIe or HGX for our first cluster?

PCIe (H200 NVL) deploys in weeks from stock and serves most inference and fine-tuning. HGX earns its premium for large-scale training with NVLink-heavy communication. We model both against your workload.

Do you handle the fabric and cabling design?

Yes — InfiniBand or 400G Ethernet topologies with port-level BOMs and cut-length cable schedules, so nothing arrives missing.

Can you deliver a cluster to a hard-to-ship country?

That is our lane: export screening, DDP delivery, customs clearance and staged shipping from Hong Kong and Dubai hubs.

Who installs it?

Integration partners at major destinations, or we deliver rack-ready kits with elevations and schedules for your team.

Private AI infrastructure → Liquid cooling guide → NVLink vs PCIe → AI training infrastructure →

Running AI on this infrastructure? Haink also builds the LLM & ML software that runs on it — model, pipeline and GPUs under one contract.

Scoping a cluster?

Pricing, availability and delivered lead time within one business day.