Validated hardware blueprints you can budget from — inference pod, fine-tuning cluster and starter stack. Each with a bill of materials, GPU options and indicative pricing. We tailor to your workload and screen every order for compliance.
The workhorse for serving large models privately: four PCIe GPU nodes on a 400G fabric with shared flash storage. PCIe H200 NVL avoids HGX allocation queues, so this deploys in weeks. Ideal for production inference and high-concurrency serving.
| Component | Specification | Notes |
|---|---|---|
| GPU nodes | 4× server, 2× H200 NVL 141 GB each | Dell R760xa / HPE DL380a class |
| Fabric | 400G QSFP-DD or InfiniBand | cut-length schedule included |
| Storage | NetApp AFF / Dell PowerScale | sized to model + cache |
| Head/service | R660 / DL360 class | scheduling, login, monitoring |
| Power/cooling | intelligent PDUs, rear-door HX | per-rack budget calculated |
For fine-tuning and pre-training: eight HGX 8-GPU systems on a non-blocking InfiniBand NDR spine, with high-throughput flash for datasets and checkpoints, and direct liquid cooling for 60kW+ racks. HGX is allocation-bound — we quote honest lead times and can bridge with H200 NVL nodes meanwhile.
| Component | Specification | Notes |
|---|---|---|
| Compute | 8× HGX 8-GPU systems (B200/B300) | on allocation, quoted realistically |
| Fabric | InfiniBand NDR, ConnectX-7/8 | non-blocking topology |
| Storage | parallel flash, multi-GB/s | datasets + fast checkpoints |
| Cooling | DLC manifolds + CDUs | 60kW+ rack density |
| Power | high-density PDUs, UPS | N+1 options |
The smallest credible private-AI footprint: a DGX Spark for development, one H200 NVL inference node for launch, and an NVMe-heavy node for RAG and vector search. Same software stack scales to the pod and cluster above when you grow.
| Component | Specification | Notes |
|---|---|---|
| Development | NVIDIA DGX Spark | 128 GB unified, desktop |
| Inference | 1× node, 2× H200 NVL | 70B-class production serving |
| RAG / vector | NVMe CPU node | embeddings, retrieval, cache |
| Networking | 25/100G switching | Catalyst / Nexus |
Send your workloads and destination — we return a tailored BOM, firm pricing and compliant lead times within one business day.