Microsoft delivers the primary at-scale manufacturing cluster with greater than 4,600 NVIDIA GB300 NVL72, that includes NVIDIA Blackwell Extremely GPUs related by way of the next-generation NVIDIA InfiniBand community.
Microsoft delivers the first at-scale manufacturing cluster with greater than 4,600 NVIDIA GB300 NVL72, that includes NVIDIA Blackwell Extremely GPUs related by way of the next-generation NVIDIA InfiniBand community. This cluster is the primary of many, as we scale to lots of of 1000’s of Blackwell Extremely GPUs deployed throughout Microsoft’s AI datacenters globally, reflecting our continued dedication to redefining AI infrastructure and collaboration with NVIDIA. The large scale clusters with Blackwell Extremely GPUs will allow mannequin coaching in weeks as an alternative of months, delivering excessive throughput for inference workloads. We’re additionally unlocking larger, extra highly effective fashions, and would be the first to assist coaching fashions with lots of of trillions of parameters.
This was made potential by way of collaboration throughout {hardware}, programs, provide chain, amenities, and a number of different disciplines, in addition to with NVIDIA.
Microsoft Azure’s launch of the NVIDIA GB300 NVL72 supercluster is an thrilling step within the development of frontier AI. This co-engineered system delivers the world’s first at-scale GB300 manufacturing cluster, offering the supercomputing engine wanted for OpenAI to serve multitrillion-parameter fashions. This units the definitive new normal for accelerated computing.
Ian Buck, Vice President of Hyperscale and Excessive-performance Computing at NVIDIA
From NVIDIA GB200 to GB300: A brand new normal in AI efficiency
Earlier this 12 months, Azure launched ND GB200 v6 digital machines (VMs), accelerated by NVIDIA’s Blackwell structure. These shortly grew to become the spine of among the most demanding AI workloads within the business, together with for organizations like OpenAI and Microsoft who already use large clusters of GB200 NVL2 on Azure to coach and deploy frontier fashions.
Now, with ND GB300 v6 VMs, Azure is elevating the bar once more. These VMs are optimized for reasoning fashions, agentic AI programs, and multimodal generative AI. Constructed on a rack-scale system, every rack has 18 VMs with a complete of 72 GPUs:
- 72 NVIDIA Blackwell Extremely GPUs (with 36 NVIDIA Grace CPUs).
- 800 gigabits per second (Gbp/s) per GPU cross-rack scale-out bandwidth through next-generation NVIDIA Quantum-X800 InfiniBand (2x GB200 NVL72).
- 130 terabytes (TB) per second of NVIDIA NVLink bandwidth inside rack.
- 37TB of quick reminiscence.
- As much as 1,440 petaflops (PFLOPS) of FP4 Tensor Core efficiency.

Constructing for AI supercomputing at scale
Constructing infrastructure for frontier AI requires us to reimagine each layer of the stack—computing, reminiscence, networking, datacenters, cooling, and energy—as a unified system. The ND GB300 v6 VMs are a transparent illustration of this transformation, from years of collaboration throughout silicon, programs, and software program.
On the rack stage, NVLink and NVSwitch cut back reminiscence and bandwidth constraints, enabling as much as 130TB per second of intra-rack data-transfer connecting 37TB whole of quick reminiscence. Every rack turns into a tightly coupled unit, delivering greater inference throughput at diminished latencies on bigger fashions and longer context home windows, empowering agentic and multimodal AI programs to be extra responsive and scalable than ever.
To scale past the rack, Azure deploys a full fat-tree, non-blocking structure utilizing NVIDIA Quantum-X800 Gbp/s InfiniBand, the quickest networking cloth out there at the moment. This ensures that prospects can scale up coaching of ultra-large fashions effectively to tens of 1000’s of GPUs with minimal communication overhead, thus delivering higher end-to-end coaching throughput. Diminished synchronization overhead additionally interprets to most utilization of GPUs, which helps researchers iterate quicker and at decrease prices regardless of the compute-hungry nature of AI coaching workloads. Azure’s co-engineered stack, together with customized protocols, collective libraries, and in-network computing, ensures the community is extremely dependable and absolutely utilized by the purposes. Options like NVIDIA SHARP speed up collective operations and double efficient bandwidth by performing math within the swap, making large-scale coaching and inference extra environment friendly and dependable.
Azure’s superior cooling programs use standalone warmth exchanger models and facility cooling to reduce water utilization whereas sustaining thermal stability for dense, high-performance clusters like GB300 NVL72. We additionally proceed to develop and deploy new energy distribution fashions able to supporting the excessive power density and dynamic load balancing required by the ND GB300 v6 VM class of GPU clusters.
Additional, our reengineered software program stacks for storage, orchestration, and scheduling are optimized to completely use computing, networking, storage, and datacenter infrastructure at supercomputing scale, delivering unprecedented ranges of efficiency at excessive effectivity to our prospects.

Wanting forward
Microsoft has invested in AI infrastructure for years, to permit for quick enablement and transition into the most recent know-how. Additionally it is why Azure is uniquely positioned to ship GB300 NVL72 infrastructure at manufacturing scale at a speedy tempo, to satisfy the calls for of frontier AI at the moment.
As Azure continues to ramp up GB300 worldwide deployments, prospects can anticipate to coach and deploy new fashions in a fraction of the time in comparison with earlier generations. The ND GB300 v6 VMs v6 are poised to turn into the brand new normal for AI infrastructure, and Azure is proud to paved the way, supporting prospects to advance frontier AI improvement.
Keep tuned for extra updates and efficiency benchmarks as Azure expands manufacturing deployment of NVIDIA GB300 NVL72 globally.

