HyperAIHyperAI

Command Palette

Search for a command to run...

Microsoft Azure Launches World’s First NVIDIA GB300 NVL72 Supercomputing Cluster for OpenAI

Microsoft has unveiled its first large-scale AI infrastructure deployment, marking a major milestone in the race to power the next generation of artificial intelligence. The company announced the launch of the NDv6 GB300 VM series on Microsoft Azure, featuring the industry’s first production-ready cluster of NVIDIA GB300 NVL72 systems—what Nvidia calls AI “factories.” This supercomputer-scale setup, built in partnership with NVIDIA, is designed to handle the most demanding AI workloads, particularly those powered by OpenAI’s frontier models. The system consists of over 4,600 NVIDIA Blackwell Ultra GPUs, housed in liquid-cooled, rack-scale GB300 NVL72 units, each packing 72 Blackwell Ultra GPUs and 36 NVIDIA Grace CPUs. Together, they deliver 37 terabytes of high-speed memory and 1.44 exaflops of FP4 Tensor Core performance per virtual machine—enough to support reasoning models, agentic AI systems, and multimodal generative AI with hundreds of trillions of parameters. The cluster is connected via NVIDIA’s Quantum-X800 InfiniBand networking platform, providing 800 Gb/s bandwidth per GPU and enabling seamless communication across the entire system. At the heart of the architecture is the fifth-generation NVIDIA NVLink Switch fabric, which delivers 130 TB/s of all-to-all bandwidth within each rack, effectively turning the entire rack into a single, unified accelerator with a shared memory pool. This is critical for handling massive, memory-intensive AI models. For inter-rack communication, the Quantum-X800 platform uses advanced features like adaptive routing, telemetry-based congestion control, and SHARP v4 to optimize performance and reduce latency during large-scale training and inference. Microsoft emphasized that this deployment reflects years of deep collaboration with NVIDIA, combining hardware innovation with custom engineering across power, cooling, and software stacks. The result is a data center infrastructure uniquely optimized for frontier AI—scaling to hundreds of thousands of Blackwell Ultra GPUs across Azure’s global network of over 300 data centers in 34 countries. The announcement comes at a pivotal moment. Just days prior, OpenAI announced major data center deals with both NVIDIA and AMD, with estimates suggesting over $1 trillion in commitments to build its own AI infrastructure by 2025. Despite this, Microsoft is positioning Azure as the current leader in ready-to-deploy AI infrastructure. CEO Satya Nadella shared a video on Twitter showcasing the new system, calling it the “first of many” such AI factories and emphasizing Azure’s readiness to meet the demands of next-generation AI. The system’s performance has already been validated in benchmarks. In MLPerf Inference v5.1, the GB300 NVL72 delivered up to 5x higher throughput per GPU on the 671-billion-parameter DeepSeek-R1 model compared to the previous Hopper architecture. It also led in performance on new benchmarks like Llama 3.1 405B, showcasing its strength in both training and inference. Microsoft’s CTO Kevin Scott is set to discuss these advancements at TechCrunch Disrupt in San Francisco, signaling that more details on Azure’s AI expansion are on the way. The company’s move underscores its strategy to lead in AI infrastructure, not just through partnerships but through full-stack innovation—from chips and networking to software and cooling. While OpenAI builds its own data centers, Microsoft is leveraging its existing global footprint and deep tech integration to offer immediate, scalable, and high-performance AI infrastructure. This deployment sets a new benchmark for AI readiness and could shape how enterprises and AI developers access and deploy large-scale models in the years ahead.

Related Links