Microsoft Unveils Second-Gen Maia 200 AI Chip to Strengthen Cloud and Inference Performance
Microsoft has unveiled the second generation of its custom artificial intelligence chip, the Maia 200, as part of its broader strategy to strengthen its cloud computing business and reduce reliance on industry-leading processors from Nvidia. The new chip, announced by Scott Guthrie, Microsoft’s executive vice president for cloud and AI, marks a significant step forward in the company’s in-house hardware development. The Maia 200 is positioned as the most efficient inference system Microsoft has ever deployed, according to Guthrie. While the company’s first AI chip, the Maia 100, was developed in 2021 but never offered to cloud customers, the Maia 200 will be available for broader use in the future. Developers, academics, AI research labs, and contributors to open-source AI projects can now apply for early access to a software development kit to begin testing and building on the new platform. The chip will be used by Microsoft’s superintelligence team, led by Mustafa Suleyman, as well as key products like Microsoft 365 Copilot, the AI-powered productivity add-on, and Microsoft Foundry, a service designed to help developers build applications on top of large AI models. The move comes amid surging demand for AI infrastructure from generative AI startups like Anthropic and OpenAI, as well as enterprises developing AI agents and other advanced applications. Data center operators are under pressure to scale computing power while managing energy consumption and costs. Microsoft is rolling out the Maia 200 in its U.S. Central region first, with deployment following in the U.S. West 3 region, and additional data center locations to come. The chips are built using Taiwan Semiconductor Manufacturing Company’s cutting-edge 3 nanometer process. Each server houses four Maia 200 chips connected together, using Ethernet cables instead of the InfiniBand standard that Nvidia has long dominated through its 2020 acquisition of Mellanox. Microsoft claims the Maia 200 delivers 30% higher performance than competing chips at the same price point. It also features more high-bandwidth memory than Amazon Web Services’ third-generation Trainium chip or Google’s seventh-generation Tensor Processing Unit. By linking up to 6,144 Maia 200 chips in a single system, Microsoft says it can achieve high throughput while reducing energy usage and lowering the total cost of ownership. The company previously demonstrated in 2023 that its GitHub Copilot coding assistant could run efficiently on the Maia 100, signaling early success in deploying its custom silicon for real-world AI workloads.
