HyperAIHyperAI

Command Palette

Search for a command to run...

Nvidia Unveils Rubin Architecture: Next-Gen AI Chips Boost Speed and Efficiency Amid Growing Infrastructure Demand

At the Consumer Electronics Show, Nvidia CEO Jensen Huang unveiled the company’s new Rubin computing architecture, describing it as the cutting edge of AI hardware. The architecture is now in full production and is expected to scale significantly in the second half of the year. “Vera Rubin is designed to address this fundamental challenge that we have: The amount of computation necessary for AI is skyrocketing,” Huang told the audience. “Today, I can tell you that Vera Rubin is in full production.” The Rubin architecture, first announced in 2024, marks the latest milestone in Nvidia’s rapid hardware innovation cycle that has propelled the company to become the most valuable corporation in the world. It will succeed the Blackwell architecture, which itself replaced the Hopper and Lovelace generations. Rubin chips are already being integrated into systems across the industry, with major cloud providers and AI labs securing early access. Key partners include Anthropic, OpenAI, and Amazon Web Services. The architecture will also power HPE’s Blue Lion supercomputer and the upcoming Doudna supercomputer at Lawrence Berkeley National Laboratory. Named after pioneering astronomer Vera Florence Cooper Rubin, the architecture is built around six interconnected chips working in unison. At its core is the Rubin GPU, but the design also introduces major upgrades to storage and interconnectivity. The Bluefield system has been enhanced to improve data movement, while the NVLink interconnect technology has been further optimized for higher bandwidth and lower latency. Additionally, Nvidia introduced a new Vera CPU specifically engineered for agentic reasoning—complex, goal-driven AI workflows. Nvidia’s senior director of AI infrastructure solutions, Dion Harris, highlighted the growing demands on memory systems, particularly for key-value (KV) cache, which stores information during AI model processing. “As you start to enable new types of workflows, like agentic AI or long-term tasks, that puts a lot of stress and requirements on your KV cache,” Harris said during a press call. “So we’ve introduced a new tier of storage that connects externally to the compute device, which allows you to scale your storage pool much more efficiently.” Performance gains are substantial. According to Nvidia’s internal benchmarks, the Rubin architecture delivers 3.5 times faster training performance and five times faster inference speeds compared to Blackwell. It can achieve up to 50 petaflops of compute power and supports eight times more inference compute per watt, significantly improving energy efficiency. These advancements come amid fierce competition in the AI infrastructure space, as cloud providers and AI labs race to secure hardware and the data center capacity needed to run next-generation models. On an October 2025 earnings call, Huang projected that between $3 trillion and $4 trillion will be invested in AI infrastructure over the next five years, underscoring the scale of the industry’s transformation.

Related Links

Nvidia Unveils Rubin Architecture: Next-Gen AI Chips Boost Speed and Efficiency Amid Growing Infrastructure Demand | Trending Stories | HyperAI