HyperAIHyperAI

Command Palette

Search for a command to run...

NVIDIA Unveils Nemotron 3 Family of Open AI Models for Efficient, Transparent Agentic AI Development

NVIDIA has unveiled the Nemotron 3 family of open models, a new suite of AI models, data, and tools designed to advance transparent, efficient, and specialized agentic AI development across industries. The Nemotron 3 lineup includes three model sizes—Nano, Super, and Ultra—each built on a breakthrough hybrid latent mixture-of-experts (MoE) architecture that enhances performance, scalability, and cost efficiency for multi-agent systems. As organizations move beyond single-model chatbots toward complex, collaborative multi-agent AI workflows, challenges like communication overhead, context drift, and high inference costs have become major bottlenecks. Nemotron 3 addresses these issues with a design that enables faster, more accurate long-horizon reasoning while maintaining transparency—key for building trust in AI systems that automate critical business processes. Jensen Huang, CEO of NVIDIA, emphasized the importance of open innovation, stating that Nemotron transforms advanced AI into an open platform, empowering developers to build agentic systems at scale with full visibility and control. The models support NVIDIA’s broader sovereign AI strategy, allowing organizations in Europe, South Korea, and beyond to build AI systems aligned with their data, regulations, and values. Early adopters such as Accenture, Cadence, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, Synopsys, and Zoom are already integrating Nemotron 3 into AI workflows across manufacturing, cybersecurity, software development, media, and communications. ServiceNow’s CEO Bill McDermott highlighted the partnership, saying the integration of Nemotron 3 with ServiceNow’s intelligent automation will set a new standard for speed, accuracy, and efficiency in enterprise AI. Perplexity’s CEO Aravind Srinivas noted that the company’s agent router can now route tasks between high-performing proprietary models and Nemotron 3 Ultra, ensuring optimal performance and cost efficiency. Nemotron 3 Nano, available now, is the most compute-efficient model, ideal for software debugging, content summarization, and information retrieval. It features a 1-million-token context window and uses a hybrid MoE architecture to deliver up to 4x higher token throughput than its predecessor, Nemotron 2 Nano, while reducing reasoning-token generation by up to 60%. Independent benchmarking by Artificial Analysis ranks it as the most open and efficient model of its size with top-tier accuracy. Nemotron 3 Super is optimized for low-latency, multi-agent collaboration on complex tasks, while Nemotron 3 Ultra serves as a high-level reasoning engine for deep research and strategic planning. Both Super and Ultra leverage NVIDIA’s 4-bit NVFP4 training format on the Blackwell architecture, cutting memory use and accelerating training without sacrificing accuracy. NVIDIA also released a comprehensive set of open tools and data, including three trillion tokens of pretraining, post-training, and reinforcement learning data, as well as the Nemotron Agentic Safety Dataset to help teams evaluate and improve agent safety. The NeMo Gym and NeMo RL open-source libraries provide training environments and post-training foundations, while NeMo Evaluator supports model validation. All are available on GitHub and Hugging Face. Nemotron 3 Nano is already accessible on Hugging Face and through inference providers like Baseten, DeepInfra, Fireworks, FriendliAI, OpenRouter, and Together AI. It is also available on AWS via Amazon Bedrock, with support coming soon on Google Cloud, CoreWeave, Crusoe, Microsoft Foundry, Nebius, Nscale, and Yotta. Enterprise users can deploy it as an NVIDIA NIM microservice for secure, scalable, and private operation on NVIDIA-accelerated infrastructure. Nemotron 3 Super and Ultra are expected to be available in the first half of 2026.

Related Links