NVIDIA unveils open Nemotron 3 Super for agentic reasoning
NVIDIA has officially released Nemotron 3 Super, an open-source hybrid architecture model designed specifically to address complex challenges in multi-agent systems. With a total parameter count of 120 billion and activated parameters reaching 12 billion, the model aims to balance reasoning depth with computational efficiency, tackling issues such as "context explosion" and "cognitive tax" that agents face during long-term tasks. Nemotron 3 Super employs an innovative mixed Mamba-Transformer mixture-of-experts (MoE) architecture. The Mamba layers provide sequence processing capabilities with linear time complexity, complemented by native million-token context windows, ensuring agents can maintain long-term memory while preserving goal consistency. Transformer attention layers are interspersed throughout to guarantee precise retrieval of critical facts amidst vast amounts of information. Additionally, the model introduces "latent MoE" technology, which compresses embedding space to enable four times more experts at equivalent costs, achieving finer-grained task specialization. Coupled with Multi-Token Prediction (MTP) techniques, the model significantly enhances logical reasoning capability and generation speed across both training and inference phases, accelerating structured tool-calling tasks by up to threefold. In terms of training, the model utilizes NVIDIA's proprietary 4-bit floating-point format (NVFP4) natively for pre-training, reducing GPU memory consumption while maintaining high precision. The training pipeline encompasses pretraining, supervised fine-tuning, and reinforcement learning across multiple environments, enabling robust performance in complex workflows. Benchmarks show that Nemotron 3 Super achieved a score of 85.6% on the PinchBench evaluation suite, positioning it among leading open-source models in its category. The model is fully open-sourced, including weights, datasets, and training recipes, allowing developers free deployment locally or in cloud environments. NVIDIA provides comprehensive fine-tuning guides and deployment toolkits compatible with agent frameworks like OpenClaw for experimentation and benchmarking. The release of Nemotron 3 Super marks a new phase for open-source multi-agent AI, delivering efficient and reliable reasoning engines for high-value application scenarios ranging from software engineering to cybersecurity.
