HyperAIHyperAI
Back to Headlines

NVIDIA Unveils Advanced Nemotron and Cosmos Reasoning Models to Power Next-Gen AI Agents and Robotics

3 days ago

AI agents are expected to generate up to $450 billion in revenue gains and cost savings by 2028, according to Capgemini. As enterprises accelerate the development of intelligent agents, they are turning to advanced reasoning models to enhance performance across enterprise and physical AI applications. At SIGGRAPH, NVIDIA unveiled significant upgrades to its Nemotron and Cosmos Reasoning Model families, empowering organizations to build smarter, more autonomous systems. Leading companies including CrowdStrike, Uber, Zoom, Magna, NetApp, and Amdocs are already leveraging these models to drive innovation. The new NVIDIA Nemotron Nano 2 and Llama Nemotron Super 1.5 models deliver the highest accuracy in their respective size categories for scientific reasoning, math, coding, tool-calling, instruction-following, and chat. These models enable AI agents to explore broader solution paths, make faster decisions, and deliver superior results within strict time constraints. Nemotron models are designed as the cognitive core of AI agents, providing strong reasoning capabilities while being optimized for efficiency. The latest versions feature a hybrid architecture, compact quantized formats, and a configurable thinking budget that reduces reasoning costs by up to 60% without sacrificing performance. Nemotron Nano 2 offers up to six times higher token generation than comparable models, while Llama Nemotron Super 1.5 achieves top-tier reasoning accuracy and is now available in NVFP4 format, delivering up to six times higher throughput on NVIDIA B200 GPUs compared to H100 GPUs. NVIDIA also released the Llama Nemotron VLM Dataset v1, a new open training dataset with 3 million samples for optical character recognition, visual question answering, and image captioning. This dataset powers the Llama 3.1 Nemotron Nano VL 8B model and supports the development of more accurate vision-language models. To improve decision-making, AI agents rely on retrieval-augmented generation to access up-to-date information. The newly released Llama 3.2 NeMo Retriever embedding model leads three visual document retrieval benchmarks—ViDoRe V1, ViDoRe V2, and MTEB VisualDocumentRetrieval—boosting agent accuracy. When combined with the AI-Q NVIDIA Blueprint, a deep research agent built using these models ranks No. 1 on the DeepResearch Bench for open and portable agents. NVIDIA NeMo and NIM microservices support the full lifecycle of AI agent development, from training and deployment to monitoring and optimization. On the physical AI front, NVIDIA introduced Cosmos Reason, a 7-billion-parameter open reasoning vision language model (VLM) designed to give robots and visual AI agents a deeper understanding of the physical world. Unlike traditional VLMs, Cosmos Reason incorporates structured reasoning to grasp concepts like physics, object permanence, and space-time relationships. It serves as a reasoning backbone for robot vision language action (VLA) models, enhances training data curation, and enables spatial-temporal understanding in real-world environments. Cosmos Reason is being used across industries to advance autonomous systems. Uber is applying it to analyze autonomous vehicle behavior and summarize complex visual scenarios such as pedestrians crossing highways. Magna is integrating it into its City Delivery Platform to help delivery vehicles adapt quickly to new urban environments. Centific uses it to improve video intelligence for safety monitoring, while VAST leverages it for real-time urban intelligence and incident detection. Ambient.ai is using Cosmos Reason’s physics-aware reasoning to automate safety checks, such as detecting missing personal protective equipment in industrial settings. NVIDIA’s own robotics team employs it for data filtering and as the “System 2” reasoning engine behind next-generation VLA models like NVIDIA Isaac GR00T NX. The models will soon be available as NVIDIA NIM microservices, enabling secure, scalable deployment on any NVIDIA-accelerated infrastructure. They will also be accessible via Amazon Bedrock, Amazon SageMaker AI, Azure AI Foundry, Oracle Data Science Platform, and Google Vertex AI. Developers can try Cosmos Reason at build.nvidia.com or download it from Hugging Face and GitHub. Nemotron Nano 2 and Llama Nemotron Super 1.5 (NVFP4) will be available for download soon. Additional resources, including the Llama Nemotron VLM Dataset v1, are available on Hugging Face. For more insights, watch the NVIDIA Research special address at SIGGRAPH and learn how graphics, simulation, and AI are transforming industrial digitalization.

Related Links