Nemotron-Pretraining-Dataset-sample Sampling Dataset
* This dataset supports online use.Click here to jump.
Nemotron-Pretraining-Dataset-sample is a streamlined sampling version of the Nemotron pretraining dataset released by NVIDIA in 2025. The related paper results are "NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model".
The dataset contains 10 representative subsets selected from different components of the complete SFT and pre-training corpus, covering high-quality question-answering data, extracted content focused on the mathematical field, code metadata, and SFT-style instruction data, suitable for review and quick experiments.
Nemotron-Pretraining-Dataset-sample.torrent
Seeding 1Downloading 0Completed 17Total Downloads 91
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.
AI Co-coding
Ready-to-use GPUs
Best Pricing
Hyper Newsletters
Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp