HyperAIHyperAI

Command Palette

Search for a command to run...

Nemotron-Post-Training-Dataset-v2 Post-training Dataset

Date

2 months ago

Size

36.78 GB

Organization

NVIDIA

Paper URL

2508.14444

License

CC BY 4.0

Nemotron-Post-Training-Dataset-v2 is a version launched by NVIDIA in 2025 based on the existing post-training corpus. This dataset expands SFT and RL data to five target languages (Spanish/French/German/Italian/Japanese), covering mathematics, code, STEM (science, technology, engineering and mathematics), dialogue and other scenarios, used to improve the model's reasoning and instruction following capabilities; and provides metadata-based filtering functions and typical subset examples. This dataset serves the release and alignment research of the Nemotron-Nano-9B-v2 series, and is one of its public post-training corpora, which facilitates users to reproduce experiments and further improve. The relevant paper results are "NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model".

Screenable samplesdistributedWith metadata:

  • Filter download: Supports quick filtering and downloading by metadata such as category/language/source model
  • Category and size (Value): math (239,467); code (175,000); stem (355,000); chat (627,720)
  • Multi-language coverage: ja, de, it, es, fr
  • Source: Synthesized from multiple large models (such as DeepSeek-R1-0528, Qwen 2.5/3 series, etc.)
  • Annotation format: Some samples provide two responses: "reasoning on or off"; the reasoning trace is in English

Nemotron-Post-Training-Dataset-v2.torrent
Seeding 1Downloading 0Completed 25Total Downloads 75
  • Nemotron-Post-Training-Dataset-v2/
    • README.md
      1.94 KB
    • README.txt
      3.88 KB
      • data/
        • Nemotron-Post-Training-Dataset-v2.zip
          36.78 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp