HyperAI

Llama-Nemotron Inference Dataset

Date

a month ago

Size

26.85 GB

Organization

NVIDIA

Publish URL

huggingface.co

Categories

This dataset is a high-quality multi-domain reasoning dataset released by NVIDIA in 2025. The relevant paper results are:Llama-Nemotron: Efficient Reasoning Models", aims to support the performance improvement of large language models in tasks such as mathematics, code, scientific reasoning and instruction following, and help the Llama-3.1/3.3-Nemotron series models achieve more efficient reasoning capabilities.

The dataset contains approximately 22.06 million mathematical data, approximately 10.1 million code data, and the rest is data in the fields of science and instruction following. The data is collaboratively generated by multiple models such as Llama-3.3-70B-Instruct, DeepSeek-R1, and Qwen-2.5, covering diverse reasoning styles and problem-solving paths to meet the diverse needs of large-scale model training.

Llama-Nemotron.torrent
Seeding 1Downloading 0Completed 6Total Downloads 10
  • Llama-Nemotron/
    • README.md
      1.4 KB
    • README.txt
      2.8 KB
      • data/
        • Llama-Nemotron.zip
          26.85 GB