HyperAI

OpenR1-Math-220k Mathematical Reasoning Dataset

Date

2 months ago

Size

3.51 GB

Organization

Publish URL

huggingface.co

License

Apache 2.0

OpenR1-Math-220k is a large-scale mathematical reasoning dataset released by the Open R1 team in 2025 to fill the gap in DeepSeek R1 synthetic data. The dataset contains 220,000 high-quality mathematical problems and their reasoning traces, which are derived from 800,000 reasoning traces generated by DeepSeek R1.

The dataset is divided into two parts:

  • default (94k issues): This part of the data performs best after supervised fine-tuning (SFT).
  • extended (131k questions): This part of the data contains additional NuminaMath 1.5 data sources, such as cn_k12, which provides more inference formulas.
OpenR1-Math-220k.torrent
Seeding 0Downloading 1Completed 29Total Downloads 57
  • OpenR1-Math-220k/
    • README.md
      1.29 KB
    • README.txt
      2.58 KB
      • data/
        • OpenR1-Math-220k.zip
          3.51 GB