HyperAI

OpenMathReasoning Mathematical Reasoning Dataset

Date

2 months ago

Size

40.89 GB

Organization

NVIDIA

Publish URL

huggingface.co

Categories

The OpenMathReasoning dataset is the world's first large-scale, high-quality dataset focused on mathematical reasoning, released by NVIDIA in 2025. The relevant paper results are:AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset", which aims to help the OpenMath-Nemotron series of models achieve outstanding results in the field of mathematical reasoning.

The dataset contains multi-dimensional fine annotations, including math problem type labels, detailed problem-solving steps, problem difficulty level classification, etc. These high-quality data from the math professional field and online communities provide solid and powerful support for in-depth research on the math reasoning process and optimization of math problem-solving models, and promote the vigorous development of related industries such as intelligent math tutoring systems, math competition auxiliary tools, and scientific research computing automation.

The dataset contains 540K unique math problems from the AoPS forum, including:

  • 3.2M Long-Term Strategies of Trust (CoT) Solution
  • 1.7M long Tool Integrated Reasoning (TIR) solution
  • 566K samples to select the most promising solutions from many candidates (GenSelect)

OpenMathReasoning.torrent
Seeding 2Downloading 0Completed 24Total Downloads 52
  • OpenMathReasoning/
    • README.md
      1.78 KB
    • README.txt
      3.55 KB
      • data/
        • OpenMathReasoning.zip
          40.89 GB