Command Palette
Search for a command to run...
NuminaMath-CoT Mathematics Competition Problem Dataset
Date
Size
Paper URL
License
CC BY-NC-SA 3.0
* This dataset supports online use.Click here to jump.
This dataset was proposed by AI-MO in 2024 and contains 860k+ math competition question-solution pairs, each of which uses the Chain of Thought (CoT) reasoning template. The sources of the dataset include Chinese high school math exercises, American and International Mathematical Olympiad competition questions. The data is mainly collected from online test paper PDFs and math discussion forums. The processing steps include (a) OCR from the original PDF, (b) segmentation into problem-solution pairs, (c) translation into English, (d) reshaping to generate CoT reasoning format, and (e) final answer format.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.