HyperAI

DeepMath-103K Mathematical Reasoning Dataset

Date

15 days ago

Organization

Shanghai Jiao Tong University

Publish URL

huggingface.co

Categories

Download Help

DeepMath-103K is a large-scale dataset for training and evaluating mathematical reasoning models jointly released by Tencent and Shanghai Jiao Tong University in 2025. The related paper results are "DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning".

The dataset focuses on math problems of levels 5-9, covering algebra, calculus, number theory, geometry, probability, discrete mathematics and other fields, and focuses on challenging complex reasoning capabilities. The dataset also performs detailed decontamination processing for common benchmarks through semantic matching to minimize test set leakage and promote fair model evaluation.

Hierarchical classification of math topics covered by DeepMath-103K