We-Math2.0-Standard Visual Mathematical Reasoning Benchmark Dataset
Date
Size
Publish URL
Paper URL
License
非商业用途
*This dataset supports online use.Click here to jump.
We-Math2.0-Standard is a standard dataset for visual mathematical reasoning released by Beijing University of Posts and Telecommunications, Tencent and Tsinghua University in 2025. The related paper results are "WE-MATH 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning", aims to provide a diagnosable, explainable and comparable evaluation basis.
This dataset builds a unified label space around 1,819 precisely defined knowledge principles, explicitly annotating each question with the principle and rigorously curating it, thereby achieving broad and balanced coverage overall, particularly strengthening mathematical subfields and question types that were previously underrepresented. The dataset adopts a dual expansion design:
- First, multiple images per question are used to test the integration and alignment of multi-source visual evidence;
- Second, multi-questions per image are used to test multi-principle transfer and conceptual flexibility in the same visual context.
Each example consists of an image and a text stem, and is accompanied by annotations of the knowledge principles and standard answers that the question relies on.
