HyperAIHyperAI

Command Palette

Search for a command to run...

We-Math2.0-Standard Visual Mathematical Reasoning Benchmark Dataset

Date

3 months ago

Size

369.86 MB

Organization

Tsinghua University
Tencent

Paper URL

2508.10433

License

Non-Commercial

*This dataset supports online use.Click here to jump.

We-Math2.0-Standard is a standard dataset for visual mathematical reasoning released by Beijing University of Posts and Telecommunications, Tencent and Tsinghua University in 2025. The related paper results are "WE-MATH 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning", aims to provide a diagnosable, explainable and comparable evaluation basis.

This dataset builds a unified label space around 1,819 precisely defined knowledge principles, explicitly annotating each question with the principle and rigorously curating it, thereby achieving broad and balanced coverage overall, particularly strengthening mathematical subfields and question types that were previously underrepresented. The dataset adopts a dual expansion design:

  • First, multiple images per question are used to test the integration and alignment of multi-source visual evidence;
  • Second, multi-questions per image are used to test multi-principle transfer and conceptual flexibility in the same visual context.

Each example consists of an image and a text stem, and is accompanied by annotations of the knowledge principles and standard answers that the question relies on.

Dataset Overview

We-Mathv2-Standard.torrent
Seeding 1Downloading 0Completed 51Total Downloads 121
  • We-Mathv2-Standard/
    • README.md
      1.82 KB
    • README.txt
      3.65 KB
      • data/
        • We-Math2.0-Standard.zip
          369.86 MB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
We-Math2.0-Standard Visual Mathematical Reasoning Benchmark Dataset | Datasets | HyperAI