HyperAIHyperAI

Command Palette

Search for a command to run...

Eurus-2-RL-Data Mathematical Programming Problem Training Dataset

Eurus-2-RL-Data is a high-quality dataset specifically for reinforcement learning training, mainly used in solving mathematical and programming problems. The relevant blog is "Process Reinforcement through Implicit Rewards". The math problems in this dataset are partly derived from NuminaMath-CoT, covering a wide range of topics from Chinese high school mathematics to the International Mathematical Olympiad. Programming problems come from multiple platforms, including APPS, CodeContests, TACO, and Codeforces, and are mainly aimed at programming competition-level questions. In order to ensure the quality of the data, Eurus-2-RL-Data has been rigorously cleaned and filtered. Mathematical problems were screened using advanced reasoning models such as Qwen-QwQ to remove unsolvable, mismatched, or wrong-answered questions, and to convert multiple-choice questions into open-ended questions. Programming problems mainly remove duplicate content. After these processes,The dataset ultimately contains about 455k math problems and 27k programming problems. The main application areas of Eurus-2-RL-Data are reinforcement learning and programming competitions. It provides an effective training platform for the model, helping it to learn more deeply and optimize when solving complex problems.

Citation

```latex
@article{yuan2024implicitprm,
title={Free Process Rewards without Process Labels},
author={Lifan Yuan and Wendi Li and Huayu Chen and Ganqu Cui and Ning Ding and Kaiyan Zhang and Bowen Zhou and Zhiyuan Liu and Hao Peng},
journal={arXiv preprint arXiv:2412.01981},
year={2024}
}
Eurus-2-RL-Data.torrent
Seeding 0Downloading 3Completed 174Total Downloads 236
  • Eurus-2-RL-Data/
    • README.md
      1.82 KB
    • README.txt
      3.64 KB
      • data/
        • Eurus-2-RL-Data.zip
          1.16 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp