```latex @article{yuan2024implicitprm, title={Free Process Rewards without Process Labels}, author={Lifan Yuan and Wendi Li and Huayu Chen and Ganqu Cui and Ning Ding and Kaiyan Zhang and Bowen Zhou and Zhiyuan Liu and Hao Peng}, journal={arXiv preprint arXiv:2412.01981}, year={2024} }

Date

a year ago

Size

1.16 GB

Paper URL

curvy-check-498.notion.site

Tags

LLM

Mathematics

Language

Reinforcement Learning

Model Training

Eurus-2-RL-Data is a high-quality dataset specifically for reinforcement learning training, mainly used in solving mathematical and programming problems. The relevant blog is "Process Reinforcement through Implicit Rewards". The math problems in this dataset are partly derived from NuminaMath-CoT, covering a wide range of topics from Chinese high school mathematics to the International Mathematical Olympiad. Programming problems come from multiple platforms, including APPS, CodeContests, TACO, and Codeforces, and are mainly aimed at programming competition-level questions. In order to ensure the quality of the data, Eurus-2-RL-Data has been rigorously cleaned and filtered. Mathematical problems were screened using advanced reasoning models such as Qwen-QwQ to remove unsolvable, mismatched, or wrong-answered questions, and to convert multiple-choice questions into open-ended questions. Programming problems mainly remove duplicate content. After these processes,The dataset ultimately contains about 455k math problems and 27k programming problems. The main application areas of Eurus-2-RL-Data are reinforcement learning and programming competitions. It provides an effective training platform for the model, helping it to learn more deeply and optimize when solving complex problems.

Citation

```latex
@article{yuan2024implicitprm,
title={Free Process Rewards without Process Labels},
author={Lifan Yuan and Wendi Li and Huayu Chen and Ganqu Cui and Ning Ding and Kaiyan Zhang and Bowen Zhou and Zhiyuan Liu and Hao Peng},
journal={arXiv preprint arXiv:2412.01981},
year={2024}
}

Eurus-2-RL-Data.torrent

Seeding 0Downloading 3Completed 174Total Downloads 236

Eurus-2-RL-Data/
- README.md
  1.82 KB
- README.txt
  3.64 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset

Discuss on Discord

Date

a year ago

Size

1.16 GB

Paper URL

curvy-check-498.notion.site

Citation

```latex
@article{yuan2024implicitprm,
title={Free Process Rewards without Process Labels},
author={Lifan Yuan and Wendi Li and Huayu Chen and Ganqu Cui and Ning Ding and Kaiyan Zhang and Bowen Zhou and Zhiyuan Liu and Hao Peng},
journal={arXiv preprint arXiv:2412.01981},
year={2024}
}

Eurus-2-RL-Data.torrent

Seeding 0Downloading 3Completed 174Total Downloads 236

Eurus-2-RL-Data/
- README.md
  1.82 KB
- README.txt
  3.64 KB

Related Datasets

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

4 hours ago

TACK Targeted Chimera Knowledge Base Dataset

15 days ago

SMOL Multilingual Translation Parallel Dataset

19 days ago

AgentTrove Intelligent Agent Interaction Trajectory Dataset

a month ago

Breast Cancer: Multi-Modal Fusion Dataset

a month ago

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

a day ago

MDPBench Multilingual Document Parsing Benchmark Dataset

a day ago

GPT-5.4-step-by-step-reasoning Dataset

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Eurus-2-RL-Data Mathematical Programming Problem Training Dataset

Citation

Build AI with AI

HyperAI Newsletters

Command Palette

Eurus-2-RL-Data Mathematical Programming Problem Training Dataset

Citation

Related Datasets

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

TACK Targeted Chimera Knowledge Base Dataset

SMOL Multilingual Translation Parallel Dataset

AgentTrove Intelligent Agent Interaction Trajectory Dataset

Breast Cancer: Multi-Modal Fusion Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

MDPBench Multilingual Document Parsing Benchmark Dataset

GPT-5.4-step-by-step-reasoning Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

Eurus-2-RL-Data Mathematical Programming Problem Training Dataset

Citation

Related Datasets

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

TACK Targeted Chimera Knowledge Base Dataset

SMOL Multilingual Translation Parallel Dataset

AgentTrove Intelligent Agent Interaction Trajectory Dataset

Breast Cancer: Multi-Modal Fusion Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

MDPBench Multilingual Document Parsing Benchmark Dataset

GPT-5.4-step-by-step-reasoning Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

TACK Targeted Chimera Knowledge Base Dataset

SMOL Multilingual Translation Parallel Dataset

AgentTrove Intelligent Agent Interaction Trajectory Dataset

Breast Cancer: Multi-Modal Fusion Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

MDPBench Multilingual Document Parsing Benchmark Dataset

GPT-5.4-step-by-step-reasoning Dataset

Related Datasets

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

TACK Targeted Chimera Knowledge Base Dataset

SMOL Multilingual Translation Parallel Dataset

AgentTrove Intelligent Agent Interaction Trajectory Dataset

Breast Cancer: Multi-Modal Fusion Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

MDPBench Multilingual Document Parsing Benchmark Dataset

GPT-5.4-step-by-step-reasoning Dataset