HyperAI超神経

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Stojanovski, Zafir ; Stanley, Oliver ; Sharratt, Joe ; Jones, Richard ; Adefioye, Abdulhakeem ; Kaddour, Jean ; Köpf, Andreas
公開日: 6/3/2025
REASONING GYM: Reasoning Environments for Reinforcement Learning with
  Verifiable Rewards
要約

We introduce Reasoning Gym (RG), a library of reasoning environments forreinforcement learning with verifiable rewards. It provides over 100 datagenerators and verifiers spanning multiple domains including algebra,arithmetic, computation, cognition, geometry, graph theory, logic, and variouscommon games. Its key innovation is the ability to generate virtually infinitetraining data with adjustable complexity, unlike most previous reasoningdatasets, which are typically fixed. This procedural generation approach allowsfor continuous evaluation across varying difficulty levels. Our experimentalresults demonstrate the efficacy of RG in both evaluating and reinforcementlearning of reasoning models.