HyperAI초신경

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Stojanovski, Zafir ; Stanley, Oliver ; Sharratt, Joe ; Jones, Richard ; Adefioye, Abdulhakeem ; Kaddour, Jean ; Köpf, Andreas
발행일: 6/3/2025
REASONING GYM: Reasoning Environments for Reinforcement Learning with
  Verifiable Rewards
초록

We introduce Reasoning Gym (RG), a library of reasoning environments forreinforcement learning with verifiable rewards. It provides over 100 datagenerators and verifiers spanning multiple domains including algebra,arithmetic, computation, cognition, geometry, graph theory, logic, and variouscommon games. Its key innovation is the ability to generate virtually infinitetraining data with adjustable complexity, unlike most previous reasoningdatasets, which are typically fixed. This procedural generation approach allowsfor continuous evaluation across varying difficulty levels. Our experimentalresults demonstrate the efficacy of RG in both evaluating and reinforcementlearning of reasoning models.