SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Recent advances such as OpenAI-o1 and DeepSeek R1 have demonstrated thepotential of Reinforcement Learning (RL) to enhance reasoning abilities inLarge Language Models (LLMs). While open-source replication efforts haveprimarily focused on mathematical and coding domains, methods and resources fordeveloping general reasoning capabilities remain underexplored. This gap ispartly due to the challenge of collecting diverse and verifiable reasoning datasuitable for RL. We hypothesize that logical reasoning is critical fordeveloping general reasoning capabilities, as logic forms a fundamentalbuilding block of reasoning. In this work, we present SynLogic, a datasynthesis framework and dataset that generates diverse logical reasoning dataat scale, encompassing 35 diverse logical reasoning tasks. The SynLogicapproach enables controlled synthesis of data with adjustable difficulty andquantity. Importantly, all examples can be verified by simple rules, makingthem ideally suited for RL with verifiable rewards. In our experiments, wevalidate the effectiveness of RL training on the SynLogic dataset based on 7Band 32B models. SynLogic leads to state-of-the-art logical reasoningperformance among open-source datasets, surpassing DeepSeek-R1-Distill-Qwen-32Bby 6 points on BBEH. Furthermore, mixing SynLogic data with mathematical andcoding tasks improves the training efficiency of these domains andsignificantly enhances reasoning generalization. Notably, our mixed trainingmodel outperforms DeepSeek-R1-Zero-Qwen-32B across multiple benchmarks. Thesefindings position SynLogic as a valuable resource for advancing the broaderreasoning capabilities of LLMs. We open-source both the data synthesis pipelineand the SynLogic dataset at https://github.com/MiniMax-AI/SynLogic.