HyperAI
14 hours ago

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Luoxin Chen, Jinming Gu, Liankai Huang, Wenhao Huang, Zhicheng Jiang, Allan Jie, Xiaoran Jin, Xing Jin, Chenggang Li, Kaijing Ma, Cheng Ren, Jiawei Shen, Wenlei Shi, Tong Sun, He Sun, Jiahui Wang, Siran Wang, Zhihong Wang, Chenrui Wei, Shufa Wei, Yonghui Wu, Yuchen Wu, Yihang Xia, Huajian Xin, Fan Yang, Huaiyuan Ying, Hongyi Yuan, Zheng Yuan, Tianyang Zhan, Chi Zhang, Yue Zhang, Ge Zhang, Tianyun Zhao, Jianqiu Zhao, Yichi Zhou, Thomas Hanwen Zhu
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving
Abstract

LLMs have demonstrated strong mathematical reasoning abilities by leveragingreinforcement learning with long chain-of-thought, yet they continue tostruggle with theorem proving due to the lack of clear supervision signals whensolely using natural language. Dedicated domain-specific languages like Leanprovide clear supervision via formal verification of proofs, enabling effectivetraining through reinforcement learning. In this work, we proposeSeed-Prover, a lemma-style whole-proof reasoning model. Seed-Provercan iteratively refine its proof based on Lean feedback, proved lemmas, andself-summarization. To solve IMO-level contest problems, we design threetest-time inference strategies that enable both deep and broad reasoning.Seed-Prover proves 78.1% of formalized past IMO problems, saturates MiniF2F,and achieves over 50\% on PutnamBench, outperforming the previousstate-of-the-art by a large margin. To address the lack of geometry support inLean, we introduce a geometry reasoning engine Seed-Geometry, whichoutperforms previous formal geometry engines. We use these two systems toparticipate in IMO 2025 and fully prove 5 out of 6 problems. This workrepresents a significant advancement in automated mathematical reasoning,demonstrating the effectiveness of formal verification with longchain-of-thought reasoning.