6 months ago

Luoxin Chen Jinming Gu Liankai Huang Wenhao Huang Zhicheng Jiang Allan Jie Xiaoran Jin Xing Jin Chenggang Li Kaijing Ma

Abstract

LLMs have demonstrated strong mathematical reasoning abilities by leveragingreinforcement learning with long chain-of-thought, yet they continue tostruggle with theorem proving due to the lack of clear supervision signals whensolely using natural language. Dedicated domain-specific languages like Leanprovide clear supervision via formal verification of proofs, enabling effectivetraining through reinforcement learning. In this work, we proposeSeed-Prover, a lemma-style whole-proof reasoning model. Seed-Provercan iteratively refine its proof based on Lean feedback, proved lemmas, andself-summarization. To solve IMO-level contest problems, we design threetest-time inference strategies that enable both deep and broad reasoning.Seed-Prover proves 78.1% of formalized past IMO problems, saturates MiniF2F,and achieves over 50% on PutnamBench, outperforming the previousstate-of-the-art by a large margin. To address the lack of geometry support inLean, we introduce a geometry reasoning engine Seed-Geometry, whichoutperforms previous formal geometry engines. We use these two systems toparticipate in IMO 2025 and fully prove 5 out of 6 problems. This workrepresents a significant advancement in automated mathematical reasoning,demonstrating the effectiveness of formal verification with longchain-of-thought reasoning.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

6 months ago

Luoxin Chen Jinming Gu Liankai Huang Wenhao Huang Zhicheng Jiang Allan Jie Xiaoran Jin Xing Jin Chenggang Li Kaijing Ma

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

6 months ago

Luoxin Chen Jinming Gu Liankai Huang Wenhao Huang Zhicheng Jiang Allan Jie Xiaoran Jin Xing Jin Chenggang Li Kaijing Ma

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Luoxin Chen Jinming Gu Liankai Huang Wenhao Huang Zhicheng Jiang Allan Jie Xiaoran Jin Xing Jin Chenggang Li Kaijing Ma26 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Luoxin Chen Jinming Gu Liankai Huang Wenhao Huang Zhicheng Jiang Allan Jie Xiaoran Jin Xing Jin Chenggang Li Kaijing Ma26 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Luoxin Chen Jinming Gu Liankai Huang Wenhao Huang Zhicheng Jiang Allan Jie Xiaoran Jin Xing Jin Chenggang Li Kaijing Ma26 more

Abstract

Build AI with AI

HyperAI Newsletters

Luoxin Chen Jinming Gu Liankai Huang Wenhao Huang Zhicheng Jiang Allan Jie Xiaoran Jin Xing Jin Chenggang Li Kaijing Ma

Luoxin Chen Jinming Gu Liankai Huang Wenhao Huang Zhicheng Jiang Allan Jie Xiaoran Jin Xing Jin Chenggang Li Kaijing Ma

Luoxin Chen Jinming Gu Liankai Huang Wenhao Huang Zhicheng Jiang Allan Jie Xiaoran Jin Xing Jin Chenggang Li Kaijing Ma