HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Automated Theorem Proving
Automated Theorem Proving On Minif2F Test
Automated Theorem Proving On Minif2F Test
Metrics
ITP
Pass@32
cumulative
Results
Performance results of various models on this benchmark
Columns
Model Name
ITP
Pass@32
cumulative
Paper Title
Subgoal-XL
Isabelle
39.3
56.1
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving
Evariste
Lean
-
41
HyperTree Proof Search for Neural Theorem Proving
Lean GPT-f
Lean
29.2
29.2
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Lean Expert Iteration
Lean
34.5
36.6
Formal Mathematics Statement Curriculum Learning
GPT-f
Metamath
-
36.6
HyperTree Proof Search for Neural Theorem Proving
Evariste-1d
Lean
-
38.9
HyperTree Proof Search for Neural Theorem Proving
LLEMMA-7b
Lean
26.2
26.2
Llemma: An Open Language Model For Mathematics
Decomposing the Enigma
Isabelle
-
45.5
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving
COPRA + GPT-3.5
Lean
-
11.9
An In-Context Learning Agent for Formal Theorem-Proving
MMOS-DeepSeekMath-7B
Lean
-
28.3
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning
DeepSeek-Prover-V1.5
Lean
50.0
63.5
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Lean tidy
Lean
-
18
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Metamath GPT-f
Metamath
-
1.6
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
COPRA + GPT-4
Lean
-
23.3
An In-Context Learning Agent for Formal Theorem-Proving
Thor + expert iteration on autoformalised theorems
Isabelle
-
35.2
-
LLEMMA-34b
Lean
25.8
25.8
Llemma: An Open Language Model For Mathematics
ReProver
Lean
26.5
26.5
-
DeepSeek-Prover
Lean
-
52.0
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Sledgehammer
Isabelle
-
10.4
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
DSP (540B Minerva informal)
Isabelle
-
38.9
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
0 of 28 row(s) selected.
Previous
Next