HyperAI超神经
首页
资讯
最新论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
跟随系统
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
登录
登录
首页
SOTA
Automated Theorem Proving
Automated Theorem Proving On Minif2F Test
Automated Theorem Proving On Minif2F Test
评估指标
ITP
Pass@32
cumulative
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
ITP
Pass@32
cumulative
Paper Title
Repository
Subgoal-XL
Isabelle
39.3
56.1
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving
-
Evariste
Lean
-
41
HyperTree Proof Search for Neural Theorem Proving
-
Lean GPT-f
Lean
29.2
29.2
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Lean Expert Iteration
Lean
34.5
36.6
Formal Mathematics Statement Curriculum Learning
-
GPT-f
Metamath
-
36.6
HyperTree Proof Search for Neural Theorem Proving
-
Evariste-1d
Lean
-
38.9
HyperTree Proof Search for Neural Theorem Proving
-
LLEMMA-7b
Lean
26.2
26.2
Llemma: An Open Language Model For Mathematics
-
Decomposing the Enigma
Isabelle
-
45.5
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving
COPRA + GPT-3.5
Lean
-
11.9
An In-Context Learning Agent for Formal Theorem-Proving
MMOS-DeepSeekMath-7B
Lean
-
28.3
An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning
DeepSeek-Prover-V1.5
Lean
50.0
63.5
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Lean tidy
Lean
-
18
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Metamath GPT-f
Metamath
-
1.6
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
COPRA + GPT-4
Lean
-
23.3
An In-Context Learning Agent for Formal Theorem-Proving
Thor + expert iteration on autoformalised theorems
Isabelle
-
35.2
-
-
LLEMMA-34b
Lean
25.8
25.8
Llemma: An Open Language Model For Mathematics
-
ReProver
Lean
26.5
26.5
-
-
DeepSeek-Prover
Lean
-
52.0
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
-
Sledgehammer
Isabelle
-
10.4
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers
-
DSP (540B Minerva informal)
Isabelle
-
38.9
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
-
0 of 28 row(s) selected.
Previous
Next