HyperAI초신경

Automated Theorem Proving On Minif2F Valid

평가 지표

Pass@64

평가 결과

이 벤치마크에서 각 모델의 성능 결과

		Paper Title
Evariste	58.6	HyperTree Proof Search for Neural Theorem Proving
Evariste-7d	47.5	HyperTree Proof Search for Neural Theorem Proving
GPT-f	47.3	HyperTree Proof Search for Neural Theorem Proving
Evariste-1d	46.7	HyperTree Proof Search for Neural Theorem Proving
Metamath GPT-f	-	MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Lean GPT-f	-	MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Lean tidy	-	MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
DSP (62B Minerva informal)	-	Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Lyra + GPT-4	-	Lyra: Orchestrating Dual Correction in Automated Theorem Proving
LEGO-Prover ChatGPT	-	LEGO-Prover: Neural Theorem Proving with Growing Libraries

0 of 10 row(s) selected.

Automated Theorem Proving On Minif2F Valid | SOTA | HyperAI초신경