Command Palette
Search for a command to run...
Automated Theorem Proving On Minif2F Valid
評価指標
Pass@64
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
| Paper Title | ||
|---|---|---|
| Evariste | 58.6 | HyperTree Proof Search for Neural Theorem Proving |
| Evariste-7d | 47.5 | HyperTree Proof Search for Neural Theorem Proving |
| GPT-f | 47.3 | HyperTree Proof Search for Neural Theorem Proving |
| Evariste-1d | 46.7 | HyperTree Proof Search for Neural Theorem Proving |
| Metamath GPT-f | - | MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics |
| Lean GPT-f | - | MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics |
| Lean tidy | - | MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics |
| DSP (62B Minerva informal) | - | Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs |
| Lyra + GPT-4 | - | Lyra: Orchestrating Dual Correction in Automated Theorem Proving |
| LEGO-Prover ChatGPT | - | LEGO-Prover: Neural Theorem Proving with Growing Libraries |
0 of 10 row(s) selected.