Automated Theorem Proving On Minif2F Valid
평가 지표
Pass@64
평가 결과
이 벤치마크에서 각 모델의 성능 결과
비교 표
모델 이름 | Pass@64 |
---|---|
hypertree-proof-search-for-neural-theorem | 47.3 |
hypertree-proof-search-for-neural-theorem | 46.7 |
minif2f-a-cross-system-benchmark-for-formal | - |
minif2f-a-cross-system-benchmark-for-formal | - |
hypertree-proof-search-for-neural-theorem | 47.5 |
hypertree-proof-search-for-neural-theorem | 58.6 |
minif2f-a-cross-system-benchmark-for-formal | - |
draft-sketch-and-prove-guiding-formal-theorem | - |
lyra-orchestrating-dual-correction-in | - |
lego-prover-neural-theorem-proving-with | - |