HyperAI

Natural Questions On Theoremqa

Métriques

Accuracy

Résultats

Résultats de performance de divers modèles sur ce benchmark

Tableau comparatif
Nom du modèleAccuracy
theoremqa-a-theorem-driven-question-answering52.4
theoremqa-a-theorem-driven-question-answering43.8
dart-math-difficulty-aware-rejection-tuning-115.4
theoremqa-a-theorem-driven-question-answering35.6
dart-math-difficulty-aware-rejection-tuning-127.4
theoremqa-a-theorem-driven-question-answering24.9
theoremqa-a-theorem-driven-question-answering25.9
dart-math-difficulty-aware-rejection-tuning-116.4
dart-math-difficulty-aware-rejection-tuning-128.2
dart-math-difficulty-aware-rejection-tuning-132.2
theoremqa-a-theorem-driven-question-answering30.2
theoremqa-a-theorem-driven-question-answering23.6
theoremqa-a-theorem-driven-question-answering22.8
theoremqa-a-theorem-driven-question-answering21.0
dart-math-difficulty-aware-rejection-tuning-119.4
dart-math-difficulty-aware-rejection-tuning-117.0
theoremqa-a-theorem-driven-question-answering23.9
dart-math-difficulty-aware-rejection-tuning-132.5
theoremqa-a-theorem-driven-question-answering31.8