HyperAI超神経

Natural Questions On Theoremqa

評価指標

Accuracy

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名Accuracy
theoremqa-a-theorem-driven-question-answering52.4
theoremqa-a-theorem-driven-question-answering43.8
dart-math-difficulty-aware-rejection-tuning-115.4
theoremqa-a-theorem-driven-question-answering35.6
dart-math-difficulty-aware-rejection-tuning-127.4
theoremqa-a-theorem-driven-question-answering24.9
theoremqa-a-theorem-driven-question-answering25.9
dart-math-difficulty-aware-rejection-tuning-116.4
dart-math-difficulty-aware-rejection-tuning-128.2
dart-math-difficulty-aware-rejection-tuning-132.2
theoremqa-a-theorem-driven-question-answering30.2
theoremqa-a-theorem-driven-question-answering23.6
theoremqa-a-theorem-driven-question-answering22.8
theoremqa-a-theorem-driven-question-answering21.0
dart-math-difficulty-aware-rejection-tuning-119.4
dart-math-difficulty-aware-rejection-tuning-117.0
theoremqa-a-theorem-driven-question-answering23.9
dart-math-difficulty-aware-rejection-tuning-132.5
theoremqa-a-theorem-driven-question-answering31.8