HyperAI超神経

Question Answering On Openbookqa

評価指標

Accuracy

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名Accuracy
unifiedqa-crossing-format-boundaries-with-a87.2
lamini-lm-a-diverse-herd-of-distilled-models31.2
clues-before-answers-generation-enhanced89.8
bloomberggpt-a-large-language-model-for47.2
モデル 595.9
large-language-models-can-self-improve84.4
large-language-models-can-self-improve86.4
grapeqa-graph-augmentation-and-pruning-to90
mixlora-enhancing-large-language-models-fine81.6
mixture-of-subspaces-in-low-rank-adaptation86.8
can-a-suit-of-armor-conduct-electricity-a-new56.3
モデル 1295.2
large-language-models-can-self-improve94.4
language-models-are-few-shot-learners65.4
grapeqa-graph-augmentation-and-pruning-to82
lamini-lm-a-diverse-herd-of-distilled-models34
モデル 1787.6
grapeqa-graph-augmentation-and-pruning-to66.2
lamini-lm-a-diverse-herd-of-distilled-models36
mixlora-enhancing-large-language-models-fine84.8
bloomberggpt-a-large-language-model-for51.6
mixlora-enhancing-large-language-models-fine83
palm-2-technical-report-158.5
モデル 2491.3
can-a-suit-of-armor-conduct-electricity-a-new55.8
bloomberggpt-a-large-language-model-for44.2
large-language-models-can-self-improve93
bloomberggpt-a-large-language-model-for58.0
lamini-lm-a-diverse-herd-of-distilled-models32.8
large-language-models-can-self-improve92
lamini-lm-a-diverse-herd-of-distilled-models32
can-a-suit-of-armor-conduct-electricity-a-new25
fusing-context-into-knowledge-graph-for83.2
careful-selection-of-knowledge-to-solve-open72
lamini-lm-a-diverse-herd-of-distilled-models39.8
qa-gnn-reasoning-with-language-models-and77.8
qa-gnn-reasoning-with-language-models-and82.8
fusing-context-into-knowledge-graph-for82.4
qa-gnn-reasoning-with-language-models-and82.8
palm-2-technical-report-156.2
palm-2-technical-report-157.4
モデル 4294.2
large-language-models-can-self-improve90
can-a-suit-of-armor-conduct-electricity-a-new76.9
gnn-is-a-counter-revisiting-gnn-for-question87.4