Question Answering On Openbookqa
評価指標
Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | Accuracy |
---|---|
unifiedqa-crossing-format-boundaries-with-a | 87.2 |
lamini-lm-a-diverse-herd-of-distilled-models | 31.2 |
clues-before-answers-generation-enhanced | 89.8 |
bloomberggpt-a-large-language-model-for | 47.2 |
モデル 5 | 95.9 |
large-language-models-can-self-improve | 84.4 |
large-language-models-can-self-improve | 86.4 |
grapeqa-graph-augmentation-and-pruning-to | 90 |
mixlora-enhancing-large-language-models-fine | 81.6 |
mixture-of-subspaces-in-low-rank-adaptation | 86.8 |
can-a-suit-of-armor-conduct-electricity-a-new | 56.3 |
モデル 12 | 95.2 |
large-language-models-can-self-improve | 94.4 |
language-models-are-few-shot-learners | 65.4 |
grapeqa-graph-augmentation-and-pruning-to | 82 |
lamini-lm-a-diverse-herd-of-distilled-models | 34 |
モデル 17 | 87.6 |
grapeqa-graph-augmentation-and-pruning-to | 66.2 |
lamini-lm-a-diverse-herd-of-distilled-models | 36 |
mixlora-enhancing-large-language-models-fine | 84.8 |
bloomberggpt-a-large-language-model-for | 51.6 |
mixlora-enhancing-large-language-models-fine | 83 |
palm-2-technical-report-1 | 58.5 |
モデル 24 | 91.3 |
can-a-suit-of-armor-conduct-electricity-a-new | 55.8 |
bloomberggpt-a-large-language-model-for | 44.2 |
large-language-models-can-self-improve | 93 |
bloomberggpt-a-large-language-model-for | 58.0 |
lamini-lm-a-diverse-herd-of-distilled-models | 32.8 |
large-language-models-can-self-improve | 92 |
lamini-lm-a-diverse-herd-of-distilled-models | 32 |
can-a-suit-of-armor-conduct-electricity-a-new | 25 |
fusing-context-into-knowledge-graph-for | 83.2 |
careful-selection-of-knowledge-to-solve-open | 72 |
lamini-lm-a-diverse-herd-of-distilled-models | 39.8 |
qa-gnn-reasoning-with-language-models-and | 77.8 |
qa-gnn-reasoning-with-language-models-and | 82.8 |
fusing-context-into-knowledge-graph-for | 82.4 |
qa-gnn-reasoning-with-language-models-and | 82.8 |
palm-2-technical-report-1 | 56.2 |
palm-2-technical-report-1 | 57.4 |
モデル 42 | 94.2 |
large-language-models-can-self-improve | 90 |
can-a-suit-of-armor-conduct-electricity-a-new | 76.9 |
gnn-is-a-counter-revisiting-gnn-for-question | 87.4 |