HyperAI超神経

Question Answering On Boolq

評価指標

Accuracy

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名Accuracy
hierarchical-prompting-taxonomy-a-universal99.87
finetuned-language-models-are-zero-shot84.6
llama-2-open-foundation-and-fine-tuned-chat81.7
hungry-hungry-hippos-towards-language59.6
ask-me-anything-a-simple-strategy-for64.9
unifying-language-learning-paradigms63.1
hungry-hungry-hippos-towards-language60.6
bloomberggpt-a-large-language-model-for74.6
hierarchical-prompting-taxonomy-a-universal99.419
boolq-exploring-the-surprising-difficulty-of72.87
llama-2-open-foundation-and-fine-tuned-chat83.7
hungry-hungry-hippos-towards-language61.7
hyena-hierarchy-towards-larger-convolutional51.8
mixlora-enhancing-large-language-models-fine72.7
scaling-language-models-methods-analysis-179.3
palm-2-technical-report-188.1
toward-efficient-language-model-pretraining90.5
opt-iml-scaling-language-model-instruction71.4
mixlora-enhancing-large-language-models-fine75
opt-iml-scaling-language-model-instruction61.5
exploring-the-limits-of-transfer-learning76.4
llama-2-open-foundation-and-fine-tuned-chat77.4
toward-efficient-language-model-pretraining92
bloomberggpt-a-large-language-model-for46.4
boolq-exploring-the-surprising-difficulty-of75.57
exploring-the-limits-of-transfer-learning81.4
llama-open-and-efficient-foundation-language-176.5
llama-open-and-efficient-foundation-language-185.3
alexatm-20b-few-shot-learning-using-a-large69.4
exploring-the-limits-of-transfer-learning91.2
mixture-of-subspaces-in-low-rank-adaptation74.6
muppet-massive-multi-task-representations83.8
muppet-massive-multi-task-representations87.5
unifying-language-learning-paradigms90.8
opt-iml-scaling-language-model-instruction60.1
llama-2-open-foundation-and-fine-tuned-chat85
llama-open-and-efficient-foundation-language-183.1
finetuned-language-models-are-zero-shot82.9
n-grammer-augmenting-transformers-with-latent-165
language-models-are-few-shot-learners76.4
boolq-exploring-the-surprising-difficulty-of71.41
ask-me-anything-a-simple-strategy-for66.5
finetuned-language-models-are-zero-shot86.3
palm-2-technical-report-190.9
designing-effective-sparse-expert-models88.6
shakti-a-2-5-billion-parameter-small-language61.1
hungry-hungry-hippos-towards-language56.1
hungry-hungry-hippos-towards-language56.1
opt-iml-scaling-language-model-instruction66.9
opt-iml-scaling-language-model-instruction60.5
palm-scaling-language-modeling-with-pathways-192.2
boolq-exploring-the-surprising-difficulty-of80.4
deberta-decoding-enhanced-bert-with90.4
designing-effective-sparse-expert-models92.4
boolq-exploring-the-surprising-difficulty-of62.17
palm-2-technical-report-188.6
entailment-as-few-shot-learner86.0
mixlora-enhancing-large-language-models-fine77.1
language-models-are-few-shot-learners60.5
bloomberggpt-a-large-language-model-for57.5
exploring-the-limits-of-transfer-learning85.4
ask-me-anything-a-simple-strategy-for 67.2
bloomberggpt-a-large-language-model-for52.9
opt-iml-scaling-language-model-instruction64
llama-open-and-efficient-foundation-language-178.1
training-compute-optimal-large-language83.7