Question Answering On Social Iqa
Métriques
Accuracy
Résultats
Résultats de performance de divers modèles sur ce benchmark
Tableau comparatif
Nom du modèle | Accuracy |
---|---|
llama-open-and-efficient-foundation-language-1 | 50.4 |
llama-open-and-efficient-foundation-language-1 | 48.9 |
unifiedqa-crossing-format-boundaries-with-a | 79.8 |
training-compute-optimal-large-language | 51.3 |
llama-open-and-efficient-foundation-language-1 | 52.3 |
two-is-better-than-many-binary-classification | 80.2 |
roberta-a-robustly-optimized-bert-pretraining | 76.7 |
two-is-better-than-many-binary-classification | 79.9 |
mixlora-enhancing-large-language-models-fine | 78.8 |
task-compass-scaling-multi-task-pre-training | 82.2 |
mixture-of-subspaces-in-low-rank-adaptation | 81.0 |
socialiqa-commonsense-reasoning-about-social | 33.3 |
scaling-language-models-methods-analysis-1 | 50.6 |
socialiqa-commonsense-reasoning-about-social | 63.1 |
task-compass-scaling-multi-task-pre-training | 79.6 |
socialiqa-commonsense-reasoning-about-social | 64.5 |
socialiqa-commonsense-reasoning-about-social | 63 |
mixlora-enhancing-large-language-models-fine | 82.5 |
llama-open-and-efficient-foundation-language-1 | 50.4 |
task-compass-scaling-multi-task-pre-training | 81.7 |
textbooks-are-all-you-need-ii-phi-1-5 | 52.6 |
mixlora-enhancing-large-language-models-fine | 78 |
textbooks-are-all-you-need-ii-phi-1-5 | 53.0 |
unicorn-on-rainbow-a-universal-commonsense | 83.2 |