HyperAI

Common Sense Reasoning On Winogrande

المقاييس

Accuracy

النتائج

نتائج أداء النماذج المختلفة على هذا المعيار القياسي

جدول المقارنة
اسم النموذجAccuracy
task-compass-scaling-multi-task-pre-training90.5
finetuned-language-models-are-zero-shot72.8
unifiedqa-crossing-format-boundaries-with-a73.3
efficient-language-modeling-with-sparse-all53.4
language-models-are-few-shot-learners57.4
pythia-a-suite-for-analyzing-large-language66.6
back-to-square-one-bias-detection-training54.9
designing-effective-sparse-expert-models81.7
textbooks-are-all-you-need-ii-phi-1-574.0
llama-open-and-efficient-foundation-language-173.0
bloomberggpt-a-large-language-model-for66.1
mixtral-of-experts74.2
guess-the-instruction-making-language-models58.56
gpt-4-technical-report-187.5
efficient-language-modeling-with-sparse-all51
exploring-the-benefits-of-training-expert61.60
pythia-a-suite-for-analyzing-large-language63.9
winogrande-an-adversarial-winograd-schema51.9
winogrande-an-adversarial-winograd-schema50
palm-2-technical-report-177.9
efficient-language-modeling-with-sparse-all51.7
the-claude-3-model-family-opus-sonnet-haiku88.5
bloomberggpt-a-large-language-model-for64.1
efficient-language-modeling-with-sparse-all54.3
llama-open-and-efficient-foundation-language-176.0
unifiedqa-crossing-format-boundaries-with-a89.4
efficient-language-modeling-with-sparse-all51.1
winogrande-an-adversarial-winograd-schema64.9
lamini-lm-a-diverse-herd-of-distilled-models59.9
palm-scaling-language-modeling-with-pathways-177.0
the-claude-3-model-family-opus-sonnet-haiku75.1
task-compass-scaling-multi-task-pre-training89.6
winogrande-an-adversarial-winograd-schema79.1
lamini-lm-a-diverse-herd-of-distilled-models56
winogrande-an-adversarial-winograd-schema51
back-to-square-one-bias-detection-training55.6
mixlora-enhancing-large-language-models-fine82.1
palm-2-technical-report-179.2
back-to-square-one-bias-detection-training53.1
palm-scaling-language-modeling-with-pathways-181.1
pythia-a-suite-for-analyzing-large-language59.4
branch-train-mix-mixing-expert-llms-into-a70.6
bloomberggpt-a-large-language-model-for67
llama-open-and-efficient-foundation-language-170.1
mixture-of-subspaces-in-low-rank-adaptation85.8
lamini-lm-a-diverse-herd-of-distilled-models54.9
training-compute-optimal-large-language74.9
designing-effective-sparse-expert-models96.1
the-cot-collection-improving-zero-shot-and57.5
llama-open-and-efficient-foundation-language-177.0
task-compass-scaling-multi-task-pre-training87
lamini-lm-a-diverse-herd-of-distilled-models55.2
scaling-language-models-methods-analysis-170.1
back-to-square-one-bias-detection-training56.3
the-claude-3-model-family-opus-sonnet-haiku74.2
back-to-square-one-bias-detection-training50
winogrande-an-adversarial-winograd-schema58.9
pythia-a-suite-for-analyzing-large-language60.9
lamini-lm-a-diverse-herd-of-distilled-models58.3
lamini-lm-a-diverse-herd-of-distilled-models56
unicorn-on-rainbow-a-universal-commonsense91.3
palm-2-technical-report-183.0
language-models-are-few-shot-learners70.2
back-to-square-one-bias-detection-training52.8
knowledge-in-context-towards-knowledgeable55.30
mixtral-of-experts77.2
gpt-4-technical-report-181.6
النموذج 6870.8
parameter-efficient-sparsity-crafting-from80.9
back-to-square-one-bias-detection-training58.7
bloomberggpt-a-large-language-model-for60.6
finetuned-language-models-are-zero-shot71.2
mistral-7b75.3
palm-scaling-language-modeling-with-pathways-177.0
mixlora-enhancing-large-language-models-fine86.3
g-daug-generative-data-augmentation-for71.4
mixlora-enhancing-large-language-models-fine76.8