Coreference Resolution On Winograd Schema
Metriken
Accuracy
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | Accuracy |
---|---|
winogrande-an-adversarial-winograd-schema | 57.1 |
back-to-square-one-bias-detection-training | 50 |
unsupervised-deep-structured-semantic-models | 57.1 |
a-simple-method-for-commonsense-reasoning | 57.9 |
deberta-decoding-enhanced-bert-with | 95.9 |
commonsense-knowledge-enhanced-embeddings-for | 58.3 |
winogrande-an-adversarial-winograd-schema | 90.1 |
lamini-lm-a-diverse-herd-of-distilled-models | 66.7 |
a-surprisingly-robust-trick-for-winograd | 62.3 |
a-surprisingly-robust-trick-for-winograd | 72.5 |
unsupervised-deep-structured-semantic-models | 62.4 |
back-to-square-one-bias-detection-training | 73.9 |
toward-efficient-language-model-pretraining | 97.3 |
attention-is-not-all-you-need-for-commonsense | 52.8 |
palm-scaling-language-modeling-with-pathways-1 | 86.3 |
hungry-hungry-hippos-towards-language | 43.3 |
language-models-are-few-shot-learners | 80.1 |
lamini-lm-a-diverse-herd-of-distilled-models | 73.3 |
lamini-lm-a-diverse-herd-of-distilled-models | 64.1 |
on-the-evaluation-of-common-sense-reasoning | 64.5 |
g-daug-generative-data-augmentation-for | 80 |
pythia-a-suite-for-analyzing-large-language | 36.5 |
ask-me-anything-a-simple-strategy-for | 36.5 |
on-the-evaluation-of-common-sense-reasoning | 55.7 |
alexatm-20b-few-shot-learning-using-a-large | 68.3 |
ask-me-anything-a-simple-strategy-for | 74.7 |
winogrande-an-adversarial-winograd-schema | 52.8 |
palm-scaling-language-modeling-with-pathways-1 | 89.1 |
on-the-evaluation-of-common-sense-reasoning | 61.5 |
palm-2-technical-report-1 | 88.1 |
a-surprisingly-robust-trick-for-winograd | 71.4 |
attention-is-not-all-you-need-for-commonsense | 52 |
on-generalization-in-coreference-resolution | 60.1 |
lamini-lm-a-diverse-herd-of-distilled-models | 69.6 |
unsupervised-deep-structured-semantic-models | 54.5 |
a-simple-method-for-commonsense-reasoning | 63.7 |
socialiqa-commonsense-reasoning-about-social | 67 |
hungry-hungry-hippos-towards-language | 61.5 |
on-the-evaluation-of-common-sense-reasoning | 69.2 |
palm-2-technical-report-1 | 84.6 |
back-to-square-one-bias-detection-training | 63 |
language-models-are-unsupervised-multitask | 70.7 |
designing-effective-sparse-expert-models | 96.6 |
bert-pre-training-of-deep-bidirectional | 62.0 |
exploring-the-benefits-of-training-expert | 62.21 |
pythia-a-suite-for-analyzing-large-language | 36.5 |
designing-effective-sparse-expert-models | 93.3 |
back-to-square-one-bias-detection-training | 55.4 |
hungry-hungry-hippos-towards-language | 63.5 |
knowledge-in-context-towards-knowledgeable | 65.40 |
unifying-language-learning-paradigms | 98.1 |
back-to-square-one-bias-detection-training | 78.8 |
finetuned-language-models-are-zero-shot | 86.5 |
attention-is-all-you-need | 54.1 |
toward-efficient-language-model-pretraining | 98.6 |
finetuned-language-models-are-zero-shot | 80.8 |
ask-me-anything-a-simple-strategy-for | 77.9 |
palm-2-technical-report-1 | 86.9 |
palm-scaling-language-modeling-with-pathways-1 | 100 |
a-simple-method-for-commonsense-reasoning | 62.6 |
on-generalization-in-coreference-resolution | 59.4 |
unsupervised-deep-structured-semantic-models | 63.0 |
a-surprisingly-robust-trick-for-winograd | 70.3 |
lamini-lm-a-diverse-herd-of-distilled-models | 59 |
pythia-a-suite-for-analyzing-large-language | 54.8 |
a-hybrid-neural-network-model-for-commonsense | 75.1 |
palm-scaling-language-modeling-with-pathways-1 | 89.5 |
winogrande-an-adversarial-winograd-schema | 83.1 |
the-cot-collection-improving-zero-shot-and | 66 |
n-grammer-augmenting-transformers-with-latent-1 | 68.3 |
unifying-language-learning-paradigms | 79.9 |
pythia-a-suite-for-analyzing-large-language | 38.5 |
exploring-the-limits-of-transfer-learning | 93.8 |
a-knowledge-hunting-framework-for-common | 57.1 |
back-to-square-one-bias-detection-training | 61.4 |
back-to-square-one-bias-detection-training | 56.5 |
guess-the-instruction-making-language-models | 58.37 |
socialiqa-commonsense-reasoning-about-social | 72.5 |
unsupervised-deep-structured-semantic-models | 59.2 |
scaling-instruction-finetuned-language-models | 89.82 |
tttttackling-winogrande-schemas | 84.6 |
attention-is-not-all-you-need-for-commonsense | 60.3 |