Common Sense Reasoning On Swag
Métriques
Test
Résultats
Résultats de performance de divers modèles sur ce benchmark
Tableau comparatif
Nom du modèle | Test |
---|---|
roberta-a-robustly-optimized-bert-pretraining | 89.9 |
swag-a-large-scale-adversarial-dataset-for | 52.7 |
swag-a-large-scale-adversarial-dataset-for | 59.2 |
bert-pre-training-of-deep-bidirectional | 86.3 |
deberta-decoding-enhanced-bert-with | 90.8 |