HyperAI

Common Sense Reasoning On Swag

Metriken

Test

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameTest
roberta-a-robustly-optimized-bert-pretraining89.9
swag-a-large-scale-adversarial-dataset-for52.7
swag-a-large-scale-adversarial-dataset-for59.2
bert-pre-training-of-deep-bidirectional86.3
deberta-decoding-enhanced-bert-with90.8