Common Sense Reasoning On Swag
評価指標
Test
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
モデル名 | Test | Paper Title | Repository |
---|---|---|---|
RoBERTa | 89.9 | RoBERTa: A Robustly Optimized BERT Pretraining Approach | |
ESIM + GloVe | 52.7 | SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference | |
ESIM + ELMo | 59.2 | SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference | |
BERT-LARGE | 86.3 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | |
DeBERTalarge | 90.8 | DeBERTa: Decoding-enhanced BERT with Disentangled Attention |
0 of 5 row(s) selected.