HyperAI초신경

Common Sense Reasoning On Record

평가 지표

EM
F1

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
EM
F1
Paper TitleRepository
XLNet + MTL + Verifier (single model)81.46082.664--
LUKE-Graph91.291.5LUKE-Graph: A Transformer-based Approach with Gated Relational Graph Attention for Cloze-style Reading Comprehension-
FLAN 137B (zero-shot)72.5-Finetuned Language Models Are Zero-Shot Learners
DocQA + ELMo45.446.7ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension-
CSRLM (single model)81.78082.584--
FLAN 137B (prompt-tuned)85.1-Finetuned Language Models Are Zero-Shot Learners
ST-MoE-L 4.1B (fine-tuned)88.9-ST-MoE: Designing Stable and Transferable Sparse Expert Models-
T5-XXL 11B (fine-tuned)93.4-Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
GraphBert-NELL (single)59.41061.515--
DeBERTa-1.5B94.194.5DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Switch Transformer 9B79.9-Efficient Language Modeling with Sparse all-MLP-
PaLM 540B (finetuned) 94.094.6PaLM: Scaling Language Modeling with Pathways
T5-11B-94.1Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Base Layers 10B (0-shot)60.7-Efficient Language Modeling with Sparse all-MLP-
Vega v2 6B (fine-tuned)93.994.4Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE-
XLNet + MTL + Verifier (ensemble)83.09083.737--
GPT-3 175B (one-shot)-90.2Large Language Models are Zero-Shot Reasoners
Gshard 9B72.4-Efficient Language Modeling with Sparse all-MLP-
DCReader+BERT (single model)69.49071.138--
GPT-3 Large 760M (0-shot)82.1-Language Models are Few-Shot Learners
0 of 45 row(s) selected.