Question Answering On Squad11

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

モデル名	EM	F1	Paper Title	Repository
SAN (ensemble model)	79.608	86.496	Stochastic Answer Networks for Machine Reading Comprehension
S^3-Net (single model)	71.908	81.023	-	-
RQA (single model)	55.827	65.467	Harvesting and Refining Question-Answer Pairs for Unsupervised QA
PQMN (single model)	68.331	77.783	-	-
BERT - 3 Layers	77.7	85.8	Information Theoretic Representation Distillation
RuBERT	-	84.6	Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language
BERT-uncased (single model)	84.926	91.932	-	-
{ANNA} (single model)	90.622	95.719	-	-
BISAN (single model)	85.314	91.756	-	-
Conductor-net (single model)	74.405	82.742	Phase Conductor on Multi-layered Attentions for Machine Comprehension	-
KACTEIL-MRC(GF-Net+) (single model)	78.664	85.780	-	-
BERT-Large 32k batch size with AdamW	-	91.58	A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes	-
FusionNet (single model)	75.968	83.900	FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension
WD (single model)	84.402	90.561	-	-
DyREX	-	91.01	DyREx: Dynamic Query Representation for Extractive Question Answering
WAHnGREA	0.000	0.000	-	-
S^3-Net (ensemble)	74.121	82.342	-	-
RaSoR + TR (single model)	75.789	83.261	Contextualized Word Representations for Reading Comprehension
RQA+IDR (single model)	61.145	71.389	Harvesting and Refining Question-Answer Pairs for Unsupervised QA
MEMEN (single model)	78.234	85.344	MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension	-

0 of 213 row(s) selected.