Natural Language Inference On Snli

Metriken

% Test Accuracy

% Train Accuracy

Parameters

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Modellname	% Test Accuracy	% Train Accuracy	Parameters	Paper Title	Repository
RoBERTa-large + self-explaining layer	92.3	?	355m+	Self-Explaining Structures Improve NLP Models
Distance-based Self-Attention Network	86.3	89.6	4.7m	Distance-based Self-Attention Network for Natural Language Inference	-
Stacked Bi-LSTMs (shortcut connections, max-pooling, attention)	84.4	-	-	Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News
300D Gumbel TreeLSTM encoders	85.6	91.2	2.9m	Learning to Compose Task-Specific Tree Structures
MT-DNN	91.6	97.2	330m	Multi-Task Deep Neural Networks for Natural Language Understanding
CBS-1 + ESIM	86.73	-	-	Parameter Re-Initialization through Cyclical Batch Size Schedules	-
SJRC (BERT-Large +SRL)	91.3	95.7	308m	Explicit Contextual Semantics for Text Comprehension	-
CA-MTL	92.1	92.6	340m	Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data
1024D GRU encoders w/ unsupervised 'skip-thoughts' pre-training	81.4	98.8	15m	Order-Embeddings of Images and Language
200D decomposable attention model with intra-sentence attention	86.8	90.5	580k	A Decomposable Attention Model for Natural Language Inference
MT-DNN-SMART_0.1%ofTrainingData	-	-	-	SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
600D (300+300) BiLSTM encoders with intra-attention and symbolic preproc.	85.0	85.9	2.8m	Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention
600D BiLSTM with generalized pooling	86.6	94.9	65m	Enhancing Sentence Embedding with Generalized Pooling
Enhanced Sequential Inference Model (Chen et al., [2017a])	88.0	-	-	Enhanced LSTM for Natural Language Inference
300D Reinforced Self-Attention Network	86.3	92.6	3.1m	Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling
100D DF-LSTM	84.6	85.2	320k	-	-
600D (300+300) Deep Gated Attn. BiLSTM encoders	85.5	90.5	12m	Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference
450D DR-BiLSTM	88.5	94.1	7.5m	DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference	-
300D Residual stacked encoders	85.7	89.8	9.7m	Shortcut-Stacked Sentence Encoders for Multi-Domain Inference
ESIM + ELMo Ensemble	89.3	92.1	40m	Deep contextualized word representations

0 of 98 row(s) selected.