Semantic Textual Similarity On Mrpc

Metrics

Results

Performance results of various models on this benchmark

Model Name	F1	Paper Title
BigBird	91.5	Big Bird: Transformers for Longer Sequences
T5-3B	92.5	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
MobileBERT	-	MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
BERT-Base	-	Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning
Charformer-Tall	91.4	Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
RoBERTa-large 355M + Entailment as Few-shot Learner	91.0	Entailment as Few-Shot Learner
Nyströmformer	88.1%	Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention
SMART-BERT	-	SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)	-	LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
FNet-Large	-	FNet: Mixing Tokens with Fourier Transforms
SqueezeBERT	-	SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
XLNet (single model)	-	XLNet: Generalized Autoregressive Pretraining for Language Understanding
SMART	-	SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
T5-Large	92.4	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
TinyBERT-6 67M	-	TinyBERT: Distilling BERT for Natural Language Understanding
TinyBERT-4 14.5M	-	TinyBERT: Distilling BERT for Natural Language Understanding
DistilBERT 66M	-	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
ERNIE 2.0 Base	-	ERNIE 2.0: A Continual Pre-training Framework for Language Understanding
T5-Small	89.7	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Q8BERT (Zafrir et al., 2019)	-	Q8BERT: Quantized 8Bit BERT

0 of 45 row(s) selected.