Common Sense Reasoning On Parus

Accuracy

Results

Performance results of various models on this benchmark

Model Name	Accuracy	Paper Title	Repository
RuBERT plain	0.574	-	-
majority_class	0.498	Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks	-
Baseline TF-IDF1.1	0.486	RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
Golden Transformer	0.908	-	-
ruRoberta-large finetune	0.508	-	-
YaLM 1.0B few-shot	0.766	-	-
Multilingual Bert	0.528	-	-
heuristic majority	0.478	Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks	-
RuGPT3Medium	0.598	-	-
RuBERT conversational	0.508	-	-
RuGPT3Large	0.584	-	-
MT5 Large	0.504	mT5: A massively multilingual pre-trained text-to-text transformer
Random weighted	0.48	Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks	-
RuGPT3Small	0.562	-	-
Human Benchmark	0.982	RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
ruBert-large finetune	0.492	-	-
ruT5-large-finetune	0.66	-	-
SBERT_Large	0.498	-	-
SBERT_Large_mt_ru_finetuning	0.498	-	-
ruBert-base finetune	0.476	-	-

0 of 22 row(s) selected.