Common Sense Reasoning On Rwsd

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	Accuracy	Paper Title	Repository
RuBERT conversational	0.669	-	-
ruRoberta-large finetune	0.571	-	-
ruT5-large-finetune	0.669	-	-
Baseline TF-IDF1.1	0.662	RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
Human Benchmark	0.84	RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
RuGPT3Large	0.636	-	-
RuGPT3XL few-shot	0.649	-	-
Golden Transformer	0.545	-	-
Multilingual Bert	0.669	-	-
SBERT_Large_mt_ru_finetuning	0.675	-	-
MT5 Large	0.669	mT5: A massively multilingual pre-trained text-to-text transformer
RuGPT3Medium	0.669	-	-
heuristic majority	0.669	Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks	-
YaLM 1.0B few-shot	0.669	-	-
Random weighted	0.597	Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks	-
RuGPT3Small	0.669	-	-
SBERT_Large	0.662	-	-
ruBert-large finetune	0.669	-	-
ruBert-base finetune	0.669	-	-
majority_class	0.669	Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks	-

0 of 22 row(s) selected.