Command Palette

Search for a command to run...

Reading Comprehension On Muserc

평가 지표

Average F1
EM

평가 결과

이 벤치마크에서 각 모델의 성능 결과

Paper Title
Golden Transformer0.9410.819-
MT5 Large0.8440.543mT5: A massively multilingual pre-trained text-to-text transformer
ruRoberta-large finetune0.830.561-
ruT5-large-finetune0.8150.537-
Human Benchmark0.8060.42RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
ruT5-base-finetune0.7690.446-
ruBert-large finetune0.760.427-
ruBert-base finetune0.7420.399-
RuGPT3XL few-shot0.740.546-
RuGPT3Large0.7290.333-
RuBERT plain0.711 0.324-
RuGPT3Medium0.706 0.308-
RuBERT conversational0.6870.278-
YaLM 1.0B few-shot0.6730.364-
heuristic majority0.6710.237Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks
RuGPT3Small0.6530.221-
SBERT_Large0.6460.327-
SBERT_Large_mt_ru_finetuning0.6420.319-
Multilingual Bert0.639 0.239-
Baseline TF-IDF1.10.587 0.242RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
0 of 22 row(s) selected.
Reading Comprehension On Muserc | SOTA | HyperAI초신경