HyperAI

Reading Comprehension On Muserc

Metriken

Average F1
EM

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Modellname
Average F1
EM
Paper TitleRepository
Baseline TF-IDF1.10.587 0.242RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
Human Benchmark0.8060.42RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
RuBERT conversational0.6870.278--
ruBert-large finetune0.760.427--
Random weighted0.450.071Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks-
ruT5-base-finetune0.7690.446--
SBERT_Large0.6460.327--
MT5 Large0.8440.543mT5: A massively multilingual pre-trained text-to-text transformer
ruRoberta-large finetune0.830.561--
ruBert-base finetune0.7420.399--
RuGPT3Large0.7290.333--
YaLM 1.0B few-shot0.6730.364--
RuGPT3Medium0.706 0.308--
SBERT_Large_mt_ru_finetuning0.6420.319--
Golden Transformer0.9410.819--
Multilingual Bert0.639 0.239--
ruT5-large-finetune0.8150.537--
RuGPT3Small0.6530.221--
heuristic majority0.6710.237Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks-
RuGPT3XL few-shot0.740.546--
0 of 22 row(s) selected.