Common Sense Reasoning On Parus
Metriken
Accuracy
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Modellname | Accuracy | Paper Title | Repository |
---|---|---|---|
RuBERT plain | 0.574 | - | - |
majority_class | 0.498 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | - |
Baseline TF-IDF1.1 | 0.486 | RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark | |
Golden Transformer | 0.908 | - | - |
ruRoberta-large finetune | 0.508 | - | - |
YaLM 1.0B few-shot | 0.766 | - | - |
Multilingual Bert | 0.528 | - | - |
heuristic majority | 0.478 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | - |
RuGPT3Medium | 0.598 | - | - |
RuBERT conversational | 0.508 | - | - |
RuGPT3Large | 0.584 | - | - |
MT5 Large | 0.504 | mT5: A massively multilingual pre-trained text-to-text transformer | |
Random weighted | 0.48 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | - |
RuGPT3Small | 0.562 | - | - |
Human Benchmark | 0.982 | RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark | |
ruBert-large finetune | 0.492 | - | - |
ruT5-large-finetune | 0.66 | - | - |
SBERT_Large | 0.498 | - | - |
SBERT_Large_mt_ru_finetuning | 0.498 | - | - |
ruBert-base finetune | 0.476 | - | - |
0 of 22 row(s) selected.