Command Palette
Search for a command to run...
Common Sense Reasoning On Parus
Métriques
Accuracy
Résultats
Résultats de performance de divers modèles sur ce benchmark
| Paper Title | ||
|---|---|---|
| Human Benchmark | 0.982 | RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark |
| Golden Transformer | 0.908 | - |
| YaLM 1.0B few-shot | 0.766 | - |
| RuGPT3XL few-shot | 0.676 | - |
| ruT5-large-finetune | 0.66 | - |
| RuGPT3Medium | 0.598 | - |
| RuGPT3Large | 0.584 | - |
| RuBERT plain | 0.574 | - |
| RuGPT3Small | 0.562 | - |
| ruT5-base-finetune | 0.554 | - |
| Multilingual Bert | 0.528 | - |
| ruRoberta-large finetune | 0.508 | - |
| RuBERT conversational | 0.508 | - |
| MT5 Large | 0.504 | mT5: A massively multilingual pre-trained text-to-text transformer |
| majority_class | 0.498 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks |
| SBERT_Large | 0.498 | - |
| SBERT_Large_mt_ru_finetuning | 0.498 | - |
| ruBert-large finetune | 0.492 | - |
| Baseline TF-IDF1.1 | 0.486 | RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark |
| Random weighted | 0.48 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks |
0 of 22 row(s) selected.