Natural Language Inference On Terra
評価指標
Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
モデル名 | Accuracy | Paper Title | Repository |
---|---|---|---|
YaLM 1.0B few-shot | 0.605 | - | - |
RuGPT3Small | 0.488 | - | - |
Multilingual Bert | 0.617 | - | - |
ruBert-base finetune | 0.703 | - | - |
SBERT_Large_mt_ru_finetuning | 0.637 | - | - |
RuBERT plain | 0.642 | - | - |
MT5 Large | 0.561 | mT5: A massively multilingual pre-trained text-to-text transformer | |
RuGPT3XL few-shot | 0.573 | - | - |
Random weighted | 0.483 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | - |
ruBert-large finetune | 0.704 | - | - |
SBERT_Large | 0.637 | - | - |
Golden Transformer | 0.871 | - | - |
RuBERT conversational | 0.64 | - | - |
ruRoberta-large finetune | 0.801 | - | - |
ruT5-large-finetune | 0.747 | - | - |
heuristic majority | 0.549 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | - |
ruT5-base-finetune | 0.692 | - | - |
Human Benchmark | 0.92 | RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark | |
majority_class | 0.513 | Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks | - |
RuGPT3Large | 0.654 | - | - |
0 of 22 row(s) selected.