HyperAI

Common Sense Reasoning On Parus

Metriken

Accuracy

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameAccuracy
Modell 10.574
unreasonable-effectiveness-of-rule-based0.498
russiansuperglue-a-russian-language0.486
Modell 40.908
Modell 50.508
Modell 60.766
Modell 70.528
unreasonable-effectiveness-of-rule-based0.478
Modell 90.598
Modell 100.508
Modell 110.584
mt5-a-massively-multilingual-pre-trained-text0.504
unreasonable-effectiveness-of-rule-based0.48
Modell 140.562
russiansuperglue-a-russian-language0.982
Modell 160.492
Modell 170.66
Modell 180.498
Modell 190.498
Modell 200.476
Modell 210.676
Modell 220.554