HyperAI

Reading Comprehension On Muserc

Metriken

Average F1
EM

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameAverage F1EM
russiansuperglue-a-russian-language0.587 0.242
russiansuperglue-a-russian-language0.8060.42
Modell 30.6870.278
Modell 40.760.427
unreasonable-effectiveness-of-rule-based0.450.071
Modell 60.7690.446
Modell 70.6460.327
mt5-a-massively-multilingual-pre-trained-text0.8440.543
Modell 90.830.561
Modell 100.7420.399
Modell 110.7290.333
Modell 120.6730.364
Modell 130.706 0.308
Modell 140.6420.319
Modell 150.9410.819
Modell 160.639 0.239
Modell 170.8150.537
Modell 180.6530.221
unreasonable-effectiveness-of-rule-based0.6710.237
Modell 200.740.546
unreasonable-effectiveness-of-rule-based0.0 0.0
Modell 220.711 0.324