HyperAI

Common Sense Reasoning On Parus

Metrics

Accuracy

Results

Performance results of various models on this benchmark

Comparison Table
Model NameAccuracy
Model 10.574
unreasonable-effectiveness-of-rule-based0.498
russiansuperglue-a-russian-language0.486
Model 40.908
Model 50.508
Model 60.766
Model 70.528
unreasonable-effectiveness-of-rule-based0.478
Model 90.598
Model 100.508
Model 110.584
mt5-a-massively-multilingual-pre-trained-text0.504
unreasonable-effectiveness-of-rule-based0.48
Model 140.562
russiansuperglue-a-russian-language0.982
Model 160.492
Model 170.66
Model 180.498
Model 190.498
Model 200.476
Model 210.676
Model 220.554