HyperAI

Question Answering On Hotpotqa

Metriken

ANS-EM
ANS-F1
JOINT-EM
JOINT-F1
SUP-EM
SUP-F1

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameANS-EMANS-F1JOINT-EMJOINT-F1SUP-EMSUP-F1
Modell 10.5980.7270.3450.6020.4800.749
Modell 20.2840.3860.0860.2450.1470.472
Modell 30.3070.4020.0000.0000.0000.000
Modell 40.3540.4630.0000.2550.0010.432
Modell 50.2990.3910.0830.2580.1320.497
hopretriever-retrieve-hops-over-wikipedia-to0.6710.7990.4320.7060.5740.835
Modell 70.3940.5140.1330.3700.2420.585
Modell 80.5980.7270.3450.6020.4800.749
Modell 90.3350.4270.1100.2840.1560.493
answering-complex-open-domain-questions0.3790.4860.1800.3910.3070.642
Modell 110.6080.7390.3800.6390.5310.793
Modell 120.6170.7460.3680.6290.5000.772
Modell 130.4820.6130.3060.5300.4830.739
Modell 140.0740.1210.0000.0110.0000.078
Modell 150.4330.5380.1450.3910.2190.596
chain-of-skills-a-configurable-model-for-open0.6740.8010.4570.7170.6130.853
beam-retrieval-general-end-to-end-retrieval0.7270.8500.5050.7750.6630.901
Modell 180.5880.7170.2930.5680.4160.725
Modell 190.6010.7300.3590.6170.5000.769
Modell 200.6710.7990.4310.6980.5720.826
Modell 210.6620.7930.4200.7000.5730.840
hotpotqa-a-dataset-for-diverse-explainable0.5890.7160.3450.5980.4800.757
Modell 230.5600.6890.2920.5530.4410.730
dynamically-fused-graph-network-for-multi-hop---0.5982--
Modell 250.3690.4600.1150.2910.1530.468
Modell 260.5970.7140.3790.6230.5100.774
Modell 270.4900.6080.2710.4960.4170.700
Modell 280.5290.6480.3120.5480.4280.720
multi-hop-reading-comprehension-through0.3000.4070.0000.0000.0000.000
Modell 300.6010.7300.3500.6090.4850.759
Modell 310.6030.7310.3590.6170.4990.768
Modell 320.2730.3650.0740.2360.1220.488
big-bird-transformers-for-longer-sequences-0.755-0.736-0.891
Modell 340.4180.5310.1700.3920.2630.573
transformer-xh-multi-evidence-reasoning-with0.5160.6410.2610.5130.4090.714
Modell 360.5810.7100.0000.0000.0000.000
Modell 370.5790.6990.3720.6070.5100.768
answering-while-summarizing-multi-task0.2870.3810.0870.2310.1420.444
Modell 390.5960.7240.3450.6010.4790.748
Modell 400.6480.7780.4100.6780.5610.818
hierarchical-graph-network-for-multi-hop0.5670.6920.3560.5990.5000.764
Modell 420.6460.7780.4110.6700.5570.812
Modell 430.5810.7110.0000.0000.0000.000
multi-hop-paragraph-retrieval-for-open-domain0.3060.4030.1090.2700.1670.473
Modell 450.6170.7460.3680.6290.5000.772
Modell 460.3580.4530.1150.3040.1600.512
Modell 470.6150.7460.3620.6240.5030.772
a-simple-yet-strong-pipeline-for-hotpotqa0.5550.6750.3290.5620.4560.730
learning-to-retrieve-reasoning-paths-over-10.6000.7300.3540.6120.4910.764
Modell 500.3600.4740.0000.0000.0000.000
Modell 510.3000.4070.0000.0000.0000.000
retrieve-rerank-read-then-iterate-answering0.6630.7910.4280.6960.5690.832
Modell 530.4750.6060.0490.3340.0760.448
Modell 540.6550.7860.4090.6890.5590.831
retrieve-rerank-read-then-iterate-answering0.6570.7820.4210.6860.5590.821
Modell 560.4210.5170.2470.4290.3710.598
Modell 570.6040.7320.3800.6290.5200.771
Modell 580.6200.7530.3540.6300.4990.778
multi-paragraph-reasoning-with-knowledge0.2770.3720.0700.2470.1270.472
answering-complex-open-domain-questions-with0.6230.7530.4180.6660.5750.809
Modell 610.0800.2210.0000.0000.0000.000
Modell 620.6700.7950.4440.7080.5940.843
Modell 630.6300.7540.4040.6620.5460.800
cognitive-graph-for-multi-hop-reading0.3710.4890.1240.3490.2280.577
Modell 650.2890.3910.0410.2090.0800.406
hotpotqa-a-dataset-for-diverse-explainable0.2400.3290.0190.1620.0390.377
ddrqa-dynamic-document-reranking-for-open0.6250.7590.3600.6390.5100.789
revealing-the-importance-of-semantic0.4530.5730.2510.4760.3870.708
Modell 690.5820.7090.3100.5690.4290.713
adaptive-information-seeking-for-open-domain0.6750.8050.4490.7200.6120.860
Modell 710.2360.3200.0330.1750.0560.400
Modell 720.5230.6480.3300.5610.4900.747