HyperAI초신경

Question Answering On Hotpotqa

평가 지표

ANS-EM
ANS-F1
JOINT-EM
JOINT-F1
SUP-EM
SUP-F1

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름ANS-EMANS-F1JOINT-EMJOINT-F1SUP-EMSUP-F1
모델 10.5980.7270.3450.6020.4800.749
모델 20.2840.3860.0860.2450.1470.472
모델 30.3070.4020.0000.0000.0000.000
모델 40.3540.4630.0000.2550.0010.432
모델 50.2990.3910.0830.2580.1320.497
hopretriever-retrieve-hops-over-wikipedia-to0.6710.7990.4320.7060.5740.835
모델 70.3940.5140.1330.3700.2420.585
모델 80.5980.7270.3450.6020.4800.749
모델 90.3350.4270.1100.2840.1560.493
answering-complex-open-domain-questions0.3790.4860.1800.3910.3070.642
모델 110.6080.7390.3800.6390.5310.793
모델 120.6170.7460.3680.6290.5000.772
모델 130.4820.6130.3060.5300.4830.739
모델 140.0740.1210.0000.0110.0000.078
모델 150.4330.5380.1450.3910.2190.596
chain-of-skills-a-configurable-model-for-open0.6740.8010.4570.7170.6130.853
beam-retrieval-general-end-to-end-retrieval0.7270.8500.5050.7750.6630.901
모델 180.5880.7170.2930.5680.4160.725
모델 190.6010.7300.3590.6170.5000.769
모델 200.6710.7990.4310.6980.5720.826
모델 210.6620.7930.4200.7000.5730.840
hotpotqa-a-dataset-for-diverse-explainable0.5890.7160.3450.5980.4800.757
모델 230.5600.6890.2920.5530.4410.730
dynamically-fused-graph-network-for-multi-hop---0.5982--
모델 250.3690.4600.1150.2910.1530.468
모델 260.5970.7140.3790.6230.5100.774
모델 270.4900.6080.2710.4960.4170.700
모델 280.5290.6480.3120.5480.4280.720
multi-hop-reading-comprehension-through0.3000.4070.0000.0000.0000.000
모델 300.6010.7300.3500.6090.4850.759
모델 310.6030.7310.3590.6170.4990.768
모델 320.2730.3650.0740.2360.1220.488
big-bird-transformers-for-longer-sequences-0.755-0.736-0.891
모델 340.4180.5310.1700.3920.2630.573
transformer-xh-multi-evidence-reasoning-with0.5160.6410.2610.5130.4090.714
모델 360.5810.7100.0000.0000.0000.000
모델 370.5790.6990.3720.6070.5100.768
answering-while-summarizing-multi-task0.2870.3810.0870.2310.1420.444
모델 390.5960.7240.3450.6010.4790.748
모델 400.6480.7780.4100.6780.5610.818
hierarchical-graph-network-for-multi-hop0.5670.6920.3560.5990.5000.764
모델 420.6460.7780.4110.6700.5570.812
모델 430.5810.7110.0000.0000.0000.000
multi-hop-paragraph-retrieval-for-open-domain0.3060.4030.1090.2700.1670.473
모델 450.6170.7460.3680.6290.5000.772
모델 460.3580.4530.1150.3040.1600.512
모델 470.6150.7460.3620.6240.5030.772
a-simple-yet-strong-pipeline-for-hotpotqa0.5550.6750.3290.5620.4560.730
learning-to-retrieve-reasoning-paths-over-10.6000.7300.3540.6120.4910.764
모델 500.3600.4740.0000.0000.0000.000
모델 510.3000.4070.0000.0000.0000.000
retrieve-rerank-read-then-iterate-answering0.6630.7910.4280.6960.5690.832
모델 530.4750.6060.0490.3340.0760.448
모델 540.6550.7860.4090.6890.5590.831
retrieve-rerank-read-then-iterate-answering0.6570.7820.4210.6860.5590.821
모델 560.4210.5170.2470.4290.3710.598
모델 570.6040.7320.3800.6290.5200.771
모델 580.6200.7530.3540.6300.4990.778
multi-paragraph-reasoning-with-knowledge0.2770.3720.0700.2470.1270.472
answering-complex-open-domain-questions-with0.6230.7530.4180.6660.5750.809
모델 610.0800.2210.0000.0000.0000.000
모델 620.6700.7950.4440.7080.5940.843
모델 630.6300.7540.4040.6620.5460.800
cognitive-graph-for-multi-hop-reading0.3710.4890.1240.3490.2280.577
모델 650.2890.3910.0410.2090.0800.406
hotpotqa-a-dataset-for-diverse-explainable0.2400.3290.0190.1620.0390.377
ddrqa-dynamic-document-reranking-for-open0.6250.7590.3600.6390.5100.789
revealing-the-importance-of-semantic0.4530.5730.2510.4760.3870.708
모델 690.5820.7090.3100.5690.4290.713
adaptive-information-seeking-for-open-domain0.6750.8050.4490.7200.6120.860
모델 710.2360.3200.0330.1750.0560.400
모델 720.5230.6480.3300.5610.4900.747