Question Answering On Conditionalqa
평가 지표
Conditional (answers)
Conditional (w/ conditions)
Overall (answers)
Overall (w/ conditions)
평가 결과
이 벤치마크에서 각 모델의 성능 결과
비교 표
모델 이름 | Conditional (answers) | Conditional (w/ conditions) | Overall (answers) | Overall (w/ conditions) |
---|---|---|---|---|
etc-encoding-long-and-structured-data-in | 39.4 / 41.8 | 2.5 / 3.4 | 35.6 / 39.8 | 26.9 / 30.8 |
leveraging-passage-retrieval-with-generative | 45.2 / 49.7 | 4.7 / 5.8 | 44.4 / 50.8 | 35.0 / 40.6 |
end-to-end-multihop-retrieval-for | 42.0 / 46.4 | 3.1 / 3.8 | 40.6 / 45.2 | 31.9 / 36.0 |