HyperAI초신경

Question Answering On Kilt Eli5

평가 지표

F1
Rouge-L

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름F1Rouge-L
kilt-a-benchmark-for-knowledge-intensive17.8817.41
kilt-a-benchmark-for-knowledge-intensive16.119.08
an-efficient-memory-augmented-transformer-for19.0320.91
read-before-generate-faithful-long-form24.5327.13
knowledge-infused-decoding-1-26.3
hurdles-to-progress-in-long-form-question23.123.4
kilt-a-benchmark-for-knowledge-intensive14.5114.05