HyperAI

Question Answering On Kilt Eli5

Metrics

F1
Rouge-L

Results

Performance results of various models on this benchmark

Comparison Table
Model NameF1Rouge-L
kilt-a-benchmark-for-knowledge-intensive17.8817.41
kilt-a-benchmark-for-knowledge-intensive16.119.08
an-efficient-memory-augmented-transformer-for19.0320.91
read-before-generate-faithful-long-form24.5327.13
knowledge-infused-decoding-1-26.3
hurdles-to-progress-in-long-form-question23.123.4
kilt-a-benchmark-for-knowledge-intensive14.5114.05