HyperAI

Question Answering On Fever

Metrics

EM

Results

Performance results of various models on this benchmark

Comparison Table
Model NameEM
measuring-and-narrowing-the-compositionality64.2
chain-of-action-faithful-and-multimodal54.2
language-models-are-unsupervised-multitask50
chain-of-action-faithful-and-multimodal64.2
chain-of-action-faithful-and-multimodal50
chain-of-action-faithful-and-multimodal68.9
dspy-compiling-declarative-language-model62.2
chain-of-action-faithful-and-multimodal62.2