HyperAI

Question Answering On Fever

Metriken

EM

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameEM
measuring-and-narrowing-the-compositionality64.2
chain-of-action-faithful-and-multimodal54.2
language-models-are-unsupervised-multitask50
chain-of-action-faithful-and-multimodal64.2
chain-of-action-faithful-and-multimodal50
chain-of-action-faithful-and-multimodal68.9
dspy-compiling-declarative-language-model62.2
chain-of-action-faithful-and-multimodal62.2