HyperAI超神経

Question Answering On Fever

評価指標

EM

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名EM
measuring-and-narrowing-the-compositionality64.2
chain-of-action-faithful-and-multimodal54.2
language-models-are-unsupervised-multitask50
chain-of-action-faithful-and-multimodal64.2
chain-of-action-faithful-and-multimodal50
chain-of-action-faithful-and-multimodal68.9
dspy-compiling-declarative-language-model62.2
chain-of-action-faithful-and-multimodal62.2