HyperAI초신경

Question Answering On Triviaqa

평가 지표

EM

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름EM
llama-open-and-efficient-foundation-language-173.0
breaking-the-ceiling-of-the-llm-community-by79.29
mistral-7b69.9
end-to-end-training-of-multi-document-reader71.4
glam-efficient-scaling-of-language-models75.8
rankrag-unifying-context-ranking-with82.9
ra-dit-retrieval-augmented-dual-instruction75.4
search-o1-agentic-search-enhanced-large-
llama-open-and-efficient-foundation-language-171.6
linkbert-pretraining-language-models-with-
rankrag-unifying-context-ranking-with86.5
fie-building-a-global-probability-space-by72.6
reasonbert-pre-trained-to-reason-with-distant-
dyrex-dynamic-query-representation-for-
reinforced-mnemonic-reader-for-machine46.94
model-card-and-evaluations-for-claude-models87.5
big-bird-transformers-for-longer-sequences-
gpt-4-technical-report-184.8
palm-2-technical-report-175.2
model-card-and-evaluations-for-claude-models78.9
chatqa-building-gpt-4-level-conversational-qa69.0
glam-efficient-scaling-of-language-models75.8
replug-retrieval-augmented-black-box-language76.8
understand-what-llm-needs-dual-preference-
branch-train-mix-mixing-expert-llms-into-a57.1
shakti-a-2-5-billion-parameter-small-language58.2
palm-scaling-language-modeling-with-pathways-176.9
spanbert-improving-pre-training-by-
llama-2-open-foundation-and-fine-tuned-chat85
memen-multi-layer-embedding-with-memory43.16
rankrag-unifying-context-ranking-with72.6
unitedqa-a-hybrid-approach-for-open-domain-
model-card-and-evaluations-for-claude-models86.7
palm-scaling-language-modeling-with-pathways-181.4
language-models-are-few-shot-learners71.2
llama-open-and-efficient-foundation-language-172.6
reasonbert-pre-trained-to-reason-with-distant-
chatqa-building-gpt-4-level-conversational-qa81.0
llama-open-and-efficient-foundation-language-168.2
simple-and-effective-multi-paragraph-reading66.37
chatqa-building-gpt-4-level-conversational-qa85.6
replug-retrieval-augmented-black-box-language77.3
glam-efficient-scaling-of-language-models71.3
palm-2-technical-report-181.7
distilling-knowledge-from-reader-to-retriever-172.1
모델 4687
dense-passage-retrieval-for-open-domain56.8
finetuned-language-models-are-zero-shot56.7
dynamic-integration-of-background-knowledge50.56
leveraging-passage-retrieval-with-generative67.6
palm-scaling-language-modeling-with-pathways-181.4
mention-memory-incorporating-textual-165.8
19060030045
retrieval-augmented-generation-for-knowledge56.1
memoreader-large-scale-reading-comprehension67.21
palm-2-technical-report-186.1