Question Answering On Natural Questions
Metriken
EM
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | EM |
---|---|
rankrag-unifying-context-ranking-with | 50.0 |
palm-2-technical-report-1 | 25.3 |
Modell 3 | 26.07 |
search-o1-agentic-search-enhanced-large | 34 |
few-shot-learning-with-retrieval-augmented | 42.4 |
llama-2-open-foundation-and-fine-tuned-chat | 33.0 |
mistral-7b | 28.8 |
chatqa-building-gpt-4-level-conversational-qa | 47.0 |
llama-open-and-efficient-foundation-language-1 | 35.0 |
glam-efficient-scaling-of-language-models | 26.3 |
few-shot-learning-with-retrieval-augmented | 64.0 |
replug-retrieval-augmented-black-box-language | 44.7 |
rankrag-unifying-context-ranking-with | 46.1 |
retrieval-as-attention-end-to-end-learning-of | 54.7 |
llama-open-and-efficient-foundation-language-1 | 39.9 |
rankrag-unifying-context-ranking-with | 54.2 |
dense-passage-retrieval-for-open-domain | 41.5 |
leveraging-passage-retrieval-with-generative | 54.7 |
scaling-language-models-methods-analysis-1 | 28.2 |
few-shot-learning-with-retrieval-augmented | 60.4 |
llama-open-and-efficient-foundation-language-1 | 24.9 |
palm-scaling-language-modeling-with-pathways-1 | 21.2 |
ask-me-anything-a-simple-strategy-for | 19.6 |
chatqa-building-gpt-4-level-conversational-qa | 42.7 |
ask-me-anything-a-simple-strategy-for | 13.7 |
leveraging-passage-retrieval-with-generative | 51.4 |
palm-2-technical-report-1 | 32.0 |
language-models-are-few-shot-learners | 29.9 |
palm-scaling-language-modeling-with-pathways-1 | 29.3 |
fie-building-a-global-probability-space-by | 58.4 |
llama-open-and-efficient-foundation-language-1 | 31.0 |
replug-retrieval-augmented-black-box-language | 45.5 |
realm-retrieval-augmented-language-model-pre | 40.4 |
r2-d2-a-modular-baseline-for-open-domain | 55.9 |
ask-me-anything-a-simple-strategy-for | 19.7 |
understand-what-llm-needs-dual-preference | 59.19 |
retrieval-augmented-generation-for-knowledge | 44.5 |
palm-scaling-language-modeling-with-pathways-1 | 39.6 |
glam-efficient-scaling-of-language-models | 24.7 |
glam-efficient-scaling-of-language-models | 32.5 |
end-to-end-training-of-multi-document-reader | 52.5 |
rankrag-unifying-context-ranking-with | 50.6 |
training-compute-optimal-large-language | 35.5 |
improving-language-models-by-retrieving-from | 45.5 |
blended-rag-improving-rag-retriever-augmented | 42.63 |
few-shot-learning-with-retrieval-augmented | 45.1 |
palm-2-technical-report-1 | 37.5 |