Accueil Plateforme Docs Actualités Articles de recherche Tutoriels Ensembles de données Wiki SOTA Modèles LLM Classement GPU Événements

Français

Question Answering On Strategyqa

Métriques

Accuracy

Résultats

Résultats de performance de divers modèles sur ce benchmark

		Paper Title	Repository
PaLM 2 (few-shot, CoT, SC)	90.4	PaLM 2 Technical Report
Rethinking with retrieval (GPT-3)	77.73	Rethinking with Retrieval: Faithful Large Language Model Inference
Self-Evaluation Guided Decoding (Codex, CoT, single reasoning chain, 6-shot gen, 4-shot eval)	77.2	-	-
U-PaLM 540B	76.6	Transcending Scaling Laws with 0.1% Extra Compute	-
PaLM 540B	76.4	Transcending Scaling Laws with 0.1% Extra Compute	-
Minerva 540B	61.9	Transcending Scaling Laws with 0.1% Extra Compute	-
SearchChain	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Least-to-Most	-	Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
SearchChain	-	Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks
CoA w/o actions	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
CoA	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Least-to-Most	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

0 of 12 row(s) selected.