Startseite Plattform Dokumentation Neuigkeiten Forschungsarbeiten Tutorials Datensätze Wiki SOTA LLM-Modelle GPU-Rangliste Veranstaltungen

Deutsch

Question Answering On Strategyqa

Metriken

Accuracy

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

		Paper Title	Repository
PaLM 2 (few-shot, CoT, SC)	90.4	PaLM 2 Technical Report
Rethinking with retrieval (GPT-3)	77.73	Rethinking with Retrieval: Faithful Large Language Model Inference
Self-Evaluation Guided Decoding (Codex, CoT, single reasoning chain, 6-shot gen, 4-shot eval)	77.2	-	-
U-PaLM 540B	76.6	Transcending Scaling Laws with 0.1% Extra Compute	-
PaLM 540B	76.4	Transcending Scaling Laws with 0.1% Extra Compute	-
Minerva 540B	61.9	Transcending Scaling Laws with 0.1% Extra Compute	-
SearchChain	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Least-to-Most	-	Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
SearchChain	-	Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks
CoA w/o actions	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
CoA	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Least-to-Most	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

0 of 12 row(s) selected.