HyperAI초신경

홈 플랫폼 문서 뉴스 연구 논문 튜토리얼 데이터셋 백과사전 SOTA LLM 모델 GPU 랭킹 컨퍼런스

한국어

HyperAI초신경

Question Answering On Strategyqa

평가 지표

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

		Paper Title	Repository
PaLM 2 (few-shot, CoT, SC)	90.4	PaLM 2 Technical Report
Rethinking with retrieval (GPT-3)	77.73	Rethinking with Retrieval: Faithful Large Language Model Inference
Self-Evaluation Guided Decoding (Codex, CoT, single reasoning chain, 6-shot gen, 4-shot eval)	77.2	-	-
U-PaLM 540B	76.6	Transcending Scaling Laws with 0.1% Extra Compute	-
PaLM 540B	76.4	Transcending Scaling Laws with 0.1% Extra Compute	-
Minerva 540B	61.9	Transcending Scaling Laws with 0.1% Extra Compute	-
SearchChain	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Least-to-Most	-	Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
SearchChain	-	Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks
CoA w/o actions	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
CoA	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Least-to-Most	-	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

0 of 12 row(s) selected.