HyperAI초신경

홈 플랫폼 문서 뉴스 연구 논문 튜토리얼 데이터셋 백과사전 SOTA LLM 모델 GPU 랭킹 컨퍼런스

한국어

HyperAI초신경

Question Answering On Bamboogle

평가 지표

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

		Paper Title	Repository
ReST meets ReAct (PaLM 2-L + Google Search)	76.1	ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent	-
MCR (code-davinci-002) + Google Search	66.5	Answering Questions by Meta-Reasoning over Multiple Chains of Thought
RALM (LLaMA2-13B + Google Search)	62.7	Making Retrieval-Augmented Language Models Robust to Irrelevant Context
Self-ask (GPT-3; davinci-002) + Google Search	60.0	Measuring and Narrowing the Compositionality Gap in Language Models
Self-ask (GPT-3; davinci-002)	57.6	Measuring and Narrowing the Compositionality Gap in Language Models
Chain-of-Thought (GPT-3; davinci-002)	46.4	Measuring and Narrowing the Compositionality Gap in Language Models
FireAct	44.0	FireAct: Toward Language Agent Fine-tuning	-
Direct Prompting (GPT-3; davinci-002)	17.6	Measuring and Narrowing the Compositionality Gap in Language Models
Google Search	0	Measuring and Narrowing the Compositionality Gap in Language Models

0 of 9 row(s) selected.