HyperAI초신경

홈 뉴스 최신 연구 논문 튜토리얼 데이터셋 백과사전 SOTA LLM 모델 GPU 랭킹 컨퍼런스

한국어

HyperAI초신경

Bbh

평가 지표

bbh

bbhbooleanexpressions

bbhcausaljudgement

bbhdateunderstanding

bbhdisambiguationqa

bbhdycklanguages

bbhformalfallacies

bbhgeometricshapes

bbhhyperbaton

bbhlogicaldeductionfiveobjects

bbhlogicaldeductionsevenobjects

bbhlogicaldeductionthreeobjects

bbhmovierecommendation

bbhmultisteparithmetictwo

bbhnavigate

bbhobjectcounting

bbhpenguinsinatable

bbhreasoningaboutcoloredobjects

bbhruinnames

bbhsalienttranslationerrordetection

bbhsnarks

bbhsportsunderstanding

bbhtemporalsequences

bbhtrackingshuffledobjectsfiveobjects

bbhtrackingshuffledobjectssevenobjects

bbhtrackingshuffledobjectsthreeobjects

bbhweboflies

bbhwordsorting

key

model

num

org

rank

time

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	bbh	bbhbooleanexpressions	bbhcausaljudgement	bbhdateunderstanding	bbhdisambiguationqa	bbhdycklanguages	bbhformalfallacies	bbhgeometricshapes	bbhhyperbaton	bbhlogicaldeductionfiveobjects	bbhlogicaldeductionsevenobjects	bbhlogicaldeductionthreeobjects	bbhmovierecommendation	bbhmultisteparithmetictwo	bbhnavigate	bbhobjectcounting	bbhpenguinsinatable	bbhreasoningaboutcoloredobjects	bbhruinnames	bbhsalienttranslationerrordetection	bbhsnarks	bbhsportsunderstanding	bbhtemporalsequences	bbhtrackingshuffledobjectsfiveobjects	bbhtrackingshuffledobjectssevenobjects	bbhtrackingshuffledobjectsthreeobjects	bbhweboflies	bbhwordsorting	key	model	num	org	rank	time	Paper Title	Repository
Chat	86.700000	96.400000	72.200000	90.000000	85.600000	63.200000	81.200000	49.600000	99.200000	83.600000	58.800000	98.400000	87.200000	87.600000	98.800000	99.600000	97.300000	97.600000	89.200000	69.600000	90.400000	95.200000	100.000000	100.000000	100.000000	100.000000	100.000000	50.800000	1.000000	GPT-4	N/A	OpenAI	1.000000	2023/3/15	-	-

0 of 1 row(s) selected.