Command Palette
Search for a command to run...
Adam Tauman Kalai Ofir Nachum Santosh S. Vempala Edwin Zhang

Abstract
Like students facing hard exam questions, large language models sometimesguess when uncertain, producing plausible yet incorrect statements instead ofadmitting uncertainty. Such "hallucinations" persist even in state-of-the-artsystems and undermine trust. We argue that language models hallucinate becausethe training and evaluation procedures reward guessing over acknowledginguncertainty, and we analyze the statistical causes of hallucinations in themodern training pipeline. Hallucinations need not be mysterious -- theyoriginate simply as errors in binary classification. If incorrect statementscannot be distinguished from facts, then hallucinations in pretrained languagemodels will arise through natural statistical pressures. We then argue thathallucinations persist due to the way most evaluations are graded -- languagemodels are optimized to be good test-takers, and guessing when uncertainimproves test performance. This "epidemic" of penalizing uncertain responsescan only be addressed through a socio-technical mitigation: modifying thescoring of existing benchmarks that are misaligned but dominate leaderboards,rather than introducing additional hallucination evaluations. This change maysteer the field toward more trustworthy AI systems.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.