HyperAIHyperAI

Command Palette

Search for a command to run...

AI models fake visual understanding of nonexistent images

A new study from Stanford University has exposed a critical flaw in current AI evaluation methods, revealing that advanced multimodal models can confidently describe non-existent images. The researchers coined this phenomenon the "mirage effect," where models generate detailed, hallucinated content without any visual input. Published as a preprint on arXiv, the findings question the reliability of many existing benchmarks used to test artificial intelligence in critical fields like healthcare and robotics. To investigate this issue, the team developed a test suite called Phantom-0. They submitted 20 categories of specific questions regarding images to top-tier frontier models, including GPT-5, Gemini 3 Pro, and Claude Sonnet 4.5, without actually uploading any accompanying pictures. Surprisingly, the models did not admit they could not see an image. Instead, they produced plausible but entirely fabricated descriptions. These hallucinations ranged from inventing exact license plate numbers and specific newspaper languages to diagnosing life-threatening medical conditions that did not exist. The results were alarming, with the mirage behavior appearing in over 60% of responses across the tested models. The study suggests that current benchmarks may rely too heavily on textual patterns rather than genuine visual understanding. In a stark example, researchers trained a text-only model with no access to visual data to answer questions on chest X-rays. Remarkably, this text-only system outperformed both top-tier multimodal AI and human doctors on standard benchmarks. The researchers observed a distinct shift in model behavior depending on how questions were framed. When explicitly informed that an image was missing, the models' accuracy dropped significantly. However, when asked to answer as if an image were present, the models entered a "mirage mode," leveraging hidden text clues and statistical patterns to generate high-accuracy responses. This indicates that many high scores on current evaluations may be illusory, derived from text inference rather than actual visual analysis. These findings carry severe implications for industries that depend on accurate AI analysis, particularly in medicine where fabricated answers could lead to dangerous medical errors. The study argues for an urgent overhaul of testing protocols to ensure models are evaluated on their true multimodal capabilities. To address this, the team introduced a new evaluation method called B-Clean. This approach filters out questions that can be answered using text clues alone, ensuring that models are only credited for utilizing visual input. The researchers hope B-Clean will eliminate the mirage effect and provide a more accurate assessment of how well AI models actually see and understand images. Further studies are now required to determine if these new methods can effectively secure the future of AI evaluation.

Related Links