AI Struggles to Fully Grasp Human Concepts Like Flowers Due to Lack of Sensory Experience

A recent study published in the journal Nature Human Behaviour delves into why artificial intelligence (AI) tools, such as ChatGPT, struggle to understand concepts in the same rich and nuanced way that humans do. The research, led by Qihui Xu, a postdoctoral researcher in psychology at The Ohio State University, highlights a fundamental difference between human cognition and large language models (LLMs) that power AI systems. This disparity stems from the fact that LLMs rely primarily on textual and sometimes visual data, lacking the sensory and motor experiences that humans draw upon to comprehend the world. To illustrate this point, Xu and her colleagues compared the knowledge representation of 4,442 words across humans and LLMs from leading tech giants, including OpenAI’s GPT-3.5 and GPT-4, and Google’s PaLM and Gemini. They used two measures: the Glasgow Norms, which rate words on nine dimensions such as emotional arousal and imageability, and the Lancaster Norms, which assess how concepts relate to sensory and motor information. For instance, the Glasgow Norms might ask how emotionally arousing a flower is, while the Lancaster Norms would explore how much a person experiences a flower through senses like smell and touch, as well as actions involving the mouth, hands, and torso. The results of the study showed that LLMs were highly proficient in representing abstract words that do not involve sensory or motor experiences. However, when it came to concrete words with strong ties to sensory and physical interactions, such as "flower," AI fell short. For humans, the concept of a "flower" encompasses a range of sensory and emotional experiences, from the intense aroma and the silky texture of petals to the profound joy it can evoke. LLMs, on the other hand, can only approximate these concepts based on textual data, which is insufficient to capture the full richness of human understanding. One of the key findings was the discrepancy in how LLMs and humans interconnect different dimensions of a word's meaning. For example, both humans and LLMs might rate "pasta" and "roses" high on the sense of smell, but humans view pasta as more similar to noodles due to shared visual and taste characteristics, while AI tends to overlook these additional layers of similarity. This suggests that LLMs lack the contextual depth that comes from real-world sensory and motor interactions. Xu emphasizes that the reliance on language alone is a significant limitation for LLMs. "Language by itself can't fully recover conceptual representation in all its richness," she explains. Despite ingesting an enormous amount of text—far more than a human could in a lifetime—LLMs still fall short in capturing certain human concepts. The human experience, she notes, is far more multifaceted than mere words can convey. However, the study also revealed a silver lining. LLMs that incorporate visual data in their training performed better in representing concepts related to vision. This implies that augmenting LLMs with sensor data and integrating them with robotics could enhance their ability to understand and interact with the physical world. Such improvements are likely to occur as technology evolves, potentially bridging the gap between AI and human cognition. Industry insiders and experts in cognitive science agree that this study underscores the limitations of current AI models. They view the findings as a call to action for developers to integrate more diverse forms of data and create more holistic AI systems. The integration of sensory and motor data could not only improve AI’s understanding of the world but also make it a more effective tool in fields where physical and sensory experiences are crucial, such as healthcare, education, and virtual reality. The team behind the study includes researchers from prestigious institutions: Yingying Peng, Ping Li, and Minghua Wu from the Hong Kong Polytechnic University, Samuel Nastase from Princeton University, and Martin Chodorow from the City University of New York. Their collaborative effort highlights the multidisciplinary nature of AI research and the growing recognition of the importance of understanding the nuances of human cognition. In conclusion, while current LLMs excel in abstract tasks, they lag in comprehending concepts that are deeply tied to sensory and motor experiences. Future advancements in AI, particularly the integration of sensor data and robotics, are expected to mitigate these limitations, bringing AI closer to mimicking human-like understanding and interaction with the world.

AI Struggles to Fully Grasp Human Concepts Like Flowers Due to Lack of Sensory Experience

Related Links