AI Pioneer Geoffrey Hinton Admits Trusting Chatbot More Than He Should, Highlights Ongoing Errors
Geoffrey Hinton, often referred to as the "Godfather of AI," has expressed concerns about his level of trust in AI chatbots, particularly GPT-4. In a recent statement, Hinton admitted that he might be trusting these systems more than he should, despite recognizing their potential for errors. This insight is particularly significant given Hinton's extensive background in artificial intelligence research and development. Hinton shared a personal anecdote to illustrate his point. He posed a riddle to GPT-4: "Sally has three brothers. Each of her brothers has two sisters. How many sisters does Sally have?" The correct answer is one, as Sally is one of the two sisters. However, GPT-4 incorrectly responded with two. Hinton expressed surprise that even a sophisticated model like GPT-4 could still falter on such a straightforward logical problem. "It surprises me. It surprises me it still screws up on that," he said. It's worth noting that not all versions of ChatGPT produced the same error. Following the interview, several users on social media indicated that newer iterations, such as GPT-4o and GPT-4.1, provided the correct answer. This suggests that continuous refinement and improvement are underway, though significant issues still persist. OpenAI, the company behind GPT-4, first launched the model in 2023. From the outset, it set a high standard for large language models, demonstrating impressive capabilities in various tasks, including passing standardized tests like the SAT, GRE, and bar exam. The model's versatility and intelligence quickly made it a benchmark in the industry. In May 2024, OpenAI introduced GPT-4o, touting it as an enhanced version that matched GPT-4's capabilities but was faster and more versatile, with improvements in handling text, voice, and vision. Since then, OpenAI has released GPT-4.5 and, most recently, GPT-4.1, each iteration aiming to address the remaining limitations and enhance performance. Despite these advancements, OpenAI did not immediately comment on Hinton's observations when reached out by Business Insider. Meanwhile, Google's Gemini 2.5-Pro is currently leading the Chatbot Arena leaderboard, a platform where models are ranked based on user feedback. OpenAI's GPT-4o and GPT-4.5 are closely following, indicating a highly competitive landscape in AI chatbot development. Recent research by AI testing company Giskard has shed light on another challenge faced by these models. According to the study, prompting chatbots to provide brief answers can increase their likelihood of "hallucination" or generating false information. Giskard found that leading models, including GPT-4o, Mistral, and Anthropic's Claude, were more prone to factual errors when asked to keep their responses short. This finding underscores the importance of context and detail in maintaining the accuracy and reliability of AI chatbots, especially in critical applications. Hinton's concerns highlight the ongoing challenges in AI development, particularly in ensuring that these models can consistently handle logical and factual integrity. While AI chatbots have made significant strides, their potential for errors and the implications of those errors in real-world scenarios cannot be overlooked. His caution serves as a reminder to users and developers alike that even the most advanced systems require skepticism and careful validation. Industry insiders agree that Hinton's observations are crucial for the field. They emphasize the need for transparency in AI capabilities and the importance of continuous testing and improvement. Companies like OpenAI and Google are investing heavily in AI research, driven by the potential benefits but also the need to address significant ethical and practical concerns. Hinton's status as a respected figure in AI adds weight to these discussions, urging the industry to remain vigilant and proactive in addressing the limitations of AI systems. Geoffrey Hinton is a Canadian cognitive psychologist and computer scientist who is widely recognized for his seminal contributions to deep learning and neural networks. His work has been fundamental in advancing AI technologies, making him a key influencer in the field. OpenAI, founded in 2015, is a leading AI research organization dedicated to developing and promoting friendly AI that benefits humanity. The company's continuous updates to its models reflect its commitment to improving AI performance and reliability.