Harvard and MIT Researchers Find AI Models Can Predict but Lack Deep Understanding of Scientific Principles
A new study by researchers at Harvard and MIT highlights a significant limitation of large language models (LLMs) in the realm of scientific understanding. While these AI models excel at making predictions, they fall short when it comes to explaining the underlying principles of those predictions. The study focuses on a fundamental question in AI and the pursuit of artificial general intelligence (AGI): Do current AI models simply predict sequences of data, or do they actually encode an understanding of the world? To explore this, the researchers trained a transformer-based AI model to make predictions in the field of orbital mechanics, similar to the historical predictions made by Johannes Kepler regarding planetary motion around the sun. However, the key test was whether the AI model could internalize and apply the underlying Newtonian mechanics—specifically, Newton's laws of gravitation. The researchers hypothesized that if the AI model could accurately predict planetary orbits but failed to encode Newton’s laws, it would demonstrate that the model lacks a comprehensive understanding of the physical world. This finding would challenge the notion that current LLMs can truly grasp complex scientific concepts, a capability essential for achieving AGI. The choice of orbital mechanics for this study is particularly apt because it mirrors the historical progression of scientific discovery. Just as Isaac Newton built upon Kepler's empirical observations to formulate the laws of gravitation, the researchers wanted to see if the AI could similarly bridge the gap between observation and theory. Their findings suggest that while AI models can make accurate predictions based on patterns in data, they do not possess the deeper understanding needed to make true scientific discoveries. This conclusion underscores a critical distinction between AI's ability to predict outcomes and its ability to explain the mechanisms behind those outcomes—a dichotomy that is central to scientific inquiry and extends beyond the domain of AI. The implications of this study are profound, indicating that current AI models are not yet ready to replace human scientists in the discovery process. Instead, they remain powerful tools for pattern recognition and data analysis, but lack the necessary cognitive depth to formulate new scientific theories or principles. This insight serves as a crucial checkpoint in the ongoing development of AI and AGI, guiding future research towards enhancing AI's ability to understand and explain rather than merely predict.