HyperAIHyperAI

Command Palette

Search for a command to run...

AI Advances in Solving Erdős Problems: Solutions, Tools, and Caveats

The page documents various contributions of AI tools to problems originally posed by the mathematician Paul Erdős. These contributions span a range of categories, including fully AI-generated solutions, partial results, literature reviews, formalized proofs, and human-AI collaborations. The data reflects the growing role of AI in mathematical research, particularly in tackling long-standing open problems. In the first category, AI tools such as AlphaEvolve, Aristotle, and GPT-5 have been used to generate solutions or partial solutions to previously open Erdős problems. For example, AlphaEvolve has made slight improvements to known constructions, found no counterexamples, or surpassed earlier results on problems like [36], [507], and [1097]. Some attempts, however, did not match known constructions, suggesting either limitations in the AI or the need for further refinement. A second category includes cases where AI tools claimed to solve problems thought to be open, only to later be found to have rediscovered or rephrased known human proofs. For instance, AI-generated solutions to problems [333], [897], and [1026] were later matched with existing literature, including work by Erdős and Newman (1977), Wirsing (1981), and Tidor, Wang, and Yang (2016). In one case, a problem was found to be misstated, and the AI correctly identified a different, intended version. A third category involves AI applying its capabilities to already solved or partially solved problems, often producing new or alternative proofs. Tools like Aristotle, AlphaProof, and Gemini have successfully re-proved results from Baumgartner (1975), Pommerenke (1961), and others, sometimes with novel approaches. In some cases, AI failed to reproduce known results despite guidance, highlighting the challenges in automating complex mathematical reasoning. Fourth, several problems have been solved through collaboration between human mathematicians and AI. Notable examples include the work of Terence Tao and colleagues on problem [1026], where multiple AI tools were used to explore ideas and verify steps. Similarly, Mehtaab Sawhney and Mark Sellke used GPT-5 to reach a full solution, and Boris Alexeev, Stijn Cambie, and others used a suite of tools to solve a problem in collaboration. Fifth, AI has been employed in literature reviews to locate existing results. While some tools successfully found known proofs, others failed to identify them, sometimes due to misformulation, misinterpretation, or over-reliance on informal descriptions. A few cases involved AI claiming a full solution but later being found to have either misstated the problem or unknowingly reproduced existing work. Sixth, a growing number of proofs have been formalized in proof assistants like Lean using AI. Tools such as Aristotle, ChatGPT, and SeedProver have been used to convert informal proofs into machine-checkable form, with some success. However, the risk of errors due to misformalization or added axioms remains, especially when proofs are suspiciously short or simple. Finally, in a few cases, human researchers used AI for secondary support—such as generating numerical data, visualizations, or initial code—without the AI being central to the solution. For example, Tao and Teravainen used ChatGPT for numerical computations and image generation, while Gafni and Tao used it for code generation. The page includes important caveats: many problems are still under review, and claims of success should be treated with caution. The success of AI in this domain is highly variable, and the results depend heavily on problem type, available data, and the quality of human oversight. Formal verification remains a key step in ensuring correctness, but even this is not foolproof. The list is incomplete, especially regarding negative outcomes, and reflects ongoing developments rather than a final assessment.

Related Links