HyperAIHyperAI

Command Palette

Search for a command to run...

AI Collaboration Yields Research-Ready Proof: ChatGPT and Aristotle Work Together to Develop and Refine a Verified Mathematical Result

Meanwhile, with additional prompting, ChatGPT successfully adapted the mathematical argument to handle both large and small cases, ultimately generating a new result that aligned with the original intent of the question. The resulting proof was shared via a Google Drive link: https://drive.google.com/file/d/1xRw8_o2C8HwmxMDnBR5OJlxXaW7jlYbz/view?usp=sharing. While the proof contained minor inaccuracies, the AI tool Aristotle was able to automatically detect and correct these gaps, producing a formally verified proof in Lean, a language designed for mathematical rigor. Next, a third participant ran Aristotle again on the corrected Lean proof to generate a more concise version. This streamlined proof was then fed into an extended, iterative conversation with ChatGPT—available at https://chatgpt.com/share/695e7cbd-605c-8010-809b-ccba75560c76—where it was transformed into a substantially more detailed and cohesive article. The revised version not only explained the proof itself but also contextualized it within existing mathematical literature and structured it with a clearer narrative flow. The final output, accessible at https://drive.google.com/file/d/1MRQfcHhrYMfMTvlZcMC3zEK7aOrUyHiQ/view?usp=sharing, represented a significant improvement over typical AI-generated content. It no longer carried the hallmark traits of generic automation and instead read with a level of sophistication and coherence comparable to an acceptable draft for a research paper. While further refinement—particularly in precision and depth of scholarly engagement—would still be beneficial, the result demonstrated the potential of collaborative human-AI workflows in advancing mathematical research. A detailed evaluation of the final writeup can be found at https://www.erdosproblems.com/forum/thread/728#post-2852.

Related Links

<p>Meanwhile, with further prompting, ChatGPT was also able to adapt the argument to handle large ? as well as small ?, thus finally producing a new result in the spirit of the intended question <a href="https://drive.google.com/file/d/1xRw8_o2C8HwmxMDnBR5OJlxXaW7jlYbz/view?usp=sharing" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://</span><span class="ellipsis">drive.google.com/file/d/1xRw8_</span><span class="invisible">o2C8HwmxMDnBR5OJlxXaW7jlYbz/view?usp=sharing</span></a> . Interestingly, the proof contained some minor errors in it, but the AI tool Aristotle was able to automatically repair these gaps and produce a Lean-verified proof.</p><p>At this point, a third particiant ran Aristotle again on the existing Lean proof to provide a shorter version, which a different participant then input into a lengthy back-and-forth ChatGPT session <a href="https://chatgpt.com/share/695e7cbd-605c-8010-809b-ccba75560c76" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://</span><span class="ellipsis">chatgpt.com/share/695e7cbd-605</span><span class="invisible">c-8010-809b-ccba75560c76</span></a> to turn it into a much more fully fleshed article that described not just the proof itself, but its connection with prior literature and with a tighter narrative structure. This resulted in a new writeup of the proof <a href="https://drive.google.com/file/d/1MRQfcHhrYMfMTvlZcMC3zEK7aOrUyHiQ/view?usp=sharing" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://</span><span class="ellipsis">drive.google.com/file/d/1MRQfc</span><span class="invisible">HhrYMfMTvlZcMC3zEK7aOrUyHiQ/view?usp=sharing</span></a> that had less of the feel of a generic AI-produced document, and which I judge to be at a level of writing within ballpark of an acceptable standard for a research paper, although there is still room for further improvement. (I review this text at <a href="https://www.erdosproblems.com/forum/thread/728#post-2852" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://www.</span><span class="ellipsis">erdosproblems.com/forum/thread</span><span class="invisible">/728#post-2852</span></a> ). (4/5)</p>
陶哲轩陶哲轩