HyperAI超神经
Back to Headlines

AI 编程助手助力数学问题快速求解,节省研究时间超 80%

1 天前

今早,一名用户在回答MathOverflow上的一个问题时,使用了传统的笔纸分析方法,但答案无法以封闭形式表示,因此决定通过数值模拟近似计算。他求助于名为o3-mini-high的人工智能模型,请求后者提供相应的代码。有趣的是,尽管o3-mini-high最初误判了该量是无限的(实际上并非如此),但它仍然提供了粗略的数值代码,能够以一位小数的精度给出所需的计算结果。 在发现可以用马尔可夫链理论来获得更精确的解后,用户再次向o3-mini-high请求理论公式和计算代码。o3-mini-high不仅能够纠正用户在请求中的一个基本错误(用户将“最小值”误写为“最大值”),还给出了高质量的代码。用户对这些代码进行了调整,最终在MathOverflow上给出了更加精确的数值答案。 整体来说,o3-mini-high在这次任务中表现优异。尽管开始时出现了一次误判,但在用户的更正下,迅速提供了正确的解决方案。相反,用户在请求中的错误也被o3-mini-high及时纠正。原本需要消耗用户约一小时的代码编写、测试和修改工作,借助o3-mini-high的帮助,在不到十分钟内就完成了。 业内人士对这一事件给予了积极评价,认为这不仅展示了人工智能在代码生成和问题解决方面的强大能力,同时也突显了人机合作在科学计算中的巨大潜力。通过人机协同工作,既可以弥补人工智能在判断上的不足,也能有效提高科研人员的工作效率。这一经验表明,人工智能不仅能够辅助人类完成复杂的任务,还能够通过互动学习不断优化自身的性能。 背景信息方面,MathOverflow是一个面向职业数学家的问答平台,用户可以在该平台上讨论各种数学问题。o3-mini-high是ChatGPT的一个变体,专注于提供高质量的编程和数理解答,其背后的研发公司是OpenAI,一家致力于推进人工智能技术发展的领先企业。

Related Links

<p>This morning I gave an answer to a MathOverflow question using traditional pen-and-paper analysis: <a href="https://mathoverflow.net/a/489533/766" target="_blank" rel="nofollow noopener noreferrer" translate="no"><span class="invisible">https://</span><span class="">mathoverflow.net/a/489533/766</span><span class="invisible"></span></a> . The answer was not in closed form, so I wanted to simulate it approximately. At this point I asked o3-mini-high for some code for this. Interestingly, it first declared that the quantity I was trying to compute was infinite (it wasn&#39;t), but nevertheless provided numerical code which did give a rough approximation to the quantity I wanted (to one decimal place): <a href="https://chatgpt.com/share/67d71204-3510-800e-8bca-11bfbf53fc3d" target="_blank" rel="nofollow noopener noreferrer" translate="no"><span class="invisible">https://</span><span class="ellipsis">chatgpt.com/share/67d71204-351</span><span class="invisible">0-800e-8bca-11bfbf53fc3d</span></a> . At that point I figured out that one should use the theory of Markov chains to get a more precise answer and asked o3-mini-high first for a theoretical formula, and then code to compute the result. Interestingly, it was able to correct a basic error in the prompt (I had written max instead of min when writing a truncation), and gave me perfectly good code, which I was able to adapt to then give a more numerically precise answer to the MO question.</p><p>So all in all a pretty good assist from o3; it made a mistake that I corrected, but I also made a mistake that it corrected, and code that would have taken perhaps an hour of my time on my own was generated, tested, modified, and reported in maybe ten minutes.</p>
Unknown Source