HyperAI
Back to Headlines

OpenAI's Latest Model Aces International Math Olympiad, Surprising Skeptics and Raising AI Standards

3 days ago

OpenAI’s latest experimental model has achieved a groundbreaking milestone by outperforming top human participants at the International Math Olympiad (IMO), a globally renowned and exceptionally challenging mathematical competition. Alexander Wei, a member of OpenAI’s technical staff, announced on X (formerly Twitter) that the model secured a gold medal-level performance, solving five out of six problems correctly under the same conditions as human competitors. The IMO, which began in 1959 in Romania, is a two-day competition where participants tackle three complex mathematical questions over four-and-a-half hours each day. Notable past winners include mathematicians like Grigori Perelman, who made significant contributions to geometry, and Terence Tao, a Fields Medal recipient. Earlier this year, Tao himself predicted on Lex Fridman’s podcast that AI would struggle to score well on the IMO. He suggested focusing on simpler competitions where the answers are numerical rather than requiring intricate proofs. However, OpenAI’s model proved him wrong, demonstrating a high level of sustained creative thinking and problem-solving ability. Noam Brown, another OpenAI researcher, highlighted that the model’s endurance during the exam was particularly noteworthy. “IMO problems demand a new level of sustained creative thinking compared to past benchmarks,” Brown said, emphasizing that the model was capable of prolonged deliberation. CEO Sam Altman also chimed in, stating that the achievement marks significant progress toward general-purpose reinforcement learning and underscores the rapid advancement of AI technology. Altman noted that this success was once a distant dream for OpenAI, highlighting how far AI has come in the past decade. The model’s capabilities, which include both mathematical reasoning and broader cognitive tasks, set it apart from specialized systems like DeepMind’s AlphaGeometry, which is designed solely for mathematical tasks. The implications of this achievement are profound. Last year, AI labs were still using elementary math to test models, and tech billionaire Peter Thiel predicted it would take at least three more years for AI to solve problems at the level of the US Math Olympiad. OpenAI’s model has exceeded expectations, showcasing the rapid pace of technological development in the field. However, not everyone is convinced. Gary Marcus, a prominent AI critic, acknowledged the model’s performance as “genuinely impressive” but raised several questions. He queried the specifics of the model’s training, the true scope of its general intelligence, its practical utility for the average person, and the cost per problem. Marcus also pointed out that the IMO has not independently verified these results. Despite the skepticism, the success of OpenAI’s model at the IMO is a significant indicator of the progress being made in AI research. It highlights the model’s robust general-purpose capabilities and suggests that AI may soon be able to handle increasingly complex and creative tasks. Industry experts view this as a major leap forward in the development of more advanced and versatile AI systems, although they caution that further verification and practical applications are necessary to fully assess its impact. OpenAI’s achievement not only demonstrates the capabilities of its latest model but also sets a new benchmark for AI performance in mathematical reasoning. The company plans to share more details about the model and its potential applications in the coming months. For now, this breakthrough underscores the potential and future direction of AI technology, which continues to evolve at an unprecedented rate.

Related Links