AI clinches gold in the International Mathematics Olympiad for the first time ever!
AI Models Achieve Gold Medal-Level Success at the International Mathematical Olympiad
In an astonishing breakthrough, AI models from Google DeepMind and OpenAI have achieved gold medal-level performance at the 2025 International Mathematical Olympiad (IMO). The AI models, named "Gemini Deep Think" by Google DeepMind and unnamed by OpenAI, successfully solved five out of six extremely challenging problems, a feat that only about 10% of human competitors manage.
The AI models' success was primarily due to their general-purpose reasoning systems, capable of processing complex mathematical concepts expressed in natural language. OpenAI's approach involved massively scaling computational capacity during test time, allowing the model to "think" longer and explore multiple reasoning paths in parallel. Both AI systems operated under exam-like conditions without internet access, code execution, or external aids, and their solutions were reviewed by former human IMO gold medalists.
This achievement marks a significant milestone in the world of AI. Until recently, it was considered nearly impossible for AI to solve IMO problems in real time and at a gold medal standard. The 2025 results demonstrate a new phase where language-based reasoning AIs can perform complex multi-hour problem solving at levels comparable to elite high-school students globally.
The IMO President, Gregor Dolinar, stated that additional evaluation and background information is required for this year's AI performances. The French News Agency reported that IMO could not verify which AI model used what computational power or if there was any human intervention during the calculations. However, both Google DeepMind and OpenAI have confirmed their AI models' success.
The AI industry's energy consumption is yet to be fully addressed by developers. Observing organizations have previously estimated the AI industry's energy consumption to reach the level of a country like Argentina, a challenge that AI developers have yet to fully address. As AI developers invest in data center projects to support AI advancements, concerns about increased energy usage and potential fossil fuel consumption are rising.
In the 66th annual IMO, held recently in Queensland, Australia, 641 young participants from 112 countries participated. While the AI models have achieved a remarkable feat, they still have a long way to go before they can fully replicate human cognitive abilities in mathematical problem-solving.
| Aspect | Google DeepMind | OpenAI | Human Performance | |-------------------------------|-------------------------------------|-----------------------------------------|--------------------------------------| | Year of achievement | 2025 | 2025 | Annually (630+ participants worldwide) | | IMO problems solved | 5 out of 6 | 5 out of 6 | Top ~10% medalists solve 5+ problems | | Scoring points | 35 (gold-medal level) | 35 (gold-medal level) | Gold medal requires about 35 points | | Reasoning approach | Generalist natural language model ("Gemini Deep Think") | General natural language model with test-time compute scaling | Human reasoning with extensive mathematical training | | Use of external tools | None (no internet, no code execution) | None (no internet, no code execution) | Human cognitive abilities only | | Confirmation status | Officially confirmed by IMO | Self-evaluated, pending official confirmation | Naturally official | | Significance | First official AI gold at IMO | Parallel breakthrough indicating major AI advance | Benchmark for intellectuals under 20 |
Despite the concerns about energy consumption and the limitations of AI in replicating human cognitive abilities, the achievement demonstrates the potential of AI technology. As AI continues to advance, it is likely to contribute to solving open problems in mathematics.
The achievement of AI models from Google DeepMind and OpenAI at the 2025 International Mathematical Olympiad, solving five out of six problems at a gold-medal level, showcases the potential of artificial-intelligence technology in complex multi-hour problem-solving.
The success of these models, specifically Google's "Gemini Deep Think" and OpenAI's unnamed model, was primarily due to their general-purpose reasoning systems, which demonstrate the power of technology in processing complex mathematical concepts expressed in natural language.