For many years, mathematicians have grappled with a complex problem that has remained unsolved for nearly 80 years. However, in May, an artificial intelligence (AI) model achieved a significant breakthrough in addressing this challenge.
OpenAI, the organization behind ChatGPT, announced that one of its internal models successfully navigated the planar unit distance problem, a question originally posed by the Hungarian mathematician Paul Erdős in 1946. Erdős was known for formulating numerous mathematical challenges, collectively referred to as “Erdős problems.”
While there have been previous claims of AI models resolving mathematical problems, many have faced skepticism and criticism. Nonetheless, experts assert that this particular achievement is noteworthy. According to them, had a human mathematician produced this proof, it would qualify for publication in a leading mathematics journal. Tim Gowers, a mathematician from the University of Cambridge, remarked in a commentary for OpenAI that “no previous AI-generated proof has come close” to such rigorous standards.
Thomas Bloom, a researcher at the University of Manchester who oversees the website erdosproblems.com, indicated that he had ranked the unit distance problem among his top ten Erdős problems and did not anticipate a solution would emerge soon. He emphasized that these breakthroughs are significant as they demonstrate AI’s potential to contribute meaningfully to research.
The problem itself is deceptively straightforward: if one takes a sheet of paper and places various dots on it, the challenge is to determine the optimal arrangement that maximizes the number of pairs of dots equidistant from each other, even as the number of dots increases into the millions or trillions.
Erdős suggested that the most effective way to achieve the greatest number of equal-distance pairs is to position the dots in a formation akin to a square grid. For many years, mathematicians believed this to be the case but lacked a formal proof. Bloom, who is among the nine mathematicians who have validated the result, expressed his expectation for a proof in line with Erdős’ assumption.
However, the OpenAI model did not merely confirm the square-grid theory; instead, it refuted it by uncovering a new family of arrangements that outperform the established assumption, according to OpenAI.
This discovery indicates that the model found a novel pattern, drawing from various mathematical fields to demonstrate the possibility of creating even more equidistant pairs, although this new configuration is challenging to visualize.
AI models, which once struggled with basic calculations, have evolved to solve SAT problems and tackle challenges at the Olympiad level. However, some of these claims have come under scrutiny. In October 2025, Kevin Weil, OpenAI’s former Chief Product Officer, stated on X that GPT-5 had solved ten Erdős problems, a claim he later retracted.
At that time, Bloom characterized Weil’s assertion as a “dramatic misrepresentation,” clarifying that it merely referenced existing literature. This highlights the significance of the recent result, as it indicates that the AI model could “read academic papers and understand them well enough to apply them in new ways,” according to Bloom.
Shortly after OpenAI’s announcement, Google DeepMind also reported that its AI system, AlphaProof Nexus, had solved nine Erdős problems.
So, why is solving mathematical problems crucial for advancements in AI? Bloom explained that in creative writing, opinions can vary on the quality of the work. In contrast, with mathematical proofs, there is a definitive right or wrong, and those who understand the proof can reach a consensus.
Bloom noted that modern mathematics is highly specialized, leading individuals to develop deep knowledge in narrow areas. This specialization can lead to missed connections between disparate fields, while AI is not constrained by such assumptions.
The striking aspect of the Erdős result lies in how the AI model integrated knowledge from multiple domains, as noted by Sayan Ranu, a professor at the Indian Institute of Technology Delhi. He explained that the model solved a discrete geometry problem using tools from algebraic number theory, demonstrating connections that experts had previously overlooked.
By synthesizing information across a broad spectrum of literature, AI models can sometimes exhibit greater efficiency in specific areas compared to a specialist working solely within one field. They can tirelessly search for solutions.
Despite these advancements in automation, experts emphasize that human intervention remains essential for verification. Sébastien Bubeck, who is leading OpenAI’s mathematical explorations, told Scientific American that the model did not create something fundamentally new but performed like a remarkable mathematician.
Ranu cautioned that high-profile successes can be misleading, noting that while large language models (LLMs) have improved, they still occasionally make basic errors in arithmetic that do not garner media attention. He pointed out that a model achieving a significant breakthrough might simultaneously struggle with simpler calculations, indicating that while these landmark results are genuine, they do not guarantee consistent accuracy.
OpenAI relied on human mathematicians to verify and interpret the outputs, while Google DeepMind utilized coding verifiers such as Lean. Bloom explained that formal proof systems like Lean enable AI to produce code that confirms the correctness of proofs without requiring human review of each individual line; human oversight is only needed to ensure the overall accuracy.
So, why do these achievements matter? Bloom stated that while these results in pure mathematics may not have immediate implications for everyday life, the capability of AI to handle mathematical literature could extend to other fields, such as biology, physics, medicine, and engineering.
Ranu suggested that the ideal arrangement would be a hybrid model, where automation serves as an initial filter, with human experts stepping in as the final evaluators for more complex cases.
Stay informed with our daily newsletter to ensure you never miss critical updates.
















