AI cracked an Erdős math problem. Now experts want guardrails

The model disproved a famous conjecture, raising questions about trust, credit and access

An animations of moving a circle around a plane of dots to include as many dot pairs as possible

Paul Erdős thought that the best way to arrange as many pairs of points as possible at the same distance from each other would be to use a regular grid, with the points spaced so that as many as possible fall onto circles. As you add more points, the number of pairs will increase, but only slightly, he conjectured. An AI model found a more complicated way to arrange pairs of points so that their number actually grows at a larger rate.

Kai Williams

Think about placing dots on a flat surface. You want as many pairs as possible to be separated by the same distance. For any amount of dots, what is the greatest possible number of pairs that can be exactly that far apart?

The question, what mathematicians call the unit distance problem, seems simple. The answer is tricky. Eighty years ago, in 1946, the famous mathematician Paul Erdős proposed what he thought was the answer, but no one had been able to prove or disprove his conjecture. At least, not until now.

Researchers at OpenAI gave an AI model Erdős’ conjecture and walked away. When they returned, they discovered a breakthrough: The model had disproved the conjecture in a mathematical proof posted May 20 on OpenAI.com.

“It’s a beautiful piece of mathematics that has been discovered,” says Melanie Matchett Wood of Harvard University, who contributed remarks to an accompanying paper in which outside experts reviewed the AI’s result. The discovery bolsters hopes that AI can contribute to scientific understanding.

But the AI proof relied on perseverance rather than creative insight and has raised concerns about how mathematics will be done going forward. On June 2 a group of experts published a declaration calling for tight guardrails around AI in mathematical research. As of June 5, the declaration has 1,590 signatures.

A breakthrough for math, but maybe not for AI

The AI model that produced this result isn’t publicly available yet, but Open AI says it is a general-purpose large language model trained for reasoning. It did not use any math-specific tools or software. And “we didn’t guide the model in any particular way,” says OpenAI researcher Sébastien Bubeck.

The original prompt, composed by AI, described the conjecture and instructed the model that a complete solution must either prove or disprove it. Mathematicians had believed the conjecture was true. Yet the model tried to disprove it instead.

Wood sees the result as a breakthrough for mathematics. The AI came up with a counterexample using tools from two of the oldest and most foundational mathematical fields: algebra and number theory. It seems that these areas shouldn’t have anything to do with this geometry question, Wood says. But the result “shows that tools from one part of mathematics can be applied really fruitfully in this other area of mathematics.” She thinks this result will inspire mathematicians to think of new ways to apply those same tools.

A circle of dots
The AI produced a counterexample to Erdős’ grid arrangement, which isn’t easy to visualize. It involved building a complicated grid in a high-dimensional space then projecting it onto a flat plane. But this picture gets at the gist of the idea. It shows an arrangement of points constructed in a similar way. You can play around with it here.Credit: Kai Williams/ChatGPT, based on an idea by Will Sawin

She’s not convinced, however, that this is a breakthrough in artificial intelligence. When she read the solution, it seemed to her that the latest, publicly available AI models could have come up with it. (In fact,one researcher posted on X that he had reproduced the proof using a publicly available model.)

Mathematician Thomas Bloom of the University of Manchester in England had a similar reaction. He noted in the paper from outside experts on the achievement that it would have been “truly incredible” if the AI had managed to prove the conjecture, as that kind of solution would require creative insight.

Bubeck concedes that “this proof isn’t exactly the spark of genius that we see sometimes in mathematics.” AI still struggles to make leaps of discovery. But the tech can patiently slog through a huge number of unlikely strategies.  

Who will check AI’s work?

Still, as the declaration calling for guardrails notes, AI technology also threatens our ability to produce responsible, verifiable and ethical mathematics.

For one thing, AI’s reasoning can be unreliable. In this case, the AI model’s proof happened to be relatively easy for a human expert to verify, Bloom says. But he has seen people on the internet who claim they have a solution to some open problem. These people have used AI to generate hundreds of pages of math that they can’t understand or even read. “It could be right. It could be nonsense. Who’s going to be able to check this?” Bloom says.

If mathematicians knew the probability that an AI-generated proof was correct, that would help. But as Wood notes, OpenAI does not share all the times their internal model failed to solve an open problem in math or, even worse, produced an incorrect solution with flawed reasoning.

OpenAI’s Bubeck says that the team ran their prompt on the Erdős conjecture through the same model multiple times, and it produced the correct solution in 50 percent of those trials. His colleague Lijie Chen says that the new model is better than current models at generating an “I cannot solve it” response when it runs into difficulty on a problem. But data to support these claims have not been released or peer-reviewed. And OpenAI will not reveal how much time the model spent working on its solution.

Wood, Bloom and those who signed the declaration have other concerns too. Right now, AI generates mathematical reasoning without showing what work inspired the ideas. That clashes with mathematicians’ standard practice of giving credit to the work that inspired a breakthrough.

“LLMs have read ALL the papers. They have read all the commentary and notes, and everything that’s online…. It’s not clear that there’s a way for [AI] to reasonably attribute the source of the ideas,” Wood says.

Access is another concern, Bloom says. If the most powerful tools are expensive and private, mathematics could become less open and democratic, and some people may question why they should learn math at all, he says.

Wood, Bloom, and some other mathematicians are cautiously optimistic, however. “I do think [AI] is going to become an indispensable tool in mathematics,” Wood says.