- Polish mathematician Bartosz Naskrecki created a complex math problem for AI testing
- The problem took nearly 20 years to design and spanned 13 pages of detailed reasoning
- GPT-5.4 solved the problem correctly once in 11 attempts, surprising the mathematician
A Polish mathematician who once described artificial intelligence as “a very advanced calculator” has been left stunned after an AI model managed to solve a research-level mathematics problem he spent nearly two decades crafting. Bartosz Naskrecki, a mathematician at Adam Mickiewicz University in Poznań, had designed the complex challenge as part of the FrontierMath benchmark -- a set of extremely difficult problems intended to test the limits of artificial intelligence in advanced mathematics. The problem represented years of accumulated expertise and careful design, with a detailed solution spanning around 13 dense pages of mathematical reasoning.
The challenge was deliberately constructed to be exceptionally difficult. According to Naskrecki, even highly trained mathematicians at the PhD level might require weeks just to determine a potential approach. The problems in FrontierMath cover advanced areas such as number theory, topology, combinatorics, and algebraic geometry, fields where creativity and deep reasoning are essential.
For years, he believed such tasks were beyond the reach of artificial intelligence. In earlier comments about AI systems, he had argued that while they could perform complicated calculations, they lacked the conceptual understanding required for genuine mathematical discovery. "AI is still like a very advanced calculator," he had said previously.
However, recent testing delivered an unexpected result.
During experiments with a new model, GPT-5.4 attempted Naskrecki's problem multiple times. In 11 attempts, the system successfully produced a correct solution once. While the success rate was modest, the quality of the solution surprised the mathematician. For the researcher, the moment was both exciting and slightly unsettling.
"It finally happened—my personal move 37 or more. I am deeply impressed. The solution is very nice, clean, and feels almost human," the mathematician said on X after reviewing the result.
See the tweet here:
It finally happened-my personal move 37 or more. I am deeply impressed. The solution is very nice, clean, and feels almost human. While testing new models in the last few weeks, I felt this coming, but it's an eerie feeling to see an algorithm solve a task one has curated for… https://t.co/Enwz7dPYkL
— Bartosz Naskręcki (@nasqret) March 5, 2026
"While testing new models in the last few weeks, I felt this coming, but it's an eerie feeling to see an algorithm solve a task one has curated for about 20 years. But at least I have gained a tool that understands my idea on par with the top experts in the field. And I am now working on a completely new level. My singularity has just happened… and there is life on the other side, off to infinity," he added.
The development reflects the rapid improvement of AI reasoning systems. In recent months, advanced models have shown increasing capability in tackling research-grade mathematical problems that were previously thought to be decades away from automation.
However, experts say that AI is still far from replacing mathematicians. Many problems remain unsolved, and the systems often require multiple attempts, significant computing power, and human verification.
Track Latest News Live on NDTV.com and get news updates from India and around the world