The question in the headline is no longer hypothetical. Google has recently stated that an advanced version of its Gemini Deep Think AI model has produced research-level mathematics, including a paper generated without human intervention in the reasoning process. The development raises fresh questions about authorship and the role of artificial intelligence in formal scientific research.
From competition to research
The claims are outlined in two recent papers by Google’s research teams, which describe how Gemini Deep Think has been used to address professional research problems in mathematics, physics, and computer science.
An earlier version of the system achieved gold medal standard performance at the International Mathematics Olympiad in mid-2025. A later version reported similar results at the International Collegiate Programming Contest. Researchers say the system has since moved beyond competition-style questions and into more open-ended scientific work.
Building a research agent
Research mathematics differs from Olympiad problems because it often requires deep knowledge of existing academic literature. To support this, Google built a research agent, internally called Aletheia, powered by Gemini Deep Think. The agent generates possible solutions, checks them using a built-in language-based verifier, and revises them through an iterative process. It is also designed to admit when it cannot solve a problem, which researchers say helps reduce incorrect or fabricated results.
The system uses Google Search and web browsing tools to examine published research and avoid inaccurate citations. According to the company, Gemini Deep Think has scored up to 90% on the IMO-ProofBench Advanced benchmark as computing resources increase. Internal tests suggest that its performance continues to improve on problems beyond Olympiad level.
Levels of AI involvement
Google describes varying levels of AI involvement in the research. One paper, referred to as ‘Feng26’, was generated without human reasoning involvement and calculates structure constants in arithmetic geometry. Other projects involved human-AI collaboration, including work on mathematical bounds for particle systems and an autonomous review of 700 open problems, where the system reportedly produced solutions to four open questions.
A second paper describes similar work in physics and computer science. Researchers outline collaborative methods such as an ‘Advisor’ model, in which humans guide the system through repeated cycles of proof and revision. They also describe a technique called balanced prompting, which asks the system to attempt both proof and refutation to reduce confirmation bias. The model was used to assist in reviewing theoretical computer science papers for the STOC (Symposium on Theory of Computing) 2026 conference.
Reported results across fields
Reported results include progress on long-standing algorithmic problems such as Max-Cut and the Steiner Tree, a counterexample to a 2015 conjecture in online submodular optimisation, analysis of a machine learning optimisation method to explain its adaptive penalty mechanism, an extension of an economic Revelation Principle to continuous real numbers, and a new analytical method for calculating gravitational radiation from cosmic strings using Gegenbauer polynomials.
Google states these findings are publishable quality rather than major breakthroughs, and are being submitted through standard academic channels. The company presents the work as part of a broader change in how scientific research may be carried out, arguing that large foundation models combined with structured reasoning tools can assist with literature review, verification, and technical problem solving.
Who is the mathematician?
The development signals a shift from AI systems solving structured competition problems to contributing to formal research. It also raises practical and philosophical questions about credit, responsibility, and the future role of human researchers in mathematics and science.
If an AI system can independently produce results that meet the standards of academic publication, the question posed in the headline truly becomes more than rhetorical. Who is recognised as the mathematician in such cases: the system that generated the reasoning, the engineers who built it, or the researchers who validated and submitted the work?
In at least one instance described by Google, a mathematics paper was generated without human involvement in the reasoning process itself. While researchers reviewed and communicated the findings, the logical steps leading to the result were produced autonomously by the system. That distinction matters. It moves AI from being a drafting assistant or computational aid to being the primary source of formal argument within the paper.
As AI tools take on a greater share of technical exploration and verification, the balance between human insight and machine computation may continue to change in ways we won’t ever expect. As such, what remains unclear is whether future research will centre on human intuition supported by machines, or on machine-generated reasoning guided and interpreted by humans.