il y a un jour

Tony Feng Trieu Trinh Garrett Bingham Jiwon Kang Shengtong Zhang Sang-hyun Kim Kevin Barreto Carl Schildkraut Junehyuk Jung Jaehyeon Seo

Table des matières

Résumé

Nous présentons une étude de cas sur la découverte mathématique semi-autonome, utilisant Gemini pour évaluer systématiquement 700 conjectures marquées « Ouvertes » dans la base de données Bloom des problèmes d’Erdős. Nous appliquons une méthodologie hybride : vérification guidée par l’intelligence artificielle à partir de langage naturel afin de réduire l’espace de recherche, suivie d’une évaluation par des experts humains pour évaluer la validité et la nouveauté. Nous traitons 13 problèmes initialement marqués « Ouverts » dans la base de données : 5 grâce à des solutions autonomes apparemment novatrices, et 8 grâce à l’identification de solutions antérieures dans la littérature existante. Nos résultats suggèrent que le statut « Ouvert » de ces problèmes était dû davantage à une obscurité qu’à une difficulté intrinsèque. Nous identifions également et discutons les problèmes soulevés par l’application de l’intelligence artificielle aux conjectures mathématiques à grande échelle, en mettant en lumière les difficultés de localisation de la littérature scientifique ainsi que le risque de « plagiat inconscient » par l’IA. Nous tirons des enseignements de ces expériences d’assistance par l’IA sur les problèmes d’Erdős.

One-sentence Summary

Researchers from Google and collaborators propose using Gemini to evaluate 700 open math conjectures, resolving 13 via novel AI solutions or rediscovered literature, revealing that “open” status often reflects obscurity, not difficulty, while cautioning against AI’s risk of subconscious plagiarism in large-scale mathematical discovery.

Key Contributions

We applied Gemini to evaluate 700 'Open' conjectures from Bloom’s Erdős Problems database using a hybrid AI-human workflow, narrowing the search via natural language verification before expert validation, and resolved 13 problems—5 with novel autonomous solutions and 8 by rediscovering prior literature.
Our results indicate that the 'Open' status of these problems often stems from obscurity rather than intrinsic difficulty, as AI efficiently surfaced overlooked or trivially resolved conjectures, challenging assumptions about problem hardness in the database.
We document key challenges in scaling AI for mathematical discovery, including the difficulty of comprehensive literature retrieval and the risk of AI inadvertently reproducing known results without attribution, which we term “subconscious plagiarism.”

Introduction

The authors leverage Gemini to evaluate 700 “Open” conjectures from Bloom’s Erdős Problems database, using AI-driven natural language verification to filter candidates before human expert review. This approach matters because it scales AI-assisted discovery in math, where expert evaluation is bottlenecked by time and scarcity. Prior work faced challenges in reliably verifying correctness at scale and identifying whether solutions already existed in the literature — often leading to redundant or misleading claims. The authors’ main contribution is resolving 13 problems: 5 with seemingly novel autonomous solutions and 8 by uncovering prior literature, revealing that many “open” problems were simply obscure, not hard. They also highlight systemic issues like AI’s risk of subconscious plagiarism and the difficulty of literature retrieval — problems formal verification cannot solve.

Top Figure

Method

The authors leverage a geometric incidence framework to establish a lower bound on the growth rate of $\alpha_k$ , defined as the minimal number of distinct distances determined by any point in a set of $n$ points in the plane. The core of the method involves constructing a family of circles derived from a carefully selected subset of $k$ points and then applying a known incidence bound to derive a contradiction unless $\alpha_k$ grows at least as fast as $k^{1/4}$ .

The construction begins by selecting a set $P_n = \{x_1, \ldots, x_n\} \subset \mathbb{R}^2$ ordered such that $R(x_i)$ , the number of distinct distances from $x_i$ to other points in $P_n$ , is non-decreasing. The first $k$ points, $S = \{x_1, \ldots, x_k\}$ , are used to define a family of circles $\mathcal{C} = \bigcup_{i=1}^{k} \{ \Gamma(x_i, r) : r \in D_i \}$ , where $D_i$ is the set of distinct distances from $x_i$ to the rest of $P_n$ . Since each $|D_i| < \alpha_k n^{1/2}$ , the total number of circles satisfies $|\mathcal{C}| < k \alpha_k n^{1/2}$ .

The key insight is that every point $p \in P_n \setminus S$ lies on at least $k$ circles in $\mathcal{C}$ , one for each center in $S$ . This yields a lower bound on the number of incidences between $P_n$ and $\mathcal{C}$ : $(n - k)k$ . Applying the Pach–Sharir incidence bound (Theorem 1) with $k=3$ and $s=2$ for circles, the upper bound on incidences becomes $c(3,2) \left( n^{3/5} |\mathcal{C}|^{4/5} + n + |\mathcal{C}| \right)$ . Substituting the bound on $|\mathcal{C}|$ and taking the limit as $n \to \infty$ yields $k \leq C(\alpha_k^{4/5} k^{4/5} + 1)$ , which implies $\alpha_k = \Omega(k^{1/4})$ .

This method hinges on the interplay between geometric construction and combinatorial incidence bounds, transforming a problem about distance distributions into one about curve-point incidences, and then leveraging asymptotic analysis to extract the desired growth rate.

Experiment

Aletheia, a math research agent built on Gemini Deep Think, evaluated 700 open Erdős problems, yielding 63 technically correct solutions, but only 13 meaningfully addressed Erdős’s intended problem statements; 50 correct solutions were mathematically trivial due to misinterpretation, and 12 were ambiguous.
Human evaluation revealed that 68.5% of 200 graded responses were fundamentally flawed, highlighting the high cost of verifying AI output, including debugging errors and checking for literature overlap or unintentional plagiarism.
Among 13 meaningfully correct solutions, 5 were autonomously novel (Erdős-652, 654, 935, 1040, 1051), though none rose to the level of a standalone research paper; Erdős-1051 was later expanded into a collaborative paper.
Aletheia also identified 8 problems already solved in the literature (e.g., Erdős-333, 591, 705), and rediscovered 3 solutions independently (e.g., Erdős-397, 659, 1089), raising concerns about subconscious plagiarism from training data.
Final validation revealed that some “solved” problems had flawed or ambiguous formulations, such as Erdős-75, where the listed problem was a misstatement of Erdős’s original intent, underscoring the need for precise problem framing in AI-driven research.
The effort emphasizes that while AI can assist in mathematical discovery, its outputs require rigorous human vetting, and claims of “accelerating science” must account for the hidden labor of verification and contextual accuracy.

The authors used a specialized AI agent to generate solutions for 700 open Erdős problems, then evaluated 200 candidate responses with human mathematicians. Results show that while 31.5% of responses were technically correct, only 6.5% meaningfully addressed the intended mathematical problem, highlighting the gap between syntactic correctness and semantic relevance in AI-generated mathematics. The evaluation underscores the need for rigorous human oversight and contextual understanding when assessing AI contributions to mathematical research.

PDF source

Table des matières

Créer de l'IA avec l'IA

De l'idée au lancement — accélérez votre développement IA avec le co-codage IA gratuit, un environnement prêt à l'emploi et le meilleur prix pour les GPU.

Codage assisté par IA

GPU prêts à l’emploi

Tarifs les plus avantageux

Commencer Voir les tarifs

HyperAI Newsletters

Abonnez-vous à nos dernières mises à jour

Nous vous enverrons les dernières mises à jour de la semaine dans votre boîte de réception à neuf heures chaque lundi matin

Propulsé par MailChimp

il y a un jour

Tony Feng Trieu Trinh Garrett Bingham Jiwon Kang Shengtong Zhang Sang-hyun Kim Kevin Barreto Carl Schildkraut Junehyuk Jung Jaehyeon Seo

Table des matières

Résumé

One-sentence Summary

Key Contributions

We applied Gemini to evaluate 700 'Open' conjectures from Bloom’s Erdős Problems database using a hybrid AI-human workflow, narrowing the search via natural language verification before expert validation, and resolved 13 problems—5 with novel autonomous solutions and 8 by rediscovering prior literature.
Our results indicate that the 'Open' status of these problems often stems from obscurity rather than intrinsic difficulty, as AI efficiently surfaced overlooked or trivially resolved conjectures, challenging assumptions about problem hardness in the database.
We document key challenges in scaling AI for mathematical discovery, including the difficulty of comprehensive literature retrieval and the risk of AI inadvertently reproducing known results without attribution, which we term “subconscious plagiarism.”

Introduction

Top Figure

Method

Experiment

Aletheia, a math research agent built on Gemini Deep Think, evaluated 700 open Erdős problems, yielding 63 technically correct solutions, but only 13 meaningfully addressed Erdős’s intended problem statements; 50 correct solutions were mathematically trivial due to misinterpretation, and 12 were ambiguous.
Human evaluation revealed that 68.5% of 200 graded responses were fundamentally flawed, highlighting the high cost of verifying AI output, including debugging errors and checking for literature overlap or unintentional plagiarism.
Among 13 meaningfully correct solutions, 5 were autonomously novel (Erdős-652, 654, 935, 1040, 1051), though none rose to the level of a standalone research paper; Erdős-1051 was later expanded into a collaborative paper.
Aletheia also identified 8 problems already solved in the literature (e.g., Erdős-333, 591, 705), and rediscovered 3 solutions independently (e.g., Erdős-397, 659, 1089), raising concerns about subconscious plagiarism from training data.
Final validation revealed that some “solved” problems had flawed or ambiguous formulations, such as Erdős-75, where the listed problem was a misstatement of Erdős’s original intent, underscoring the need for precise problem framing in AI-driven research.
The effort emphasizes that while AI can assist in mathematical discovery, its outputs require rigorous human vetting, and claims of “accelerating science” must account for the hidden labor of verification and contextual accuracy.

PDF source

Table des matières

Créer de l'IA avec l'IA

De l'idée au lancement — accélérez votre développement IA avec le co-codage IA gratuit, un environnement prêt à l'emploi et le meilleur prix pour les GPU.

Codage assisté par IA

GPU prêts à l’emploi

Tarifs les plus avantageux

Commencer Voir les tarifs

HyperAI Newsletters

Abonnez-vous à nos dernières mises à jour

Nous vous enverrons les dernières mises à jour de la semaine dans votre boîte de réception à neuf heures chaque lundi matin

Propulsé par MailChimp

Command Palette

Découverte mathématique semi-autonome avec Gemini : une étude de cas sur les problèmes d'Erdős

Tony Feng Trieu Trinh Garrett Bingham Jiwon Kang Shengtong Zhang Sang-hyun Kim Kevin Barreto Carl Schildkraut Junehyuk Jung Jaehyeon Seo14 more

Résumé

One-sentence Summary

Key Contributions

Introduction

Method

Experiment

Créer de l'IA avec l'IA

HyperAI Newsletters

Command Palette

Découverte mathématique semi-autonome avec Gemini : une étude de cas sur les problèmes d'Erdős

Tony Feng Trieu Trinh Garrett Bingham Jiwon Kang Shengtong Zhang Sang-hyun Kim Kevin Barreto Carl Schildkraut Junehyuk Jung Jaehyeon Seo14 more

Résumé

One-sentence Summary

Key Contributions

Introduction

Method

Experiment

Créer de l'IA avec l'IA

HyperAI Newsletters

Command Palette

Découverte mathématique semi-autonome avec Gemini : une étude de cas sur les problèmes d'Erdős

Tony Feng Trieu Trinh Garrett Bingham Jiwon Kang Shengtong Zhang Sang-hyun Kim Kevin Barreto Carl Schildkraut Junehyuk Jung Jaehyeon Seo14 more

Résumé

One-sentence Summary

Key Contributions

Introduction

Method

Experiment

Créer de l'IA avec l'IA

HyperAI Newsletters

Tony Feng Trieu Trinh Garrett Bingham Jiwon Kang Shengtong Zhang Sang-hyun Kim Kevin Barreto Carl Schildkraut Junehyuk Jung Jaehyeon Seo

Tony Feng Trieu Trinh Garrett Bingham Jiwon Kang Shengtong Zhang Sang-hyun Kim Kevin Barreto Carl Schildkraut Junehyuk Jung Jaehyeon Seo

Tony Feng Trieu Trinh Garrett Bingham Jiwon Kang Shengtong Zhang Sang-hyun Kim Kevin Barreto Carl Schildkraut Junehyuk Jung Jaehyeon Seo