AI-Generated Peer Reviews Flood Major Machine Learning Conference, Sparking Trust Concerns
A major AI conference, the International Conference on Learning Representations (ICLR), is facing widespread concern after an analysis revealed that a significant portion of its peer reviews were likely written entirely by artificial intelligence. The findings, released by Pangram Labs, a company specializing in AI text detection, show that about 21% of the 75,800 peer reviews submitted for ICLR 2026 were fully generated by large language models (LLMs), with over half of all reviews showing clear signs of AI involvement. The issue came to light when researchers, including Graham Neubig from Carnegie Mellon University, reported receiving unusually verbose, generic, and sometimes inaccurate reviews. These included requests for statistical analyses not standard in machine learning papers, as well as hallucinated citations and incorrect references to data. Neubig, suspecting AI involvement, offered a reward for a tool to verify the claims. Max Spero, CEO of Pangram Labs, responded by scanning all 19,490 submitted papers and their associated reviews in just 12 hours. Pangram’s AI detection model flagged 15,899 peer reviews as fully AI-generated. The analysis also found that 199 manuscripts (1%) were fully AI-written, and 9% of submissions contained more than half AI-generated text. While 61% of submissions were mostly human-written, the high rate of AI use in both reviews and papers has raised alarms across the research community. The results have prompted strong reactions. Desmond Elliott, a computer scientist at the University of Copenhagen, said one of the three reviews his team received was so off-target and full of errors that his PhD student suspected it was AI-generated. When he checked the results, the review was confirmed as fully AI-written. The review gave the paper a borderline rating, which Elliott found deeply frustrating, especially since it may have influenced the final decision. Bharath Hariharan, senior programme chair for ICLR 2026, acknowledged the scale of the problem, calling it the first time the conference has faced such a large-scale AI issue. The organizers have now committed to using automated tools to enforce their AI policies, ensuring future submissions and reviews comply with ethical standards. The findings highlight a growing challenge in academic publishing: the increasing use of AI in peer review, which risks undermining the credibility and quality of scientific evaluation. Researchers now face the difficult task of verifying whether feedback they receive is trustworthy, and whether their own work is being judged fairly. As AI tools become more accessible, the need for robust detection methods and clear guidelines on AI use in research is more urgent than ever.
