Analysis Uncovers Significant AI Use in Peer Reviews for Leading AI Conference
The International Conference on Learning Representations (ICLR), a premier venue for machine learning research, has become the focal point of discussions around artificial intelligence in academic evaluation. A detailed examination by Pangram Labs of submissions for ICLR 2026 identified that 21 percent of the peer reviews, totaling approximately 15,899 out of roughly 75,800 reviews, were fully generated by AI systems. More than half of all reviews showed some level of AI involvement, ranging from editing assistance to complete generation.
This finding highlights a rapid shift in how scholarly feedback is produced, particularly in fast-moving fields like artificial intelligence. The analysis covered nearly 19,490 paper submissions alongside the extensive review pool, providing one of the largest-scale looks at AI patterns in conference peer review to date.
Methodology Behind the Pangram Labs Examination
Pangram Labs applied its EditLens detection model to the full set of ICLR materials. The tool distinguishes between fully human-written content, lightly edited AI text, moderately assisted material, heavily AI-influenced writing, and entirely AI-generated reviews. Validation on earlier ICLR cycles, such as 2022 reviews, showed very low false positive rates, with almost all content classified as human-authored in that baseline year.
The study tracked trends over time, revealing minimal AI detection before 2022, followed by steady growth. By 2025, around 20 percent of sampled ICLR reviews met criteria for full AI generation in related analyses. Paper submissions themselves remained predominantly human-written, with 61 percent classified as mostly human-produced.
Background on ICLR and Traditional Peer Review Expectations
ICLR serves as a key gathering for researchers in representation learning, deep learning, and related areas. The conference typically receives thousands of submissions each year, with acceptance rates often in the 20 to 30 percent range. Peer review forms the cornerstone of quality control, where experts volunteer time to assess novelty, technical soundness, clarity, and broader impact.
Reviewers traditionally read full manuscripts, supplementary materials, and figures before drafting detailed feedback. They consider the target venue's standards and cross-reference claims against existing literature. This process demands substantial domain expertise and careful attention, especially for complex technical work.
Growth of AI Tools in Scholarly Communication
Large language models have transformed writing assistance across disciplines since their widespread availability. Researchers increasingly use these tools for drafting, editing, summarizing literature, and even generating initial outlines. While guidelines from many publishers permit limited AI assistance with disclosure, full generation of reviews raises distinct concerns about accountability and depth.
Community discussions on platforms like X have amplified awareness. Posts from researchers such as Graham Neubig highlighted public detection results for ICLR papers and reviews, sparking widespread conversation about the implications. Other voices noted correlations between high AI content and lower review scores, as well as clustering of AI-generated feedback in embedding spaces.
Community Reactions and Emerging Concerns
News coverage in outlets such as Nature described the situation as a controversy, with academics expressing unease on social media about the volume of AI-produced reviews. Some reviewers reported receiving feedback that appeared verbose yet lacked substantive engagement with specific technical details or author rebuttals.
Stakeholders worry that AI-generated reviews may prioritize generic observations over nuanced critique. They can miss critical flaws in experimental design, statistical validity, or reproducibility. At the same time, the volume of submissions at top conferences strains the pool of available human reviewers, creating pressure that may encourage shortcuts.
Photo by Miguel Henriques on Unsplash
Impacts on Research Integrity and Quality
When reviews lack genuine human judgment, the gatekeeping function of peer review weakens. Authors may receive less actionable guidance on improving their work, while flawed papers could advance further in the process. The study also noted that AI reviews often cluster together, potentially amplifying certain perspectives or overlooking diverse viewpoints.
Beyond individual papers, widespread AI use could erode trust in the scholarly record. Fields like machine learning move quickly, and reliable evaluation helps separate promising advances from incremental or questionable claims. Related examinations of other venues, including Nature Communications, have shown lower but still notable rates of AI involvement in reviews from 2025.
Perspectives from Authors, Reviewers, and Organizers
Authors value thorough, context-aware feedback that identifies both strengths and limitations. Many express frustration when reviews appear formulaic or request analyses already present in the manuscript. Reviewers, often juggling multiple commitments, seek ways to maintain high standards without burnout.
Conference organizers face the challenge of scaling processes while upholding integrity. Some have begun exploring policy updates, though enforcement remains complex given the difficulty of proving AI use in every case. Detection tools provide indicators but are not infallible, particularly as models improve.
Existing Guidelines and Emerging Policy Responses
Major publishers and conferences have issued statements on AI in scholarly work. Common themes include requirements for disclosure when AI assists in writing, prohibitions on AI as an author, and emphasis on human accountability for content accuracy. ICLR and similar venues continue to refine their approaches in light of recent data.
Some analyses suggest best practices such as using AI only after a reviewer has formed an independent opinion, then employing tools to polish language or structure. This hybrid approach preserves human insight while leveraging efficiency gains.
Detection Technologies and Their Role
Tools like the one developed by Pangram Labs analyze linguistic patterns, consistency, and statistical signatures to flag potential AI content. They perform well on historical data but require ongoing refinement as generation techniques evolve. Complementary approaches include watermarking, provenance tracking, and community norms around transparency.
Researchers are advised to treat detection results as signals rather than definitive proof, combining them with manual review of the feedback's substance and alignment with the paper's details.
Practical Recommendations for the Academic Community
Reviewers can strengthen their contributions by reading materials thoroughly first, documenting their reasoning, and disclosing any AI assistance used in drafting. Authors benefit from seeking multiple human perspectives on their work before submission and preparing clear rebuttals that address both AI-flagged and human concerns.
Institutions and professional societies might consider training programs on responsible AI use in research workflows. Conferences could pilot structured review templates or additional quality checks during high-volume periods.
Photo by Adhitya Sibikumar on Unsplash
Broader Implications for Higher Education and Career Pathways
The shift affects not only publishing but also training for early-career researchers. Graduate programs and postdoctoral positions increasingly emphasize skills in critical evaluation and clear communication. Understanding AI capabilities and limitations becomes an essential competency alongside traditional research methods.
Resources focused on academic career development, such as guidance on publishing strategies and professional networking, can help scholars navigate these changes effectively. The situation underscores the ongoing value of human expertise in maintaining rigorous standards across disciplines.
Future Outlook for Peer Review Processes
As AI capabilities advance, the academic community will likely experiment with new models. These may include AI-assisted triage for initial screening, followed by deeper human review, or platforms that facilitate verified human-AI collaboration. The goal remains preserving the integrity that makes peer review a trusted mechanism for advancing knowledge.
Continued monitoring through studies like the Pangram analysis and open dialogue among researchers will shape responsible adoption. The experience at ICLR serves as a case study for other fields facing similar pressures from rising submission volumes and technological change.
