Academic Jobs - Home of Higher Ed Logo

Pangram Study Reveals 21% of ICLR Peer Reviews Fully AI-Generated

Submit News
people sitting on chair inside room
Photo by Wan San Yip on Unsplash

Analysis Uncovers Significant AI Use in Peer Reviews for Leading AI Conference

The International Conference on Learning Representations (ICLR), a premier venue for machine learning research, has become the focal point of discussions around artificial intelligence in academic evaluation. A detailed examination by Pangram Labs of submissions for ICLR 2026 identified that 21 percent of the peer reviews, totaling approximately 15,899 out of roughly 75,800 reviews, were fully generated by AI systems. More than half of all reviews showed some level of AI involvement, ranging from editing assistance to complete generation.

This finding highlights a rapid shift in how scholarly feedback is produced, particularly in fast-moving fields like artificial intelligence. The analysis covered nearly 19,490 paper submissions alongside the extensive review pool, providing one of the largest-scale looks at AI patterns in conference peer review to date.

Methodology Behind the Pangram Labs Examination

Pangram Labs applied its EditLens detection model to the full set of ICLR materials. The tool distinguishes between fully human-written content, lightly edited AI text, moderately assisted material, heavily AI-influenced writing, and entirely AI-generated reviews. Validation on earlier ICLR cycles, such as 2022 reviews, showed very low false positive rates, with almost all content classified as human-authored in that baseline year.

The study tracked trends over time, revealing minimal AI detection before 2022, followed by steady growth. By 2025, around 20 percent of sampled ICLR reviews met criteria for full AI generation in related analyses. Paper submissions themselves remained predominantly human-written, with 61 percent classified as mostly human-produced.

Background on ICLR and Traditional Peer Review Expectations

ICLR serves as a key gathering for researchers in representation learning, deep learning, and related areas. The conference typically receives thousands of submissions each year, with acceptance rates often in the 20 to 30 percent range. Peer review forms the cornerstone of quality control, where experts volunteer time to assess novelty, technical soundness, clarity, and broader impact.

Reviewers traditionally read full manuscripts, supplementary materials, and figures before drafting detailed feedback. They consider the target venue's standards and cross-reference claims against existing literature. This process demands substantial domain expertise and careful attention, especially for complex technical work.

Growth of AI Tools in Scholarly Communication

Large language models have transformed writing assistance across disciplines since their widespread availability. Researchers increasingly use these tools for drafting, editing, summarizing literature, and even generating initial outlines. While guidelines from many publishers permit limited AI assistance with disclosure, full generation of reviews raises distinct concerns about accountability and depth.

Community discussions on platforms like X have amplified awareness. Posts from researchers such as Graham Neubig highlighted public detection results for ICLR papers and reviews, sparking widespread conversation about the implications. Other voices noted correlations between high AI content and lower review scores, as well as clustering of AI-generated feedback in embedding spaces.

Community Reactions and Emerging Concerns

News coverage in outlets such as Nature described the situation as a controversy, with academics expressing unease on social media about the volume of AI-produced reviews. Some reviewers reported receiving feedback that appeared verbose yet lacked substantive engagement with specific technical details or author rebuttals.

Stakeholders worry that AI-generated reviews may prioritize generic observations over nuanced critique. They can miss critical flaws in experimental design, statistical validity, or reproducibility. At the same time, the volume of submissions at top conferences strains the pool of available human reviewers, creating pressure that may encourage shortcuts.

Impacts on Research Integrity and Quality

When reviews lack genuine human judgment, the gatekeeping function of peer review weakens. Authors may receive less actionable guidance on improving their work, while flawed papers could advance further in the process. The study also noted that AI reviews often cluster together, potentially amplifying certain perspectives or overlooking diverse viewpoints.

Beyond individual papers, widespread AI use could erode trust in the scholarly record. Fields like machine learning move quickly, and reliable evaluation helps separate promising advances from incremental or questionable claims. Related examinations of other venues, including Nature Communications, have shown lower but still notable rates of AI involvement in reviews from 2025.

Perspectives from Authors, Reviewers, and Organizers

Authors value thorough, context-aware feedback that identifies both strengths and limitations. Many express frustration when reviews appear formulaic or request analyses already present in the manuscript. Reviewers, often juggling multiple commitments, seek ways to maintain high standards without burnout.

Conference organizers face the challenge of scaling processes while upholding integrity. Some have begun exploring policy updates, though enforcement remains complex given the difficulty of proving AI use in every case. Detection tools provide indicators but are not infallible, particularly as models improve.

Existing Guidelines and Emerging Policy Responses

Major publishers and conferences have issued statements on AI in scholarly work. Common themes include requirements for disclosure when AI assists in writing, prohibitions on AI as an author, and emphasis on human accountability for content accuracy. ICLR and similar venues continue to refine their approaches in light of recent data.

Some analyses suggest best practices such as using AI only after a reviewer has formed an independent opinion, then employing tools to polish language or structure. This hybrid approach preserves human insight while leveraging efficiency gains.

Detection Technologies and Their Role

Tools like the one developed by Pangram Labs analyze linguistic patterns, consistency, and statistical signatures to flag potential AI content. They perform well on historical data but require ongoing refinement as generation techniques evolve. Complementary approaches include watermarking, provenance tracking, and community norms around transparency.

Researchers are advised to treat detection results as signals rather than definitive proof, combining them with manual review of the feedback's substance and alignment with the paper's details.

Practical Recommendations for the Academic Community

Reviewers can strengthen their contributions by reading materials thoroughly first, documenting their reasoning, and disclosing any AI assistance used in drafting. Authors benefit from seeking multiple human perspectives on their work before submission and preparing clear rebuttals that address both AI-flagged and human concerns.

Institutions and professional societies might consider training programs on responsible AI use in research workflows. Conferences could pilot structured review templates or additional quality checks during high-volume periods.

Broader Implications for Higher Education and Career Pathways

The shift affects not only publishing but also training for early-career researchers. Graduate programs and postdoctoral positions increasingly emphasize skills in critical evaluation and clear communication. Understanding AI capabilities and limitations becomes an essential competency alongside traditional research methods.

Resources focused on academic career development, such as guidance on publishing strategies and professional networking, can help scholars navigate these changes effectively. The situation underscores the ongoing value of human expertise in maintaining rigorous standards across disciplines.

Future Outlook for Peer Review Processes

As AI capabilities advance, the academic community will likely experiment with new models. These may include AI-assisted triage for initial screening, followed by deeper human review, or platforms that facilitate verified human-AI collaboration. The goal remains preserving the integrity that makes peer review a trusted mechanism for advancing knowledge.

Continued monitoring through studies like the Pangram analysis and open dialogue among researchers will shape responsible adoption. The experience at ICLR serves as a case study for other fields facing similar pressures from rising submission volumes and technological change.

Portrait of Gabrielle Ryan
About the author

Gabrielle RyanView author

Academic Jobs In House Author

Discussion

Sort by:

Be the first to comment on this article!

You

Please keep comments respectful and on-topic.

New0 comments

Join the conversation!

Add your comments now!

Have your say

Engagement level

Browse by Faculty

Browse by Subject

Frequently Asked Questions

📊What exactly did the Pangram study find about ICLR reviews?

The analysis determined that 21 percent of the approximately 75,800 peer reviews for ICLR 2026 were fully AI-generated, equating to roughly 15,899 reviews. Over half exhibited some form of AI assistance.

🔍How was AI content detected in the reviews?

Pangram Labs used its EditLens model, which classifies content based on linguistic patterns and has been validated on prior ICLR cycles with low false positive rates.

📄Were the submitted papers themselves mostly AI-written?

No. The study found that paper submissions remained largely human-written, with 61 percent classified as mostly human-produced.

⚠️What concerns have researchers raised about AI-generated reviews?

Concerns center on reduced depth, generic feedback, weaker engagement with technical details, and potential erosion of trust in the peer review process.

🌍How does this compare to other venues?

Related work on Nature Communications reviews from 2025 showed around 12 percent AI-generated content, indicating the issue extends beyond ICLR but varies by field and venue.

📋What guidelines exist for AI use in peer review?

Publishers generally require disclosure of AI assistance, prohibit listing AI as an author, and emphasize that humans remain accountable for accuracy and judgments.

🛠️Can detection tools reliably identify AI reviews?

Current tools provide useful signals but are not perfect. They work best when combined with human assessment of review substance and alignment with the manuscript.

✍️What should reviewers do to use AI responsibly?

Form an independent opinion after reading the full materials, then use AI only for polishing language or structure if needed, while disclosing any assistance.

🎓How might this affect early-career researchers?

It underscores the need for strong critical evaluation skills and awareness of AI capabilities, which are becoming essential alongside traditional research training.

🔮What is the outlook for peer review in coming years?

Expect continued experimentation with hybrid human-AI models, policy refinements, and emphasis on transparency to maintain the rigor of scholarly evaluation.