AI’s Growing Role as Scientific Peer Reviewer

Date

March 25, 2026

Topics

Sciences (Social, Health, Biological, Physical)

illustration of a book turning into a computer with medical and science concepts

Stanford computer scientist James Zou is exploring how AI can accelerate scientific research and peer review. His finding: AI excels at spotting gaps, but judgment calls still need humans.

James Zou is a computer scientist at Stanford University who has been exploring how large language models (LLMs) can assist scientific peer review — and more broadly how AI agents might accelerate research. It is a provocative topic in the scientific community and an important one to wrestle with as AI’s capabilities grow.

In one recent large-scale randomized experiment, Zou and collaborators provided AI assistance to human reviewers on roughly 20,000 reviews to assess AI’s impact on review quality. Separately, Zou helped organize the Agents for Science conference, an experimental “sandbox” to study AI’s role as both scientific author and reviewer.

James Zou: "AI is strongest on objective, checkable inconsistencies and technical issues and weaker on subjective judgments about the novelty or significance of the research."

Overall, he sees AI’s value in finding errors or gaps in research, data, and analysis but notes its limitations in truly human tasks like judgment about the relative significance of research.

We talked to Zou about what these efforts suggest for the future of AI in scientific publication.

How does AI contribute to the peer-review process?

Zou: There is tremendous interest in using AI, especially language models, to support research and peer review and speed up the scientific process. A key advantage is that AI can act like a rapid, always-available critic, a sort of pre-submission review process — before scientists officially send in a paper for publication. AI can be quite good at assessing drafts for gaps and limitations, so researchers can preemptively address them. This can improve the quality of first drafts submitted for publication and reduce the back-and-forth later. And on the reviewer side, the pressure is real: As submissions grow, human reviewers are very much overburdened, which can lead to lower-quality reviews and frustration for authors.

Where is AI strongest and weakest as a peer reviewer?

Well, it is still early and unsettled. So far, besides spotting gaps and limitations, AI can be quite good at the more objective, verifiable aspects of review. For instance: "This number in table one does not match the number reported in the text," or "This equation doesn’t match the other equation." I would say, AI is strongest on objective, checkable inconsistencies and technical issues and weaker on subjective judgments about the novelty or significance of the research. Some of its subjective assessments can even border on sycophancy, actually.

What do you think is an appropriate working relationship between scientists and AI?

As in other fields, AI should support and inform — not fully replace — human decision-making. A human, or team of humans, must make the final editorial decisions and scientists must stand behind the work. AI can offer comments on early drafts, point out omissions, and suggest improvements in the writing and the research — but the scientists must remain accountable for incorporating and synthesizing feedback from the AI and human reviewers. Perhaps as AI improves, this might evolve, but for now I think that’s reasonable practice. In our Agents4Science conference we made AI’s submissions and reviews publicly available to create a corpus for the community of scientists to review. We even had a winner of the Nobel Prize in Economics independently assess one AI-led paper. He wrote in his review of the paper: “This is actually technically very well done.”

What are the scientists' responsibilities regarding transparency into AI’s role in their work?

Scientists have to be up-front about how and where AI has assisted in the research itself and in the writing and review of the papers. They should acknowledge exactly how AI was involved and what tools were used in the paper. It comes down to accountability and a clear chain of responsibility and that final decisions are still made by humans.

What has been the reaction to your work in the scientific community?

There is tremendous interest and curiosity in how AI can improve peer review. Our International Conference on Learning Representations (ICLR) experiment showed that AI feedback improved review quality and reviewer engagement. Our Agents4Science conference also received more than 300 AI-led research submissions from 28 different countries. Following on this, many conferences and journals are now exploring using LLMs to assist the review process.

How are you using AI in your work today?

I use it for day-to-day research tasks. AI, for example, helps us write code. I also use AI almost like a pre-submission review process before I officially submit a paper — to identify gaps and limitations and suggest improvements we can address long before human peer review begins.

Where is this research taking you next?

We’ll be hosting additional conferences for AI agents with the goal of establishing evidence and norms to shape how AI is used in science in the future. The field needs more careful testing. And as AI becomes a routine scientific collaborator — writer, coder, critic — the scientific community will have to keep refining which roles belong to machines, which belong to people, and how to make this relationship both useful and trustworthy. AI’s role in science is only going to grow, and the scientific community should work together to shape that future collaboration.

Related News

Collaborative Coding, Better Scaling, Health Tracking: HAI Awards $2.17M to Innovative Research

Nikki Goth Itoi

Apr 29, 2026

Announcement

Seed grants will fund 29 research teams pursuing novel research ideas across disciplines.

Announcement

Collaborative Coding, Better Scaling, Health Tracking: HAI Awards $2.17M to Innovative Research

Nikki Goth Itoi

HealthcareSciences (Social, Health, Biological, Physical)Apr 29

Seed grants will fund 29 research teams pursuing novel research ideas across disciplines.

Anthropic’s Claude Mythos Dilemma: When Superpowered AI Gets Risky

Forbes

Apr 16, 2026

Media Mention

"The 2026 Stanford AI Index Report, released this month, highlights a sharp increase in AI adoption in medicine. It notes a significant rise in AI uses for clinical documentation, medical imaging, and diagnostic reasoning. That growth may improve efficiency. But it also expands the attack surface for public health if mis-deployed."

Media Mention