Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
A Framework to Report AI’s Flaws | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
news

A Framework to Report AI’s Flaws

Date
April 28, 2025
Topics
Ethics, Equity, Inclusion
Generative AI
Privacy, Safety, Security

Pointing to "white-hat" hacking, AI policy experts recommend a new system of third-party reporting and tracking of AI’s flaws.

As the users of today’s artificial intelligence tools know, AI often hallucinates — offering erroneous information as fact — or reveals issues in its data or training. But even the best-intentioned user can find it difficult to notify AI developers of flaws in their models and harder still to get them addressed.

Now, a team of computer scientists and policy experts at Stanford University, Massachusetts Institute of Technology, and a dozen other institutions is proposing a new way for these outside third-party users to report flaws and track whether and how AI developers address them.

In the paper “In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI,” published on preprint server arXiv, the research team proposes a broad framework for responsible third-party discovery and disclosure of AI’s flaws — and for developers to report their efforts to address them.

“We’re at a moment where these AI systems are being deployed to hundreds of millions of people at a time. But the infrastructure to identify and fix flaws at AI companies lags far behind other fields, like cybersecurity and software development,” says co-first author Shayne Longpre, a PhD student at MIT.

White-hat Culture

In those more-mature fields, Longpre says, there is a robust culture of reporting and remediation of flaws — even so-called “bug bounties” in which well-intentioned external parties are paid to find and report flaws.

So far, such a collaborative culture has not yet developed in the AI field. Companies have to date relied on in-house teams or pre-approved contractors to identify flaws, but they have proved inadequate to surface the breadth and complexity of real-world risks.

Flaws in AI are different from the security gaps of cybersecurity or bugs in the software industry. They can range from the aforementioned hallucinations to problems in the data, such as racial bias in medical imaging. Often these flaws can only be discovered once models are live and in use by millions of users. Who better to find and surface these flaws than these neutral third-parties who only want AI that works, Longpre asks rhetorically.

Three Recommendations

Against that backdrop, the scholars propose a three-part recommendation.

First, they offer a standardized AI flaw report template (see below) governed by a set of good-faith guidelines. In this regard, the authors draw inspiration from cybersecurity’s hacking culture that encourages well-meaning parties to actively search for flaws while adhering to clear rules of engagement, including doing no harm to users, safeguarding privacy, and reporting responsibly.

A flaw report card contains common elements of disclosure from software security, used to improve reproducibility of flaws and triage among them.

Second, the authors encourage companies to offer legal protections for research. Many AI companies dissuade outsiders from probing, reverse engineering, or testing their models with legal restrictions. The authors call for legal and technical safe harbors to protect good-faith evaluators from lawsuits or punitive account bans. Such frameworks work well in high-stakes cybersecurity settings, says co-first author Ruth Appel, a Stanford Impact Labs postdoctoral fellow at Stanford.

Third, the authors call for a “Disclosure Coordination Center,” a sort of public clearinghouse of known flaws and developers’ efforts to address them. This is perhaps the most ambitious proposal among the three. Flaws in AI are often “transferable,” Appel explains. That is, flaws in one system can also be found in others trained on similar data or with similar architectures. A Disclosure Coordination Center would standardize and streamline communication across the AI industry and provide a certain measure of public accountability.

“Our goal is to make flaw reporting more systematic and coordinated, which will help users and model developers alike,” says senior author Percy Liang, associate professor of computer science.

Early Adopters

Under the current framework, which is really more of a stopgap process that developed organically, flaws are either emailed discreetly to the developer never to be seen again or, conversely, blasted on social media in what amounts to public shaming. Neither approach ensures the problem gets fixed, Longpre says.

“There was a time when software companies didn’t want to hear about security bugs either,” Longpre adds. “But we learned that sunlight is the best disinfectant.”

Appel is optimistic about adoption. The team will be building a prototype website for submitting standardized flaw reports and is in talks with partners about piloting the Disclosure Coordination Center concept.

“We need companies to change their policies, for researchers to start using these reporting tools, and for the ecosystem to invest in shared processes and infrastructure,” says Longpre. “This can’t just be a framework on paper — it has to become common and accepted practice.”

Share
Link copied to clipboard!
Contributor(s)
Andrew Myers

Related News

Smart Enough to Do Math, Dumb Enough to Fail: The Hunt for a Better AI Test
Andrew Myers
Feb 02, 2026
News
illustration of data and lines

A Stanford HAI workshop brought together experts to develop new evaluation methods that assess AI's hidden capabilities, not just its test-taking performance.

News
illustration of data and lines

Smart Enough to Do Math, Dumb Enough to Fail: The Hunt for a Better AI Test

Andrew Myers
Foundation ModelsGenerative AIPrivacy, Safety, SecurityFeb 02

A Stanford HAI workshop brought together experts to develop new evaluation methods that assess AI's hidden capabilities, not just its test-taking performance.

AI For Good: What Does It Mean Today?
Forbes
Jan 23, 2026
Media Mention

HAI Co-Director James Landay urges people to think about what "AI for good" means today. He argues, "we need to move beyond just thinking about the user. We’ve got to think about broader communities who are impacted by AI systems if we actually want them to be good.”

Media Mention
Your browser does not support the video tag.

AI For Good: What Does It Mean Today?

Forbes
Ethics, Equity, InclusionJan 23

HAI Co-Director James Landay urges people to think about what "AI for good" means today. He argues, "we need to move beyond just thinking about the user. We’ve got to think about broader communities who are impacted by AI systems if we actually want them to be good.”

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos
TIME
Jan 21, 2026
Media Mention

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?” 

Media Mention
Your browser does not support the video tag.

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos

TIME
Ethics, Equity, InclusionGenerative AIMachine LearningNatural Language ProcessingJan 21

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?”