Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
A Framework to Report AI’s Flaws | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
news

A Framework to Report AI’s Flaws

Date
April 28, 2025
Topics
Ethics, Equity, Inclusion
Generative AI
Privacy, Safety, Security

Pointing to "white-hat" hacking, AI policy experts recommend a new system of third-party reporting and tracking of AI’s flaws.

As the users of today’s artificial intelligence tools know, AI often hallucinates — offering erroneous information as fact — or reveals issues in its data or training. But even the best-intentioned user can find it difficult to notify AI developers of flaws in their models and harder still to get them addressed.

Now, a team of computer scientists and policy experts at Stanford University, Massachusetts Institute of Technology, and a dozen other institutions is proposing a new way for these outside third-party users to report flaws and track whether and how AI developers address them.

In the paper “In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI,” published on preprint server arXiv, the research team proposes a broad framework for responsible third-party discovery and disclosure of AI’s flaws — and for developers to report their efforts to address them.

“We’re at a moment where these AI systems are being deployed to hundreds of millions of people at a time. But the infrastructure to identify and fix flaws at AI companies lags far behind other fields, like cybersecurity and software development,” says co-first author Shayne Longpre, a PhD student at MIT.

White-hat Culture

In those more-mature fields, Longpre says, there is a robust culture of reporting and remediation of flaws — even so-called “bug bounties” in which well-intentioned external parties are paid to find and report flaws.

So far, such a collaborative culture has not yet developed in the AI field. Companies have to date relied on in-house teams or pre-approved contractors to identify flaws, but they have proved inadequate to surface the breadth and complexity of real-world risks.

Flaws in AI are different from the security gaps of cybersecurity or bugs in the software industry. They can range from the aforementioned hallucinations to problems in the data, such as racial bias in medical imaging. Often these flaws can only be discovered once models are live and in use by millions of users. Who better to find and surface these flaws than these neutral third-parties who only want AI that works, Longpre asks rhetorically.

Three Recommendations

Against that backdrop, the scholars propose a three-part recommendation.

First, they offer a standardized AI flaw report template (see below) governed by a set of good-faith guidelines. In this regard, the authors draw inspiration from cybersecurity’s hacking culture that encourages well-meaning parties to actively search for flaws while adhering to clear rules of engagement, including doing no harm to users, safeguarding privacy, and reporting responsibly.

A flaw report card contains common elements of disclosure from software security, used to improve reproducibility of flaws and triage among them.

Second, the authors encourage companies to offer legal protections for research. Many AI companies dissuade outsiders from probing, reverse engineering, or testing their models with legal restrictions. The authors call for legal and technical safe harbors to protect good-faith evaluators from lawsuits or punitive account bans. Such frameworks work well in high-stakes cybersecurity settings, says co-first author Ruth Appel, a Stanford Impact Labs postdoctoral fellow at Stanford.

Third, the authors call for a “Disclosure Coordination Center,” a sort of public clearinghouse of known flaws and developers’ efforts to address them. This is perhaps the most ambitious proposal among the three. Flaws in AI are often “transferable,” Appel explains. That is, flaws in one system can also be found in others trained on similar data or with similar architectures. A Disclosure Coordination Center would standardize and streamline communication across the AI industry and provide a certain measure of public accountability.

“Our goal is to make flaw reporting more systematic and coordinated, which will help users and model developers alike,” says senior author Percy Liang, associate professor of computer science.

Early Adopters

Under the current framework, which is really more of a stopgap process that developed organically, flaws are either emailed discreetly to the developer never to be seen again or, conversely, blasted on social media in what amounts to public shaming. Neither approach ensures the problem gets fixed, Longpre says.

“There was a time when software companies didn’t want to hear about security bugs either,” Longpre adds. “But we learned that sunlight is the best disinfectant.”

Appel is optimistic about adoption. The team will be building a prototype website for submitting standardized flaw reports and is in talks with partners about piloting the Disclosure Coordination Center concept.

“We need companies to change their policies, for researchers to start using these reporting tools, and for the ecosystem to invest in shared processes and infrastructure,” says Longpre. “This can’t just be a framework on paper — it has to become common and accepted practice.”

Share
Link copied to clipboard!
Contributor(s)
Andrew Myers

Related News

Exploring the Dangers of AI in Mental Health Care
Sarah Wells
Jun 11, 2025
News
Young woman holds up phone to her face

A new Stanford study reveals that AI therapy chatbots may not only lack effectiveness compared to human therapists but could also contribute to harmful stigma and dangerous responses.

News
Young woman holds up phone to her face

Exploring the Dangers of AI in Mental Health Care

Sarah Wells
HealthcareGenerative AIJun 11

A new Stanford study reveals that AI therapy chatbots may not only lack effectiveness compared to human therapists but could also contribute to harmful stigma and dangerous responses.

Europe's Innovation Pivot: Can the EU Lead the Next Wave of AI?
Daniel Zhang
Jun 04, 2025
News
Speaker for the event

With its AI Continent Action Plan, the EU aims to reinvent its innovation model. European Commission Executive Vice-President for Tech Sovereignty, Security and Democracy Henna Virkkunen outlines its ambition.

News
Speaker for the event

Europe's Innovation Pivot: Can the EU Lead the Next Wave of AI?

Daniel Zhang
Government, Public AdministrationPrivacy, Safety, SecurityJun 04

With its AI Continent Action Plan, the EU aims to reinvent its innovation model. European Commission Executive Vice-President for Tech Sovereignty, Security and Democracy Henna Virkkunen outlines its ambition.

Struggling DNA Testing Firm 23andMe To Be Bought For $256m
BBC
May 19, 2025
Media Mention

Stanford HAI Policy Fellow Jennifer King speaks about the data privacy implications of 23andMe's purchase by Regeneron.

Media Mention
Your browser does not support the video tag.

Struggling DNA Testing Firm 23andMe To Be Bought For $256m

BBC
Privacy, Safety, SecurityMay 19

Stanford HAI Policy Fellow Jennifer King speaks about the data privacy implications of 23andMe's purchase by Regeneron.