Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Stanford Launches AI Audit Challenge | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
news

Stanford Launches AI Audit Challenge

Date
July 11, 2022

AI systems must be evaluated for legal compliance, in particular laws protecting people from illegal discrimination. This challenge seeks to broaden the tools available to people who want to analyze and regulate them. 

 

Over the past decade, researchers have warned of the flaws and vulnerabilities inherent to certain AI systems, such as the difficulty in evaluating their safety, legality and effectiveness, and their potential discriminatory effects. Increasingly, we see these harms materializing and affecting people in real life settings: Last year, a New Jersey man was wrongly accused of shoplifting and trying to hit an officer with a car because of a wrong facial recognition match. A drug addiction risk algorithm was recently found to have a disparate impact on women. Graduate researchers discovered that facial recognition systems deployed by the private sector displayed significant biases. More recently, researchers have also focused their attention on how large-scale language models often capture undesirable societal biases.

Unfortunately, it remains very difficult for regulators, journalists, policymakers and the third sector more widely to evaluate these algorithms and test them to understand potential discriminatory impacts. As the Wired article linked above notes, the proprietary nature of many deployed algorithms means that “there’s no way to look under the hood to inspect them for errors or biases.” In 2020, a group of respected academics and practitioners published a paper calling for better tools to audit algorithmic systems, noting the process of AI development is opaque and too many barriers make it impossible for third parties to verify the claims made by developers.

Challenge and Cash Prizes 

That’s why Stanford’s Cyber Policy Center and the Stanford Institute for Human-Centered Artificial Intelligence have launched a challenge with prizes of up to $25,000 to encourage developers to create better and more usable approaches to auditing AI systems. 

Visit the AI Audit Challenge Website

 

This challenge is generously funded by the Rockefeller Foundation and will focus on tools to assess whether deployed AI systems illegally discriminate against protected categories. For example, can we analyze how well a computer vision system performs when confronted with pictures of people from different demographic backgrounds? Can we grade the output of a natural language processing system asked to produce content on different religions? 

Why Auditing Matters

Being able to audit AI systems comes with great benefits: It allows public officials or journalists to verify the statements made by companies about the efficacy of their algorithms, thereby reducing the risk of fraud and misrepresentation. It improves competition on the quality and accuracy of AI systems. It could also allow governments to establish high-level objectives without being overly prescriptive about the means to get there. Being able to detect and evaluate the potential harm caused by various algorithmic applications is crucial to the democratic governance of AI systems.

The problem is that auditing algorithms is very difficult. AI systems are not simply a few lines of code, but complex sociotechnical systems consisting of a mixture of technical choices and social practices. Context matters greatly and what is acceptable in one setting might not necessarily be in another – for example, an algorithm used in a medical or social welfare setting will require far more scrutiny than an algorithm used to generate music. But even the core technical parts, such as the algorithm, the compute and the training sets, remain very difficult to properly scrutinize – even more so when a proprietary AI system is deployed. Datasets used in machine learning are also frequently incomplete and not representative of different population groups, but looking under the hood is, in practice, quasi-impossible.

We believe there is an urgent need for better tools to test these algorithms. The risk of harmful algorithmic systems is well known; now is the time to act and build the toolkits that will empower policymakers, activists and white hat hackers of the future. It is with this backdrop in mind that we decided to launch an AI Audit Challenge, with an initial focus on tools to detect bias and discrimination in particular.

The Challenge Process

We invite submissions looking at both open-source models later integrated in commercial products (such as BERT and YOLO) and deployed systems in use by the public and private sector (such as COMPAS, GPT-3 and POL-INTEL) to better understand how these systems deal with protected characteristics and classes and identify indirect discrimination.

Submissions will be evaluated by a jury including Mozilla fellow Deborah Raji, Montreal AI Ethics Institute founder Abhishek Gupta and DeepMind senior research scientist William Isaac. Participants will also have the opportunity to iterate their work through workshops and receive advice and support from an advisory board, with members such as Professor Safiya Noble of UCLA and former U.S. Ambassador to the United Nations Eileen Donahoe. Two first place winners will receive $25,000, with additional awards for second and third. 

We believe that to get the most out of AI systems, they need to first and foremost respect civil rights law, and also be safe, high quality and trustworthy. With this challenge, we hope to catalyze and build on the larger body of work concerned with interrogating these systems to create pragmatic policy, regulatory and governance approaches. 

Learn more and submit your entry at the AI Audit Challenge Website.

Stanford HAI’s mission is to advance AI research, education, policy and practice to improve the human condition. Learn more.

Share
Link copied to clipboard!
Contributor(s)
Marietje Schaake and Jack Clark

Related News

A New Economic World Order May Be Based on Sovereign AI and Midsized Nation Alliances
Alex Pentland
Feb 06, 2026
News
close-up of a globe with pinpoints of lights coming out of all the countries

As trust in the old order erodes, mid-sized countries are building new agreements involving shared digital infrastructure and localized AI.

News
close-up of a globe with pinpoints of lights coming out of all the countries

A New Economic World Order May Be Based on Sovereign AI and Midsized Nation Alliances

Alex Pentland
Feb 06

As trust in the old order erodes, mid-sized countries are building new agreements involving shared digital infrastructure and localized AI.

Smart Enough to Do Math, Dumb Enough to Fail: The Hunt for a Better AI Test
Andrew Myers
Feb 02, 2026
News
illustration of data and lines

A Stanford HAI workshop brought together experts to develop new evaluation methods that assess AI's hidden capabilities, not just its test-taking performance.

News
illustration of data and lines

Smart Enough to Do Math, Dumb Enough to Fail: The Hunt for a Better AI Test

Andrew Myers
Foundation ModelsGenerative AIPrivacy, Safety, SecurityFeb 02

A Stanford HAI workshop brought together experts to develop new evaluation methods that assess AI's hidden capabilities, not just its test-taking performance.

What Davos Said About AI This Year
Shana Lynch
Jan 28, 2026
News
James Landay and Vanessa Parli

World leaders focused on ROI over hype this year, discussing sovereign AI, open ecosystems, and workplace change.

News
James Landay and Vanessa Parli

What Davos Said About AI This Year

Shana Lynch
Economy, MarketsJan 28

World leaders focused on ROI over hype this year, discussing sovereign AI, open ecosystems, and workplace change.