Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Evaluating Facial Recognition Technology: A Protocol for Performance Assessment in New Domains | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
policyWhite Paper

Evaluating Facial Recognition Technology: A Protocol for Performance Assessment in New Domains

Date
November 01, 2020
Topics
Computer Vision
Regulation, Policy, Governance
Read Paper
abstract

This white paper provides research- and scientifically-grounded recommendations for how to give context to calls for testing the operational accuracy of facial recognition technology.

Preface

In May 2020, Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) convened a half-day workshop to address the question of facial recognition technology (FRT) performance in new domains. The workshop included leading computer scientists, legal scholars, and representatives from industry, government, and civil society (listed in the Appendix). Given the limited time, the goal of the workshop was circumscribed. It aimed to examine the question of operational performance of FRT in new domains. While participants brought many perspectives to the workshop, there was a relative consensus that (a) the wide range of emerging applications of FRT presented substantial uncertainty about performance of FRT in new domains, and (b) much more work was required to facilitate rigorous assessments of such performance. This White Paper is the result of the deliberation ensuing from the workshop. 

FRT raises profound questions about the role of technology in society. The complex ethical and normative concerns about FRT’s impact on privacy, speech, racial equity, and the power of the state are worthy of serious debate, but beyond the limited scope of this White Paper. Our primary objective here is to provide research- and scientifically-grounded recommendations for how to give context to calls for testing the operational accuracy of FRT. Framework legislation concerning the regulation of FRT has included general calls for evaluation, and we provide guidance for how to actually implement and realize it. That work cannot be done solely in the confines of an academic lab. It will require the involvement of all stakeholders — FRT vendors, FRT users, policymakers, journalists, and civil society organizations — to promote a more reliable understanding of FRT performance. Since the time of the workshop, numerous industry developers and vendors have called for a moratorium on government and/or police use of FRT. Given the questions around accuracy of the technology, we consider a pause to understand and study further the consequences of the technology to be prudent at this time. 

Adhering to the protocol and recommendations herein will not end the intense scrutiny around FRT, nor should it. We welcome continued conversation around these important issues, particularly around the potential for these technologies to harm and disproportionately impact underrepresented communities. Our limited goals are to make concrete a general requirement that appears in nearly every proposed legislation to regulate FRT: whether it works as billed. 

We hope that grounding our understanding of the operational and human impacts of this emerging technology will inform the wider debate on the future use of FRT, and whether or not it is ready for societal deployment.

Read Paper
Share
Link copied to clipboard!
Authors
  • Dan Ho headshot
    Daniel E. Ho
  • Emily Black
    Emily Black
  • Maneesh Agrawala
    Maneesh Agrawala
  • fei fei li headshot
    Fei-Fei Li

Related Publications

Russ Altman’s Testimony Before the U.S. Senate Committee on Health, Education, Labor, and Pensions
Russ Altman
Quick ReadOct 09, 2025
Testimony

In this testimony presented to the U.S. Senate Committee on Health, Education, Labor, and Pensions hearing titled “AI’s Potential to Support Patients, Workers, Children, and Families,” Russ Altman highlights opportunities for congressional support to make AI applications for patient care and drug discovery stronger, safer, and human-centered.

Testimony

Russ Altman’s Testimony Before the U.S. Senate Committee on Health, Education, Labor, and Pensions

Russ Altman
HealthcareRegulation, Policy, GovernanceSciences (Social, Health, Biological, Physical)Quick ReadOct 09

In this testimony presented to the U.S. Senate Committee on Health, Education, Labor, and Pensions hearing titled “AI’s Potential to Support Patients, Workers, Children, and Families,” Russ Altman highlights opportunities for congressional support to make AI applications for patient care and drug discovery stronger, safer, and human-centered.

Michelle M. Mello's Testimony Before the U.S. House Committee on Energy and Commerce Health Subcommittee
Michelle Mello
Quick ReadSep 02, 2025
Testimony

In this testimony presented to the U.S. House Committee on Energy and Commerce’s Subcommittee on Health hearing titled “Examining Opportunities to Advance American Health Care through the Use of Artificial Intelligence Technologies,” Michelle M. Mello calls for policy changes that will promote effective integration of AI tools into healthcare by strengthening trust.

Testimony

Michelle M. Mello's Testimony Before the U.S. House Committee on Energy and Commerce Health Subcommittee

Michelle Mello
HealthcareRegulation, Policy, GovernanceQuick ReadSep 02

In this testimony presented to the U.S. House Committee on Energy and Commerce’s Subcommittee on Health hearing titled “Examining Opportunities to Advance American Health Care through the Use of Artificial Intelligence Technologies,” Michelle M. Mello calls for policy changes that will promote effective integration of AI tools into healthcare by strengthening trust.

Response to the Department of Education’s Request for Information on AI in Education
Victor R. Lee, Vanessa Parli, Isabelle Hau, Patrick Hynes, Daniel Zhang
Quick ReadAug 20, 2025
Response to Request

Stanford scholars respond to a federal RFI on advancing AI in education, urging policymakers to anchor their approach in proven research.

Response to Request

Response to the Department of Education’s Request for Information on AI in Education

Victor R. Lee, Vanessa Parli, Isabelle Hau, Patrick Hynes, Daniel Zhang
Education, SkillsRegulation, Policy, GovernanceQuick ReadAug 20

Stanford scholars respond to a federal RFI on advancing AI in education, urging policymakers to anchor their approach in proven research.

Labeling AI-Generated Content May Not Change Its Persuasiveness
Isabel Gallegos, Dr. Chen Shani, Weiyan Shi, Federico Bianchi, Izzy Benjamin Gainsburg, Dan Jurafsky, Robb Willer
Quick ReadJul 30, 2025
Policy Brief

This brief evaluates the impact of authorship labels on the persuasiveness of AI-written policy messages.

Policy Brief

Labeling AI-Generated Content May Not Change Its Persuasiveness

Isabel Gallegos, Dr. Chen Shani, Weiyan Shi, Federico Bianchi, Izzy Benjamin Gainsburg, Dan Jurafsky, Robb Willer
Generative AIRegulation, Policy, GovernanceQuick ReadJul 30

This brief evaluates the impact of authorship labels on the persuasiveness of AI-written policy messages.