Evaluating Facial Recognition Technology: A Protocol for Performance Assessment in New Domains

This white paper provides research- and scientifically-grounded recommendations for how to give context to calls for testing the operational accuracy of facial recognition technology.
Preface
In May 2020, Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) convened a half-day workshop to address the question of facial recognition technology (FRT) performance in new domains. The workshop included leading computer scientists, legal scholars, and representatives from industry, government, and civil society (listed in the Appendix). Given the limited time, the goal of the workshop was circumscribed. It aimed to examine the question of operational performance of FRT in new domains. While participants brought many perspectives to the workshop, there was a relative consensus that (a) the wide range of emerging applications of FRT presented substantial uncertainty about performance of FRT in new domains, and (b) much more work was required to facilitate rigorous assessments of such performance. This White Paper is the result of the deliberation ensuing from the workshop.
FRT raises profound questions about the role of technology in society. The complex ethical and normative concerns about FRT’s impact on privacy, speech, racial equity, and the power of the state are worthy of serious debate, but beyond the limited scope of this White Paper. Our primary objective here is to provide research- and scientifically-grounded recommendations for how to give context to calls for testing the operational accuracy of FRT. Framework legislation concerning the regulation of FRT has included general calls for evaluation, and we provide guidance for how to actually implement and realize it. That work cannot be done solely in the confines of an academic lab. It will require the involvement of all stakeholders — FRT vendors, FRT users, policymakers, journalists, and civil society organizations — to promote a more reliable understanding of FRT performance. Since the time of the workshop, numerous industry developers and vendors have called for a moratorium on government and/or police use of FRT. Given the questions around accuracy of the technology, we consider a pause to understand and study further the consequences of the technology to be prudent at this time.
Adhering to the protocol and recommendations herein will not end the intense scrutiny around FRT, nor should it. We welcome continued conversation around these important issues, particularly around the potential for these technologies to harm and disproportionately impact underrepresented communities. Our limited goals are to make concrete a general requirement that appears in nearly every proposed legislation to regulate FRT: whether it works as billed.
We hope that grounding our understanding of the operational and human impacts of this emerging technology will inform the wider debate on the future use of FRT, and whether or not it is ready for societal deployment.







