The Privacy-Bias Trade-Off

This brief examines the “regulatory misalignment” of existing regulatory proposals for AI and urges policymakers to first consider feasibility, trade-offs, and unintended consequences before rushing into regulation.
Key Takeaways
As companies and regulators step up efforts to protect individuals’ information privacy, a common privacy principle (data minimization) can come to clash with algorithmic fairness.
The U.S. federal government provides a compelling case study: Its adoption of data minimization in the Privacy Act of 1974 has brought many privacy benefits but stymies efforts to gather demographic data to assess disparities in program outcomes across federal agencies.
Coupled with procedures under the Paperwork Reduction Act of 1980, the Privacy Act has meant that agencies rarely and inconsistently collect data on protected attributes.
Twenty-one of 25 agencies noted substantial data challenges in responding to an executive order requiring agencies to conduct equity assessments of their programs.
Privacy principles should be harmonized to permit secure collection of demographic data to conduct disparity assessments.
Executive Summary
Algorithmic fairness and privacy issues are increasingly drawing both policymakers' and the public's attention amid rapid advances in artificial intelligence (AI). New Al applications used in medicine, criminal justice, hiring, and elsewhere can—and in numerous cases already do—make decisions that can generate or exacerbate disparities along race or gender attributes. At the same time, the vast amounts of data collected and processed by public and private actors to train models carry complex implications for individual privacy.
Safeguarding privacy and addressing algorithmic bias can pose a less recognized trade-off. The principle of "data minimization," which the U.S. government has experimented with for almost 50 years, holds that entities should collect and retain only the minimally necessary data to achieve their objectives. But the result is that agencies may lack access to demographic data (e.g., data on race, ethnicity) that is required to conduct equity assessments of public programs. Privacy, in short, can mean a lack of awareness.
In a new paper, "The Privacy-Bias Trade-Off," we document this tension between data minimization principles and racial disparity assessments in the U.S.government. We examine the U.S. government's recent efforts to introduce government-wide equity assessments of federal programs and consider a range of policy solutions, including amending or interpreting the Privacy Act to permit the collection of demographic data to conduct disparity assessments.
Data minimization, while beneficial for privacy, has simultaneously made it legally, technically, and bureaucratically difficult to acquire demographic information necessary to conduct equity assessments. The more AI systems are deployed across government and society, the more imperative it will be to balance privacy and fairness.
Introduction
Race and ethnicity are socially constructed concepts, but demographic information remains critical to understanding, let alone mitigating, racial (or intersectional) disparities. As put elegantly in the algorithmic fairness literature, there is no fairness without awareness.
Approaches to measuring race and ethnicity have varied over time. The U.S. government, for example, started collecting race-related data in the 1930s from applicants to the Social Security Administration (SSA); prior to 1980, the categories were “White,” “Negro,” and “Other.” Since then, the SSA has repeatedly changed its race/ethnicity codes. Most recently, the country’s chief statistician announced plans in 2022 to update government guidelines to further disaggregate racial categories; these include moving away from monolithic categories such as “Asian American,” which can mask substantial variance between subgroups.
During his first day in office, President Biden signed Executive Order (EO) 13985, requiring agencies to conduct equity assessments of federal policies and programs. The EO also acknowledged that “[m]any Federal datasets are not disaggregated by race, ethnicity, gender, disability, income, veteran status, or other key demographic variables.”
What it did not mention is one key structural reason for such difficulties: privacy protection.
The Privacy Act of 1974 and the Paperwork Reduction Act of 1980 are laws that aim to reduce the amount of data the government collects. The Privacy Act of 1974 requires federal agencies to abide by a “data minimization” principle, namely to: (a) collect personally identifiable information only as minimally necessary to carry out their statutory mission; (b) use the information only for its stated collection purpose; and (c) refrain from sharing or linking the data. The Paperwork Reduction Act of 1980 requires agencies to secure approval before requesting many kinds of data from the public. Under the Act, federal agencies adding new data collection mechanisms (such as surveys or web forms) are typically required to go through notice-and-comment and approval by the White House Office of Management and Budget (OMB).
The U.S. government’s implementation of data minimization is a potent case study that highlights a broad dilemma that has vexed regulators and industry alike: Data minimization can come to clash with disparity or equity assessments. Existing research has examined tensions between privacy and fairness in the algorithmic context. It has also examined how privacy laws, such as the European Union’s General Data Protection Regulation (GDPR), can inadvertently prevent technologists from accessing the data they need to conduct fairness tests. Missing, though, is research that documents the impact of privacy-fairness trade-offs on government policy and data use, where the widely espoused data minimization principle has been adopted for some 50 years.
In our paper, we examine how agencies are grappling with this “privacy-bias trade-off.” We assess agency responses to the EO’s requirement to conduct equity assessments and how agencies have dealt with privacy and fairness considerations in major claims programs. We outline federal agencies’ distinct approaches to data surveying and management and identify the most common barriers to implementing equity assessments.







