As AI and predictive algorithms permeate ever more areas of decision making, from setting bail to evaluating job applications to making home loans, what happens when an algorithm arbitrarily discriminates against women, African-Americans, or other groups?
It happens all the time.
Amazon famously discarded a resume-reviewing system because it penalized women — probably a legacy of gender-skewed hiring patterns. Similarly, an AI model used by courts to predict recidivism incorrectly labelled Black defendants as “high risk” at twice the rate as whites.
Unfortunately, warns Daniel Ho, the William Benjamin Scott and Luna M. Scott Professor of Law at Stanford University and associate director of the Stanford Institute for Human-Centered Artificial Intelligence, many of the proposed solutions to fix algorithmic bias are on a collision course with Supreme Court rulings on equal protection.
In a new paper by Ho and Alice Xiang, the head of Fairness, Transparency, and Accountability Research at the Partnership on AI, a research group that focuses on responsible AI, the authors warn that many of the strategies for increasing fairness clash directly with the high court’s push for “anti-classification.” That’s the principle of remaining “blind” toward categories like race, gender, or religion.
In one key case, the Supreme Court rejected the University of Michigan’s attempt to give a modest statistical boost to applicants from under-represented communities. In another, the high court ruled against the city of New Haven, Conn., which had thrown out the results of a test for firefighters because no African-American candidates would have been promoted. As Chief Justice John Roberts summed up the issue in another case, “The way to stop discrimination on the basis of race is to stop discriminating on the basis of race.”
That poses a big legal obstacle for fixing biased algorithms, note Ho and Xiang, because most of the strategies involve adjusting algorithms to produce fairer outcomes along racial or gender lines. Since many minority students may have less time and money for SAT coaching classes, for example, it could make sense to lower the relative weight of SAT scores in evaluating their college applications.
But because such adjustments would be deemed race-based classifications, the authors say, they will be at risk of being overturned in court.
“The adjustments to algorithmic systems come very close to the University of Michigan’s 20-point boost, which the Supreme Court rejected,” Ho says. “The machine learning community working on algorithmic fairness hasn’t had close exchanges with the legal community. But when you put the two together, you realize there’s a collision.”
Adds Xiang, “It was striking to see how much of the machine-learning literature is legally suspect. The court has taken a very strong anti-classification stance. If actions are motivated by race, even with the ostensible goal to promote fairness, it probably won’t fly.”
At the same time, the authors warn, the demand for “blindness” could make algorithmic bias even worse.
That’s because machine learning models pick up on all kinds of correlations or proxies for race that may have no real-world significance but become part of the decision process.
Amazon’s resume-reviewing model, for example, didn’t distinguish between male and female applicants. Instead, it “learned” that the company had hired very few engineers who had come from women’s colleges. As a result, it down-weighted applications that mentioned women’s colleges.
“It’s hard to be fair if you’re not aware of an algorithm’s potential impact on different subgroups,” says Ho. “That’s why blindness can often be a significantly inferior solution than what machine learners call ‘fairness through awareness.’”
The good news is that Ho and Xiang see a possible solution to the legal morass.
A separate strand of affirmative action case law, tied to government contracting, has long permitted explicit racial and gender preferences. Federal, state, and local agencies create set-asides and bidding preferences for minority-owned contractors, and those preferences have passed legal muster because the agencies could document their own histories of discrimination.
Ho says that advocates for fairness could well justify race- or gender-based fixes on the basis of past discrimination. Like the Amazon system, most AI models are trained on historical data that may incorporate past patterns of discrimination.
What the law emphasizes, Ho says, is that the magnitude of the adjustment should track the evidence for historical discrimination. The government contracting precedent allows for explicit quantification, thus making fair machine learning more feasible.
Put another way, say Ho and Xiang, the jurisprudence of government contract law may offer an escape from the trap of blindness and a path toward fairness through awareness.
Stanford HAI's mission is to advance AI research, education, policy and practice to improve the human condition. Learn more.