For aspiring home buyers, getting a mortgage often comes down to one talismanic number: the credit score.
Banks and other lenders are turning to artificial intelligence to develop increasingly sophisticated models for scoring credit risk. But even though credit-scoring companies are legally prohibited from considering factors like race or ethnicity, critics have long worried that the models contain hidden biases against disadvantaged communities, limiting their access to credit.
Now a preprint study in which researchers used artificial intelligence to test alternative credit-scoring models finds that there is indeed a problem for lower-income families and minority borrowers: The predictive tools are between 5 and 10 percent less accurate for these groups than for higher-income and non-minority groups.
Read the paper: How Costly is Noise? Data and Disparities in Consumer Credit
It’s not that the credit score algorithms themselves are biased against disadvantaged borrowers. Rather, it’s that the underlying data is less accurate in predicting creditworthiness for those groups, often because those borrowers have limited credit histories.
A “thin” credit history will in itself lower a person’s score, because lenders prefer more data than less. But it also means that one or two small dings, such as a delinquent payment many years in the past, can cause outsized damage to a person’s score.
“We’re working with data that’s flawed for all sorts of historical reasons,” says Laura Blattner, an assistant professor of finance at the Stanford Graduate School of Business, who co-authored the new study with Scott Nelson of the University of Chicago Booth School of Business. “If you have only one credit card and never had a mortgage, there’s much less information to predict whether you’re going to default. If you defaulted one time several years ago, that may not tell much about the future.”
The Root of the Problem
In analyzing the issue, the researchers used artificial intelligence and huge volumes of consumer data to test different credit-scoring models.
The first step was to figure out if the standard credit-score approaches were equally accurate across different demographic groups. They used AI to crunch anonymized consumer credit data on 50 million people that was provided by one of the major credit-score companies. They also worked with a huge marketing dataset that makes it possible to identify borrowers by income, race, and ethnicity.
Part of the challenge was figuring out whether people who were rejected for home loans would have been likely to default on mortgages if they had been approved. To do that, the AI models looked at how rejected mortgage applicants had kept up with other kinds of loans, such as car loans, which correlate very closely with how likely a person is to default on a mortgage.
Sure enough, the credit scores turned out to be less accurate for low-income and minority borrowers than others.
Overall, Blattner and Nelson estimate that there is substantially more “noise” or misleading data in the credit scores of people in minority and low-income households. They found that scores for minorities are about 5 percent less accurate in predicting default risk than the scores of non-minority borrowers. Likewise, the scores for people in the bottom fifth of income are about 10 percent less predictive than those for higher-income borrowers.
But why? Were the algorithms biased because they couldn’t pick up on the distinctive patterns of particular demographic groups?
To find out, Blattner and Nelson tested alternative scoring models that had been fine-tuned to minority and low-income borrowers. The results didn’t change much: The scores were still less accurate for them.
The real problem was in the data. People with very limited credit files, who had taken out few loans and held few if any credit cards, were harder to assess for creditworthiness. That was especially the case for people who had one or two blemishes on their records. Because minority borrowers and low-income earners were more likely to have thin or spotty credit records, their credit scores were less accurate.
By making the existing credit report data more informative, such as by compiling thicker and more diverse files for minorities, Blattner and Nelson estimate that it’s possible to eliminate half the disparity in accuracy.
A Misallocation of Credit
The findings have a number of important implications. The study suggests that people from disadvantaged communities are being turned down for mortgages more often than they probably should be, though some of those borrowers may also get approved when they probably shouldn’t be. That creates a misallocation of credit. It can also perpetuate the inequalities because people who can’t get mortgages have less opportunity to build up solid records. They also then lose out on a crucial avenue for building wealth.
Some experts have argued that the algorithms run into trouble because financial institutions are legally prohibited from incorporating factors like ethnicity, gender, or race into their models. Those prohibitions are intended to prevent discrimination, but they can also block lenders from recognizing differences between groups that might actually elevate some borrowers’ credit scores.
The study finds that is not a problem. Even when Blattner and Nelson created scoring models that were fine-tuned to minority and low-income borrowers, the scores were still less accurate for those groups.
There is no simple solution, Blattner says. But one possible strategy would be for financial companies to run experiments in which they approve loans to people with relatively low credit scores.
“If you’re a bank, you could give loans to people and see who pays,” says Blattner. “That’s exactly what some fin-tech companies are doing: giving loans and then learning.”
Stanford HAI's mission is to advance AI research, education, policy and practice to improve the human condition. Learn more.