AI Hiring Tools Can Yield Racial Bias and Systemic Rejection

The first large-scale study of hiring algorithms in the wild finds concerning patterns to how systems reject candidates.
It’s graduation season and the Class of 2026 is entering one of the toughest labor markets in years. Entry-level hiring has slowed. At the same time, AI tools have made it easier than ever for job seekers to fire off applications. Together, fewer jobs and more applications mean companies are now seeing nearly three times as many applications for entry-level positions as in 2022. AI is changing not just if firms hire, but how they hire. Ninety percent of U.S. employers use AI screening tools to sort and rank job seekers, with most relying on the same few third-party vendors. When one algorithm influences many employers, what is the impact on job seekers?
We follow 3.4 million people who submit 4 million job applications to 1,700 job postings across 150 employers and 11 industry sectors. Each job application was assessed by an AI hiring tool built by a single third-party vendor. Our new paper offers a rare look inside the “black box” of algorithmic hiring, showing that these tools increase racial bias and shut the same people out of jobs everywhere they apply.

The hiring AI pipeline: Job seekers submit applications, their applications are sent to the hiring AI vendor, the vendor’s machine learning models make predictions, and the resulting labels of “recommend” or “do not recommend” are sent to the employer to inform decisions.
Surfacing racial bias at scale
We find substantial evidence of racial disparities in AI-based candidate screening. To measure adverse impact, we apply the EEOC’s “four-fifths rule,” which flags a position when one group is recommended at less than 80% of the rate of the most-recommended group — the relevant U.S. employment law (Title VII). We discovered that 26% of Black applicants and 15% of Asian applicants applied to positions where the AI system discriminated against their racial group. To put this in perspective: If the AI had recommended Black and Asian candidates at the same rate as it recommended the most-favored group (typically white applicants), 40,000 more of their applications would have advanced to the next stage of hiring.
How adverse impact is measured matters. The vendor we study screens applicants for many different positions across many employers. If we pool all of its recommendations together — treating the vendor as one giant hiring process — we don’t find adverse impact. If we look at each position separately, as would be typical in an evaluation of adverse impact, then we expose the adverse impact in many positions. For example, imagine the AI tool frequently recommends Black applicants for warehouse jobs but rarely recommends them for finance jobs. If we were to average all the jobs together, those two patterns would cancel each other out and it would seem like there is no discrimination. The big-picture average hides the real discrimination happening job by job.

Our study finds significant adverse impact on Black and Asian applicants.
Algorithmic monocultures can give rise to systemic rejection
We also study new concerns brought about by the shared dependence on a single hiring vendor. In our prior work, we theorized that algorithmic monocultures in which many employers came to rely on the same algorithmic recommendations could lead to some people being shut out from jobs. Using our large dataset of real hiring AI recommendations, we test our hypothesis. We find that people who submit multiple applications to positions screened by the same algorithmic hiring vendor are more likely to be rejected from every position to which they apply than would be true if the companies made decisions statistically independently from one another. Ten percent of applicants who submit four applications are rejected from all the places to which they apply.
Our research also found that this pattern does not appear to be the case in other circumstances. We analyzed data from the largest prior study of hiring decisions, which sent 83,000 applications to 108 Fortune 500 firms during the same time period as our study and did not focus on whether AI was used to make decisions. We found that the rate at which applicants were rejected from every firm they applied to in this data was no higher than what you’d expect if each company decided independently of the others.
This suggests market concentration matters: As a single hiring vendor comes to dominate screening for an industry, it may be more likely that candidates are shut out.

We find applicants are more likely to be rejected from every position they apply to than would be predicted by the baseline of each position making statistically independent decisions.

We analyze data from the largest previous study of hiring outcomes, finding that the rate at which applicants are rejected from every position they apply to is effectively predicted by the baseline of statistically independent decisions.
AI screening tools bring together three properties that should not co-exist in high-stakes decision-making: They are pervasively adopted, highly consequential, and opaque to the public. Our research makes progress toward illuminating the consequences of AI hiring tools, but much of this technology’s impact remains unclear. This space is rapidly evolving as new tools are built using language models and agents.
The key lesson from this work is the value of and need for independent research into algorithmic hiring. Without independent research, it will be difficult to pursue evidence-based AI policy to govern AI’s impact on individual job prospects and overall workforce composition.



