In California’s Santa Clara County, the pandemic has exposed the many challenges contact tracers confront as they scramble to head off the spread of COVID-19. Chief among them: language barriers, a fact that has contributed to troubling health disparities for the county’s Latino community. While only about 25 percent of Santa Clara’s population is Latino, that community accounts for more than 56 percent of the state’s COVID-19 cases.
“When we connect with people in their preferred language, it makes a huge difference in their willingness to share information about themselves, their health, and their families and friends,” said co-author Dr. Sarah Rudman, director of contact tracing for the County of Santa Clara Public Health Department.
The issue runs deeper than preference alone, Rudman explained. Many in the Latino community are understandably distrustful, if not fearful, of government employees requesting personal information. Meanwhile, a large portion is only able to have these complex conversations in Spanish. These two truths create a potent mix that can inhibit, or halt, contact tracing.
Being able to predict when a contact is a monolingual Spanish speaker or has limited English proficiency and assign the individual to one of the county’s native Spanish-speaking tracers is a key factor in the success of tracing efforts. Interpreters help, but only to a point. In a field where time is of the essence and tracers must reach thousands of cases and their contacts per day, even skilled interpreters can slow the process. As a result, it can take days to successfully reach and close a case. Every minute that passes, the disease is spreading.
Recently, however, experts from Stanford’s RegLab — a group that designs and evaluates programs, policies, and technologies to modernize government — have come to the aid of the Santa Clara County Public Health Department. In a study detailed in Proceedings of the National Academies of Science, the RegLab team describes how it applied machine learning to transform contact tracing in Santa Clara County — and narrowed the health gap between the county’s Latino and other communities.
Predicting Language Preference
“Due to the decentralized nature of laboratory testing, initial reports of positive cases often contain only limited information, like name, address, and date of birth,” said Daniel Ho, a Stanford Law School professor who is faculty director of the RegLab and associate director of the Stanford Institute for Human-Centered Artificial Intelligence.
Read the study: A Language-matching Model to Improve Equity and Efficiency of COVID-19 Contact Tracing
Ho and his collaborators at Stanford were able to blend that scant data with demographic information from the census and other administrative data and apply a machine learning algorithm that scores given contacts as to which language they are most likely to prefer, before the first call or text goes out.
The scores are based on a complex analysis of data like census block group, age, and names. The algorithm weighs these factors and finds patterns in the data that predict one’s language preference. The team paid special attention to an explainable model to develop support, understanding, and buy-in for the approach.
To test the algorithm’s effectiveness, the RegLab and the county conducted a randomized controlled trial. They randomly routed a subset of cases to a “language specialty team” with bilingual speakers and treated the other half with the county’s typical process.
In a few short months, the algorithm’s effect crystallized. In the test group, case time from open to completion declined by nearly 14 hours over the control. Same-day completions rose by 12 percent, and the outright refusal to interview dipped by 4 percent.
The Luxury of Insight
The results have been powerful, Ho and Rudman agree. The new approach has built people’s trust and willingness to engage in the process. It has also allowed Rudman the opportunity to match resources to the job at hand — she has been able to ensure the bilingual tracers the county had could be assigned to the contacts most likely to need them, relying on interpreters as little as possible.
“Before the algorithm you could hear frustration in the voices of our tracers when they would get mismatched with a contact,” Rudman recalled. “After the algorithm, there would be talk of the families they had connected with, many of whom stayed on the phone only because the tracer spoke Spanish and pronounced their name correctly.”
In a field where every missed contact and every extra minute can mean additional infections, these are significant improvements, Ho said, noting that the partnership between people and machine was a surprising — and refreshing — outcome for him.
“There’s much worry in the AI community about whether machines will displace human judgment,” he said, “But, this case is a model for how machines and people can integrate in complex ways that make both better.”
Additional authors include Lisa Lu (lead author) and Ben Anderson, both research fellows at the RegLab; Derek Ouyang, project lead at the RegLab and Stanford Future Bay Initiative; Raymond Ha, a PhD student at Stanford University; and Alexis D’Agostino, senior research and evaluation specialist, Santa Clara County Public Health Department.