Promoting Algorithmic Fairness in Clinical Risk Prediction

Date

September 09, 2022

Topics

Healthcare

Machine Learning

Ethics, Equity, Inclusion

Read Paper

abstract

This brief examines the debate on algorithmic fairness in clinical predictive algorithms and recommends paths to safer, more equitable healthcare AI.

Key Takeaways

We studied the trade-offs clinical predictive algorithms face between accuracy and fairness for outcomes like hospital mortality, prolonged stays in the hospital, and 30-day readmissions to the hospital. We found that techniques that make these programs more fair can degrade performance of the algorithm for everyone across the board.
Making algorithmic fixes on the developer’s side should only be one option to fix this. Policymakers should consider ways to incentivize model developers to engage in participatory design practices that incorporate perspectives from patient advocacy groups and civil society organizations.
Algorithmic fixes may work in some contexts, but others may require policymakers to mandate that a human stays in the decision-making loop or the use of the algorithm may not be worthwhile at all.

Executive Summary

Healthcare providers and medical professionals are increasingly using machine learning to advance how treatment is delivered to patients. From medical image analysis to a range of data processing functions, these machine learning applications will only continue to shape patient-care experiences and medical outcomes. Developers, doctors, patients, and policymakers are just some of the stakeholders grappling with these algorithmic uses.

That said, there is a fundamental problem with machine learning in healthcare: We cannot assume developers are making strides to remedy bias and other fairness issues in a concerted manner. Discriminatory AI decision-making is concerning in any setting. This is especially pronounced in a clinical setting, where individuals’ well-being and physical safety are on the line, and where medical professionals face life-or-death decisions every day.

Until now, the conversation about measuring algorithmic fairness in healthcare has focused on fairness itself—and has not fully taken into account how fairness techniques could impact clinical predictive models, which are often derived from large clinical datasets. Our new research, published in the Journal of Biomedical Informatics, seeks to ground this debate in evidence, and suggests the best way forward in developing fairer machine learning tools for a clinical setting.

We explicitly measure trade-offs in the fairness and performance of clinical predictive models. Using three large datasets spanning decades of health outcomes—such as hospital mortality, prolonged stays in the hospital, and 30-day readmissions to the hospital—our research compared these outcomes with three different notions of fairness across demographic groupings—such as race, ethnicity, sex, and age. In total, we find that improvements in algorithmic fairness, based on minimizing differences between demographic groups, cause lowered performance across multiple metrics. This exposes many challenges ahead in successfully mitigating bias in algorithms of the kind that has long plagued certain demographics within the United States.

Policymakers should recognize that there is no technical solution to address unfairness in clinical predictive models that does not decrease accuracy. Consequently, they should consider ways to incentivize responsible algorithm development alongside policies that address broader, structural healthcare inequities such as those caused by racism and socioeconomic inequality. The use of clinical predictive models must either be narrowly calibrated to a particular setting or constructed so that a human healthcare provider stays in the decision-making loop to ensure fair patient treatment. If machine learning models do not promote health equity, it may be appropriate to abstain from using an algorithm altogether.

Read Paper

Related Publications

How Can AI Support Language Digitization and Digital Inclusion?

Juan N. Pava, Thomas S. Mullaney, Caroline Meinhardt, Audrey Gao, Diyi Yang

Deep DiveFeb 26, 2026

White Paper

This white paper analyzes the varying ways AI tools can advance language digitization work, and provides recommendations for responsibly realizing the potential of AI in supporting the digital inclusion of digitally disadvantaged languages.

White Paper

How Can AI Support Language Digitization and Digital Inclusion?

Juan N. Pava, Thomas S. Mullaney, Caroline Meinhardt, Audrey Gao, Diyi Yang

Ethics, Equity, InclusionInternational Affairs, International Security, International DevelopmentNatural Language ProcessingDeep DiveFeb 26

Toward Responsible AI in Health Insurance Decision-Making

Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char

Quick ReadFeb 10, 2026

Policy Brief

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Policy Brief

Toward Responsible AI in Health Insurance Decision-Making

Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char

HealthcareRegulation, Policy, GovernanceQuick ReadFeb 10

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Response to FDA's Request for Comment on AI-Enabled Medical Devices

Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew

Quick ReadDec 02, 2025

Response to Request

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.

Response to Request

Response to FDA's Request for Comment on AI-Enabled Medical Devices

Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew

HealthcareRegulation, Policy, GovernanceQuick ReadDec 02

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.

Moving Beyond the Term "Global South" in AI Ethics and Policy

Evani Radiya-Dixit, Angèle Christin

Quick ReadNov 19, 2025

Issue Brief

This brief examines the limitations of the term "Global South" in AI ethics and policy, and highlights the importance of grounding such work in specific regions and power structures.

Issue Brief

Moving Beyond the Term "Global South" in AI Ethics and Policy

Evani Radiya-Dixit, Angèle Christin

Ethics, Equity, InclusionInternational Affairs, International Security, International DevelopmentQuick ReadNov 19

This brief examines the limitations of the term "Global South" in AI ethics and policy, and highlights the importance of grounding such work in specific regions and power structures.

Navigate

Participate

Stay Up To Date

Promoting Algorithmic Fairness in Clinical Risk Prediction

Key Takeaways

Executive Summary

Stephen R. Pfohl

Agata Foryciarz

Nigam Shah

Related Publications

How Can AI Support Language Digitization and Digital Inclusion?

How Can AI Support Language Digitization and Digital Inclusion?

Toward Responsible AI in Health Insurance Decision-Making

Toward Responsible AI in Health Insurance Decision-Making

Response to FDA's Request for Comment on AI-Enabled Medical Devices

Response to FDA's Request for Comment on AI-Enabled Medical Devices

Moving Beyond the Term "Global South" in AI Ethics and Policy

Moving Beyond the Term "Global South" in AI Ethics and Policy