Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Toward Fairness in Health Care Training Data | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
policyPolicy Brief

Toward Fairness in Health Care Training Data

Date
October 01, 2020
Topics
Healthcare
Ethics, Equity, Inclusion
Toward Fairness in Health Care Training Data
Read Paper
abstract

This brief highlights the lack of geographic representation in medical-imaging AI training data and calls for nationwide, diversity-focused data-sharing initiatives.

Key Takeaways

  • Bias arises when we build algorithms using datasets that don’t mirror the population. When generalized to larger swathes of the population, these nonrepresentative data have the potential to confound research findings.

  • The vast majority of the health data used to build AI algorithms came from only three states, with little or no representation from the remaining 47 states.

  • Policymakers, regulators, industry, and academia need to work together to ensure medical AI data reflect America’s diversity across not only geography but also many other important attributes. To that end, nationwide data sharing initiatives should be a top priority.

Executive Summary

With recent advances in artificial intelligence (AI), researchers can now train sophisticated computer algorithms to interpret medical images – often with accuracy comparable to trained physicians. Yet our recent survey of medical research shows that these algorithms rely on datasets that lack population diversity and could introduce bias into the understanding of a patient’s health condition.

Artificial intelligence algorithms increasingly inform the decisions of human experts. In medical imaging, these algorithms may help a doctor spot a subtle finding or suggest an alternate diagnosis. But bias in the data used to train these high-stakes algorithms can bias the algorithm itself. Our analysis shows that the datasets used to develop these algorithms come from only a handful of locations – raising serious questions for policymakers—but also providing opportunities for course correction.

In our research, published in the Journal of the American Medical Association, we looked at data from more than 70 studies that used U.S. patient data to train algorithms designed to compete or collaborate with physicians to perform diagnostic tasks. Overwhelmingly, the datasets came from three states—California, Massachusetts, and New York—with little or no representation from the remaining 47 states. Rectifying this lack of representation in medical data should be front of mind for health policymakers and regulators. Lack of data diversity can be addressed in part by initiatives to streamline the nation’s digital infrastructure, to enhance the availability of patient data from underrepresented populations for larger studies, and to incentivize ethical data sharing and the democratization of medical data.

Read Paper
Share
Link copied to clipboard!
Authors
  • Amit Kaushal
    Amit Kaushal
  • Russ Altman
    Russ Altman
  • Curt Langlotz headshot
    Curtis Langlotz

Related Publications

Response to FDA's Request for Comment on AI-Enabled Medical Devices
Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew
Quick ReadDec 02, 2025
Response to Request

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.

Response to Request

Response to FDA's Request for Comment on AI-Enabled Medical Devices

Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew
HealthcareRegulation, Policy, GovernanceQuick ReadDec 02

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.

Moving Beyond the Term "Global South" in AI Ethics and Policy
Evani Radiya-Dixit, Angèle Christin
Quick ReadNov 19, 2025
Issue Brief

This brief examines the limitations of the term "Global South" in AI ethics and policy, and highlights the importance of grounding such work in specific regions and power structures.

Issue Brief

Moving Beyond the Term "Global South" in AI Ethics and Policy

Evani Radiya-Dixit, Angèle Christin
Ethics, Equity, InclusionInternational Affairs, International Security, International DevelopmentQuick ReadNov 19

This brief examines the limitations of the term "Global South" in AI ethics and policy, and highlights the importance of grounding such work in specific regions and power structures.

Russ Altman’s Testimony Before the U.S. Senate Committee on Health, Education, Labor, and Pensions
Russ Altman
Quick ReadOct 09, 2025
Testimony

In this testimony presented to the U.S. Senate Committee on Health, Education, Labor, and Pensions hearing titled “AI’s Potential to Support Patients, Workers, Children, and Families,” Russ Altman highlights opportunities for congressional support to make AI applications for patient care and drug discovery stronger, safer, and human-centered.

Testimony

Russ Altman’s Testimony Before the U.S. Senate Committee on Health, Education, Labor, and Pensions

Russ Altman
HealthcareRegulation, Policy, GovernanceSciences (Social, Health, Biological, Physical)Quick ReadOct 09

In this testimony presented to the U.S. Senate Committee on Health, Education, Labor, and Pensions hearing titled “AI’s Potential to Support Patients, Workers, Children, and Families,” Russ Altman highlights opportunities for congressional support to make AI applications for patient care and drug discovery stronger, safer, and human-centered.

Michelle M. Mello's Testimony Before the U.S. House Committee on Energy and Commerce Health Subcommittee
Michelle Mello
Quick ReadSep 02, 2025
Testimony

In this testimony presented to the U.S. House Committee on Energy and Commerce’s Subcommittee on Health hearing titled “Examining Opportunities to Advance American Health Care through the Use of Artificial Intelligence Technologies,” Michelle M. Mello calls for policy changes that will promote effective integration of AI tools into healthcare by strengthening trust.

Testimony

Michelle M. Mello's Testimony Before the U.S. House Committee on Energy and Commerce Health Subcommittee

Michelle Mello
HealthcareRegulation, Policy, GovernanceQuick ReadSep 02

In this testimony presented to the U.S. House Committee on Energy and Commerce’s Subcommittee on Health hearing titled “Examining Opportunities to Advance American Health Care through the Use of Artificial Intelligence Technologies,” Michelle M. Mello calls for policy changes that will promote effective integration of AI tools into healthcare by strengthening trust.