Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Improving AI Software for Healthcare Diagnostics | Stanford HAI
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

policyPolicy Brief

Improving AI Software for Healthcare Diagnostics

Date
July 01, 2021
Topics
Healthcare
Read Paper
abstract

This brief explores current regulatory frameworks for AI use in radiology and calls for stronger regulatory guidance to improve testing, enhance safety, and establish performance standards.

Key Takeaways

  • AI-based diagnostics show great promise to improve traditional medical imaging methods, such as CT scans, MRIs, and X-rays. These algorithms offer computational capabilities that process images with greater speed and accuracy than traditional methods and can improve patient outcomes for millions.

  • Current proposals for regulatory frameworks do not fully address the necessity to build trust in these systems due to the confusion between the algorithm in question and the task it is designed to perform, inadequate establishment of standard-setting bodies, and insufficient rigor in the evaluation and development process.

  • Policymakers should turn to medical societies for the clinical definitions of diagnostic tasks. These groups should extend performance assessments beyond simply testing for accuracy.

Executive Summary

One of the most promising uses of artificial intelligence (AI) is in radiology, the medical specialization that uses imaging technology to diagnose and treat disease. AI holds great promise for more accurate healthcare diagnostics and even prediction of disease outcomes for patients. AI can improve traditional medical imaging methods like computed tomography (CT), magnetic resonance imaging (MRI), and X-ray by offering computational capabilities that process images with greater speed and accuracy, automatically recognizing complex patterns to assess a patient’s health. This sophisticated software needs more robust evaluation methods to reduce risk to the patient, to establish trust, and to ensure wider adoption. A clear example of this can be seen in the difficulty many researchers had in classifying imaging results from early studies of the coronavirus disease that spawned the COVID-19 pandemic.

In our article in the Journal of the American College of Radiology, “Regulatory Frameworks for Development and Evaluation of Artificial Intelligence-Based Diagnostic Imaging Algorithms,” we explore three major regulatory frameworks for radiology that have been proposed by the United States Food and Drug Administration (FDA), the European Union, and the International Medical Device Regulators Forum, respectively, and show how they ensure safety, effectiveness, and performance of AI-based applications. However, these regulatory bodies could be doing more to build trust in these systems; we recommend changes that need to be made so diagnostic AI can reach its full potential. The shortcomings we enumerate extend from the tendency to confuse the algorithm with the task it is designed to perform, from the lack of rigor in definitions of medical tasks, and from the lax specification of the task, making it harder to compare similar algorithms directly. We also identify problems with unpredictability in model performance, insufficient testing infrastructure, and inherent conflicts of interest.

A better path forward depends on policymakers and medical societies adopting stronger regulatory guidance to improve testing, enhance safety, and establish performance standards for these algorithms. Gaps in the three regulatory frameworks we examined can be filled by the following four actions:

  1. Make sure the algorithm is always distinguished from the definition of the diagnostic task it is automating.

  2. Define elements of algorithmic performance beyond accuracy, such as transparency, use of fail-safes, and auditability.

  3. Divide the evaluation into discrete steps from the perspective of the potential user or evaluator, including diagnostic task definition, capacity of the algorithm to perform in a controlled environment, evaluation of effectiveness in the real world as compared to performance in the controlled environment, validation of effectiveness in the local setting at each installed site, and durability testing and monitoring to ensure the algorithm performs well over time.

  4. Encourage independent assessment by third-party evaluators by implementing a phased testing regime similar to that used by the pharmaceutical industry during drug development and the broader software industry.

Taken together, these steps have the potential to shape diagnostic AI’s future for the better and ensure the technology is developed as fairly and as rapidly as possible.

Read Paper
Share
Link copied to clipboard!
Authors
  • David B. Larson
    David B. Larson
  • Daniel L. Rubin
    Daniel L. Rubin
  • Curt Langlotz headshot
    Curtis Langlotz

Related Publications

Toward Responsible AI in Health Insurance Decision-Making
Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char
Quick ReadFeb 10, 2026
Policy Brief

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Policy Brief

Toward Responsible AI in Health Insurance Decision-Making

Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char
HealthcareRegulation, Policy, GovernanceQuick ReadFeb 10

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Response to FDA's Request for Comment on AI-Enabled Medical Devices
Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew
Quick ReadDec 02, 2025
Response to Request

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.

Response to Request

Response to FDA's Request for Comment on AI-Enabled Medical Devices

Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew
HealthcareRegulation, Policy, GovernanceQuick ReadDec 02

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.

Russ Altman’s Testimony Before the U.S. Senate Committee on Health, Education, Labor, and Pensions
Russ Altman
Quick ReadOct 09, 2025
Testimony

In this testimony presented to the U.S. Senate Committee on Health, Education, Labor, and Pensions hearing titled “AI’s Potential to Support Patients, Workers, Children, and Families,” Russ Altman highlights opportunities for congressional support to make AI applications for patient care and drug discovery stronger, safer, and human-centered.

Testimony

Russ Altman’s Testimony Before the U.S. Senate Committee on Health, Education, Labor, and Pensions

Russ Altman
HealthcareRegulation, Policy, GovernanceSciences (Social, Health, Biological, Physical)Quick ReadOct 09

In this testimony presented to the U.S. Senate Committee on Health, Education, Labor, and Pensions hearing titled “AI’s Potential to Support Patients, Workers, Children, and Families,” Russ Altman highlights opportunities for congressional support to make AI applications for patient care and drug discovery stronger, safer, and human-centered.

Michelle M. Mello's Testimony Before the U.S. House Committee on Energy and Commerce Health Subcommittee
Michelle Mello
Quick ReadSep 02, 2025
Testimony

In this testimony presented to the U.S. House Committee on Energy and Commerce’s Subcommittee on Health hearing titled “Examining Opportunities to Advance American Health Care through the Use of Artificial Intelligence Technologies,” Michelle M. Mello calls for policy changes that will promote effective integration of AI tools into healthcare by strengthening trust.

Testimony

Michelle M. Mello's Testimony Before the U.S. House Committee on Energy and Commerce Health Subcommittee

Michelle Mello
HealthcareRegulation, Policy, GovernanceQuick ReadSep 02

In this testimony presented to the U.S. House Committee on Energy and Commerce’s Subcommittee on Health hearing titled “Examining Opportunities to Advance American Health Care through the Use of Artificial Intelligence Technologies,” Michelle M. Mello calls for policy changes that will promote effective integration of AI tools into healthcare by strengthening trust.