Improving AI Software for Healthcare Diagnostics

Date

July 01, 2021

Topics

abstract

This brief explores current regulatory frameworks for AI use in radiology and calls for stronger regulatory guidance to improve testing, enhance safety, and establish performance standards.

Key Takeaways

AI-based diagnostics show great promise to improve traditional medical imaging methods, such as CT scans, MRIs, and X-rays. These algorithms offer computational capabilities that process images with greater speed and accuracy than traditional methods and can improve patient outcomes for millions.
Current proposals for regulatory frameworks do not fully address the necessity to build trust in these systems due to the confusion between the algorithm in question and the task it is designed to perform, inadequate establishment of standard-setting bodies, and insufficient rigor in the evaluation and development process.
Policymakers should turn to medical societies for the clinical definitions of diagnostic tasks. These groups should extend performance assessments beyond simply testing for accuracy.

Executive Summary

One of the most promising uses of artificial intelligence (AI) is in radiology, the medical specialization that uses imaging technology to diagnose and treat disease. AI holds great promise for more accurate healthcare diagnostics and even prediction of disease outcomes for patients. AI can improve traditional medical imaging methods like computed tomography (CT), magnetic resonance imaging (MRI), and X-ray by offering computational capabilities that process images with greater speed and accuracy, automatically recognizing complex patterns to assess a patient’s health. This sophisticated software needs more robust evaluation methods to reduce risk to the patient, to establish trust, and to ensure wider adoption. A clear example of this can be seen in the difficulty many researchers had in classifying imaging results from early studies of the coronavirus disease that spawned the COVID-19 pandemic.

In our article in the Journal of the American College of Radiology, “Regulatory Frameworks for Development and Evaluation of Artificial Intelligence-Based Diagnostic Imaging Algorithms,” we explore three major regulatory frameworks for radiology that have been proposed by the United States Food and Drug Administration (FDA), the European Union, and the International Medical Device Regulators Forum, respectively, and show how they ensure safety, effectiveness, and performance of AI-based applications. However, these regulatory bodies could be doing more to build trust in these systems; we recommend changes that need to be made so diagnostic AI can reach its full potential. The shortcomings we enumerate extend from the tendency to confuse the algorithm with the task it is designed to perform, from the lack of rigor in definitions of medical tasks, and from the lax specification of the task, making it harder to compare similar algorithms directly. We also identify problems with unpredictability in model performance, insufficient testing infrastructure, and inherent conflicts of interest.

A better path forward depends on policymakers and medical societies adopting stronger regulatory guidance to improve testing, enhance safety, and establish performance standards for these algorithms. Gaps in the three regulatory frameworks we examined can be filled by the following four actions:

Make sure the algorithm is always distinguished from the definition of the diagnostic task it is automating.
Define elements of algorithmic performance beyond accuracy, such as transparency, use of fail-safes, and auditability.
Divide the evaluation into discrete steps from the perspective of the potential user or evaluator, including diagnostic task definition, capacity of the algorithm to perform in a controlled environment, evaluation of effectiveness in the real world as compared to performance in the controlled environment, validation of effectiveness in the local setting at each installed site, and durability testing and monitoring to ensure the algorithm performs well over time.
Encourage independent assessment by third-party evaluators by implementing a phased testing regime similar to that used by the pharmaceutical industry during drug development and the broader software industry.

Taken together, these steps have the potential to shape diagnostic AI’s future for the better and ensure the technology is developed as fairly and as rapidly as possible.

Read Paper

Related Publications

Operationalizing Real-Time Monitoring of Clinical AI

Zhongnan Fang, Lina Cheuy, Hye Sun Na, Akshay Chaudhari, David B. Larson

Quick ReadMay 14, 2026

Policy Brief

This brief demonstrates how real-time monitoring can address critical gaps in the oversight of radiological AI tools.

Policy Brief

Operationalizing Real-Time Monitoring of Clinical AI

Zhongnan Fang, Lina Cheuy, Hye Sun Na, Akshay Chaudhari, David B. Larson

HealthcareRegulation, Policy, GovernanceQuick ReadMay 14

This brief demonstrates how real-time monitoring can address critical gaps in the oversight of radiological AI tools.

Toward Responsible AI in Health Insurance Decision-Making

Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char

Quick ReadFeb 10, 2026

Policy Brief

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Policy Brief

Toward Responsible AI in Health Insurance Decision-Making

Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char

HealthcareRegulation, Policy, GovernanceQuick ReadFeb 10

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Response to FDA's Request for Comment on AI-Enabled Medical Devices

Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew

Quick ReadDec 02, 2025

Response to Request

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.

Response to Request

Response to FDA's Request for Comment on AI-Enabled Medical Devices

Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew

HealthcareRegulation, Policy, GovernanceQuick ReadDec 02

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.

Russ Altman’s Testimony Before the U.S. Senate Committee on Health, Education, Labor, and Pensions

Russ Altman

Quick ReadOct 09, 2025

Testimony

In this testimony presented to the U.S. Senate Committee on Health, Education, Labor, and Pensions hearing titled “AI’s Potential to Support Patients, Workers, Children, and Families,” Russ Altman highlights opportunities for congressional support to make AI applications for patient care and drug discovery stronger, safer, and human-centered.

Testimony