Nigam Shah
Professor of Medicine (Biomedical Informatics) and of Biomedical Data Science, Stanford University; Chief Data Scientist, Stanford Health Care; Faculty Affiliate, Stanford HAI

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.
Sign Up For Latest News
Professor of Medicine (Biomedical Informatics) and of Biomedical Data Science, Stanford University; Chief Data Scientist, Stanford Health Care; Faculty Affiliate, Stanford HAI
Promoting Algorithmic Fairness in Clinical Risk Prediction
Key Considerations For Incorporating Conversational AI in Psychotherapy
Assessing the accuracy of automatic speech recognition for psychotherapy
Foundation models are transforming artificial intelligence (AI) in healthcare by providing modular components adaptable for various downstream tasks, making AI development more scalable and cost-effective. Foundation models for structured electronic health records (EHR), trained on coded medical records from millions of patients, demonstrated benefits including increased performance with fewer training labels, and improved robustness to distribution shifts. However, questions remain on the feasibility of sharing these models across hospitals and their performance in local tasks. This multi-center study examined the adaptability of a publicly accessible structured EHR foundation model (FMSM), trained on 2.57 M patient records from Stanford Medicine. Experiments used EHR data from The Hospital for Sick Children (SickKids) and Medical Information Mart for Intensive Care (MIMIC-IV). We assessed both adaptability via continued pretraining on local data, and task adaptability compared to baselines of locally training models from scratch, including a local foundation model. Evaluations on 8 clinical prediction tasks showed that adapting the off-the-shelf FMSMmatched the performance of gradient boosting machines (GBM) locally trained on all data while providing a 13% improvement in settings with few task-specific training labels. Continued pretraining on local data showed FMSM required fewer than 1% of training examples to match the fully trained GBM’s performance, and was 60 to 90% more sample-efficient than training local foundation models from scratch. Our findings demonstrate that adapting EHR foundation models across hospitals provides improved prediction performance at less cost, underscoring the utility of base foundation models as modular components to streamline the development of healthcare AI.
Foundation models are transforming artificial intelligence (AI) in healthcare by providing modular components adaptable for various downstream tasks, making AI development more scalable and cost-effective. Foundation models for structured electronic health records (EHR), trained on coded medical records from millions of patients, demonstrated benefits including increased performance with fewer training labels, and improved robustness to distribution shifts. However, questions remain on the feasibility of sharing these models across hospitals and their performance in local tasks. This multi-center study examined the adaptability of a publicly accessible structured EHR foundation model (FMSM), trained on 2.57 M patient records from Stanford Medicine. Experiments used EHR data from The Hospital for Sick Children (SickKids) and Medical Information Mart for Intensive Care (MIMIC-IV). We assessed both adaptability via continued pretraining on local data, and task adaptability compared to baselines of locally training models from scratch, including a local foundation model. Evaluations on 8 clinical prediction tasks showed that adapting the off-the-shelf FMSMmatched the performance of gradient boosting machines (GBM) locally trained on all data while providing a 13% improvement in settings with few task-specific training labels. Continued pretraining on local data showed FMSM required fewer than 1% of training examples to match the fully trained GBM’s performance, and was 60 to 90% more sample-efficient than training local foundation models from scratch. Our findings demonstrate that adapting EHR foundation models across hospitals provides improved prediction performance at less cost, underscoring the utility of base foundation models as modular components to streamline the development of healthcare AI.
This new class of models may lead to more affordable, easily adaptable health AI.
This new class of models may lead to more affordable, easily adaptable health AI.
Stanford experts examine the safety and accuracy of GPT-4 in serving curbside consultation needs of doctors.
Stanford experts examine the safety and accuracy of GPT-4 in serving curbside consultation needs of doctors.
Scholars detail the current state of large language models in healthcare and advocate for better evaluation frameworks.
Scholars detail the current state of large language models in healthcare and advocate for better evaluation frameworks.
While these tools show potential in clinical practice, we urgently need a systematic approach to evaluation.
While these tools show potential in clinical practice, we urgently need a systematic approach to evaluation.