Stanford’s Multimodal AI Model Advances Personalized Cancer Care | Stanford HAI
Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • AI Glossary
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

news

Stanford’s Multimodal AI Model Advances Personalized Cancer Care

Date
January 27, 2025
Topics
Healthcare

The MUSK model combines clinical notes and images to predict prognosis and immunotherapy response.

A human pathologist is still the gold standard for diagnosing diseases today. Unlike current artificial intelligence models, doctors don’t rely on a single data source to make clinical decisions and instead factor in a patient’s demographics, their medical history, imaging, and other characteristics of a disease. That is why AI has largely remained an assistive tool for doctors.

But now, researchers at Stanford have built a more useful AI model that factors in clinical notes and images to help predict patient outcomes and determine what treatment might work best. The new AI model – nicknamed MUSK for Multimodal transformer with Unified maSKed modeling – can look at unlabeled and unpaired image-text data at a large scale. The research, partially funded by the Stanford Institute for Human-Centered AI, was published in the journal Nature in early January.

Instead of relying on analyzing a single data source in isolation, MUSK is multimodal; it consults clinical notes and images that humans haven’t had to manually pair.

“We try to extract concrete, complementary information from both modalities so that we can make a good clinical decision that cannot be achieved by a single modality,” said Jinxi Xiang, lead author of the study and a Stanford postdoctoral scholar in radiation oncology.

Doctors often need the most help with predicting outcomes and with precision in cancer therapies. Empowered with images and text, MUSK can better predict how a patient might respond to certain types of cancer treatment.

To develop that capability, the researchers pretrained MUSK using 50 million pathology images and 1 billion pathology-related text tokens (or commonly grouped words and characters), representing 33 tumor types. That scale of training is a dramatic increase from image-text pairs in existing studies.

“Compared to the traditional AI approach, you can leverage unlabeled, large-scale, diverse data, so you don’t have to ask human experts to label them,” said Ruijiang Li, Stanford associate professor (research) of radiation oncology, whose lab focuses on applying machine and deep learning to medical imaging analysis and precision oncology. “Now we have designed this new architecture that can take in unpaired multimodal data sets for pretraining, so you are able to leverage a much larger data set to train more robust models.”

So far, the Stanford team has tested MUSK with multimodal data from more than 8,000 patients. The test results are showing promise over existing models. The breakthrough of blending images and clinical reports more accurately predicts outcomes like melanoma relapse and how a patient might respond to immunotherapy with lung and gastro-esophageal cancers. They also found the model did well with predicting prognosis for 16 cancer types, especially for common cancers like breast, lung, and colorectal cancers.

Moving forward, the MUSK approach to digital pathology could be generalized to other types of medical data, as well as biological data. But for now, Li said the team needs to gather more evidence for MUSK in order to eventually deploy it in the clinical setting. It also would need to undergo a clinical trial and then get regulatory approval.

“It’s a major step forward and valuable contribution to the field of multimodal foundation models,” Li added.

Share
Link copied to clipboard!
Contributor(s)
Vignesh Ramachandran

Related News

Collaborative Coding, Better Scaling, Health Tracking: HAI Awards $2.17M to Innovative Research
Nikki Goth Itoi
Apr 29, 2026
Announcement
Your browser does not support the video tag.

Seed grants will fund 29 research teams pursuing novel research ideas across disciplines.

Announcement
Your browser does not support the video tag.

Collaborative Coding, Better Scaling, Health Tracking: HAI Awards $2.17M to Innovative Research

Nikki Goth Itoi
HealthcareSciences (Social, Health, Biological, Physical)Apr 29

Seed grants will fund 29 research teams pursuing novel research ideas across disciplines.

An AI Health Coach Could Change Your Mindset
Katharine Miller
Apr 23, 2026
News
A runner with a smartphone laces her shoes

Bloom, a health coaching app created by Stanford researchers, helps people tap into their own motivations.

News
A runner with a smartphone laces her shoes

An AI Health Coach Could Change Your Mindset

Katharine Miller
HealthcareGenerative AIApr 23

Bloom, a health coaching app created by Stanford researchers, helps people tap into their own motivations.

Using LLMs To Improve Workplace Social Skills
Katharine Miller
Apr 20, 2026
News
A woman takes notes while working on a tablet

Practicing specific social skills with AI chatbots helps users build confidence and competence.

News
A woman takes notes while working on a tablet

Using LLMs To Improve Workplace Social Skills

Katharine Miller
Education, SkillsGenerative AIHealthcareApr 20

Practicing specific social skills with AI chatbots helps users build confidence and competence.