Sheng Wang | Generative AI for Multimodal Biomedicine | Stanford HAI
Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • AI Glossary
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

eventSeminar

Sheng Wang | Generative AI for Multimodal Biomedicine

Status
Past
Date
Wednesday, November 06, 2024 12:00 PM - 1:15 PM PST/PDT
Location
Hybrid
Topics
Healthcare
Overview
Event Recording

HAI Seminar with Sheng Wang

Abstract:

Biomedicine is inherently multimodal, including imaging modalities such as pathology, CT, MRI, X-ray and ultrasounds, as well as omics modality such as genomics, epigenomics and transcriptomics. General domain multimodal approaches are not applicable to biomedicine because biomedical images are very different from general domain images, thus necessitating the development of modality-specific approaches. In this talk, Sheng will introduce three recent works towards building multimodal biomedicine foundation models. 

First,  Sheng will introduce GigaPath, the first whole-slide pathology foundation model that can handle gigapixel-level pathology images. GigaPath exploits a novel vision transformer architecture and achieves the state-of-the-art results on 23 out of 26 cancer tasks, including subtyping and biomarker prediction. Next, he will introduce OCTCube, the first 3D OCT retinal imaging foundation model. OCTCube significantly outperformed 2D models on 27 out of 29 tasks, including retinal disease prediction, cross-modality analysis, cross-device generalization and systemic disease prediction. Finally, Sheng will introduce BiomedParse, a multi-modal foundation model that integrates 9 major biomedical imaging modalities by projecting all of them into the text space, resulting in superior performance on segmentation, detection, and recognition, paving the path for large-scale image-based biomedical discovery. I will conclude this task with discussion on how multi-modal generative AI can advance future medical applications through multi-agent framework and integration with multi-omics datasets.

Speaker
Sheng Wang
Assistant Professor in the School of Computer Science and Engineering at the University of Washington Seattle
Overview
Event Recording
Share
Link copied to clipboard!
Event Contact
Annie Benisch
abenisch@stanford.edu
Related
  • Sheng Wang
    Assistant Professor in the School of Computer Science and Engineering at the University of Washington Seattle

Related Events

Ashesh Rambachan | From Next-Token Prediction to Automatic Induction of Automata
Apr 13, 202612:00 PM - 1:00 PM
April
13
2026

Sequence data is ubiquitous in economics — job histories in labor economics, diagnosis and treatment sequences in health economics, strategic interactions in game theory. Generative sequence models can learn to predict these sequences well, but their complexity makes it hard to extract interpretable economic insights from their predictions.

Event

Ashesh Rambachan | From Next-Token Prediction to Automatic Induction of Automata

Apr 13, 202612:00 PM - 1:00 PM

Sequence data is ubiquitous in economics — job histories in labor economics, diagnosis and treatment sequences in health economics, strategic interactions in game theory. Generative sequence models can learn to predict these sequences well, but their complexity makes it hard to extract interpretable economic insights from their predictions.

Caroline Meinhardt, Thomas Mullaney, Juan N. Pava, and Diyi Yang | How Can AI Support Language Digitization and Digital Inclusion?
SeminarApr 15, 202612:00 PM - 1:15 PM
April
15
2026

What does digital inclusion look like in the age of AI? Over 6,000 of the world’s 7,000-plus living languages remain digitally disadvantaged.

Seminar

Caroline Meinhardt, Thomas Mullaney, Juan N. Pava, and Diyi Yang | How Can AI Support Language Digitization and Digital Inclusion?

Apr 15, 202612:00 PM - 1:15 PM

What does digital inclusion look like in the age of AI? Over 6,000 of the world’s 7,000-plus living languages remain digitally disadvantaged.

Matt Beane | Precision Proactivity: Measuring Cognitive Load in Real-World AI-Assisted Work
Apr 20, 202612:00 PM - 1:00 PM
April
20
2026

Systems like ChatGPT and Claude assist billions through proactive dialogue—offering unsolicited, task-relevant information. Drawing on Cognitive Load Theory, we study how cognitive load shapes performance in AI assisted knowledge work.

Event

Matt Beane | Precision Proactivity: Measuring Cognitive Load in Real-World AI-Assisted Work

Apr 20, 202612:00 PM - 1:00 PM

Systems like ChatGPT and Claude assist billions through proactive dialogue—offering unsolicited, task-relevant information. Drawing on Cognitive Load Theory, we study how cognitive load shapes performance in AI assisted knowledge work.