Skip to main content Skip to secondary navigation
Page Content

2024 Hoffman-Yee Grant Recipients

Six Stanford research teams have received funding to solve some of the most challenging problems in AI.


A human-centered foundation model for cells to accelerate drug discovery and personalized treatment

A Multidimensional Odyssey into the Human Mind

Data in the Age of Generative AI

Evo: A foundational model for generative genomics

Integrating Intelligence: Building Shared Conceptual Grounding for Interacting with Generative AI

Leveraging Technology to Improve Police-Community Relations: The Promise of Large-Scale Analysis of Body-Worn Camera Footage 

A human-centered foundation model for cells to accelerate drug discovery and personalized treatment

Imagine if we could build digital twins of your cells and simulate how they would respond to a drug treatment before exposing you to the drug while considering your sex, age, and comorbidities. The key to such a medical future requires developing end-to-end frameworks for cell modeling. Recent scientific and technological advances present a historic opportunity to make unprecedented progress in our fight against human disease. Specifically, new sources of biomedical data are rapidly becoming available, and new AI techniques make it possible to understand these massive datasets. We aim to (1) create a multimodal foundation model for cells capable of capturing function and state across human tissues and individuals, (2) develop an intuitive chat model interface to augment the ability of biologists to use and understand it, and (3) showcase its capacities by modeling cells affected by the menstrual cycle to answer critical questions in women’s health, with an initial focus on cardiovascular disease management.

NAMEROLESCHOOLDEPARTMENTS
Emma LundbergLead PIEngineeringBioengineering
Russ AltmanCo-PIEngineeringBioengineering
Jure LeskovecCo-PIEngineeringComputer Science
Stephen QuakeCo-PIEngineeringBioengineering
Serena Yeung-LevyCo-PIMedicineBiomedical Data Science
Back to top of page   

A Multidimensional Odyssey into the Human Mind

The human brain, a complex organ influenced by psychological, biological, environmental, and physical factors, remains a significant challenge in neuroscience. Despite advances in understanding its structure, function, and biochemistry, a comprehensive, data-driven model integrating these diverse aspects has yet to be realized. Existing research has made strides with brain foundation models using MRI data and multi-modal approaches, but gaps remain in integrating various modalities and domains. This project aims to develop a “Brain World Model,” named “Brain-Bind,” by unifying datasets that reflect different aspects of brain health. This model will be trained in an end-to-end manner, harmonizing data sources, developing modality-specific encoders, and employing self-supervised learning to integrate non-imaging data with MRI-derived representations. The resulting model is intended to enhance diagnostic precision, inform personalized care, and contribute to a deeper understanding of neurological and cognitive processes.

NAMEROLESCHOOLDEPARTMENTS
Ehsan AdeliLead PIMedicinePsychiatry and Behavioral Sciences
Akshay ChaudhariCo-PIMedicineRadiology
Anshul KundajeCo-PIMedicineGenetics
Fei-Fei LiCo-PIEngineeringComputer Science
Feng Vankee LinCo-PIMedicinePsychiatry and Behavioral Sciences
Killian PohlCo-PIMedicinePsychiatry and Behavioral Sciences
Jiajun WuCo-PIEngineeringComputer Science
Dan YaminsCo-PIHumanities and SciencesPsychology
Back to top of page   

Data in the Age of Generative AI

Massive datasets are the cornerstone for developing large language models (LLMs) and other generative AI. However, these datasets have also sparked debates regarding generative AI, highlighted by several copyright disputes involving OpenAI. This proposal is dedicated to exploring critical aspects of data creation and attribution for generative AI. Our approach is three-fold: Firstly, we aim to establish guiding principles and scaling laws for assembling datasets tailored for training and aligning LLMs and other generative AI, ensuring their responsible application in various sectors. Secondly, we plan to develop scalable methods to trace the generative AI outputs to specific training data, enabling data attribution and valuation. Finally, we will investigate how to effectively use and monitor synthetic data produced by generative AI. These goals are closely linked and mutually reinforcing, with successful data attribution and synthetic data methods informing and improving the strategies for dataset design. Throughout our project, we will ground our research with real-world legal and policy considerations and high-impact applications in law and medicine.

NAMEROLESCHOOLDEPARTMENTS
James ZouLead PIMedicineBiomedical Data Science
Surya GanguliCo-PIHumanities and SciencesApplied Physics
Tatsunori HashimotoCo-PIEngineeringComputer Science
Daniel HoCo-PILawLaw 
Curt LanglotzCo-PIMedicineRadiology
Percy LiangCo-PIEngineeringComputer Science
Mark LemleyCo-PILaw Law 
Megan MaSenior/Lead Research ScholarLawLaw
Julian NyarkoCo-PILawLaw 
Christopher RéCo-PIEngineeringComputer Science
Ellen VitercikCo-PIEngineering Management Science and Engineering
Back to top of page   

Evo: A foundational model for generative genomics 

DNA encodes the fundamental language for all living organisms. Recently, large language models have been used to learn this mysterious biological language to unlock a better understanding of this blueprint of life. Yet, learning from DNA has its distinct challenges over natural language - it’s extremely long, with the human genome over 3 billion nucleotides in length. It’s also highly sensitive to small changes, where a single point mutation can mean the difference between having a disease or not. Overcoming these technical challenges of modeling long sequences in DNA can lead to a deeper understanding of human disease, the creation of novel therapeutics, and the possibility to engineer life itself.

This project aims to develop a new line of long sequence language models that can reproduce the organization of DNA sequences from the molecular to the whole genome scale. We will build on the Hyena architecture, an efficient long sequence model that leverages breakthroughs in deep signal processing and scales sub-quadratically with the length of data. We will extend the long convolutions of Hyena with a bidirectional and diffusion training paradigm. This approach will enable the modeling and design of DNA sequences from scratch as well as with an “infilling” ability, allowing both greater control and the mimicking of evolution’s process of continuous updating. Our team is dedicated to leading the ethical development of DNA sequence modeling and design, and to bring the innovation of AI systems for the betterment of human health.

NAMEROLESCHOOLDEPARTMENTS
Brian HieLead PIEngineeringChemical Engineering
Christopher RéCo-PIEngineeringComputer Science
Stefano ErmonCo-PIEngineeringComputer Science
Stephen BaccusCo-PIMedicineNeurobiology
Euan AshleyCo-PIMedicineCardiovascular Medicine
Back to top of page   

Integrating Intelligence: Building Shared Conceptual Grounding for Interacting with Generative AI 

Visual media, in the form of images, video, animation, and 3D virtual environments are now central to the way people communicate stories, ideas and information. Yet, creating such visual content is challenging as it requires significant visual design expertise. The promise of modern generative AI tools is that they will assist users in creating production-quality visual content from a simple text prompt describing what the user wants. But current black-box AI are difficult to work with; the AI often misinterprets the intent of the user and users lack a predictive conceptual model for what the AI will produce given an input prompt. This mutual lack of a theory of mind leads to a collaboration by trial-and-error, where the user repeatedly tries different prompts hoping to find one that will produce the desired output.

In this project we take major steps towards allowing both entities (humans and AI) to develop a shared conceptual grounding that allows each to simulate how the other might operate given an input task. To this end, we focus on two key objectives. (1) First we will identify the concepts and mental processes expert human creators use when they make visual content. Such experts often convey their design intentions via natural language and sketches. We will analyze these linguistic and sketch outputs to identify common patterns in their workflows and we will investigate how such creators communicate with other human collaborators to establish common ground, repair misunderstandings, etc. (2) Second, we will build new generative AI tools that internally respect the ways humans mentally organize creation processes and workflows. For this objective, we will develop new algorithms and methods that incorporate the concepts humans use within generative AI models.

Ultimately we envision human creators collaborating with generative AI tools using a combination of natural language, example content and code snippets, in a turn-taking fashion to produce the desired content. Importantly, both the human and AI will communicate with one another through a shared understanding of the concepts and mental processes relevant to the creation task. We will validate our approach with human-subject experiments that examine how our generative AI tools lower usability barriers for creating production-quality visual content. With our generative AI tools we aim to democratize visual content creations so that users of all skill levels can easily express their ideas and tell their stories using visual media.

NAMEROLESCHOOLDEPARTMENTS
Maneesh AgrawalaLead PIEngineeringComputer Science
Judith FanCo-PIHumanities and SciencesPsychology
Kayvon FatahalianCo-PIEngineeringComputer Science
Tobi GerstenbergCo-PIHumanities and SciencesPsychology
Nick HaberCo-PIEducationGraduate School of Education
Hari SubramonyamCo-PIEducationGraduate School of Education
Jiajun WuCo-PIEngineeringComputer Science
Back to top of page   

Leveraging Technology to Improve Police-Community Relations: The Promise of Large-Scale Analysis of Body-Worn Camera Footage 

Police body-worn cameras have been at the center of police reform efforts over the past decade. Yet the vast majority of the footage generated by those cameras is never examined, undermining the camera’s utility as a tool for accountability and improving interactions between the police and community members. We are harnessing artificial intelligence (AI) and large language models (LLMs) to unlock the research potential of body-worn camera footage to better understand the nature of law enforcement’s encounters with the public. In turn, leveraging the resulting insights could fuel both the development and systematic evaluation of officer trainings and other institutional interventions designed to improve policing. To advance these goals, we are building AI tools and state-of-the-art infrastructure on Stanford’s campus to receive, secure, and process police footage for the purpose of conducting research aimed at improving relations between the police and the public. This includes body-worn camera footage of routine vehicle stops from several law enforcement agencies. We will begin by using this data to evaluate the effectiveness of a state-wide legal intervention designed to improve police-community interactions during vehicle stops, eradicate racial disparities in those stops, and increase trust. We will also explore the effectiveness of training and other methods to reduce escalation. In a world where police departments are increasingly utilizing AI to fight crime, we see the value of harnessing AI to make accessible a largely untapped source of data to improve police-community relations – a goal that can be lauded by the police and the policed alike. Using AI in this way could be crucial to advancing solutions to one of society’s most pressing problems: how can we reimagine public safety?

NAMEROLESCHOOLDEPARTMENTS
Jennifer EberhardtLead PIGraduate School of BusinessGraduate School of Business
Ralph BanksCo-PILawLaw 
Dan JurafskyCo-PIHumanities and SciencesLinguistics
Benoit MoninCo-PIGraduate School of BusinessGraduate School of Business
Back to top of page   

Back to Hoffman-Yee Research Grants