Seed Research Grants | Stanford HAI
Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • AI Glossary
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

researchGrant

Seed Research Grants

Status
Closed
Date
Applications closed on September 15, 2025
Apply
Overview
2025 Recipients
2024 Recipients
2023 Recipients
2022 Recipients
2021 Recipients
2020 Recipients
2019 Recipients
2018 Recipients
Overview
2025 Recipients
2024 Recipients
2023 Recipients
2022 Recipients
2021 Recipients
2020 Recipients
2019 Recipients
2018 Recipients
Share
Link copied to clipboard!
Related
  • Stanford HAI Funds Groundbreaking AI Research Projects
    Nikki Goth Itoi
    Quick ReadJan 30
    news
    collage

    Thirty-two interdisciplinary teams will receive $2.37 million in Seed Research Grants to work toward initial results on ambitious proposals.

  • Stanford AI Scholars Find Support for Innovation in a Time of Uncertainty
    Nikki Goth Itoi
    Jul 01
    news

    Stanford HAI offers critical resources for faculty and students to continue groundbreaking research across the vast AI landscape.

This project is co-funded with the Stanford Center for Digital Health.

Point-of-care ultrasound (POCUS) allows clinicians to perform real-time assessments at the patient's bedside using portable ultrasound devices that can connect to their phone, thus improving diagnosis and reducing complications. However, training medical students and residents to use this technology effectively remains challenging due to limited faculty time and inconsistent teaching methods. Many trainees currently learn with minimal supervision, which can compromise patient care.

This project creates the first comprehensive, annotated database of POCUS images specifically designed to augment trainee performance with POCUS. We will collect and annotate 7,500 ultrasound clips from Stanford Medical Center covering heart, lung, and abdominal imaging (the three most common clinical applications). Unlike existing datasets that focus on disease detection, our database will include dual annotations: measuring the quality of the acquisition (identification of common acquisition errors) AND identifying potential pathology.

This unique approach enables development of AI systems that provide real-time, personalized feedback to learners, which can simulate expert faculty guidance to overcome traditional teaching barriers with POCUS. The database will be released publicly to accelerate medical education research globally. Future applications include educational dashboards that help trainees improve their scanning technique and interpretation skills, potentially transforming how ultrasound is taught worldwide while ensuring AI augments rather than replaces human clinical judgment.

Name

Role

School

Department

Andre Kumar

Main PI

School of Medicine

Med/Hospital Medicine

The field of learning sciences has historically been limited, not only by a “research-to-practice” gap, but also a “practice-to-research” gap: practitioners find it challenging to obtain actionable evidence for instructional methods that apply to their students’ unique learning contexts, but also find it difficult to create the conditions needed to rigorously test their own instructional methods. This project aims to accelerate a virtuous cycle of research and practice of learning science by engaging both practitioners and researchers in the scientific discovery process. The core enabling technology behind this initiative, Learning Strategy Studio, is a novel, strategy-following AI instructional agent, whose instructional strategies shall be configurable in natural language by human actors. This instructional agent implements a hierarchical instructional decision-making process across three layers: planning, activity design, and learner interaction. With this agent, we aim to engage both practitioners and researchers to author and experiment with various instructional strategies with real students. The agent’s interactions with learners, combined with its fine-grained strategic decision-making process and both proximal and distal measures of learning outcomes, will provide data that can yield actionable insights. These insights will benefit both human actors and AI agents, leading to a continuous improvement in practice and research.

Name

Role

School

Department

Candace Thille

Main PI

Graduate School of Education

Graduate School of Education

Shima Salehi

Co-PI

Graduate School of Education

Graduate School of Education

Systematic reviews are essential for translating scientific findings into clinical guidelines and patient care, but they are slow and costly, often taking over a year and hundreds of thousands of dollars to complete. Existing workflow tools can help manage references but do not touch the content, while emerging GenAI platforms lack transparency and control, making them unsuitable for high-stakes medical use.

This project will evaluate and advance a new approach that embeds large language models (LLMs) within the structured PRISMA framework for systematic reviews. By limiting LLMs to critical but well-defined tasks such as screening and data extraction, and by integrating transparency features like reason tagging and traceable evidence retrieval, the system aims to combine efficiency with trustworthiness. We will benchmark the tool against gold-standard Cochrane reviews and study how transparency mechanisms improve reliability for guideline development. If successful, this work will lay the foundation for continuous, up-to-date, living reviews‚ that keep pace with rapidly expanding scientific evidence, changing how trusted knowledge is synthesized for healthcare.

Eran Bendavid

Main PI

School of Medicine

Medicine (Primary Care and Population Health)

Carlos Guestrin

Co-PI

School of Engineering

Computer Science

Copyright is one of the central questions shaping the future of Generative AI. A key point of dispute is the extent to which large language models (LLMs) reproduce copyrighted works from their training data--both within LLM themselves ('memorization') and in their outputs at generation time ('extraction'). Drawing on methods from both machine learning and law, we will (1) measure how much publicly released LLMs (such as Llama) memorize specific books, (2) test how easily those books can be extracted in outputs, and (3) evaluate the associated copyright risks with greater precision than has previously been possible. We will also build tools that make our findings broadly accessible, and which will provide policymakers, creators, and the public with a clearer understanding of these complex, intersecting technical and legal issues.

Name

Role

School

Department

Mark Lemley

Main PI

School of Law

Law School

Daniel Ho

Co-PI

School of Law

Law School

Percy Liang

Co-PI

School of Engineering

Computer Science

Learning to perform highly dexterous sensorimotor skills quickly and accurately typically requires practice with expert feedback. When expert coaching is unavailable, haptic technology—physical devices that stimulate the sense of touch—can provide feedback to novices acquiring motor skills; yet, many haptic interventions lead users towards one fixed motion path which may be insufficient for learning. We hypothesize that modeling expert movement will enable the design of AI-adjustable, individualized guidance for novices that would direct them to learn expert movement strategies or policies rather than following a fixed expert path. In the proposed work, we focus on a neonatal endotracheal intubation (ETI) task, a procedure clinicians must perform when a newborn (neonate) is unable to breath on their own. We set out to understand and model expert ETI movement strategy using inverse reinforcement learning (RL), then develop a system providing novices with AI-enabled haptic feedback based on our modeled expert movement policies. This work will provide a generalizable framework to explore motor skill learning strategies and support neonatal resuscitation training.

Name

Role

School

Department

Sean Follmer

Main PI

School of Engineering

Mechanical Engineering

Lou Halamek

Co-PI

School of Medicine

Pediatrics (Neonatology)

Industrial fisheries feed millions and support livelihoods worldwide, and ports are critical control points for management. Climate change is shifting fish distributions, but how this will affect where vessels land their catch remains uncertain. We need reliable, updatable forecasts of port use grounded in vessel decision making. Our project applies AI in a new way for ocean management by learning those choices from vessel-tracking data and replaying them under realistic “what-if” ocean conditions to establish climate’s causal role. We will use these insights to project changes in port use under future climates and provide policy-ready guidance to support more resilient governance, capacity planning, and food security.

Name

Role

School

Department

James Leape

Main PI

School of Sustainability

Oceans

Sara Constantino

Co-PI

School of Sustainability

Environmental Social Sciences

We propose to prototype a novel AI chip enabling human brain-inspired dendrocentric learning and inference based on spintronics - electronics of electronic spins rather than charges. We will show stepwise retrieval-augmented generation (RAG) in which correct order of tokens mirrors the correct spike signals, specifically the functional parallelism between our dendrocentric AI chip and  neuroscience-based RAG: Input spikes ~ Query, Permutation ~ Learned fact, Dendrite response ~ Match. This AI chip could spur a paradigm shift in our AI ecosystem from being 'Cloud-centered' to 'human-centered' by alleviating privacy concerns, boosting personalization, and dramatically reducing power consumption of all sorts of AI computing.

Name

Role

School

Department

Shan Wang

Main PI

School of Engineering

Materials Science and Engineering

Kwabena Boahen

Co-PI

School of Engineering

Bioengineering

Humans excel at parsing natural scenes, a capacity thought to reflect visual cortical tuning to the statistical structure of the natural world. In early visual cortex, this principle is well established: quantitative descriptors of low-level image statistics – such as orientation and spatial frequency – have enabled direct tests linking cortical coding to efficient representation of natural images. Extending this account to high-level visual cortex has been limited, however, by the absence of analogous descriptors for the complex statistics that govern natural image structure at larger scales. Generative diffusion models offer a way forward: by learning to reverse noise corruption of natural images, they acquire hypothesis-free, quantitative estimates of image probability without requiring explicit parameterization of high-level statistics. Here, we propose to use  a diffusion model's learned natural image probabilities to test whether human perception and high-level visual cortex are calibrated to the same structure. We will compare perceptual judgments and cortical responses for image pairs drawn from high- and low-probability regions of the natural image manifold, matched for equal distances in image space. We predict that both perceptual discriminability and representational distance in ventral temporal cortex will scale with image probability, providing a direct test of efficient coding on the natural image manifold.

Name

Role

School

Department

Justin Gardner

Main PI

School of Humanities and Sciences

Psychology

Gordon Wetzstein

Co-PI

School of Engineering

Electrical Engineering

This project is co-funded with the Stanford Center for Digital Health.

Pneumonia is one of the leading causes of death in Ethiopian children under five, yet current systems to detect outbreaks only occur after many children are already sick. In rural areas, oxygen and antibiotics, the most critical treatments for pneumonia, are often scarce, meaning health facilities can easily be caught off guard without life-saving treatment during seasonal surges in pneumonia cases. This project will develop an early warning system that uses artificial intelligence to predict pneumonia spikes before they happen. 

By combining health reports from clinics, call center data, and weather patterns, the system will give health officials advance notice of likely outbreaks and their locations. This will allow the Ethiopian Ministry of Health to prepare in advance: sending oxygen, medicines, and health workers to the areas that need them most. The project will be piloted in partnership with the Ethiopian Ministry of Health and directly integrated into their new national “situation room” for real-time health monitoring. The goal is to save children’s lives by giving the health system the tools to act earlier, respond faster, and reach every child who needs care.

Name

Role

School

Department

John Openshaw

Main PI

School of Medicine

Medicine (Infectious Diseases and Geographic Medicine)

Rishi Mediratta

Co-PI

School of Medicine

Pediatrics

This project is co-funded with the Stanford Center for Digital Health.

The human gut microbiome consists of trillions of microbial cells and hundreds of species that are integrated into human biology, impacting health, and in some cases, causing disease. Modulating and measuring this community of microbes holds great promise for developing new therapies and diagnostics for a range of diseases. But this community of microbes is complex, dynamic, and individualized. Therefore, progress toward reaching the biomedical potential to treat, cure, and prevent diseases has been slow. Here we propose to build AI agent co-pilots that can augment human scientists to accelerate important discoveries. Focusing initially on developing microbial therapies for inflammatory bowel disease, these AI agent researchers will be programmed with specialized knowledge and datasets. Human researchers will work with the AI agents and rapidly receive responses to queries. In addition, AI agents will provide hypotheses and suggest experiments that could be run in the lab to test the hypotheses. We envision such AI agent co-pilots as an important step toward enabling researchers to perform more efficient and rigorous science, resulting in robust discoveries that better reflect the mass of accumulating data and knowledge. As an ultimate outcome, we hope to accelerate cures and prevention for a range of gut microbiome-driven diseases.

Name

Role

School

Department

Justin Sonnenburg

Main PI

School of Medicine

Microbiology and Immunology

James Zou

Co-PI

School of Medicine

Biomedical Data Science

Courts across East Africa face fragmented legal systems and limited access to digitally searchable case law. These constraints leave judges without the resources they need to make timely, well-supported decisions, ultimately harming due process and rule of law. This project proposes a collaboration to benchmark, roll out, and rigorously evaluate a platform for legal research that integrates semantic search of continental case law, automated summarization, and a tailored precedent chatbot. The system was built in collaboration with more than 75 judges in Rwanda and Kenya by a Stanford-originated startup (Hakimu), promising to leapfrog generations of legal search technology.

The team will design an AI training program for East African judges; develop a collaborative and ecologically valid benchmark suite; and carry out a sandboxed randomized evaluation with more than 250 sitting judges in Kenya to assess the effect on quality and efficiency of decision making. The findings will inform how Hakimu is refined and adopted across African judiciaries, while also providing a broader model for responsible AI evaluation by demonstrating how high-stakes decision-support systems should be assessed before deployment.

Name

Role

School

Department

Daniel Ho

Main PI

​​School of Law

Law School

Proteins explore many shapes that underpin their function, yet traditional crystallography and today’s AI predictors typically yield standalone structures that are detached from experimental evidence. We propose a scientist-in-the-loop framework that guides state-of-the-art generative structure models with crystallographic measurements to bridge the gap between predicting structures and solving structures from experimental measurements. In addition to accelerating structure determination, we aim to produce data-consistent ensembles that better capture conformational variability than the conventional single structures. This ensemble-centric paradigm is a long-term vision of the structural biology field to provide a foundation for applications such as drug discovery and synthetic biology.

Name

Role

School

Department

Gordon Wetzstein

Main PI

School of Engineering

Electrical Engineering

Have you ever struggled to give someone directions? Explaining how to get from A to B is surprisingly complex – you need to anticipate what the listener will see, what they'll remember, and where they might get confused. This research asks what makes a navigation explanation good, and how to build computational models that generate effective guidance the way humans do: by reasoning about the listener. 

Rather than directly training an AI system to follow directions, we’re building a speaker model that generates helpful directions by simulating where a listener would get confused and adjusting what it communicates based on the situation. We will assess whether AI-generated explanations can match human-quality guidance using a novel experimental paradigm where a speaker with full knowledge of the environment must guide a listener who has only a limited view, pairing both human and AI speakers with human and AI listeners. 

This work bridges cognitive science research on explanation and spatial reasoning with human-AI collaboration, and has practical implications for making navigation systems more accessible and adaptive. We will release our collected data as a public benchmark, Open Navigation Dialogues, to support future research on explanation generation.

Name

Role

School

Department

Tobias Gerstenberg

Main PI

School of Humanities and Sciences

Psychology

Robert Hawkins

Co-PI

School of Humanities and Sciences

Linguistics

Humans have long observed the cosmos, seeking to understand the physics of the universe by interpreting the light we collect in our telescopes. We now have the ability to measure light from galaxies across a broad range of the electromagnetic spectrum, using a variety of space- and ground-based telescopes, each measuring the sky at different resolutions and with different instrumental effects. Because of the different character of data from different telescopes, it has not yet been possible to do scientific inference that truly uses the richness of our data. We have assembled an interdisciplinary team of researchers, including astrophysicists and computer scientists, to chart a new path toward using all of our data to understand the physics of galaxies. We propose to pilot a foundation model for galaxies, intentionally designed for interpretability, to build human trust and understanding in the model and its application to discovery.

Name

Role

School

Department

Susan Clark

Main PI

School of Humanities and Sciences

Physics

Surya Ganguli

Co-PI

School of Humanities and Sciences

Applied Physics

Risa Wechsler

Co-PI

School of Humanities and Sciences

Physics

Gordon Wetzstein

Co-PI

School of Engineering

Electrical Engineering

The discourse surrounding AI has always been religious: Peter Thiel barnstorms lectures on the Apocalypse and the Antichrist. Marc Andreessen maintains a techno-saviorist view in his lectures and writing on AI, asserting that technology can liberate the human soul and spirit while scorning the search for AGI as a search for God. Sam Altman quips that the most successful founders set out to establish a religion, not merely a company. Garry Tan advocates that Christianity can provide the spiritual and moral tenets to right the wayward technoculture of Silicon Valley. It is a perfect rhetorical setup: powerful machines, extreme claims, and apocalyptic angst. Our project will study the religious convictions at the heart of this phenomenon, using (1) a mixed-methods analysis of dominant AI discourses and (2) ethnographic field work with communities of AI users. We will begin with an exploration of the legitimating power of religious language regarding AI, asking: what cultural work does religious language perform for proponents and critics of AI alike? We will follow up this discourse analysis with ethnographic investigations across four fieldwork sites to understand how users' experiences with AI fit with its public-facing religious rhetoric. Focusing on users will hone our understanding of how religious language and concepts translate into users’ experiences and, perhaps more importantly, how they do not. This comparison of AI’s public representation and private logics will provide key insights into the pervasiveness of its deep religious discourse. It will help explain how religious thinking fuels impressions of the technology’s power to shape the real and imagined worlds of users, reveal the limitations and risks of endowing technology with divine power, and illuminate a path to a more ethical AI.

Name

Role

School

Department

Ari Kelman

Main PI

Graduate School of Education

Graduate School of Education

John Willinsky

Co-PI

Graduate School of Education

Graduate School of Education

The development and discovery of small molecule therapeutics to treat diseases like cancer requires balancing selectivity, efficacy, safety, and synthesizability. As generative models for chemistry become increasingly widely used for molecular optimization, large numbers of predicted compounds must be triaged for the aforementioned properties, which is tedious and inaccurate for medicinal chemists. We plan to evaluate and improve AI agents to initially screen and evaluate molecules that are predicted to have high binding affinity by our generative molecular design tools.

Name

Role

School

Department

Grant Rotskoff

Main PI

School of Humanities and Sciences

Chemistry

Nathanael Gray

Co-PI

School of Humanities and Sciences

Chemical and Systems Biology Operations

How can a large language model (LLM) which has been custom built for a college course be wisely and safely incorporated into that course, to maximize learning and engagement, and to avoid gratuitous and unproductive uses of the tool?

We have built VHIL-E, the Virtual Human Interaction Lab’s Expert, an LLM representing the lab’s research, outreach, and teaching, focused on the psychological aspects of virtual and augmented reality. Collected materials consist of approximately 2.3 million words, broken into approximately 10,000 chunks stored in an embedded index. Materials came from two books written about the lab’s research, 261 journal articles by VHIL scholars, 57 transcriptions of keynote addresses by lab members, 14 dissertations from lab students, 87 dedicated news articles about the lab’s research, 14 recorded question/answer lectures from multiple iterations of the course, and text from 17 lecture slides from a recent course year. In our first preliminary test, for open-ended responses, VHIL-E outperformed the baseline GPT model when replying to open ended questions, in particular by avoiding egregious hallucinations. 

Embedded within the Virtual People class (Fall, 2026), we will investigate how VHIL-E can be used to assist students with designing VR experiences that adhere to research-based best practices. Some students will use VHIL-E to iterate on their own VR design blueprints (VHIL-E: augmentation), whereas other students will use VHIL-E to create the VR design blueprint for them (VHIL-E: replacement). The control group will not use LLMs at all. This implementation isolates key augmentation-related instructional features, including immediate feedback and adaptivity (known from intelligent tutoring systems), and draws on prior work on AI-enabled co-creation. As a measure of learning transfer we will assess students’ abilities to bring their VR blueprints to life during an actual VR prototyping task, and, finally, present their work during class. This longitudinal study design which unfolds over 8 weeks addresses three threats to the validity of educational LLM research by 1) deliberately centering on an instructional method, 2) having an adequate control condition, and 3) using a strong learning outcome measure. By investigating how a custom-built LLM can augment learners, our project closely aligns with HAI’s mission to create a framework on human-centered AI research.

Name

Role

School

Department

Ron Dror

Main PI

School of Engineering

Computer Science

Wah Chiu

Co-PI

School of 

Engineering

Bioengineering

Gordon Wetzstein

Co-PI

School of Engineering

Electrical Engineering

How can an LLM which has been custom built for a college course be wisely and safely incorporated into that course, to maximize learning and engagement, and to avoid gratuitous and unproductive uses of the tool?

We have built VHIL-E, the Virtual Human Interaction Labs Expert, a large language model (LLM) representing the lab’s research and outreach and teaching, focused on the psychological aspects of virtual and augmented reality. Collected materials consist of approximately 2.3 million words, broken into approximately 10,000 chunks stored in an embedded index. Materials came from 2 books written about the lab’s research, 261 journal articles by VHIL scholars, 57 transcriptions of keynote addresses by lab members, 14 dissertations from lab students, 87 dedicated news articles about the lab‚Äôs research, 14 recorded question/answer lectures from multiple iterations of the course, and text from 17 lecture slides from a recent course year. In our first preliminary test, for open-ended responses, VHIL-E outperformed the baseline GPT model when replying to actual questions from journalists. Moreover the baseline model hallucinated for five percent of replies, while VHIL-E hallucinated for only one percent of replies. VHIL-E performed slightly better (3 percentage points) than the baseline on a 232 multiple choice question exam. 

Embedded within the Virtual People class (Fall, 2025, 2026), we will investigate how VHIL-E can be used to assist students with designing VR experiences that adhere to research-based best practices. Some students will use VHIL-E to iterate on their own VR design blueprints (VHIL-E: augmentation), whereas other students will use VHIL-E to create the VR design blueprint for them (VHIL-E: replacement). The control group will not use LLMs at all. This implementation isolates key augmentation-related instructional features, including immediate feedback and adaptivity (known from intelligent tutoring systems), and draws on prior work on AI-enabled co-creation and learner self-transcendence. As a measure of learning transfer we will assess students’ abilities to bring their VR blueprints to life during an actual VR prototyping task, and, finally, present their work during class. This longitudinal study design which unfolds over 8 weeks addresses three threats to the validity of educational LLM research by 1) deliberately centering on an instructional method, 2) having an adequate control condition, and 3) using a strong learning outcome measure. By investigating how a custom-built LLM can augment learners, our project closely aligns with HAI’s mission to create a framework on human-centered AI research.

Name

Role

School

Department

Jeremy Bailenson

Main PI

Graduate School of Education

Graduate School of Education

Dan Schwartz

Co-PI

Graduate School of Education

Graduate School of Education

Social scientists regularly test interventions – like programs to increase voter turnout, improve health behaviors, or reduce poverty – through expensive field experiments that can cost hundreds of thousands of dollars and take years to complete. Many of these interventions don't work, however, wasting precious time and resources that could have been used to help people. This research project aims to solve this problem by using artificial intelligence to predict which social interventions will be successful before researchers invest in costly real-world trials. Our prior research has already shown that large language models (like GPT-4) can accurately forecast the results of survey experiments and even outperform human experts at predicting intervention outcomes. In this project, we seek to build on this work by expanding their archive of field experiment results, developing a new method of prediction using 'expert agents' (AI models that simulate and aggregate predictions from different types of social science experts), and collecting expert forecasts against which we can benchmark our new prediction tool. We will also build a user-friendly online platform where researchers can input study designs and receive predictions, helping researchers focus their limited resources on the most promising interventions and democratizing access to intervention research. Most importantly, this research can accelerate progress on society's biggest challenges by helping identify the most effective solutions faster and more efficiently than ever before.

Name

Role

School

Department

Robb Willer

Main PI

Graduate School of Education

Graduate School of Education

Large Language Models (LLMs) are sycophantic, excessively agreeing with users even when they are wrong. In this proposal we suggest that LLMs are subject to a deeper problem that goes beyond factual agreement: they are socially sycophantic, flattering and excessively affirming users, even when users are delusional or propose to harm themselves or others, and can even draw users into emotional dependency. Social sycophancy has clear risks to users and to society, but neither the prevalence of social sycophancy nor its consequences have been empirically measured. To address this gap, we bring together insights from sociolinguistics, natural language processing, human-computer interaction, and psychology toward three goals: (1) To characterize the phenomenon of social sycophancy, by measuring aspects like emotional validation and action endorsement across different large language models and in different languages and cultures; (2) To evaluate the impacts of social sycophancy on users, via human-subject experiments in which participants discuss social dilemmas with sycophantic (or non-sycophantic) LLMs, and we then measure how the different LLM responses affect participants’ behaviors and perceptions of AI; (3) To develop methods to mitigate social sycophancy, by aligning LLMs to focus more on longer-term consequences and not just immediate rewards, and by inoculating users against its impacts. Our proposed work will bridge disciplines to provide a conceptual and empirical foundation for understanding social sycophancy and contribute toward building safer, more trustworthy LLMs.

Name

Role

School

Department

Dan Jurafsky

Main PI

School of Humanities and Sciences

Linguistics

Many teachers want local, inquiry-based lessons, but lack the time, tools, and GIS skills to turn open data into classroom-ready activities. Students, especially in under-resourced schools, miss chances to build spatial thinking and do real civic inquiry. PlaceEd Co-Pilot is a limited-purpose, teacher-only AI that helps educators create place-based, map-driven lessons in minutes. It pulls curated civic and environmental layers, builds activities in a clear structure (objectives, launch, evidence steps, quick checks), explains each choice in plain language, and offers easy adaptations for different learners and low-tech settings. Students never use the tool; teachers review all materials before class.

We will run a one-year, mixed-methods study with 30 California middle school teachers (across math, science, and social studies), primarily from LAUSD, SFUSD, and SDUSD. The work includes a spring prototype, summer co-design, and fall classroom use. We ask three questions: Efficiency: does the co-pilot boost successful lesson creation and cut prep time for low-GIS teachers? Usability: how useful and pedagogically valuable do teachers find the map-based activities? Co-Design: how does teacher feedback improve templates, explanations, and guardrails? Our goal is more classrooms running rigorous local data lessons with clear evidence paths and much less prep.

Name

Role

School

Department

Bryan Brown

Main PI

Graduate School of Education

Graduate School of Education

Large language models (LLMs) can adapt their responses to individual users, but today’s methods for evaluating such personalization often rely on shallow demographic categories or task-specific histories. Our project introduces a benchmark rooted in psychology theory and empirical data. We build synthetic personas from networks of beliefs, attitudes, values, and goals, constructs that shape how people judge information. By comparing LLM outputs to these psychological networks, we can measure how well models personalize in ways that matter for real people. This approach avoids the pitfalls of stereotypes and increases transparency about why an output counts as personalized.

Name

Role

School

Department

Nicholas Haber

Main PI

Graduate School of Education

Graduate School of Education

Carol Dweck

Co-PI

School of Humanities and Sciences

Psychology

Dairy farms increasingly rely on AI systems that monitor cow health and activity, predict injuries, illness, and breeding cycles, and recommend treatments. These systems promise to improve productivity, animal welfare, and decision-making by learning from data generated through sensors, activity trackers, and robotic milking systems. However, despite their predictive capabilities, many AI systems cannot learn from their mistakes or improve their accuracy over time. Realizing this promise depends on closing AI learning loops by integrating farmers’ interventions and expertise as feedback and ground truth labels into AI models. AI systems generate predictions and alert farmers to potential problems, such as a cow showing signs of mastitis, inviting farmers to investigate and intervene. However, these predictions are not always correct; farmers’ interpretations of AI predictions remain inaccessible, and records of their interventions are often entered into legacy herd management software that remains disconnected from the AI system, preventing AI models from learning through human feedback. In this sense, ground truth is lost in the void. Our project sheds light on how AI developers attempt to bridge this ground truth void through their design choices and by engaging dairy farmers in new forms of data work. Through an ethnographic study of leading dairy technology providers, we investigate how AI developers design user interfaces and encourage farmers to share their interpretations of AI predictions and translate their interventions into training data for supervised learning. By uncovering how continuous feedback loops between AI systems and domain experts are achieved in practice, our findings will inform and guide the design of AI systems and organizational workflows that keep AI development grounded in domain expertise and support sustained model improvement over time. 

Name

Role

School

Department

Pamela Hinds

Main PI

School of Engineering

Management Science

Recent work shows that social scientists are both excited and cautious about the role of large language models (LLMs) in research. LLMs already assist with tasks like literature review, data labeling, survey design, programming, and writing, and are expected to further scientific discovery through hypothesis generation, experimental design, and simulation. Some studies show that LLMs can simulate human opinions and behaviors with surprising accuracy, sometimes outperforming traditional theories, raising important questions about their future utility in the social sciences. This project investigates whether predictable scaling laws, commonly seen in LLM benchmarks, also apply to social science tasks. Because such tasks are not typically prioritized in LLM training, they may exhibit different scaling dynamics, including diminishing or even inverse returns. Cultural ambiguity, representational gaps in training data, and domain-specific knowledge may limit performance gains from model size alone. Understanding these dynamics can help social scientists gauge when and how LLMs can be reliably used, and when domain-specific models may be necessary, which helps lay the groundwork for future interdisciplinary collaboration.

Name

Role

School

Department

Diyi Yang

Main-PI

School of Engineering

Computer Science

David Grusky

Co-PI

School of Humanities and Sciences

Sociology

Tatsunori Hashimoto

Co-PI

School of Engineering

Computer Science

This proposal seeks funding for an AI-powered framework that reinvents how scientists create software for complex physical simulations, such as climate modeling.

Currently, developing this essential software, known as Partial-Differential Equation (PDE) solvers, is a major bottleneck, requiring immense computational power and specialized expertise. While Large Language Models (LLMs) can help automate this process, existing methods are inefficient, often relying on computationally expensive, brute-force approaches. We propose a collaborative, 'scientist-in-the-loop' workflow where an AI acts as an intelligent partner to a researcher. This process has three stages:

Analysis: The AI performs a mathematical analysis of the scientific problem.

Genesis: It then generates a small number of initial software solutions.

Synthesis: Finally, a team of AI judges refines these solutions in a tournament-style process, with the scientist providing goals, feedback, and expert guidance to 'nudge the judge' toward the best outcome.

This human-centered approach promises to drastically cut computational costs and carbon emissions while producing more accurate, reliable solvers tailored to specific scientific needs. By transforming the AI from a simple code generator into a collaborative partner, this framework will empower researchers, accelerate discovery, and make high-performance computing more accessible to a broader scientific community.

Name

Role

School

Department

Madeleine Udell

Main PI

School of Engineering

Management Science and Engineering

We introduce STEPS, a human-centered AI tutor designed to help introductory chemistry students learn how to tackle complex, real-world problems. Many STEM courses emphasize right answers over the process of problem solving, leaving students with little guidance on key steps such as defining a problem, making reasonable assumptions, monitoring progress, and checking results. STEPS augments - not replaces - teaching by doing three things: it helps instructors generate authentic, course-aligned problems that are appropriately scaffolded; it guides students through each step of the problem-solving process using effective tutor behaviors; and it provides clear, individualized feedback to both students and instructors about where students are succeeding and where they need support. In partnership with the Chemistry Departments at Stanford and Ohio State University, STEPS will be embedded in weekly assignments and reinforced in lectures, quizzes, and reflections. We will evaluate its impact using mixed methods, examining plan quality, solution accuracy, transfer to later assessments, engagement, and student confidence. Developed through participatory design and with attention to transparency and equity, STEPS aims to promote deeper learning while avoiding pitfalls of unstructured AI use. The project will produce shareable tools, prompt templates, rubrics, analytics, and implementation guides‚ and evidence about when AI tutoring helps or hinders problem solving.

Name

Role

School

Department

Shima Salehi

Main PI

Graduate School of Education

Graduate School of Education

Jennifer Schwartz Poehlmann

Co-PI

School of Humanities and Sciences

Chemistry

The discovery of cancer drug resistance mechanisms through RNA-seq analysis is hindered by the immense scale and complexity of experimental data and biomedical literature. Current language models, which rely on embedding-similarity search, struggle with specialized biomedical content and are unable to process long documents due to restricted context windows.

This project addresses these challenges by developing a next-generation AI-powered assistant for technical research. Our system constructs custom knowledge graphs to comprehensively index scientific publications, organizing large collections of complex medical documents into a curated, structured database for advanced reasoning. As the assistant analyzes literature, it uncovers new leads and connections, enabling iterative, in-depth investigation. The system will support researchers by providing precise information, linking experimental findings to resistance pathways, rigorously prioritizing gene candidates by biological plausibility, and suggesting evidence-based follow-up experiments. This project aims to accelerate the identification of drug resistance mechanisms and set new benchmarks for AI research tools, fostering more effective and reliable scientific inquiry in partnership with human experts.

Name

Role

School

Department

James Ford

Main PI

School of Medicine

Medicine - Med/Oncology

Monica Lam

Co-PI

School of Engineering

Computer Science

Blue whales, the largest animals on Earth, play a vital role in maintaining healthy ocean ecosystems. They communicate across vast distances using complex songs, but scientists still know little about what these songs mean or how they relate to whale behavior. Our project brings together experts in computer science and ocean science to explore this problem. We will combine neuro-symbolic AI systems with animal-borne sensor data to study the syntax and functions of blue whale songs. By breaking down songs into their basic components and examining how these components are combined, we aim to predict behaviors such as when and where whales feed, and to uncover how communication patterns vary across groups and environments. This research will not only advance our understanding of whale communication, but also support conservation efforts by providing new insights into how blue whales interact with each other and their changing ocean habitat.

Name

Role

School

Department

Jiajun Wu

Main PI

School of Engineering

Computer Science

Jeremy Goldbogen

Co-PI

School of Sustainability

Oceans

We propose to transform human-AI codesign by integrating generative artificial intelligence, physics simulation, and human-machine interfaces, with a focus on discovering novel continuum robot mechanisms. Continuum robots, with their elastic, infinite-degree-of-freedom structures, enable applications ranging from minimally invasive surgery to inspection and 3D printing, but their mechanisms are often unintuitive, and innovation has historically been slow. Our approach addresses three key human design bottlenecks: (1) limited exploration of the design space, (2) costly and inefficient testing of design candidates, and (3) difficulty in reimagining new objectives for unconventional mechanisms. We leverage diffusion-based generative models guided by differentiable fitness functions to explore and iteratively refine highly expressive design spaces, combined with virtual and physical prototyping to facilitate human interaction and evaluation. By also solving inverse design problems that map novel mechanisms to potential tasks, we aim to accelerate innovative leaps. This interdisciplinary effort combines expertise in mechanical design, continuum robotics, human cognition, and generative AI, and seeks to establish a generalizable framework for human-AI collaboration in creative engineering.

Name

Role

School

Department

Allison Okamura

Main PI

School of Engineering

Computer Science

Judith Fan

Co-PI

School of Humanities and Sciences

Psychology

Shuran Song

Co-PI

School of Engineering

Electrical Engineering