Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Seed Research Grants | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
researchGrant

Seed Research Grants

Status
Closed
Date
Applications closed on September 15, 2025
Apply
Overview
2024 Recipients
2023 Recipients
2022 Recipients
2021 Recipients
2020 Recipients
2019 Recipients
2018 Recipients
Overview
2024 Recipients
2023 Recipients
2022 Recipients
2021 Recipients
2020 Recipients
2019 Recipients
2018 Recipients
Share
Link copied to clipboard!
Related
  • Stanford HAI Funds Groundbreaking AI Research Projects
    Nikki Goth Itoi
    Quick ReadJan 30
    news
    collage

    Thirty-two interdisciplinary teams will receive $2.37 million in Seed Research Grants to work toward initial results on ambitious proposals.

  • Policy-Shaped Prediction: Avoiding Distractions in Model-Based Reinforcement Learning
    Nicholas Haber, Miles Huston, Isaac Kauvar
    Dec 13
    Research
    Your browser does not support the video tag.

    Model-based reinforcement learning (MBRL) is a promising route to sampleefficient policy optimization. However, a known vulnerability of reconstructionbased MBRL consists of scenarios in which detailed aspects of the world are highly predictable, but irrelevant to learning a good policy. Such scenarios can lead the model to exhaust its capacity on meaningless content, at the cost of neglecting important environment dynamics. While existing approaches attempt to solve this problem, we highlight its continuing impact on leading MBRL methods —including DreamerV3 and DreamerPro — with a novel environment where background distractions are intricate, predictable, and useless for planning future actions. To address this challenge we develop a method for focusing the capacity of the world model through synergy of a pretrained segmentation model, a task-aware reconstruction loss, and adversarial learning. Our method outperforms a variety of other approaches designed to reduce the impact of distractors, and is an advance towards robust model-based reinforcement learning.

  • LABOR-LLM: Language-Based Occupational Representations with Large Language Models
    Susan Athey, Herman Brunborg, Tianyu Du, Ayush Kanodia, Keyon Vafa
    Dec 11
    Research
    Your browser does not support the video tag.

    Vafa et al. (2024) introduced a transformer-based econometric model, CAREER, that predicts a worker’s next job as a function of career history (an “occupation model”). CAREER was initially estimated (“pre-trained”) using a large, unrepresentative resume dataset, which served as a “foundation model,” and parameter estimation was continued (“fine-tuned”) using data from a representative survey. CAREER had better predictive performance than benchmarks. This paper considers an alternative where the resume-based foundation model is replaced by a large language model (LLM). We convert tabular data from the survey into text files that resemble resumes and fine-tune the LLMs using these text files with the objective to predict the next token (word). The resulting fine-tuned LLM is used as an input to an occupation model. Its predictive performance surpasses all prior models. We demonstrate the value of fine-tuning and further show that by adding more career data from a different population, fine-tuning smaller LLMs surpasses the performance of fine-tuning larger models.

  • How Persuasive Is AI-generated Propaganda?
    Josh A. Goldstein, Jason Chao, Shelby Grossman, Alex Stamos, Michael Tomz
    Feb 20
    Research

    Can large language models, a form of artificial intelligence (AI), generate persuasive propaganda? We conducted a preregistered survey experiment of US respondents to investigate the persuasiveness of news articles written by foreign propagandists compared to content generated by GPT-3 davinci (a large language model). We found that GPT-3 can create highly persuasive text as measured by participants’ agreement with propaganda theses. We further investigated whether a person fluent in English could improve propaganda persuasiveness. Editing the prompt fed to GPT-3 and/or curating GPT-3’s output made GPT-3 even more persuasive, and, under certain conditions, as persuasive as the original propaganda. Our findings suggest that propagandists could use AI to create convincing content with limited effort.

  • Sociotechnical Audits: Broadening the Algorithm Auditing Lens to Investigate Targeted Advertising
    Michelle Lam, Ayush Pandit, Colin H. Kalicki, Rachit Gupta, Poonam Sahoo, Danaë Metaxa
    Oct 04
    Research
    Your browser does not support the video tag.

    Algorithm audits are powerful tools for studying black-box systems without direct knowledge of their inner workings. While very effective in examining technical components, the method stops short of a sociotechnical frame, which would also consider users themselves as an integral and dynamic part of the system. Addressing this limitation, we propose the concept of sociotechnical auditing: auditing methods that evaluate algorithmic systems at the sociotechnical level, focusing on the interplay between algorithms and users as each impacts the other. Just as algorithm audits probe an algorithm with varied inputs and observe outputs, a sociotechnical audit (STA) additionally probes users, exposing them to different algorithmic behavior and measuring their resulting attitudes and behaviors. As an example of this method, we develop Intervenr, a platform for conducting browser-based, longitudinal sociotechnical audits with consenting, compensated participants. Intervenr investigates the algorithmic content users encounter online, and also coordinates systematic client-side interventions to understand how users change in response. As a case study, we deploy Intervenr in a two-week sociotechnical audit of online advertising (N = 244) to investigate the central premise that personalized ad targeting is more effective on users. In the first week, we observe and collect all browser ads delivered to users, and in the second, we deploy an ablation-style intervention that disrupts normal targeting by randomly pairing participants and swapping all their ads. We collect user-oriented metrics (self-reported ad interest and feeling of representation) and advertiser-oriented metrics (ad views, clicks, and recognition) throughout, along with a total of over 500,000 ads. Our STA finds that targeted ads indeed perform better with users, but also that users begin to acclimate to different ads in only a week, casting doubt on the primacy of personalized ad targeting given the impact of repeated exposure. In comparison with other evaluation methods that only study technical components, or only experiment on users, sociotechnical audits evaluate sociotechnical systems through the interplay of their technical and human components.

  • How Culture Shapes What People Want From AI
    Chunchen Xu, Xiao Ge, Daigo Misaki, Hazel Markus, Jeanne Tsai
    May 11
    Research
    Your browser does not support the video tag.

    There is an urgent need to incorporate the perspectives of culturally diverse groups into AI developments. We present a novel conceptual framework for research that aims to expand, reimagine, and reground mainstream visions of AI using independent and interdependent cultural models of the self and the environment. Two survey studies support this framework and provide preliminary evidence that people apply their cultural models when imagining their ideal AI. Compared with European American respondents, Chinese respondents viewed it as less important to control AI and more important to connect with AI, and were more likely to prefer AI with capacities to influence. Reflecting both cultural models, findings from African American respondents resembled both European American and Chinese respondents. We discuss study limitations and future directions and highlight the need to develop culturally responsive and relevant AI to serve a broader segment of the world population.

  • Minority-group incubators and majority-group reservoirs for promoting the diffusion of climate change and public health adaptations
    Matthew Adam Turner, Alyson L Singleton, Mallory J Harris, Cesar Augusto Lopez, Ian Harryman, Ronan Forde Arthur, Caroline Muraida, James Holland Jones
    Jan 01
    Research
    Your browser does not support the video tag.

    Current theory suggests that heterogeneous metapopulation structures can help foster the diffusion of innovations to solve pressing issues including climate change adaptation and promoting public health. In this paper, we develop an agent-based model of the spread of adaptations in simulated populations with minority-majority metapopulation structure, where subpopulations have different preferences for social interactions (i.e., homophily) and, consequently, learn deferentially from their own group. In our simulations, minority-majority-structured populations with moderate degrees of in-group preference better spread and maintained an adaptation compared to populations with more equal-sized groups and weak homophily. Minority groups act as incubators for novel adaptations, while majority groups act as reservoirs for the adaptation once it has spread widely. This suggests that population structure with in-group preference could promote the maintenance of novel adaptations.

  • Interaction of a Buoyant Plume with a Turbulent Canopy Mixing Layer
    Hayoon Chung, Jeffrey R Koseff
    Jun 23
    Research
    Your browser does not support the video tag.

    This study aims to understand the impact of instabilities and turbulence arising from canopy mixing layers on wind-driven wildfire spread. Using an experimental flume (water) setup with model vegetation canopy and thermally buoyant plumes, we study the influence of canopy-induced shear and turbulence on the behavior of buoyant plume trajectories. Using the length of the canopy upstream of the plume source to vary the strength of the canopy turbulence, we observed behaviors of the plume trajectory under varying turbulence yet constant cross-flow conditions. Results indicate that increasing canopy turbulence corresponds to increased strength of vertical oscillatory motion and variability in the plume trajectory/position. Furthermore, we find that the canopy coherent structures characterized at the plume source set the intensity and frequency at which the plume oscillates. These perturbations then move longitudinally along the length of the plume at the speed of the free stream velocity. However, the buoyancy developed by the plume can resist this impact of the canopy structures. Due to these competing effects, the oscillatory behavior of plumes in canopy systems is observed more significantly in systems where the canopy turbulence is dominant. These effects also have an influence on the mixing and entrainment of the plumes. We offer scaling analyses to find flow regimes in which canopy induced turbulence would be relevant in plume dynamics.

  • Stanford AI Scholars Find Support for Innovation in a Time of Uncertainty
    Nikki Goth Itoi
    Jul 01
    news

    Stanford HAI offers critical resources for faculty and students to continue groundbreaking research across the vast AI landscape.

Understanding the genetic and molecular basis of disease can have a profound influence on the development of targeted, personalized therapies and disease risk management. However, the process of determining causal variants of diseases remains challenging and largely manual, requiring researchers to compile predictions from computational models, and synthesize annotations scattered across various databases. We will develop and optimize large language model (LLM)-based chatbots with programmatic access to diverse biological databases and tools that will enable researchers to obtain relevant information, and synthesize data to answer complex queries. Researchers will be able to interface with the chatbots through a web platform with data visualization capabilities and features specifically designed to facilitate AI-assisted variant prioritization. The integration of advances in generative AI and natural language processing with bioinformatics will empower researchers to efficiently analyze the functional effects of genetic variants, significantly advancing our ability to discover mechanisms of various genetic diseases.

Name

Role

School

Department

Anshul Kundaje

Main PI

School of Medicine

Genetics

Integration of AI in clinical decision-support tools promises to augment clinicians’ knowledge work. Despite significant AI advances, adoption of these tools in the real-world remains limited due to their poor alignment with the needs of clinicians and that of the decisional contexts. For AI to be successfully adopted in clinical practice, it is crucial to not only build accurate models but to also deliver AI support in a way that considers factors that matter for decision-making, such as context of decisions, the type of computational output, and individual differences in reasoning and judgment of decision-makers. Designing interfaces to deliver machine-learned outputs tailored to these different dimensions remains a complex and unsystematic endeavor. To address this gap, the project brings together experts in clinical decision-making, human-computer interaction, and psychology to systematically design and evaluate different modes of delivering AI support. Using a mixed-methods approach, it first involves clinicians in a co-design process to create salient design variations of decision-support mechanisms for two clinical tasks that include lab order and antibiotic selection. It then conducts an experimental evaluation of combinations of task and decision-support mechanisms to examine how different task-interface combinations impact clinicians' performance, confidence, and reasoning. Findings from the study will show what types of clinical decision-making tasks benefit from which types of AI support; and how different modes of delivering AI support influence performance and confidence. By systematically addressing limitations in existing AI integration in clinical practice, the project aims to bridge the gap between technological capability and practical usability, advancing how AI-assisted clinical decision-making is conceptualized and supported.

Name

Role

School

Department

Shriti Raj

Main PI

School of Medicine

Biomedical Informatics Research

Jonathan Chen

Co-PI

School of Medicine

Biomedical Informatics Research

Tobias Gerstenberg

Co-PI

School of Humanities and Sciences

Psychology

This project works on developing tools that apply large language models (LLMs) to help experts analyze discrimination in the courtroom. We focus on prior research on two kinds of bias: framing of Black Americans with dehumanizing metaphors (for example body-centric language, or framing black people as inherently criminal), or language about the physical spaces associated with Black Americans. We will develop an LLM-based algorithm for identifying how courtroom actors frame various participants, drawing on our prior work using LLMs for detecting implicit meanings like dehumanization. And we will study the downstream effect of this biased language with a human subjects experiment that measures how mentions of racialized spaces colors the decisions of jurors. Our project aims to explore how LLMs can help with the critical, challenging task of understanding implicit language in the courtroom and ultimately improving access to justice for people of all demographics.

Name

Role

School

Department

Dan Jurafsky

Main PI

School of Humanities and Sciences

Linguistics

The Ocean covers 71% of the planet and makes up 99% of the habitable space on the planet; however, we know very little about Earth’s marine biodiversity and global genetic databases have significant gaps. Environmental DNA (eDNA - DNA shed by organisms into the environment) is increasingly being utilized to improve marine biodiversity monitoring. While eDNA is a promising way to monitor biodiversity trends, current methods are constrained by existing databases and human decision making in data processing. Leveraging AI, our project aims to develop a new approach that enables context-aware phylogenetic inference using multiple datasets, combined with a novel statistical algorithm for reference-free species identification. 

Name

Role

School

Department

Fiorenza Micheli

Main PI

School of Sustainability

Oceans

Julia Salzman

Co-PI

School of Medicine

Biomedical Data Science

Weather-related risks continue to stress insurance markets, one of the key tools available for both protecting households from the ravages of climate change and incentivizing adaptation strategies. Florida’s growing hurricane risks have demonstrated the necessity of functioning insurance markets. Los Angeles’s recent fires have reinforced the centrality of insurance markets for managing and adapting to a changing physical and political environment. As risks shift, understanding implications of pricing strategies, regulatory approaches, and market functions add substantial complexity to insurance operations and oversight. This project proposes to develop AI tools to bring speed, efficiency, and eventually insight to these processes.

In Florida, two key government actors engage with insurance markets: the Office of Insurance Regulation (OIR) and the Insurance Consumer Advocate (ICA.) Insurance companies frequently apply for regulatory approval for insurance rate changes in response to perceived risks, whether from hurricane frequency and severity, aggregate risks that impact reinsurance markets (i.e. insurance for insurance companies, a market function essential to Florida), competitive dynamics, and the litigation environment within Florida. OIR attempts to balance market stability, company solvency and consumer protections, while ICA focuses primarily on consumers. OIR and ICA work requires substantial manual processing and analysis. Currently, both OIR and ICA can face backlogs in their approval and oversight processes.

This project seeks to automate the initial data extraction and coding from insurance rate filings in support of ICA and OIR and to facilitate their analysis. The analysis of insurance filings is representative of challenging tasks where highly specialized expertise is required to interpret and process many similar documents of high complexity.  Our research is to create a general processing system for document sets that captures experts’ key considerations, processes the document sets, helps them compare and contrast between documents. Specifically, we will study how our system can help OIR and ICA accelerate the rate approval process while maintaining accuracy.  If successful, this project will aid consumer protections and improve efficiency of insurance market regulatory oversight.

Name

Role

School

Department

Monica Lam

Main PI

School of Engineering

Computer Science

Marc Roston

Senior Research Scholar

School of Sustainability

Precourt Institute for Energy

This project aims to improve urban drought management by developing AI-driven tools that enhance drought indicators and response strategies. As climate change increases the frequency and severity of droughts, cities face growing challenges in ensuring reliable water access while managing limited resources. Current drought indicators are infrequently tested and updated, limiting their effectiveness in guiding drought responses that mitigate socio-economic impacts. By integrating water resource systems modeling with reinforcement learning, this research will optimize drought indicators, adapt them to changing water supply and demand patterns, and create more effective drought response plans. The approach focuses on providing actionable indicators for urban water planners and providing insights for California drought policy. 

Name

Role

School

Department

Sarah Fletcher

Main PI

School of Engineering/School of Sustainability

Civil and Environmental Engineering

Mykel Kochenderfer

Co-PI

School of Engineering

Aeronautics and Astronautics

Rapid advancements in artificial intelligence (AI) raise important concerns about cybersecurity risks. While existing work shows AI falls short of human expertise in cybersecurity, we aim to identify specific characteristics that could serve as indicators of emerging capabilities and risks by studying the gap between AI and peak human performance. We propose comparing world-class hackers, selected based on their track record in security research and top competitions, against AI systems in controlled environments. By analyzing how both humans and AI approach security challenges, we can map out the specific expertise, intuitive knowledge, and problem-solving approaches that make human experts effective but remain difficult for AI to replicate. Conversely, we seek to identify areas where AI's latent capabilities may offer unique advantages to better understand how experts may best utilize AI in their work. This detailed analysis will help develop better ways to evaluate AI systems in cybersecurity contexts, addressing a critical gap in evidence-based policymaking and laying the groundwork for future benchmarks and evaluation work.

Name

Role

School

Department

Daniel Ho

Main PI

School of Law

Law School

Dan Boneh

Co-Pl

School of Engineering

Electrical Engineering

Percy Liang

Co-Pl

School of Engineering

Computer Science

Realistic practice and personalized feedback are key for psychotherapy trainees to effectively learn clinical helping skills. Simulation-based practice methods, such as roleplay with peers or Standardized Patients can offer interactive practice opportunities that complement theoretical knowledge. However, each practice method has its limitations. Moreover, while instructor-provided feedback on psychotherapy skills can significantly enhance therapists’ competence and quality , it is often time-consuming and costly for psychotherapy classroom instructors and supervisors to provide such feedback on every practice. Innovative training methods that are more cost-effective and scalable are therefore needed. Our project will develop CARE, an interactive training system leveraging Large Language Models (LLMs) to provide scalable practice and feedback in psychotherapy classrooms. In this proposed project, we work to accomplish this vision via two complementary research thrusts: (1) building the technical foundations and (2) deploying CARE in psychotherapy learning communities.

Name

Role

School

Department

Bruce Arnow

Main PI

School of Medicine

Psychiatry and Behavioral Sciences

Diyi Yang

Co-PI

School of Engineering

Computer Science

Intercultural competence is a set of related skills that facilitate successful interaction with people of other cultures. Those partaking in intercultural exchange need to be aware of their partners’ behavioral and linguistic conventions, as well as the deeper interpretations orienting their partners’ behavioral choices. If communication augmented by Generative AI understands such cultural differences, it will be better equipped to make impactful use of visual and linguistic media. Although current large language models (LLMs) already contain implicit socio-cultural knowledge, these inferences may be entangled or superimposed. We propose developing a cross-cultural knowledge tool which (1) implements a dynamic methodology for collecting cultural knowledge, (2) will be useful for mapping cultural variance within and across cultures, and (3) overlayed with current LLMs, will facilitate constructive intercultural exchange.

Name

Role

School

Department

Diyi Yang

Main PI

School of Engineering

Computer Science

Amir Goldberg

Co-PI

Graduate School of Business

Graduate School of Business

Proteins are central to many biological processes, yet analyzing and designing them often requires advanced expertise in structural biology and computational methods. We propose developing a multimodal AI framework that bridges protein structure and text modalities, enabling more intuitive interaction with protein structures data and protein design tools. Building upon existing generative protein structure and large language models, we aim to establish new evaluation metrics to systematically assess generated protein structures and automate the generation of human-readable descriptions from protein structures. We will conduct comprehensive evaluation of the quality and diversity of current protein structure model outputs, create a comprehensive dataset of protein structures paired with detailed textual explanations, and ultimately train an integrated protein structure-text generative model. If successful, this framework could offer interpretation to complex protein features, lower the barriers to computational structural biology and protein engineering. This effort will provide researchers, educators, and enthusiasts with accessible tools for exploring protein structures.

Name

Role

School

Department

Possu Huang

Main PI

School of Engineering

Bioengineering

This project aims to create the first comprehensive textbook on how AI systems can learn from and align with human preferences and values. While there are many books on AI algorithms and human decision-making, there is currently no single resource that brings these fields together to teach how to build AI systems that effectively learn from human feedback and preferences. The proposed textbook will bridge this gap by combining insights from computer science, psychology, economics, and ethics. It will cover both technical approaches like reinforcement learning from human feedback (RLHF) and broader considerations around ethics, fairness, and responsible AI development. The book will include practical tutorials, real-world examples, and interdisciplinary exercises to help students and practitioners build AI systems that better serve human needs and values. A companion website will provide additional resources and maintain an engaged learning community. This project represents an important step toward ensuring AI systems are developed in ways that benefit society while respecting human values and preferences.

Name

Role

School

Department

Sanmi Koyejo

Main PI

School of Engineering

Computer Science

Susan Athey

Co-Pl

Graduate School of Business

Graduate School of Business

Noah Goodman

Co-Pl

School of Humanities and Sciences

Psychology

Jonathan Levav

Co-Pl

Graduate School of Business

Graduate School of Business

Dorsa Sadigh

Co-Pl

School of Engineering

Computer Science

Diyi Yang

Co-Pl

School of Engineering

Computer Science

Globally, over 310 million major surgeries occur annually, with outcomes heavily reliant on surgeon experience, leading to variability, inefficiencies, complications, and increased healthcare costs. To address this, we propose the concept of Surgeon-Machine Interfaces (SMI), integration of intelligent machine models in real-time to provide tailored experiential insights and augment surgeons' capabilities. Using microvascular decompression (MVD) surgery as a proof-of-concept, the platform quantifies key surgical and anatomical interactions, generates outcome-driving predictive metrics, and enables real-time deployment in the operating room. This project aims to enhance surgical decision-making to reduce variability in MVD surgeries and improve patient outcomes while advancing the quality of surgical care.

Name

Role

School

Department

Vivek Buch

Main PI

School of Medicine

Neurosurgery

Ehsan Adeli

Co-PI

School of Medicine

Psychiatry and Behavioral Sciences

Language models are aligned to the collective voice of many, resulting in outputs that align with no one in particular. This results in, for example, LLM-generated writing that feels correct but uninteresting, or that sounds plausible but doesn’t align with the user’s views, or otherwise doesn’t echo the user’s voice or intent. How might we develop algorithms and interactions that leverage the power of modern AI models in a way that maintains and even amplifies what is uniquely "you"? Our research is developing techniques to train personalized models from a small number of demonstrated examples, allowing you to construct a large number of bounded abstractions—LLM “brushes” that capture specific voices or elements that you might want to deploy. We are developing a set of interaction techniques for nonexpert end users to author, command, and combine these brushes as they use large language models.

Name

Role

School

Department

Michael Bernstein

Main PI

School of Engineering

Computer Science

Diyi Yang

Co-Pl

School of Engineering

Computer Science

Radiology reports are detailed documents generated during routine clinical care when radiologists interpret complex medical images. The primary purpose of these reports is to facilitate effective communication between radiologists and referring physicians. Recently, patients have also had direct access to these reports, often receiving them even before their physicians can provide context and explanations. The specialized terminology and complexity inherent in radiology reports can make them difficult to understand, thereby increasing anxiety and uncertainty, particularly among patients with diverse educational backgrounds.

Improved comprehension of clinical reports has been associated with better patient engagement, reduced anxiety, and enhanced treatment outcomes. Recognizing the challenges patients face in understanding radiology reports, we aim to leverage large language models (LLMs) to develop an automated framework for generating and evaluating patient-friendly clinical reports. Our work focuses on three specific objectives: (1) devising innovative methods to generate accessible, patient-friendly clinical reports; (2) establishing robust evaluation and monitoring strategies to ensure the safety, quality, and effectiveness of LLM-generated summaries; and (3) clinically validating these automatically generated reports through engagement with all stakeholders, including patients.

Name

Role

School

Department

Akshay Chaudhari

Main PI

School of Medicine

Radiology

Roxana Daneshjou

Co-PI

School of Medicine

Biomedical Data Science

Jason Hom

Co-PI

School of Medicine

Medicine

David Larson

Co-PI

School of Medicine

Radiology

Our long-term goal is to equip robots with flexible and creative problem-solving capabilities in the physical world, such as rolling out dough with a wine bottle if no rolling pin is available. This will require robots to physically reason about the environment, preferably with as little exploration as possible. While robots are increasingly capable of manipulating a variety of objects, humans excel in their capacity to generalize their manipulation skills to novel objects, scenarios, or goals through innovative usage of objects and out-of-the-box thinking. On top of that, they often do so in only a handful of attempts. Currently, robots lack this creativity, flexibility, and fast adaptation which ultimately will be required when deploying them into environments they have never seen before (e.g. disaster zones or other planets). In this project, we will develop and test computational models of physical problem solving, serving the twin goals of (1) equipping robots with this capacity, and (2) better understanding the underlying cognitive mechanisms that support this capacity in humans. The core research question addressed in this proposal is how an agent (whether human or robot) can effectively reason about the infinitely many ways of interacting with novel objects in order to achieve various goals in previously unseen scenarios.

Name

Role

School

Department

Jeannette Bohg

Main PI

School of Engineering

Computer Science

Kayvon Fatahalian

Co-PI

School of Engineering

Computer Science

Tobias Gerstenberg

Co-PI

School of Humanities and Sciences

Psychology

High-quality teacher feedback is crucial for students’ literacy development and agency as critical independent writers. While recent advances in AI have led to various automated feedback tools, current large language models (LLMs) struggle to provide feedback that meets pedagogical standards across content, structural, and social-affective dimensions. This research proposes to 1) measure the quality of teacher-written and AI-generated feedback along research-based dimensions, 2) provide insights into how expert teachers balance complex pedagogical responsibilities, and 3) enhance automated feedback generation. We will develop a suite of automated benchmarks for feedback quality dimensions, leveraging expert annotations on a large dataset of feedback comments to train language models. We will then collect a RLHF dataset of expert teacher preferences to train both a standard black-box reward model and an interpretable reward model relating preferences to our benchmarks. Finally we will apply a novel reinforcement learning approach to optimize open-source LLMs to generate feedback that aligns with learned reward functions and captures the complex objectives of the task. The resulting systems have the potential to exceed human feedback capabilities and allow individual teachers to customize feedback priorities without additional training. 

Name

Role

School

Department

Dora Demszky

Main PI

Graduate School of Education

Graduate School of Education

Rebeca Silverman

Co-PI

Graduate School of Education

Graduate School of Education

The development of a motion-capture glove using Fiber Bragg Grating (FBG) sensors offers a novel approach to capturing hand motion with exceptional precision and minimal interference. Designed to track intricate movements and tactile interactions, this project has the potential to advance research in robotics, surgery, and the arts.

An initial application focuses on piano performance, addressing ergonomic challenges faced by individuals with smaller hand spans, often women, who are at greater risk of injury due to the standardized design of the piano keyboard. By collecting high-resolution, multimodal data, this research aims to inform safer and more inclusive practices in music education and performance.

This work will generate a comprehensive dataset, integrating motion, tactile, and audio inputs, with applications extending to robotics, computer vision, and ergonomic musical instrument design. This initiative integrates advanced sensing technology with human-centered research to foster equity and inclusivity across multiple fields.

Name

Role

School

Department

Elizabeth Schumann

Main PI

School of Humanities and Sciences

Music

Mark Cutkosky

Co-PI

School of Engineering

Mechanical Engineering

Social isolation and loneliness are growing challenges in American society, propelled by broad changes in our social fabric, underlying physical and mental health conditions, and compounded by population aging. The consequences for health and wellbeing are profound and on par with smoking a pack a day, and yet identifying persons facing social isolation and loneliness is challenging, and solutions are emerging. To support better intervention and research, we will conduct a phone survey to definitively identify persons with these conditions, and undertake a process to link those results to extensive health records and create an “electronic phenotype”. The phenotype will provide the means for systematically identifying a population at risk, and we will also use implementation science frameworks to query stakeholders at different levels of the health system regarding how to apply the phenotype in practice to conduct screening and ultimately improve support and services provided to affected persons.

Name

Role

School

Department

Karl Lorenz

Main PI

School of Medicine

Med/Primary Care and Population Health

Selen Bozkurt

Co-PI

Emory University/School of Medicine

Biomedical Informatics

Karleen Giannitrapani

Co-PI

School of Medicine

Med/Primary Care and Population Health

Robotic dexterous manipulation has recently made significant progress through imitation learning from human demonstrations. Tools like the Universal Manipulation Interface (UMI) have greatly expanded the scale at which we can collect robot behavior data, but most rely solely on vision and miss key sensory signals such as touch and audio. In this project, we enhance UMI with CoinFT—a custom, coin-sized six-axis force-torque sensor at each fingertip—and add audio recording, creating UMIFT. This setup enables us to collect large, multimodal datasets encompassing vision, touch, and sound, allowing robots to learn manipulation skills that are more adaptable, precise, and safe. By leveraging these richer sensory cues, robots will be capable of reliably handling soft and fragile objects as well, representing a significant step forward in human-like dexterous manipulation.

Name

Role

School

Department

Mark Cutkosky

Main PI

School of Engineering

Mechanical Engineering

Shuran Song

Co-Pl

School of Engineering

Electrical Engineering

Dengue is the most significant arbovirus globally, causing an estimated 210 million infections per year, leading to high rates of morbidity and considerable economic burdens. Dengue virus is transmitted from person to person via mosquito vectors that thrive in different environments. There are currently no preventative pharmaceutical interventions, such as vaccines, for Dengue control, so mosquito control is the primary disease management tool. Historically, large outbreaks of the disease predominantly occurred in urban and peri-urban areas, attributed to the primary vector, Aedes aegypti. Over the last decade, however, outbreaks have emerged in rural areas where a highly invasive secondary vector, Aedes albopictus, is also present. However, the precise habitat in rural areas, and thus targets for rural Dengue management, remain unknown. This project will pair mosquito and plant species ground surveys, drone imagery, and Meta’s Segment Anything Model to identify Ae. albopictus habitat and efficiently create fine (0.02m) resolution habitat maps. Drone imagery will then be combined with high resolution satellite imagery (0.5m) to generate dynamic regional risk maps that can be used for disease management. .

Name

Role

School

Department

Erin Mordecai

Main PI

School of Humanities and Sciences

Biology

This proposal seeks to enable language models to learn specialized knowledge from extremely small datasets—on the order of a single textbook—by creating a new approach to data-efficient model adaptation. We will build an experimental platform that studies small-scale learning, employ synthetic data generation techniques to break current scaling-law constraints, and systematically analyze the scaling properties of adapted models. Our work will evaluate the resulting models on specialized domains (e.g., code and mathematics) to measure improvements in accuracy and the effects of potential issues such as hallucination and bias. In the long term, we aim to enable models to learn from human-like data efficiencies and broaden their impact beyond large-scale pretraining settings.

Name

Role

School

Department

Tatsunori Hashimoto

Main PI

School of Engineering

Computer Science

Unlocking the mysteries of disease onset and progression necessitates cutting-edge algorithms that bridge the molecular and morphological intricacies of individual cells. Our project uses AI to align and analyze two powerful imaging modalities: Xenium, a technology that maps molecular information at subcellular resolution, and H&E staining, the gold-standard for visualizing tissue architecture. By combining these me3thodes, we aim to reveal finite relationships between molecular activity and tissue structure, offering unprecedented insights into cellular behavior in healthy and precancerous tissues.

This interdisciplinary approach leverages advanced AI models, such as Swin Transformers and VoxelMorph, to precisely align images and segment individual cells. By enhancing the accuracy of these techniques, we can better understand how cellular diversity and organization contribute to disease, paving the way for earlier diagnostics and personalized treatments. Our research integrates expertise in genetics, bioinformatics, pathology, and AI, creating a framework that will benefit both researchers and clinicians.

With support from expert pathologists, we will validate this approach to ensure it meets the highest standards of accuracy and usability. Ultimately, this work could transform diagnostic protocols, making them more precise and accessible while advancing our understanding of complex diseases.

Name

Role

School

Department

Michael Snyder

Main PI

School of Medicine

Genetics

A hallmark of human intelligence is the exquisite balance of engaging goal-driven control over complex aspects of tasks while relegating undemanding aspects to more automatic processes. Visual search is a canonical task in which human performance can be guided both by goal-driven and automatic systems. Classical studies of visual search behavior in humans has provided fundamental insights into the trade-off between goal-driven and automatic behavior. However, a major challenge has been to make image computable models of visual search behavior for natural images that capitalize on these insights. Moreover, testing to see whether fundamental insights learned from behavior tested with simple, highly controlled visual search arrays generalize to search in natural images is a nontrivial challenge. Here we propose to use deep learning models with aligned linguistic and visual representations to build models of human visual search in natural images which can account for both automatic and goal-driven behavior. We will test these models using visual search stimuli generated with deep image synthesis designed specifically to test the tradeoff between automatic and goal-driven behavior. Ultimately we expect that our data and models of visual search will form the basis for artificial intelligence systems capable of monitoring the balance of goal-driven and automatic behavior and optimize human performance through image synthesis of stimuli which promote optimal balance of these two forces. 

Name

Role

School

Department

Justin Gardner

Main PI

School of Humanities and Sciences

Psychology

Jiajun Wu

Co-Pl

School of Engineering

Computer Science

The construction industry is one of the most resource-intensive global sectors, often facing inefficiencies and waste. Despite advancements in technology, managers still rely on subjective visual assessments to monitor progress, leading to delays, cost overruns, and safety concerns. This project aims to develop a digital twin—a 4D queryable representation of construction sites that is generated by visual data (color and depth images) collected over time. Using advanced computer vision and machine learning models, the system will enable users of any expertise or skill to interact with the digital twin by querying it in natural language to obtain insights about progress, identify issues, and anticipate future outcomes. By integrating visual and linguistic models, the system will provide interpretable visual outputs in response to user queries, hence increasing transparency and explainability. This approach seeks to improve efficiency, enhance safety, and support sustainability in construction project management. 

Name

Role

School

Department

Iro Armeni

Main PI

School of Engineering/School of Sustainability

Civil and Environmental Engineering

Gordon Wetzstein

Co-Pl

School of Engineering

Electrical Engineering

This project aims to empower young people from disadvantaged backgrounds to use generative AI tools effectively, productively, and responsibly. While generative AI holds immense potential for enhancing productivity and solving real-world challenges, its improper use raises concerns about reliability and ethical considerations. To address these challenges, we will assess the current capacity of youth to engage with these tools, develop an educational curriculum emphasizing critical skills, sustainability, and ethical use, and rigorously evaluate its effectiveness through a randomized controlled trial involving 3,000 students across 150 classes. Partnering with a non-profit operating in rural India, we aim to build the capacity of underserved youth to harness generative AI in ways that benefit their personal and community development. The findings will provide valuable insights into fostering equitable access to AI education and its responsible use in developing country contexts.

Name

Role

School

Department

Prashant Loyalka

Main PI

Graduate School of Education

Graduate School of Education

We use critical theories of disability to examine the production and use of AI detection tools. First, we analyze how Human Computer Interaction and Natural Language Processing literature considers disability, such as through annotator recruitment, analyses of bias, and comparisons between people and models. Second, we conduct an algorithmic audit of several AI-detection tools to understand their treatment of disabled people. We focus specifically on the higher education context, and hope that our work will inform improved research directions and offer constructive considerations for the protection of disability rights with respect to AI. 

Name

Role

School

Department

Alfredo J. Artiles

Main PI

Graduate School of Education

Graduate School of Education

Dynamic brain data, teeming with biological and functional insights, are becoming increasingly accessible through advanced measurements, providing a gateway to understanding the inner workings of the brain in living subjects. However, the vast size and intricate complexity of the data also pose a daunting challenge in reliably extracting meaningful information across various data sources. In this project we will introduce a generalizable self-supervised deep manifold learning for exploration of dynamic patterns. Unlike existing methods that extracting patterns directly from the input data as in the existing methods, the proposed brain cognitive network embedding (BCNE) seeks to capture the brain-state map by deciphering the temporospatial correlations within the data and subsequently applying manifold learning to this correlative representation. The performance of BCNE will be showcased through the analysis of several important dynamic brain datasets. We anticipate that BCNE will provide an effective tool for exploring general neuroscience inquiries. 

Name

Role

School

Department

Lei Xing

Main PI

School of Medicine

Radiation Oncology - Radiation Physics

Hearing loss significantly impacts individuals of all ages, particularly children, where early detection and monitoring are critical for optimal speech and language development. In the United States, 3 in every 1,000 children are born with hearing loss, and 1 in 7 children aged 6 to 19 experience some form of hearing loss. Current hearing assessment methods, such as the Ling Sound Test, are challenging for parents to administer consistently, while access to specialists is often limited. We propose the development of the first AI-powered digital Ling Sound Test in a web-based platform for daily, and long-term hearing monitoring at home. By addressing challenges such as uncontrolled home environments, background noise, and device variability, the research aims to deliver accurate, efficient, and robust hearing assessments. In addition to advanced statistical models, we will integrate a transformer-based model to refine hearing predictions and leverage a large language model (LLM) to provide personalized feedback. The gamified interface will be developed with the intention to further sustain participation for childrens across different age groups. By enabling accessible, accurate, and engaging at-home hearing assessments, Shuno aims to empower families and foster better outcomes for children with hearing challenges. 

Name

Role

School

Department

Chris Piech

Main PI

School of Engineering

Computer Science

AI will increasingly be used to mediate interpersonal communication, especially in work settings. As people interact on digital platforms, LLM-based technology will assist them in crafting more effective messages, and saving time and energy in the process. Such technologies already exist on the market; their use will likely proliferate dramatically in the near future. This study seeks to explore the implications of such technologies on culture and interactive productivity. Using an LLM-mediated interaction platform developed with the DSPy library, we will run a set of experiments in which pairs of individuals work together to perform a task. Participants will either be interacting with the assistance of an LLM-based intermediary, or without one. We will test how LLM-intermediation affects participants’ perceptions of the quality of interaction, their sense of identification and connection with their partners, and their psychological well-being.

Name

Role

School

Department

Christopher Potts

Main PI

School of Humanities and Sciences

Linguistics

Amir Goldberg

Co-PI

Graduate School of Business

Graduate School of Business

Robot-assisted surgery (RAS) has had considerable success with well over 2,200,000 robot-assisted surgeries carried out in 2023 with Intuitive Surgical’s da Vinci Surgical System, which has an installed base of over 9,500 systems worldwide. In RAS, small incisions are made in the body and long, slender surgical instruments are inserted inside the patient’s abdomen along with a 3D camera. The surgeon teleoperates these instruments and the camera with two robotic arms from a remote console showing the 3D camera view. The patient-side robot instruments are controlled to follow the surgeon’s hand motions, with motion scaling for improved accuracy and filtering to remove hand tremor. The patient-side part of the latest systems has four robot arms; one of them holds the 3D camera and the remaining three hold the surgical instruments. 

A significant portion of many surgical procedures is devoted to three-handed tasks. The repetitive, less demanding part of these tasks is usually performed using one dedicated robot arm. The surgeon then can use the remaining two robot arms to perform the more demanding parts of the task. Since a surgeon can only control two arms at a time, he/she must switch the robot’s control system to use one arm to perform one part and then switch the control system back to manipulate the other two arms as needed. This switching increases the cognitive load of the surgeon and was shown to cause negative outcomes such as collisions between the robot arms and tearing of the human tissue. This provides a huge opportunity to develop autonomous systems that can carry out the tasks of the extra robotic arm(s). 

The goal of this work is to develop a collaborative autonomous system automating an extra robotic arm to work alongside the teleoperated arms by the surgeon in a parallel and coordinated fashion. This will realize the full potential of having more than two robotic arms in RAS. This is unlike previous work in this area where the autonomous auxiliary arms move alone while the teleoperated surgical instruments are idle. Our work will lead to a collaborative system where an automated extra robotic arm efficiently assists the main surgeon as they teleoperate other robotic arms. We hypothesize that our collaborative autonomous systems will lead to faster execution of tasks than the state-of-the-art systems in this area. In addition, we hypothesize that this collaborative system will reduce the cognitive load of the surgeons compared with the gold standard, where the main surgeon controls all the robotic arms. The knowledge and data generated from the proposed research will provide insights towards designing effective interactions between surgeons and autonomous systems. 

Name

Role

School

Department

Allison Okamura

Main PI

School of Engineering

Mechanical Engineering

Carla Pugh

Co-Pl

School of Medicine

Surgery

What knowledge is endowed to us by genetics versus what is learned through experience? Recent advances in AI have rendered this ancient puzzle tractable. It is now possible to train a model akin to those that have shown sparks of human intelligence (e.g., ChatGPT) on the experiences of infants and evaluate whether it learns like an infant. What knowledge, if any, needs to be built in for the model to learn like an infant? To start to answer this question, we will collect the most comprehensive multi-dimensional dataset from a single infant (2-24 months). We will deeply survey their visual and auditory experiences via head-mounted cameras, we will measure their neural activity through awake fMRI, and assess their language comprehension through behavioral tasks. Utilizing this data, we will train a multi-modal model on the infant's experiential data and constrain it to have the same neural representations as measured by the infant. The model’s language acquisition will be evaluated against the child’s developmental milestones. Success will be evaluated based on whether the model learns language at the same rate the child does. In doing so, we will make substantial progress in understanding visual and language development in infancy, which may inform what goes awry in developmental disorders (e.g., autism). 

Name

Role

School

Department

Cameron Ellis

Main PI

School of Humanities and Sciences

Psychology

Michael Frank

Co-Pl

School of Humanities and Sciences

Psychology

Dan Yamins

Co-Pl

School of Humanities and Sciences

Psychology

When asked what they most desire in life, the majority of people cite happiness or well-being. Beyond its intrinsic value, high well-being is associated with improved mental and physical health, career success, and stronger social relationships. Despite its critical importance for individuals and its broader societal impact, global subjective well-being ratings have stagnated, even as technology has advanced. We propose leveraging LLM-based well-being chatbots to offer more engaging, personalized, and potentially more effective interventions. Our approach begins by enhancing existing well-being interventions, such as writing a gratitude letter or reflecting on a positive life moment, by incorporating tailored guidance and feedback. In the next phase, we aim to democratize the development of new interventions by providing easy-to-use chatbot templates for other researchers. This will lower technical barriers and encourage a broader range of innovative solutions from both the scientific community and the public, addressing the historical lack of diversity in well-being interventions. Finally, we will assess the effectiveness of these diverse interventions through a well-powered chatbot competition, identifying the most effective strategies for different populations. Through this project, we aim to explore how emerging AI technologies can supplement, not replace, traditional mental health resources, by offering a low-barrier entry point for researchers to develop and test their own interventions, ultimately identifying the most effective strategies for enhancing well-being.

Name

Role

School

Department

Robb Willer

Main PI

School of Humanities and Sciences

Sociology

Johannes Eichstaedt

Co-Pl

School of Humanities and Sciences

Psychology