Seed Research Grants

How can a large language model (LLM) which has been custom built for a college course be wisely and safely incorporated into that course, to maximize learning and engagement, and to avoid gratuitous and unproductive uses of the tool?

We have built VHIL-E, the Virtual Human Interaction Lab’s Expert, an LLM representing the lab’s research, outreach, and teaching, focused on the psychological aspects of virtual and augmented reality. Collected materials consist of approximately 2.3 million words, broken into approximately 10,000 chunks stored in an embedded index. Materials came from two books written about the lab’s research, 261 journal articles by VHIL scholars, 57 transcriptions of keynote addresses by lab members, 14 dissertations from lab students, 87 dedicated news articles about the lab’s research, 14 recorded question/answer lectures from multiple iterations of the course, and text from 17 lecture slides from a recent course year. In our first preliminary test, for open-ended responses, VHIL-E outperformed the baseline GPT model when replying to open ended questions, in particular by avoiding egregious hallucinations.

Embedded within the Virtual People class (Fall, 2026), we will investigate how VHIL-E can be used to assist students with designing VR experiences that adhere to research-based best practices. Some students will use VHIL-E to iterate on their own VR design blueprints (VHIL-E: augmentation), whereas other students will use VHIL-E to create the VR design blueprint for them (VHIL-E: replacement). The control group will not use LLMs at all. This implementation isolates key augmentation-related instructional features, including immediate feedback and adaptivity (known from intelligent tutoring systems), and draws on prior work on AI-enabled co-creation. As a measure of learning transfer we will assess students’ abilities to bring their VR blueprints to life during an actual VR prototyping task, and, finally, present their work during class. This longitudinal study design which unfolds over 8 weeks addresses three threats to the validity of educational LLM research by 1) deliberately centering on an instructional method, 2) having an adequate control condition, and 3) using a strong learning outcome measure. By investigating how a custom-built LLM can augment learners, our project closely aligns with HAI’s mission to create a framework on human-centered AI research.

How can an LLM which has been custom built for a college course be wisely and safely incorporated into that course, to maximize learning and engagement, and to avoid gratuitous and unproductive uses of the tool?

We have built VHIL-E, the Virtual Human Interaction Labs Expert, a large language model (LLM) representing the lab’s research and outreach and teaching, focused on the psychological aspects of virtual and augmented reality. Collected materials consist of approximately 2.3 million words, broken into approximately 10,000 chunks stored in an embedded index. Materials came from 2 books written about the lab’s research, 261 journal articles by VHIL scholars, 57 transcriptions of keynote addresses by lab members, 14 dissertations from lab students, 87 dedicated news articles about the lab‚Äôs research, 14 recorded question/answer lectures from multiple iterations of the course, and text from 17 lecture slides from a recent course year. In our first preliminary test, for open-ended responses, VHIL-E outperformed the baseline GPT model when replying to actual questions from journalists. Moreover the baseline model hallucinated for five percent of replies, while VHIL-E hallucinated for only one percent of replies. VHIL-E performed slightly better (3 percentage points) than the baseline on a 232 multiple choice question exam.

Embedded within the Virtual People class (Fall, 2025, 2026), we will investigate how VHIL-E can be used to assist students with designing VR experiences that adhere to research-based best practices. Some students will use VHIL-E to iterate on their own VR design blueprints (VHIL-E: augmentation), whereas other students will use VHIL-E to create the VR design blueprint for them (VHIL-E: replacement). The control group will not use LLMs at all. This implementation isolates key augmentation-related instructional features, including immediate feedback and adaptivity (known from intelligent tutoring systems), and draws on prior work on AI-enabled co-creation and learner self-transcendence. As a measure of learning transfer we will assess students’ abilities to bring their VR blueprints to life during an actual VR prototyping task, and, finally, present their work during class. This longitudinal study design which unfolds over 8 weeks addresses three threats to the validity of educational LLM research by 1) deliberately centering on an instructional method, 2) having an adequate control condition, and 3) using a strong learning outcome measure. By investigating how a custom-built LLM can augment learners, our project closely aligns with HAI’s mission to create a framework on human-centered AI research.

Name	Role	School	Department
Andre Kumar	Main PI	School of Medicine	Med/Hospital Medicine

Name	Role	School	Department
Candace Thille	Main PI	Graduate School of Education	Graduate School of Education
Shima Salehi	Co-PI	Graduate School of Education	Graduate School of Education

Eran Bendavid	Main PI	School of Medicine	Medicine (Primary Care and Population Health)
Carlos Guestrin	Co-PI	School of Engineering	Computer Science

Name	Role	School	Department
Mark Lemley	Main PI	School of Law	Law School
Daniel Ho	Co-PI	School of Law	Law School
Percy Liang	Co-PI	School of Engineering	Computer Science

Name	Role	School	Department
Sean Follmer	Main PI	School of Engineering	Mechanical Engineering
Lou Halamek	Co-PI	School of Medicine	Pediatrics (Neonatology)

Navigate

Participate

Stay Up To Date

A Human-Centered AI Database for Multi-Organ Point-of-Care Ultrasound (POCUS) Education

Accelerating the Research and Practice of the Science of Human Learning with Strategy-Following Instructional Agents

Accelerating Trusted Evidence Synthesis of the Scientific Literature

Assessing Copyright Risks of Training Data 'Memorization' and 'Extraction' for Open-Weight Language Models

Augmenting Novice Motor Skill through Expert Informed Haptic Feedback

Decoding Fishing Vessels' Decision Making to Build Climate-Resilient Fisheries Governance

Dendrocentric Learning and Inference at Edge Devices with Spintronics

Describing Human Visual Coding Motifs with Diffusion-Model-Learned Natural Image Manifolds

Early Detection of Pediatric Pneumonia Spikes in Ethiopia Using AI and Cloud-Based Data Integration

Empowering Biomedical Researchers with Microbiome-Specialized AI Agent Co-Pilots

Empowering East African Courts with AI Powered Search: A Demonstration Project

Empowering X-Ray Crystallographers with Generative AI

Explaining How to Get from A to B

FM-ISM: A Robust Foundation for Astronomical Foundation Models

Good Faith Technology? The Religious Experience of AI

Integrating Generative Molecular Discovery with Chemical Reasoning Models

Learning Protein Dynamics to Enable the Discovery of Safe, Effective Medicines

Learning with a Custom-Built Large Language Model: A Longitudinal Class Study

Leveraging Large Language Models to Predict Results of Social Science Field Experiments

Measuring and Mitigating Social Sycophancy in Large Language Models

PlaceEd: AI-Enabled Mapping for Climate Data Civics

Psychologically Grounded LLM Personalization Benchmark

Robots, Cows, and Ground Truth Lost in the Void: How AI Experts Mobilize Dairy Farmers' Data Work to Close AI Learning Loops

Scaling Laws for the Social Sciences

Scientist-in-the-Loop Code Generation for PDE Solvers

STEPS: An AI Tutor to Scaffold Problem Solving in Introductory Chemistry

Transforming RNA-Seq Analysis of Cancer Drug Resistance with a Deep Research AI Assistant

Understanding and Interpreting the Syntax and Functions of Blue Whale Songs

Why Didn't I Think of That? Accelerating Discovery of Novel Mechanisms by Transforming Human-AI Co-Design Processes

Stanford HAI Funds Groundbreaking AI Research Projects

Stanford AI Scholars Find Support for Innovation in a Time of Uncertainty