Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Simulating Human Behavior with AI Agents | Stanford HAI
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
policyPolicy Brief

Simulating Human Behavior with AI Agents

Date
May 20, 2025
Topics
Generative AI
Read Paper
abstract

This brief introduces a generative AI agent architecture that can simulate the attitudes of more than 1,000 real people in response to major social science survey questions.

Key Takeaways

  • Simulating human attitudes and behaviors could enable researchers to test interventions and theories and gain real-world insights.

  • We built an AI agent architecture that can simulate real people in ways far more complex than traditional approaches. Using this architecture, we created generative agents that simulate 1,000 individuals, each using an LLM paired with an in-depth interview transcript of the individual.

  • To test these generative agents, we evaluated the agents’ responses against the corresponding person’s responses to major social science surveys and experiments. We found that the agents replicated real participants’ responses 85% as accurately as the individuals replicated their own answers two weeks later on the General Social Survey.

  • Because these generative agents hold sensitive data and can mimic individual behavior, policymakers and researchers must work together to ensure that appropriate monitoring and consent mechanisms are used to help mitigate risks while also harnessing potential benefits.

Executive Summary

AI agents have been gaining widespread attention among the general public as AI systems that can pursue complex goals and directly take actions in both virtual and real-world environments. Today, people can use AI agents to make payments, reserve flights, and place grocery orders for them, and there is great excitement about the potential for AI agents to manage even more sophisticated tasks.

However, a different type of AI agent—a simulation of human behaviors and attitudes—is also on the rise. These simulation AI agents aim to be useful at asking “what if” questions about how people might respond to a range of social, political, or informational contexts. If these agents achieve high accuracy, they could enable researchers to test a broad set of interventions and theories, such as how people would react to new public health messages, product launches, or major economic or political shocks. Across economics, sociology, organizations, and political science, new ways of simulating individual behavior—and the behavior of groups of individuals—could help expand our understanding of social interactions, institutions, and networks. While work on these kinds of agents is progressing, current architectures must cover some distance before their use is reliable.

In our paper, “Generative Agent Simulations of 1,000 People,” we introduce an AI agent architecture that simulates more than 1,000 real people. The agent architecture—built by combining the transcripts of two-hour, qualitative interviews with a large language model (LLM) and scored against social science benchmarks—successfully replicated real individuals’ responses to survey questions 85% as accurately as participants replicate their own answers across surveys staggered two weeks apart. The generative agents performed comparably in predicting people’s personality traits and experiment outcomes and were less biased than previously used simulation tools.

This architecture underscores the benefits of using generative agents as a research tool to glean new insights into real-world individual behavior. However, researchers and policymakers must also mitigate the risks of using generative agents in such contexts, including harms related to over-reliance on agents, privacy, and reputation.

Introduction

Simulations in which agents are used to model the behaviors and interactions of individuals have been a popular tool for empirical social research for years, even before the emergence of AI agents. Traditional approaches to building agent architectures, such as agent-based models or game theory, rely on clear sets of rules and environments manually specified by the researchers. While these rules make it relatively easy to interpret results, they also limit the contexts in which traditional agents can act while oversimplifying the real-life complexity of human behavior. This, in turn, can limit the generalizability and accuracy of the simulation results.

Generative AI models offer the opportunity to build general purpose agents that can simulate behaviors across a variety of contexts. To create simulations that better reflect the myriad, often idiosyncratic factors that influence individuals’ attitudes, beliefs, and behaviors, we built a novel generative agent architecture that combines LLMs with in-depth interviews with real individuals. 

We recruited 1,052 individuals—representative of the U.S. population across age, gender, race, region, education, and political ideology—to participate in two-hour qualitative interviews. These in-depth interviews, which included both pre-specified questions and adaptive follow-up questions, are a foundational social science method that has been successfully used by researchers to predict life outcomes beyond what could be learned from traditional surveys and demographic instruments. We also developed an AI interviewer to ask participants the questions based on a semi-structured interview protocol from the American Voices Project—which ranged from life stories to people’s views on current social issues. 

Then, we built the generative agents based on participants’ full interview transcripts and an LLM. When a generative agent was queried, the full transcript was injected into the model prompt, which instructed the model to imitate the relevant individual when responding to questions, including forced-choice prompts, surveys, and multi-stage interactional settings.

Once the generative agents were in place, we evaluated them on their ability to predict participants’ responses to common social science surveys and experiments, which the participants completed after their in-depth interviews. We tested on the core module of the General Social Survey (widely used to assess survey respondents’ demographic backgrounds, behaviors, attitudes, and beliefs); the 44-item Big Five Inventory (designed to assess an individual’s personality); five well-known behavioral economic games (the dictator game, first and second player trust games, public goods game, and prisoner’s dilemma); and five social science experiments with control and treatment conditions. For the General Social Survey (which has categorical responses), we measured accuracy and correlation based on whether the agent selects the same survey response as the person. For the Big Five Inventory and the economic games (which have continuous responses), we assessed accuracy and correlation using mean absolute error.

Read Paper
Share
Link copied to clipboard!
Authors
  • Joon Sung Park
    Joon Sung Park
  • Carolyn Q. Zou
    Carolyn Q. Zou
  • Aaron Shaw
    Aaron Shaw
  • Benjamin Mako Hill
    Benjamin Mako Hill
  • Carrie J. Cai
    Carrie J. Cai
  • Meredith Ringel Morris
    Meredith Ringel Morris
  • Robb Willer
    Robb Willer
  • Percy Liang
    Percy Liang
  • Michael S. Bernstein
    Michael S. Bernstein

Related Publications

Demographic Stereotypes in Text-to-Image Generation
Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, Aylin Caliskan
Nov 30, 2023
Policy Brief

In this brief, Stanford scholars test a variety of ordinary text prompts to examine how major text-to-image AI models encode a wide range of dangerous biases about demographic groups.

Policy Brief

Demographic Stereotypes in Text-to-Image Generation

Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, Aylin Caliskan
Generative AIFoundation ModelsEthics, Equity, InclusionNov 30

In this brief, Stanford scholars test a variety of ordinary text prompts to examine how major text-to-image AI models encode a wide range of dangerous biases about demographic groups.

Policy Implications of DeepSeek AI’s Talent Base
Amy Zegart, Emerson Johnston
Quick ReadMay 06, 2025
Policy Brief

This brief presents an analysis of Chinese AI startup DeepSeek’s talent base and calls for U.S. policymakers to reinvest in competing to attract and retain global AI talent.

Policy Brief

Policy Implications of DeepSeek AI’s Talent Base

Amy Zegart, Emerson Johnston
International Affairs, International Security, International DevelopmentFoundation ModelsWorkforce, LaborQuick ReadMay 06

This brief presents an analysis of Chinese AI startup DeepSeek’s talent base and calls for U.S. policymakers to reinvest in competing to attract and retain global AI talent.

Mind the (Language) Gap: Mapping the Challenges of LLM Development in Low-Resource Language Contexts
Juan Pava, Haifa Badi Uz Zaman, Caroline Meinhardt, Toni Friedman, Sang T. Truong, Daniel Zhang, Elena Cryst, Vukosi Marivate, Sanmi Koyejo
Deep DiveApr 22, 2025
White Paper

This white paper maps the LLM development landscape for low-resource languages, highlighting challenges, trade-offs, and strategies to increase investment; prioritize cross-disciplinary, community-driven development; and ensure fair data ownership.

White Paper

Mind the (Language) Gap: Mapping the Challenges of LLM Development in Low-Resource Language Contexts

Juan Pava, Haifa Badi Uz Zaman, Caroline Meinhardt, Toni Friedman, Sang T. Truong, Daniel Zhang, Elena Cryst, Vukosi Marivate, Sanmi Koyejo
International Affairs, International Security, International DevelopmentNatural Language ProcessingDeep DiveApr 22

This white paper maps the LLM development landscape for low-resource languages, highlighting challenges, trade-offs, and strategies to increase investment; prioritize cross-disciplinary, community-driven development; and ensure fair data ownership.

Response to OSTP’s Request for Information on the Development of an AI Action Plan
Caroline Meinhardt, Daniel Zhang, Rishi Bommasani, Jennifer King, Russell Wald, Percy Liang, Daniel E. Ho
Mar 17, 2025
Response to Request

In this response to a request for information issued by the National Science Foundation’s Networking and Information Technology Research and Development National Coordination Office (on behalf of the Office of Science and Technology Policy), scholars from Stanford HAI, CRFM, and RegLab urge policymakers to prioritize four areas of policy action in their AI Action Plan: 1) Promote open innovation as a strategic advantage for U.S. competitiveness; 2) Maintain U.S. AI leadership by promoting scientific innovation; 3) Craft evidence-based AI policy that protects Americans without stifling innovation; 4) Empower government leaders with resources and technical expertise to ensure a “whole-of-government” approach to AI governance.

Response to Request

Response to OSTP’s Request for Information on the Development of an AI Action Plan

Caroline Meinhardt, Daniel Zhang, Rishi Bommasani, Jennifer King, Russell Wald, Percy Liang, Daniel E. Ho
Regulation, Policy, GovernanceMar 17

In this response to a request for information issued by the National Science Foundation’s Networking and Information Technology Research and Development National Coordination Office (on behalf of the Office of Science and Technology Policy), scholars from Stanford HAI, CRFM, and RegLab urge policymakers to prioritize four areas of policy action in their AI Action Plan: 1) Promote open innovation as a strategic advantage for U.S. competitiveness; 2) Maintain U.S. AI leadership by promoting scientific innovation; 3) Craft evidence-based AI policy that protects Americans without stifling innovation; 4) Empower government leaders with resources and technical expertise to ensure a “whole-of-government” approach to AI governance.