Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Demographic Stereotypes in Text-to-Image Generation | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
policyPolicy Brief

Demographic Stereotypes in Text-to-Image Generation

Date
November 30, 2023
Topics
Generative AI
Foundation Models
Ethics, Equity, Inclusion
Read Paper
abstract

This brief tests a variety of ordinary text prompts to examine how major text-to-image AI models encode a wide range of dangerous biases about demographic groups.

Key Takeaways

  • Text-to-image generative AI usage is growing, but the outputs of state-of-the-art models perpetuate and even exacerbate demographic stereotypes that can lead to increased hostility, discrimination, and violence toward individuals and communities.

  • Stable Diffusion generates images that encode substantial biases and stereotypes in response to ordinary text prompts that mention traits, descriptions, occupations, or objects—whether or not the prompts include explicit references to demographic characteristics or identities. These stereotypes persist despite mitigation strategies.

  • DALL-E similarly demonstrates substantial biases, often in a less straightforward way, despite OpenAI’s claims that it has implemented guardrails.

  • Technical fixes are insufficient to address the harms perpetuated by these systems. Policymakers need to understand how these biases translate into real-world harm and need to support holistic, comprehensive research approaches that meld technological evaluations with nuanced understandings of social and power dynamics.

Executive Summary

Text-to-image generative artificial intelligence (AI) systems such as Stable Diffusion and DALL-E that convert text descriptions provided by the user into synthetic images are exploding in popularity. However, users are often unaware that these models are trained on massive datasets of images that are primarily in English and often contain stereotyping, toxic, and pornographic content. As millions of images are generated each day using these AI systems, concerns around bias and stereotyping should be front and center in discussions.

In a new paper, “Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale,” we show that major text-to-image AI models encode a wide range of dangerous biases about different communities. Past research has demonstrated these biases in previous language and vision models, with recent research starting to explore these issues in relation to image generation. This paper aims to highlight the depth and breadth of biases in recently popularized text-to-image AI models, namely Stable Diffusion and DALL-E. We test a variety of ordinary text prompts and find that the resulting images perpetuate substantial biases and stereotypes—whether or not the prompts contain explicit references to demographic attributes.

Our research underscores the urgent need for policymakers to address the harms resulting from the mass dissemination of stereotypes through major text-to-image AI models.

Finally, our third study aimed to identify previously unknown complex undercompensated groups defined by multiple attributes. After considering age, documented sex, and 12 chronic health indicators, we found many groups with multiple chronic conditions that were undercompensated by at least $10,000 and up to $29,600. For instance, enrollees with asthma, heart disease, and mental health and substance use disorders were undercompensated by about $12,000.

Policy Discussion

The patients at the center of our studies are marginalized in the healthcare system and many problems in the system persist. Health plan enrollees with multiple chronic conditions already face challenges in maintaining access to care. Patients with undercompensated health conditions also represent a considerable portion of the U.S. population— approximately 20 percent of the country’s population have a mental health or substance use disorder. 

Policymakers should use their convening power to bring together industry and patient advocacy groups to parse out the consequences of new fairness metrics.

Subscribe to Policy Program Updates

Our second study found that a combination of removing drug and certain health condition variables, using 1 percent of funds for reinsurance, and introducing fairness constraints on the loss function for four undercompensated chronic illnesses greatly improved overall fit and fairness. This approach also improved group fit for most undercompensated groups not included in the loss function.

Read Paper
Share
Link copied to clipboard!
Authors
  • Federico Bianchi
    Federico Bianchi
  • Pratyusha Kalluri
    Pratyusha Kalluri
  • Esin Durmus
    Esin Durmus
  • Faisal Ladhak
    Faisal Ladhak
  • Myra Cheng
    Myra Cheng
  • Debora Nozza
    Debora Nozza
  • Tatsunori Hashimoto
    Tatsunori Hashimoto
  • Dan Jurafsky
    Dan Jurafsky
  • James Zou
    James Zou
  • Aylin Caliskan
    Aylin Caliskan

Related Publications

Beyond DeepSeek: China's Diverse Open-Weight AI Ecosystem and Its Policy Implications
Caroline Meinhardt, Sabina Nong, Graham Webster, Tatsunori Hashimoto, Christopher Manning
Deep DiveDec 16, 2025
Issue Brief

Almost one year after the “DeepSeek moment,” this brief analyzes China’s diverse open-model ecosystem and examines the policy implications of their widespread global diffusion.

Issue Brief

Beyond DeepSeek: China's Diverse Open-Weight AI Ecosystem and Its Policy Implications

Caroline Meinhardt, Sabina Nong, Graham Webster, Tatsunori Hashimoto, Christopher Manning
Foundation ModelsInternational Affairs, International Security, International DevelopmentDeep DiveDec 16

Almost one year after the “DeepSeek moment,” this brief analyzes China’s diverse open-model ecosystem and examines the policy implications of their widespread global diffusion.

Moving Beyond the Term "Global South" in AI Ethics and Policy
Evani Radiya-Dixit, Angèle Christin
Quick ReadNov 19, 2025
Issue Brief

This brief examines the limitations of the term "Global South" in AI ethics and policy, and highlights the importance of grounding such work in specific regions and power structures.

Issue Brief

Moving Beyond the Term "Global South" in AI Ethics and Policy

Evani Radiya-Dixit, Angèle Christin
Ethics, Equity, InclusionInternational Affairs, International Security, International DevelopmentQuick ReadNov 19

This brief examines the limitations of the term "Global South" in AI ethics and policy, and highlights the importance of grounding such work in specific regions and power structures.

Validating Claims About AI: A Policymaker’s Guide
Olawale Salaudeen, Anka Reuel, Angelina Wang, Sanmi Koyejo
Quick ReadSep 24, 2025
Policy Brief

This brief proposes a practical validation framework to help policymakers separate legitimate claims about AI systems from unsupported claims.

Policy Brief

Validating Claims About AI: A Policymaker’s Guide

Olawale Salaudeen, Anka Reuel, Angelina Wang, Sanmi Koyejo
Foundation ModelsPrivacy, Safety, SecurityQuick ReadSep 24

This brief proposes a practical validation framework to help policymakers separate legitimate claims about AI systems from unsupported claims.

Toward Political Neutrality in AI
Jillian Fisher, Ruth E. Appel, Yulia Tsvetkov, Margaret E. Roberts, Jennifer Pan, Dawn Song, Yejin Choi
Quick ReadSep 10, 2025
Policy Brief

This brief introduces a framework of eight techniques for approximating political neutrality in AI models.

Policy Brief

Toward Political Neutrality in AI

Jillian Fisher, Ruth E. Appel, Yulia Tsvetkov, Margaret E. Roberts, Jennifer Pan, Dawn Song, Yejin Choi
DemocracyGenerative AIQuick ReadSep 10

This brief introduces a framework of eight techniques for approximating political neutrality in AI models.