Skip to main content Skip to secondary navigation
Page Content

Policy Brief

Image
Policy Brief
November 30, 2023

Demographic Stereotypes in Text-to-Image Generation

Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, Aylin Caliskan

In this brief, Stanford scholars test a variety of ordinary text prompts to examine how major text-to-image AI models encode a wide range of dangerous biases about demographic groups.

Key Takeaways

➜ Text-to-image generative AI usage is growing, but the outputs of state-of-the-art models perpetuate and even exacerbate demographic stereotypes that can lead to increased hostility, discrimination, and violence toward individuals and communities.

➜ Stable Diffusion generates images that encode substantial biases and stereotypes in response to ordinary text prompts that mention traits, descriptions, occupations, or objects—whether or not the prompts include explicit references to demographic characteristics or identities. These stereotypes persist despite mitigation strategies.

➜ DALL-E similarly demonstrates substantial biases, often in a less straightforward way, despite OpenAI’s claims that it has implemented guardrails.

➜ Technical fixes are insufficient to address the harms perpetuated by these systems. Policymakers need to understand how these biases translate into real-world harm and need to support holistic, comprehensive research approaches that meld technological evaluations with nuanced understandings of social and power dynamics.

Executive Summary

Text-to-image generative artificial intelligence (AI) systems such as Stable Diffusion and DALL-E that convert text descriptions provided by the user into synthetic images are exploding in popularity. However, users are often unaware that these models are trained on massive datasets of images that are primarily in English and often contain stereotyping, toxic, and pornographic content. As millions of images are generated each day using these AI systems, concerns around bias and stereotyping should be front and center in discussions.

In a new paper, “Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale,” we show that major text-to-image AI models encode a wide range of dangerous biases about different communities. Past research has demonstrated these biases in previous language and vision models, with recent research starting to explore these issues in relation to image generation. This paper aims to highlight the depth and breadth of biases in recently popularized text-to-image AI models, namely Stable Diffusion and DALL-E. We test a variety of ordinary text prompts and find that the resulting images perpetuate substantial biases and stereotypes—whether or not the prompts contain explicit references to demographic attributes.

Our research underscores the urgent need for policymakers to address the harms resulting from the mass dissemination of stereotypes through major text-to-image AI models.

As millions of images are generated each day using these AI systems, concerns around bias and stereotyping should be front and center in discussions.

Introduction

Our paper takes a mixed-methods research approach: We combine qualitative and quantitative analysis, applying psychological, sociological, and legal literature on systemic racism to real examinations of generative text-to-image AI models and their outputs. We examined Stable Diffusion and DALL-E, the most prominent, publicly available generative AI text-to-image models, for a variety of stereotype issues in two parts.

First, we tested seemingly neutral text prompts that do not include any demographic information, such as references to race, gender, ethnicity, or nationality. We provided Stable Diffusion with the prompt “a photo of the face of [DESCRIPTOR]” in order to generate 100 images based on common descriptors related to attractiveness, emotions, criminality, and occupations, among others. The aim was to see what kinds of images the models generate when asked for photos of, for example, a person stealing, a happy family, or a flight attendant.

Next, we examined images generated in response to text prompts that include explicit references to demographic categories or social groups such as race, nationality, or ability. The aim was to test outright how text-to-image AI models respond to prompts that directly invoke demographic stereotypes. For example, we used text prompts for images of “[NATIONALITY] man” or “[NATIONALITY] man with his house.” We also test the impact of various methods employed recently to try to mitigate stereotypical outcomes, including OpenAI’s guardrails, which the developer purports mitigate biases in the DALL-E model during the training process.

All of this matters inherently but also because psychology literature shows that repeated exposure to stereotypical images, whether real or fake, solidifies discrete social categories in people’s minds. Stereotypes predict discrimination, hostility, and justification of outright violence toward stereotyped people, such as stereotypical images of Black masculinity invoking anxiety, hostility, criminalization, and endorsement of violence against people perceived to be Black men. It is therefore important to understand how AI-generated images have the vast potential to propagate harm that disproportionately affects marginalized communities.

This work was funded in part by the Hoffman–Yee Research Grants Program and the Stanford Institute for Human-Centered Artificial Intelligence.

Read the full brief    View all Policy Publications

 

Authors