Who Is Liable When Generative AI Says Something Harmful? | Stanford HAI
Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • AI Glossary
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

news

Who Is Liable When Generative AI Says Something Harmful?

Date
October 11, 2023

Courts will have to grapple with this new challenge, although scholars believe much of generative AI will be protected by the First Amendment.

When interacting with ChatGPT, you may have encountered a response that starts with “As a language model …” before it politely refuses to do your bidding. The model is trained to respond like this to potentially harmful queries. These queries are discovered through what’s called “red teaming” in which a team of people — some technical experts and some not — craft queries to try to trigger harmful behavior from the model, which then helps model developers create safeguards. In fact, the White House helped put together a red teaming effort for some of the most advanced foundation models. Red teaming isn’t perfect, though. Researchers have crafted a number of different “jailbreaks,” both text-based and image-based, which cause the model to engage in the harmful behavior despite safeguards.

But what happens when the safeguards fail and models are used for harm? Who is liable? In two new papers, my co-authors and I find that it will not be so easy to impose liability on general foundation model creators and deployers — assuming current large language-style models do not take autonomous actions in the world. Those seeking to impose liability on creators and deployers will generally be constrained by the First Amendment, face other difficult-to-meet statutory requirements, and in some cases face defendants that could retain immunity from liability under a law referred to as “Section 230.” Ultimately, in all of these cases courts will have to wade through low-level technical details that significantly affect any liability analysis.

In “Freedom Of Speech And AI Output,” a recent article in the Journal of Free Speech Law, my co-authors Eugene Volokh, a professor at UCLA Law School and visiting fellow at the Hoover Institution, Mark Lemley, a professor at Stanford Law School, and I make the case that generative AI outputs (at least the speech-like ones) are likely entitled to First Amendment protections. This is in part because the model creators may have the right to channel their speech through the model, but also because users have a right to listen to that speech. In a landmark case,  Lamont v. Postmaster General, the Supreme Court struck down a law that would require “communist political propaganda” to be confiscated by the U.S. Postmaster General before it reached the U.S. person who had requested it. In a concurring opinion, Justice Brennan wrote, “I think the right to receive information is such a fundamental right.”

If it is true that AI outputs are protected by the First Amendment, then it creates challenges for finding model creators and deployers liable when AI models do harmful things under a number of criminal and civil statutes. That is exactly what Stanford assistant professor Tatsunori Hashimoto, Mark Lemley, and I break down further in our other work, “Where's the Liability in Harmful AI Speech?” also recently published in the Journal of Free Speech Law.

Consider a scenario where a terrorist organization tries to repurpose a machine learning model to recruit for its acts of terror. Two Supreme Court cases recently dealt with a similar question: Twitter v. Taamneh and Gonzalez v. Google. They addressed whether recommendation systems at Twitter, YouTube, and other social media companies aided ISIS in recruiting and committing acts of terror. If so, they could be liable under a law called the Justice Against Sponsors of Terrorism Act. In Taamneh, the Court ultimately found that the companies were not liable for their recommender systems’ involvement in any ISIS recruiting.  The decision was partly due to difficulties in determining whether the companies knew about the use of their platform in aiding the specific act in question and ultimately would have faced further difficulties due to Section 230 immunity. In one of the cases, Gonzalez, Justice Gorsuch specifically noted that generative AI likely would not be covered by Section 230, however.

So, what happens when generative AI does not receive immunity from liability? What if ChatGPT is directly used by ISIS for recruiting people to commit an act of terror, paralleling the recent Supreme Court cases? Who would be liable? We argue that there are a lot of roadblocks to finding liability against the model creator or model host of foundation models.

In some cases, generative AI can still be designed in ways that preserve a plausible Section 230 immunity argument (for example, by designing it to piece together bits of third-party content like Google does for its search snippets). As in recent Supreme Court cases, it will also be difficult to prove that the model creator knew that ISIS was using its model for this specific harmful purpose. And the creator could argue that ISIS repurposed the system for harm via elaborate prompt engineering or fine-tuning, bypassing safeguards and thus breaking the causal link to the downstream harm. This doesn’t even begin to grapple with the First Amendment challenges that will likely come along with such cases.

Much of the analysis will rest on minute details and technical design decisions. Strangely, in some ways, current law might incentivize a more hands-off approach to safety to preserve Section 230 immunity. The more you craft (or “align”) the speech of the model with your own preferences, rather than just regurgitate third-party content, the less likely you are to preserve Section 230 immunity.

Ultimately, we are nearly assured to see Supreme Court cases dealing with harms of generative AI in the not-so-distant future. The decisions in these cases will affect a large swath of future technical designs, as model creators attempt to avoid liability. We argue that policymakers and regulators may want to intervene before it comes to that and understand at a deep technical level which design decisions they will incentivize with their interventions.

Peter Henderson is finishing his PhD in computer science at Stanford, received his JD from Stanford Law School, and is an incoming assistant professor at Princeton University with appointments in the Department of Computer Science and School of Public and International Affairs.

Stanford HAI’s mission is to advance AI research, education, policy and practice to improve the human condition. Learn more. 

Share
Link copied to clipboard!
Authors
  • Peter Henderson
    Peter Henderson

Related News

Today's AI Talks Like “Nobody.” New Research Gives It Real Personality.
Jun 08, 2026
News
3D illustration of mirrored human profiles in blue and yellow layers

PsychAdapter lets researchers dial in on personality traits, age, and mental health characteristics to generate text that sounds like real individuals, opening the door to training simulations and personalized content.

News
3D illustration of mirrored human profiles in blue and yellow layers

Today's AI Talks Like “Nobody.” New Research Gives It Real Personality.

HealthcareGenerative AISciences (Social, Health, Biological, Physical)Jun 08

PsychAdapter lets researchers dial in on personality traits, age, and mental health characteristics to generate text that sounds like real individuals, opening the door to training simulations and personalized content.

Reading Today’s Headlines Through AI: A Real-Time Audit of Six Commercial Chatbots
Mirac Suzgun and James Zou
Jun 03, 2026
News

In a new study, scholars measured how accurately popular AI chatbots answered questions about the emerging news and found substantial regional disparity, dependence on distinct information ecosystems, and acute fragility under imperfect prompts.

News

Reading Today’s Headlines Through AI: A Real-Time Audit of Six Commercial Chatbots

Mirac Suzgun and James Zou
Communications, MediaGenerative AIJun 03

In a new study, scholars measured how accurately popular AI chatbots answered questions about the emerging news and found substantial regional disparity, dependence on distinct information ecosystems, and acute fragility under imperfect prompts.

AI Coding Agents Fail at Teamwork
Andrew Myers
Jun 01, 2026
News
illustration of two people paddling in opposite directions

Two models working together perform worse than one alone, exposing a critical gap in artificial intelligence capabilities.

News
illustration of two people paddling in opposite directions

AI Coding Agents Fail at Teamwork

Andrew Myers
Generative AIMachine LearningJun 01

Two models working together perform worse than one alone, exposing a critical gap in artificial intelligence capabilities.