AI’s ‘Delusional Spirals’ (and What to Do About Them) | Stanford HAI
Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • AI Glossary
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

news

AI’s ‘Delusional Spirals’ (and What to Do About Them)

Date
April 20, 2026
Topics
Healthcare
Generative AI

In a world where chatbots can stand in for friends, counselors, and even lovers, the mental health risks are a growing concern.

Perhaps to the surprise of their creators, large language models have become confidants, therapists, and, for some, intimate partners to real human users. In a new paper, AI researchers at Stanford studied verbatim transcripts of 19 real conversations between humans and chatbots to understand how these relationships arise, evolve, and, too often, devolve into troubling outcomes the researchers describe as “delusional spirals.”

These conversations can spin out of control as AI amplifies the user’s distorted beliefs and motivations, leading some people to take real-world, dangerous actions.

“People are really believing the AI,” said Jared Moore, a PhD candidate in computer science at Stanford University and first author of the paper, which will be presented at the ACM FAccT Conference. “As you read through the transcripts, you see some users think that they’ve found a uniquely conscious chatbot.”

Programmed to Please

Part of the problem, the researchers say, is that AI models are trained from the outset to “align” with human interests. AI has been programmed to please and to validate. When combined with AI’s well-known tendency to hallucinate, it adds up to a potentially toxic formula. 

“AI can be sycophantic,” Moore says. “And that’s a problem for some users.”

The researchers say delusional spirals result from a pattern in which a human presents an unusual, grandiose, paranoid, or wholly imaginary idea and the model responds with affirmation, encouragement, or, in some cases, aid in constructing the person’s delusional world, all while offering intimate reassurances that can sound all too human. 

Things then escalate as the model offers an endless stream of attention, empathy, and reassurance without the all-important pushback a human confidant, therapist, or lover would typically provide. 

These stakes are not abstract. In the team’s dataset, Moore and colleagues witnessed how delusional spirals led to ruined relationships and careers – or worse. In one case, a participant died by suicide when the conversation grew “dark and harmful,” Moore explained. 

“Chatbots are trained to be overly enthusiastic, often reframing the user’s delusional thoughts in a positive light, dismissing counterevidence and projecting compassion and warmth,” Moore said. “This can be destabilizing to a user who is primed for delusion.”

Warning Signs of Delusional Spirals

Moore says delusional spirals derive from a few specific hallmarks: an AI that encourages grandeur and uses affectionate interpersonal language, and a human’s misperception of AI sentience. Meanwhile, chatbots are ill‑equipped to respond to suicidal and violent thoughts.

It is less a matter of “the evil AI,” Moore said, than a miscalibrated social calculus built into the models. Systems tend to extend conversations to defer to their interlocutors, thereby making them better assistants. At the same time, they don’t have ways to tap the brakes on a spiraling conversation or to route an unstable person toward help.

“There is a mismatch between how people actually use these systems and what many chatbot developers intended them – trained them – to be,” Moore says.

What Can Be Done

In light of these clear and concerning risks, Moore and colleagues conclude their paper with remedial recommendations. AI developers could include metrics in their testing of a model’s tendency to facilitate delusional spirals and, potentially, add detection filters to the models themselves that raise red flags on potentially harmful uses of AI. The researchers acknowledge that privacy concerns could stand in the way of that strategy. 

“I think AI developers have a vested interest in addressing this concern about the use of their models in ways they likely never even intended or imagined,” Moore noted.

On a policy front, the researchers say that lawmakers should reframe alignment as a public-health issue requiring new standards for flagging sensitive conversations, greater transparency into AI “safety” tuning, and clear rules for crisis escalation when a user demonstrates tendencies toward self‑harm or violence.

“When we put chatbots that are meant to be helpful assistants out into the world and have real people use them in all sorts of ways, consequences emerge,” said Nick Haber, an assistant professor at Stanford Graduate School of Education and a senior author of the study. “Delusional spirals are one particularly acute consequence. By understanding it, we might be able to prevent real harm in the future.”

This paper was partially funded by the Stanford Institute for Human-Centered AI.

Share
Link copied to clipboard!
Contributor(s)
Andrew Myers

Related News

Today's AI Talks Like “Nobody.” New Research Gives It Real Personality.
Jun 08, 2026
News
3D illustration of mirrored human profiles in blue and yellow layers

PsychAdapter lets researchers dial in on personality traits, age, and mental health characteristics to generate text that sounds like real individuals, opening the door to training simulations and personalized content.

News
3D illustration of mirrored human profiles in blue and yellow layers

Today's AI Talks Like “Nobody.” New Research Gives It Real Personality.

HealthcareGenerative AISciences (Social, Health, Biological, Physical)Jun 08

PsychAdapter lets researchers dial in on personality traits, age, and mental health characteristics to generate text that sounds like real individuals, opening the door to training simulations and personalized content.

Reading Today’s Headlines Through AI: A Real-Time Audit of Six Commercial Chatbots
Mirac Suzgun and James Zou
Jun 03, 2026
News

In a new study, scholars measured how accurately popular AI chatbots answered questions about the emerging news and found substantial regional disparity, dependence on distinct information ecosystems, and acute fragility under imperfect prompts.

News

Reading Today’s Headlines Through AI: A Real-Time Audit of Six Commercial Chatbots

Mirac Suzgun and James Zou
Communications, MediaGenerative AIJun 03

In a new study, scholars measured how accurately popular AI chatbots answered questions about the emerging news and found substantial regional disparity, dependence on distinct information ecosystems, and acute fragility under imperfect prompts.

AI Coding Agents Fail at Teamwork
Andrew Myers
Jun 01, 2026
News
illustration of two people paddling in opposite directions

Two models working together perform worse than one alone, exposing a critical gap in artificial intelligence capabilities.

News
illustration of two people paddling in opposite directions

AI Coding Agents Fail at Teamwork

Andrew Myers
Generative AIMachine LearningJun 01

Two models working together perform worse than one alone, exposing a critical gap in artificial intelligence capabilities.