Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Be Careful What You Tell Your AI Chatbot | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
news

Be Careful What You Tell Your AI Chatbot

Date
October 15, 2025
Topics
Privacy, Safety, Security
Generative AI
Regulation, Policy, Governance

A Stanford study reveals that leading AI companies are pulling user conversations for training, highlighting privacy risks and a need for clearer policies.

Last month, Anthropic made a quiet change to its terms of service for customers: Conversations you have with its AI chatbot, Claude, will be used for training its large language model by default, unless you opt out. 

Anthropic is not alone in adopting this policy. A recent study of frontier developers’ privacy policies found that six leading U.S. companies feed user inputs back into their models to improve capabilities and win market share. Some give consumers the choice to opt out, while others do not.

Given this trend, should users of AI-powered chat systems worry about their privacy? “Absolutely yes,” says the study’s lead author, Jennifer King, Privacy and Data Policy Fellow at the Stanford Institute for Human-Centered AI. “If you share sensitive information in a dialogue with ChatGPT, Gemini, or other frontier models, it may be collected and used for training, even if it’s in a separate file that you uploaded during the conversation.”

King and her team of Stanford scholars examined AI developers’ privacy policies and identified several causes for concern, including long data retention periods, training on children’s data, and a general lack of transparency and accountability in developers’ privacy practices. In light of these findings, consumers should think twice about the information they share in AI chat conversations and, whenever possible, affirmatively opt out of having their data used for training. 

The History of Privacy Policies

As a communication tool, the internet-era privacy policy that’s now being applied to AI chats is deeply flawed. Typically written in convoluted legal language, these documents are difficult for consumers to read and understand. Yet, we have to agree to them if we want to visit websites, query search engines, and interact with large language models (LLMs).

In the last five years, AI developers have been scraping massive amounts of information from the public internet to train their models, a process that can inadvertently pull personal information into their datasets. “We have hundreds of millions of people interacting with AI chatbots, which are collecting personal data for training, and almost no research has been conducted to examine the privacy practices for these emerging tools,” King explains. In the United States, she adds, privacy protections for personal data collected by or shared with LLM developers are complicated by a patchwork of state-level laws and a lack of federal regulation.

In an effort to help close this research gap, the Stanford team compared the privacy policies of six U.S. companies: Amazon (Nova), Anthropic (Claude), Google (Gemini), Meta (Meta AI), Microsoft (Copilot), and OpenAI (ChatGPT). They analyzed a web of documents for each LLM, including its published privacy policies, linked subpolicies, and associated FAQs and guidance accessible from the chat interfaces, for a total of 28 lengthy documents.

To evaluate these policies, the researchers followed a methodology used by the California Consumer Privacy Act, as it is the most comprehensive privacy law in the United States, and all six frontier developers are required to comply with it. For each company, the researchers analyzed language in the documentation to discern how the stated policies address three questions:

  1. Are user inputs to chatbots used to train or improve LLMs?

  2. What sources and categories of personal consumer data are collected, stored, and processed to train or improve LLMs?

  3. What are the users’ options for opting into or out of having their chats used for training?

Blurred Boundaries

The scholars found all six companies employ users’ chat data by default to train their models, and some developers keep this information in their systems indefinitely. Some, but not all, of the companies state that they de-identify personal information before using it for training purposes. And some developers allow humans to review users’ chat transcripts for model training purposes. 

In the case of multiproduct companies, such as Google, Meta, Microsoft, and Amazon, user interactions also routinely get merged with information gleaned from other products consumers use on those platforms – search queries, sales/purchases, social media engagement, and the like.

These practices can become problematic when, for example, users share personal biometric and health data without considering the implications. Here’s a realistic scenario: Imagine asking an LLM for dinner ideas. Maybe you specify that you want low-sugar or heart-friendly recipes. The chatbot can draw inferences from that input, and the algorithm may decide you fit a classification as a health-vulnerable individual. “This determination drips its way through the developer’s ecosystem. You start seeing ads for medications, and it’s easy to see how this information could end up in the hands of an insurance company. The effects cascade over time,” King explains.

Another red flag the researchers discovered concerns the privacy of children: Developers’ practices vary in this regard, but most are not taking steps to remove children’s input from their data collection and model training processes. Google announced earlier this year that it would train its models on data from teenagers, if they opt in. By contrast, Anthropic says it does not collect children’s data nor allow users under the age of 18 to create accounts, although it does not require age verification. And Microsoft says it collects data from children under 18, but does not use it to build language models. All of these practices raise consent issues, as children cannot legally consent to the collection and use of their data.

Privacy-Preserving AI

Across the board, the Stanford scholars observed that developers’ privacy policies lack essential information about their practices. They recommend policymakers and developers address data privacy challenges posed by LLM-powered chatbots through comprehensive federal privacy regulation, affirmative opt-in for model training, and filtering personal information from chat inputs by default.

“As a society, we need to weigh whether the potential gains in AI capabilities from training on chat data are worth the considerable loss of consumer privacy. And we need to promote innovation in privacy-preserving AI, so that user privacy isn’t an afterthought,” King concludes. 

Share
Link copied to clipboard!
Contributor(s)
Nikki Goth Itoi

Related News

Smart Enough to Do Math, Dumb Enough to Fail: The Hunt for a Better AI Test
Andrew Myers
Feb 02, 2026
News
illustration of data and lines

A Stanford HAI workshop brought together experts to develop new evaluation methods that assess AI's hidden capabilities, not just its test-taking performance.

News
illustration of data and lines

Smart Enough to Do Math, Dumb Enough to Fail: The Hunt for a Better AI Test

Andrew Myers
Foundation ModelsGenerative AIPrivacy, Safety, SecurityFeb 02

A Stanford HAI workshop brought together experts to develop new evaluation methods that assess AI's hidden capabilities, not just its test-taking performance.

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos
TIME
Jan 21, 2026
Media Mention

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?” 

Media Mention
Your browser does not support the video tag.

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos

TIME
Ethics, Equity, InclusionGenerative AIMachine LearningNatural Language ProcessingJan 21

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?” 

Stanford’s Yejin Choi & Axios’ Ina Fried
Axios
Jan 19, 2026
Media Mention

Axios chief technology correspondent Ina Fried speaks to HAI Senior Fellow Yejin Choi at Axios House in Davos during the World Economic Forum.

Media Mention
Your browser does not support the video tag.

Stanford’s Yejin Choi & Axios’ Ina Fried

Axios
Energy, EnvironmentMachine LearningGenerative AIEthics, Equity, InclusionJan 19

Axios chief technology correspondent Ina Fried speaks to HAI Senior Fellow Yejin Choi at Axios House in Davos during the World Economic Forum.