Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Data Privacy and Foundation Models: Can We Have Both? | Stanford HAI
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • AI Glossary
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

policyIssue Brief

Data Privacy and Foundation Models: Can We Have Both?

Date
April 08, 2026
Topics
Privacy, Safety, Security
Foundation Models
Regulation, Policy, Governance
Read Paper
abstract

This brief examines the privacy risks foundation models pose to individuals and society, and governance mechanisms needed to address them.

Key Takeaways

  • Foundation models pose unprecedented and largely unaddressed privacy risks that are broader and harder to address than those posed by traditional AI systems.

  • These risks emerge across the entire model life cycle — from the mass scraping of personally identifiable information during training, to the memorization and regurgitation of sensitive information in model outputs, to the intimate data that users unwittingly disclose through chatbot interfaces.

  • Foundation models are also vulnerable to adversarial attacks, including prompt injection, data poisoning, and model inversion, that can circumvent privacy safeguards and expose sensitive personal information.

  • Existing privacy frameworks, including the EU’s GDPR, are fundamentally incompatible with how foundation models are built, yet neither the EU nor the United States has enacted comprehensive rules that could meaningfully change developer behavior.

  • Without clear regulatory guardrails, the public remains largely dependent on developers to voluntarily protect their privacy. Policymakers must weigh a range of governance mechanisms that require removing personal data from the training data pipeline, increase model transparency, ensure the creation of systems that protect privacy by design, and constrain privacy-infringing model outputs.

Introduction

Imagine receiving a security alert from your bank: A fraudster cloned your voice and used it to bypass the bank’s digital security measures and empty your bank account. The tool they used? A generative AI model trained on publicly available data, cloning your voice with an old YouTube video you’d forgotten was online. Or consider prompting a chatbot to tell you what it knows about you, and it surfaces deeply personal information gleaned from pseudonymous posts you once made online.

These examples underscore the profound privacy challenges posed by foundation models — large-scale, general-purpose AI models that stand apart in their ability to impact society globally and at scale. These models are the literal foundation upon which large-scale AI is being integrated into countless consumer-sector digital products and services.

In this issue brief, we examine the risks to data privacy, from both individual and systemic perspectives, posed by the training and use of consumer-focused foundation models. Because foundation models are dependent on massive datasets for their development, they pose a broader set of privacy risks than smaller AI systems trained on proprietary or limited datasets. Foundation models may thwart data privacy not only by the use or misuse of the technology itself, but also by the process of building and training them. In addition, they are vulnerable to privacy risks from adversarial attacks. While the risks can be mitigated, without regulatory rules in place, the public is largely reliant on developers to do the right thing to protect the public’s privacy, which unfortunately is not always the case. 

To reconcile data privacy with foundation models, policymakers should weigh a range of governance mechanisms that ensure the removal of personal data from the training data pipeline, require system architectures that prioritize privacy and data security protections by design, increase the transparency and interpretability of foundation models and their training data, and constrain the outputs of models. Policymakers must confront the many ethical and legal questions over access to and control of personal data as AI model adoption continues to grow.

Read Paper
Share
Link copied to clipboard!
Authors
  • Jennifer King
    Jennifer King
  • Tiffany Saade
    Tiffany Saade
Related
  • Rethinking Privacy in the AI Era: Policy Provocations for a Data-Centric World
    Jennifer King, Caroline Meinhardt
    Deep DiveFeb 22
    whitepaper

    This white paper explores the current and future impact of privacy and data protection legislation on AI development and provides recommendations for mitigating privacy harms in an AI era.

Related Publications

Toward Responsible AI in Health Insurance Decision-Making
Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char
Quick ReadFeb 10, 2026
Policy Brief

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Policy Brief

Toward Responsible AI in Health Insurance Decision-Making

Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char
HealthcareRegulation, Policy, GovernanceQuick ReadFeb 10

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Response to OSTP's Request for Information on Accelerating the American Scientific Enterprise
Rishi Bommasani, John Etchemendy, Surya Ganguli, Daniel E. Ho, Guido Imbens, James Landay, Fei-Fei Li, Russell Wald
Quick ReadDec 26, 2025
Response to Request

Stanford scholars respond to a federal RFI on scientific discovery, calling for the government to support a new “team science” academic research model for AI-enabled discovery.

Response to Request

Response to OSTP's Request for Information on Accelerating the American Scientific Enterprise

Rishi Bommasani, John Etchemendy, Surya Ganguli, Daniel E. Ho, Guido Imbens, James Landay, Fei-Fei Li, Russell Wald
Sciences (Social, Health, Biological, Physical)Regulation, Policy, GovernanceQuick ReadDec 26

Stanford scholars respond to a federal RFI on scientific discovery, calling for the government to support a new “team science” academic research model for AI-enabled discovery.

Beyond DeepSeek: China's Diverse Open-Weight AI Ecosystem and Its Policy Implications
Caroline Meinhardt, Sabina Nong, Graham Webster, Tatsunori Hashimoto, Christopher Manning
Deep DiveDec 16, 2025
Issue Brief

Almost one year after the “DeepSeek moment,” this brief analyzes China’s diverse open-model ecosystem and examines the policy implications of their widespread global diffusion.

Issue Brief

Beyond DeepSeek: China's Diverse Open-Weight AI Ecosystem and Its Policy Implications

Caroline Meinhardt, Sabina Nong, Graham Webster, Tatsunori Hashimoto, Christopher Manning
Foundation ModelsInternational Affairs, International Security, International DevelopmentDeep DiveDec 16

Almost one year after the “DeepSeek moment,” this brief analyzes China’s diverse open-model ecosystem and examines the policy implications of their widespread global diffusion.

Response to FDA's Request for Comment on AI-Enabled Medical Devices
Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew
Quick ReadDec 02, 2025
Response to Request

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.

Response to Request

Response to FDA's Request for Comment on AI-Enabled Medical Devices

Desmond C. Ong, Jared Moore, Nicole Martinez-Martin, Caroline Meinhardt, Eric Lin, William Agnew
HealthcareRegulation, Policy, GovernanceQuick ReadDec 02

Stanford scholars respond to a federal RFC on evaluating AI-enabled medical devices, recommending policy interventions to help mitigate the harms of AI-powered chatbots used as therapists.