Data Privacy and Foundation Models: Can We Have Both?

Date

April 08, 2026

Topics

Privacy, Safety, Security

Foundation Models

Regulation, Policy, Governance

Read Paper

abstract

This brief examines the privacy risks foundation models pose to individuals and society, and governance mechanisms needed to address them.

Key Takeaways

Foundation models pose unprecedented and largely unaddressed privacy risks that are broader and harder to address than those posed by traditional AI systems.
These risks emerge across the entire model life cycle — from the mass scraping of personally identifiable information during training, to the memorization and regurgitation of sensitive information in model outputs, to the intimate data that users unwittingly disclose through chatbot interfaces.
Foundation models are also vulnerable to adversarial attacks, including prompt injection, data poisoning, and model inversion, that can circumvent privacy safeguards and expose sensitive personal information.
Existing privacy frameworks, including the EU’s GDPR, are fundamentally incompatible with how foundation models are built, yet neither the EU nor the United States has enacted comprehensive rules that could meaningfully change developer behavior.
Without clear regulatory guardrails, the public remains largely dependent on developers to voluntarily protect their privacy. Policymakers must weigh a range of governance mechanisms that require removing personal data from the training data pipeline, increase model transparency, ensure the creation of systems that protect privacy by design, and constrain privacy-infringing model outputs.

Introduction

Imagine receiving a security alert from your bank: A fraudster cloned your voice and used it to bypass the bank’s digital security measures and empty your bank account. The tool they used? A generative AI model trained on publicly available data, cloning your voice with an old YouTube video you’d forgotten was online. Or consider prompting a chatbot to tell you what it knows about you, and it surfaces deeply personal information gleaned from pseudonymous posts you once made online.

These examples underscore the profound privacy challenges posed by foundation models — large-scale, general-purpose AI models that stand apart in their ability to impact society globally and at scale. These models are the literal foundation upon which large-scale AI is being integrated into countless consumer-sector digital products and services.

In this issue brief, we examine the risks to data privacy, from both individual and systemic perspectives, posed by the training and use of consumer-focused foundation models. Because foundation models are dependent on massive datasets for their development, they pose a broader set of privacy risks than smaller AI systems trained on proprietary or limited datasets. Foundation models may thwart data privacy not only by the use or misuse of the technology itself, but also by the process of building and training them. In addition, they are vulnerable to privacy risks from adversarial attacks. While the risks can be mitigated, without regulatory rules in place, the public is largely reliant on developers to do the right thing to protect the public’s privacy, which unfortunately is not always the case.

To reconcile data privacy with foundation models, policymakers should weigh a range of governance mechanisms that ensure the removal of personal data from the training data pipeline, require system architectures that prioritize privacy and data security protections by design, increase the transparency and interpretability of foundation models and their training data, and constrain the outputs of models. Policymakers must confront the many ethical and legal questions over access to and control of personal data as AI model adoption continues to grow.

Read Paper

Related Publications

Operationalizing Real-Time Monitoring of Clinical AI

Zhongnan Fang, Lina Cheuy, Hye Sun Na, Akshay Chaudhari, David B. Larson

Quick ReadMay 14, 2026

Policy Brief

This brief demonstrates how real-time monitoring can address critical gaps in the oversight of radiological AI tools.

Policy Brief

Operationalizing Real-Time Monitoring of Clinical AI

Zhongnan Fang, Lina Cheuy, Hye Sun Na, Akshay Chaudhari, David B. Larson

HealthcareRegulation, Policy, GovernanceQuick ReadMay 14

This brief demonstrates how real-time monitoring can address critical gaps in the oversight of radiological AI tools.

Toward Responsible AI in Health Insurance Decision-Making

Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char

Quick ReadFeb 10, 2026

Policy Brief

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Policy Brief

Toward Responsible AI in Health Insurance Decision-Making

Michelle Mello, Artem Trotsyuk, Abdoul Jalil Djiberou Mahamadou, Danton Char

HealthcareRegulation, Policy, GovernanceQuick ReadFeb 10

This brief proposes governance mechanisms for the growing use of AI in health insurance utilization review.

Response to OSTP's Request for Information on Accelerating the American Scientific Enterprise

Rishi Bommasani, John Etchemendy, Surya Ganguli, Daniel E. Ho, Guido Imbens, James Landay, Fei-Fei Li, Russell Wald

Quick ReadDec 26, 2025

Response to Request

Stanford scholars respond to a federal RFI on scientific discovery, calling for the government to support a new “team science” academic research model for AI-enabled discovery.

Response to Request

Response to OSTP's Request for Information on Accelerating the American Scientific Enterprise

Rishi Bommasani, John Etchemendy, Surya Ganguli, Daniel E. Ho, Guido Imbens, James Landay, Fei-Fei Li, Russell Wald

Sciences (Social, Health, Biological, Physical)Regulation, Policy, GovernanceQuick ReadDec 26

Stanford scholars respond to a federal RFI on scientific discovery, calling for the government to support a new “team science” academic research model for AI-enabled discovery.

Beyond DeepSeek: China's Diverse Open-Weight AI Ecosystem and Its Policy Implications

Caroline Meinhardt, Sabina Nong, Graham Webster, Tatsunori Hashimoto, Christopher Manning

Deep DiveDec 16, 2025

Issue Brief

Almost one year after the “DeepSeek moment,” this brief analyzes China’s diverse open-model ecosystem and examines the policy implications of their widespread global diffusion.

Issue Brief

Beyond DeepSeek: China's Diverse Open-Weight AI Ecosystem and Its Policy Implications

Caroline Meinhardt, Sabina Nong, Graham Webster, Tatsunori Hashimoto, Christopher Manning

Foundation ModelsInternational Affairs, International Security, International DevelopmentDeep DiveDec 16

Almost one year after the “DeepSeek moment,” this brief analyzes China’s diverse open-model ecosystem and examines the policy implications of their widespread global diffusion.

Navigate

Participate

Stay Up To Date

Data Privacy and Foundation Models: Can We Have Both?

Key Takeaways

Introduction

Jennifer King

Tiffany Saade

Rethinking Privacy in the AI Era: Policy Provocations for a Data-Centric World

Related Publications

Operationalizing Real-Time Monitoring of Clinical AI

Operationalizing Real-Time Monitoring of Clinical AI

Toward Responsible AI in Health Insurance Decision-Making

Toward Responsible AI in Health Insurance Decision-Making

Response to OSTP's Request for Information on Accelerating the American Scientific Enterprise

Response to OSTP's Request for Information on Accelerating the American Scientific Enterprise

Beyond DeepSeek: China's Diverse Open-Weight AI Ecosystem and Its Policy Implications

Beyond DeepSeek: China's Diverse Open-Weight AI Ecosystem and Its Policy Implications