Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Rethinking Privacy in the AI Era: Policy Provocations for a Data-Centric World | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
policyWhite Paper

Rethinking Privacy in the AI Era: Policy Provocations for a Data-Centric World

Date
February 22, 2024
Topics
Privacy, Safety, Security
Regulation, Policy, Governance
Read Paper
abstract

This white paper explores the current and future impact of privacy and data protection legislation on AI development and provides recommendations for mitigating privacy harms in an AI era.

Executive Summary

  • In this paper, we present a series of arguments and predictions about how existing and future privacy and data protection regulation will impact the development and deployment of AI systems.

  • Data is the foundation of all AI systems. Going forward, AI development will continue to increase developers’ hunger for training data, fueling an even greater race for data acquisition than we have already seen in past decades.

  • Largely unrestrained data collection poses unique risks to privacy that extend beyond the individual level—they aggregate to pose societal-level harms that cannot be addressed through the exercise of individual data rights alone.

  • While existing and proposed privacy legislation, grounded in the globally accepted Fair Information Practices (FIPs), implicitly regulate AI development, they are not sufficient to address the data acquisition race as well as the resulting individual and systemic privacy harms.

  • Even legislation that contains explicit provisions on algorithmic decision-making and other forms of AI does not provide the data governance measures needed to meaningfully regulate the data used in AI systems.

  • We present three suggestions for how to mitigate the risks to data privacy posed by the development and adoption of AI:

  1. Denormalize data collection by default by shifting away from opt-out to opt-in data collection. Data collectors must facilitate true data minimization through “privacy by default” strategies and adopt technical standards and infrastructure for meaningful consent mechanisms.

  2. Focus on the AI data supply chain to improve privacy and data protection. Ensuring dataset transparency and accountability across the entire life cycle must be a focus of any regulatory system that addresses data privacy.

  3. Flip the script on the creation and management of personal data. Policymakers should support the development of new governance mechanisms and technical infrastructure (e.g., data intermediaries and data permissioning infrastructure) to support and automate the exercise of individual data rights and preferences.

 

Introduction

Foundation models (e.g., GPT-4, Llama 2) are at the epicenter of AI, driving technological innovation and billions in investment. This paradigm shift has sparked widespread demands for regulation. Animated by factors as diverse as declining transparency and unsafe labor practices, limited protections for copyright and creative work, as well as market concentration and productivity gains, many have called for policymakers to take action.

Central to the debate about how to regulate foundation models is the process by which foundation models are released. Some foundation models like Google DeepMind’s Flamingo are fully closed, meaning they are available only to the model developer; others, such as OpenAI’s GPT-4, are limited access, available to the public but only as a black box; and still others, such as Meta’s Llama 2, are more open, with widely available model weights enabling downstream modification and scrutiny. As of August 2023, the U.K.’s Competition and Markets Authority documents the most common release approach for publicly-disclosed models is open release based on data from Stanford’s Ecosystem Graphs. Developers like Meta, Stability AI, Hugging Face, Mistral, Together AI, and EleutherAI frequently release models openly.

Governments around the world are issuing policy related to foundation models. As part of these efforts, open foundation models have garnered significant attention: The recent U.S. Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence tasks the National Telecommunications and Information Administration with preparing a report on open foundation models for the president. In the EU, open foundation models trained with fewer than 1025 floating point operations (a measure of the amount of compute expended) appear to be exempted under the recently negotiated AI Act. The U.K.’s AI Safety Institute will “consider open-source systems as well as those deployed with various forms of access controls” as part of its initial priorities. Beyond governments, the Partnership on AI has introduced guidelines for the safe deployment of foundation models, recommending against open release for the most capable foundation models.

Policy on foundation models should support the open foundation model ecosystem, while providing resources to monitor risks and create safeguards to address harms. Open foundation models provide significant benefits to society by promoting competition, accelerating innovation, and distributing power. For example, small businesses hoping to build generative AI applications could choose among a variety of open foundation models that offer different capabilities and are often less expensive than closed alternatives. Further, open models are marked by greater transparency and, thereby, accountability. When a model is released with its training data, independent third parties can better assess the model’s capabilities and risks.

However, an emerging concern is whether open foundation models pose distinct risks to society. Unlike closed foundation model developers, open developers have limited ability to restrict the use of their models by malicious actors that can easily remove safety guardrails. Recent studies claim that open foundation models are more likely to generate disinformation, cyberweapons, bioweapons, and spear-phishing emails.

Correctly characterizing these distinct risks requires centering the marginal risk: To what extent do open foundation models increase risk relative to (a) closed foundation models or (b) pre-existing technologies like search engines? We find that for many dimensions, the existing evidence about the marginal risk of open foundation models remains quite limited. In some instances, such as the case of AI-generated child sexual abuse material (CSAM) and nonconsensual intimate imagery (NCII), harms stemming from open foundation models have been better documented. For these demonstrated harms, proposals to restrict the release of foundation models via licensing of compute-intensive models are mismatched, because the text-to-image models used to cause these harms require vastly lower amounts of resources to train.

More broadly, several regulatory approaches under consideration are likely to have a disproportionate impact on open foundation models and their developers, without meaningfully reducing risk. Even though these approaches do not differentiate between open and closed foundation model developers, they yield asymmetric compliance burdens. For example, legislation that holds developers liable for content generated using their models or their derivatives would harm open developers as users can modify their models to generate illicit content. Policymakers should exercise caution to avoid unintended consequences and ensure adequate consultation with open foundation model developers before taking action.

Read Paper
Share
Link copied to clipboard!
Authors
  • Jennifer King
    Jennifer King
  • headshot
    Caroline Meinhardt

Related Publications

Response to OSTP’s Request for Information on the Development of an AI Action Plan
Caroline Meinhardt, Daniel Zhang, Rishi Bommasani, Jennifer King, Russell Wald, Percy Liang, Daniel E. Ho
Mar 17, 2025
Response to Request

In this response to a request for information issued by the National Science Foundation’s Networking and Information Technology Research and Development National Coordination Office (on behalf of the Office of Science and Technology Policy), scholars from Stanford HAI, CRFM, and RegLab urge policymakers to prioritize four areas of policy action in their AI Action Plan: 1) Promote open innovation as a strategic advantage for U.S. competitiveness; 2) Maintain U.S. AI leadership by promoting scientific innovation; 3) Craft evidence-based AI policy that protects Americans without stifling innovation; 4) Empower government leaders with resources and technical expertise to ensure a “whole-of-government” approach to AI governance.

Response to Request

Response to OSTP’s Request for Information on the Development of an AI Action Plan

Caroline Meinhardt, Daniel Zhang, Rishi Bommasani, Jennifer King, Russell Wald, Percy Liang, Daniel E. Ho
Regulation, Policy, GovernanceMar 17

In this response to a request for information issued by the National Science Foundation’s Networking and Information Technology Research and Development National Coordination Office (on behalf of the Office of Science and Technology Policy), scholars from Stanford HAI, CRFM, and RegLab urge policymakers to prioritize four areas of policy action in their AI Action Plan: 1) Promote open innovation as a strategic advantage for U.S. competitiveness; 2) Maintain U.S. AI leadership by promoting scientific innovation; 3) Craft evidence-based AI policy that protects Americans without stifling innovation; 4) Empower government leaders with resources and technical expertise to ensure a “whole-of-government” approach to AI governance.

Safeguarding Third-Party AI Research
Kevin Klyman, Shayne Longpre, Sayash Kapoor, Rishi Bommasani, Percy Liang, Peter Henderson
Feb 13, 2025
Policy Brief
Safeguarding third-party AI research

This brief examines the barriers to independent AI evaluation and proposes safe harbors to protect good-faith third-party research.

Policy Brief
Safeguarding third-party AI research

Safeguarding Third-Party AI Research

Kevin Klyman, Shayne Longpre, Sayash Kapoor, Rishi Bommasani, Percy Liang, Peter Henderson
Privacy, Safety, SecurityRegulation, Policy, GovernanceFeb 13

This brief examines the barriers to independent AI evaluation and proposes safe harbors to protect good-faith third-party research.

Assessing the Implementation of Federal AI Leadership and Compliance Mandates
Jennifer Wang, Mirac Suzgun, Caroline Meinhardt, Daniel Zhang, Kazia Nowacki, Daniel E. Ho
Jan 17, 2025
White Paper

This white paper assesses federal efforts to advance leadership on AI innovation and governance through recent executive actions and emphasizes the need for senior-level leadership to achieve a whole-of-government approach.

White Paper

Assessing the Implementation of Federal AI Leadership and Compliance Mandates

Jennifer Wang, Mirac Suzgun, Caroline Meinhardt, Daniel Zhang, Kazia Nowacki, Daniel E. Ho
Government, Public AdministrationRegulation, Policy, GovernanceJan 17

This white paper assesses federal efforts to advance leadership on AI innovation and governance through recent executive actions and emphasizes the need for senior-level leadership to achieve a whole-of-government approach.

What Makes a Good AI Benchmark?
Anka Reuel, Amelia Hardy, Chandler Smith, Max Lamparth, Malcolm Hardy, Mykel Kochenderfer
Dec 11, 2024
Policy Brief
What Makes a Good AI Benchmark

This brief presents a novel assessment framework for evaluating the quality of AI benchmarks and scores 24 benchmarks against the framework.

Policy Brief
What Makes a Good AI Benchmark

What Makes a Good AI Benchmark?

Anka Reuel, Amelia Hardy, Chandler Smith, Max Lamparth, Malcolm Hardy, Mykel Kochenderfer
Foundation ModelsPrivacy, Safety, SecurityDec 11

This brief presents a novel assessment framework for evaluating the quality of AI benchmarks and scores 24 benchmarks against the framework.