Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Borrowing from the Law to Filter Training Data for Foundation Models | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
news

Borrowing from the Law to Filter Training Data for Foundation Models

Date
August 10, 2022
Topics
Machine Learning
DALL-E

Using “Pile of Law,” a dataset of legal materials, Stanford researchers explore filtering private or toxic content from training data for foundation models.

Foundation models are often trained on what is essentially the entire internet. By learning from such a vast dataset, they can impressively memorize and reproduce information that we want them to learn. For example, they might learn to accurately answer factual questions such as “Who is the president of the United States?” At the same time, however, foundation models can memorize and reproduce information that could be harmful. For example, they might disclose people’s Social Security numbers, credit card information, or criminal records, or answer questions about Muslims by suggesting they are terrorists.

These are problems that the creators of foundation models need to fix, says Peter Henderson, a JD/PhD student at Stanford. “We don’t want models to associate people with either their private content or with harmful characteristics.” 

To avoid such consequences, the creators of foundation models sometimes try to filter out private or toxic content before using a dataset to train a model. But trying to remove all – or even most – of the private or toxic content from the entirety of the internet is extremely challenging. One reason: Context matters. Privacy expectations differ across cultures and even across time. And deciding if a phrase is toxic might depend on who is speaking, why they are using a particular phrase, and the expectations of the readers. In sum: It’s a balancing act, and different researchers apply different standards. 

“We wondered if there was a more principled way to filter pretraining data,” Henderson says. He and his colleagues, including Mark Krass, also a JD/PhD student, had an idea: Look to the law. There’s a long history of courts setting standards for information disclosure, so why not import those standards into the machine learning environment?

To test their idea, Henderson and his colleagues assembled Pile of Law, a vast dataset of court and administrative opinions, legal code, case books, and other legal documents. They then explored whether Pile of Law could help identify a principled way to filter pretraining data with a particular focus on privacy and toxicity.

Based on the team’s initial experiments, Pile of Law offers some valuable opportunities: First, it can help researchers ensure their training data meets minimum legal standards. And second, it can reveal problems with commonplace filtering standards, such as in the toxicity realm.

Filtering for Privacy

When Henderson and Krass first looked at the datasets currently used to train foundation models, they found none that were explicitly filtered for personally sensitive information. So they decided to identify the standards that courts and governments use to balance privacy and transparency and then test whether the implicit use of those standards in Pile of Law could point them toward a nuanced approach to data filtering. 

First the team cataloged the various ways courts have addressed privacy concerns. They found some bright-line rules that model designers might adapt to filter their training data. For example, no U.S. jurisdictions reveal minors’ names, Social Security numbers, financial account numbers, or dates of birth. But they also found approaches that were more contextual. For example, U.S. courts typically disclose people’s criminal records or litigants’ names in civil cases, but there are exceptions. In sexual assault cases, for example, the victims’ names are often pseudonymized. Similarly, administrative law judges use their discretion to protect the names of people who come before them in contexts such as applying for disability benefits or for political asylum.  

The existence of these contextual standards means that certain subsets of Pile of Law are already implicitly filtered to protect certain people’s privacy. In the immigration context, for example, people seeking asylum who allege that they were tortured in their own countries are likely to have been given pseudonyms in the public record. Henderson and his team decided to test whether a model could learn these contextualized standards by using Pile of Law as the training data. The result: a model that predicts with 80% accuracy whether a paragraph in an immigration case should use a pseudonym or not. And they showed that these predictions were aligned with the law: Sentences referencing asylum and torture were more likely to trigger pseudonymity than sentences referring to criminal offenses. 

These and several other experiments suggest that Pile of Law can help researchers develop context-appropriate privacy filters, Henderson says. Next, the team would like to expand these efforts beyond the legal domain: Might a model learn to pseudonymize the names of asylum seekers in a dataset that includes the entire internet?

Filtering for Toxicity

In the toxicity arena, Henderson and Krass found a different landscape. Existing filters are widely used and go well beyond what would be suggested by court standards. Indeed, applying current toxicity filters to Pile of Law could filter out important portions of some key legal precedents from the civil rights era, including Brown v. Board of Education, an important case that led to the desegregation of schools in the United States. In addition, the team found that existing filters may remove toxic content from shorter spans of text while leaving it in place if it appears in longer written work – an unexplained outcome that is potentially problematic.

“The lesson is to think more carefully before you take a filter off the shelf to filter data before training,” Henderson says. “We’re therefore calling for more research to properly address toxicity in the training data.”

Next: Legal Reasoning

While Henderson and Krass hope Pile of Law will help make data filtering less ad hoc than it is today, they also have a second goal: using Pile of Law to build foundation models that are capable of legal reasoning. The team has already shown that foundation models do a lousy job of understanding how to apply the law to a set of facts. But Henderson hopes that AI systems will one day improve attorneys’ efficiency and thoroughness by, for example, checking their citations and identifying all of the relevant arguments in a case. The goal, he says: to improve access to justice for people who can’t afford to pay for a lawyer. 

“It’s a tough challenge, but why not aim for a hard problem to solve?” he says. “And one that can actually help people.”

Stanford HAI’s mission is to advance AI research, education, policy and practice to improve the human condition. Learn more.

DALL-E
Share
Link copied to clipboard!
Contributor(s)
Katharine Miller

Related News

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos
TIME
Jan 21, 2026
Media Mention

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?” 

Media Mention
Your browser does not support the video tag.

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos

TIME
Ethics, Equity, InclusionGenerative AIMachine LearningNatural Language ProcessingJan 21

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?” 

Stanford’s Yejin Choi & Axios’ Ina Fried
Axios
Jan 19, 2026
Media Mention

Axios chief technology correspondent Ina Fried speaks to HAI Senior Fellow Yejin Choi at Axios House in Davos during the World Economic Forum.

Media Mention
Your browser does not support the video tag.

Stanford’s Yejin Choi & Axios’ Ina Fried

Axios
Energy, EnvironmentMachine LearningGenerative AIEthics, Equity, InclusionJan 19

Axios chief technology correspondent Ina Fried speaks to HAI Senior Fellow Yejin Choi at Axios House in Davos during the World Economic Forum.

Spatial Intelligence Is AI’s Next Frontier
TIME
Dec 11, 2025
Media Mention

"This is AI’s next frontier, and why 2025 was such a pivotal year," writes HAI Co-Director Fei-Fei Li.

Media Mention
Your browser does not support the video tag.

Spatial Intelligence Is AI’s Next Frontier

TIME
Computer VisionMachine LearningGenerative AIDec 11

"This is AI’s next frontier, and why 2025 was such a pivotal year," writes HAI Co-Director Fei-Fei Li.