Cleaning Up Policy Sludge: An AI Statutory Research System

This brief introduces a novel AI tool that performs statutory surveys to help governments—such as the San Francisco City Attorney Office—identify policy sludge and accelerate legal reform.
Key Takeaways
Legal reform can get bogged down by policy sludge that is strewn about millions of words of statutes and regulations. Such sludge can make programs hard to administer for civil servants and even more difficult to navigate for the public.
Stanford RegLab developed the Statutory Research Assistant (STARA), a domain-informed AI system capable of performing accurate and comprehensive statutory surveys that help to identify and eliminate policy sludge.
As an illustration, RegLab partnered with the San Francisco City Attorney’s Office to identify all legislatively mandated reporting requirements, many of which are burdensome and can serve little purpose after decades. Based on the collaboration, the city attorney spearheaded a consultative process with city departments, culminating in a proposed ordinance to delete or consolidate over a third of these requirements.
AI systems like STARA enable researchers, advocates, attorneys, and government officials to gain a more comprehensive understanding of often opaque legal mandates, identify policy sludge, and accelerate meaningful reform efforts.
The Problem
There is a growing recognition that “policy sludge”—outdated, obsolete, or cumbersome legal requirements and regulations—can impede adaptable governance. But reforming a daunting volume of statutes, regulations, and codes can be challenging.
Consider five examples:
As a law professor in the 1970s, Ruth Bader Ginsburg hired an army of Columbia Law School students to cull through the United States Code for provisions that discriminated on the basis of sex. The final report, using 59 key words, was an “extensive, but not exhaustive” list which provided the blueprint for equal rights litigation.
In the 1980s, the U.S. Department of Justice under the Reagan administration tried to count federal crimes. After two years, they gave up. The responsible official said that one could “die and [be] resurrected three times,” and still not know the true number.
In 2021, California enacted a law that required all county recorder offices to identify and redact racist deed records. Such racial covenants, which prohibit people of particular races from residing on the property, are unenforceable and yet they persist. In Santa Clara County alone, that meant sifting through 80 million pages of deed documents dating back to the 1800s. Los Angeles recently spent $8 million for a contractor to use key words to find such covenants—a process expected to last over seven years.
Congressionally mandated reports are, according to political scientist Frank Fukuyama, a prime example of how “government is made inefficient by the layers of rules bureaucrats themselves are forced to labour under.” Congress has lost track of thousands of these reports, creating a congressional “black hole” that weighs down civil servants, with many reports producing little benefit. As Supreme Court Justice Neil Gorsuch noted, one report—on the Social Security Administration’s printing operations—took 95 employees over four months to complete. As the Congressional Research Service has conceded, there is no “search method that can obtain an exact accounting of all reports required,” given that the U.S. Code contains some 32 million words.
In 2024, San Francisco voters approved a ballot measure requiring the city to simplify its sprawling system of advisory bodies and commissions. The San Francisco Municipal Code, along with resolutions by the Board of Supervisors, totals 16 million words, and a civil grand jury found “there [was] no centralized list of commissions.”
The law runs across millions of statutes, regulations, deeds, and other documents. And sometimes, the problem facing policymakers, judges, and reformers is simply knowing what the law is.
In our paper, “What Is the Law? A System for Statutory Research (STARA) with Large Language Models,” we introduce an automated system that aims to address the unique challenges of statutory research by rapidly parsing and compiling legal provisions. It enables researchers, advocates, attorneys, and government officials to understand the full breadth of legislative mandates. Our work highlights the promise of using large language models (LLMs) to build domain-specific systems that perform well in complex fields such as statutory research, where they can help governments reduce policy sludge and pave the way for meaningful statutory reform.
The Solution
We developed the Statutory Research Assistant (STARA), a domain-informed AI system designed to automate statutory and regulatory research. STARA performs comprehensive statutory surveys, i.e., systematic compilations of legal provisions relevant to a given question or policy area. Unlike general-purpose tools, STARA exploits the capabilities of frontier AI models and a domain-specific architecture tailored to the structure of legal codes. It incorporates key elements such as hierarchical organization, cross-references, and definitions—unique features of legal codes that have made manual and automated approaches to statutory research challenging. Intuitively, STARA represents legal codes in the way the basics of statutory interpretation are taught to law students.
STARA’s research pipeline operates in three stages. First, it preprocesses and segments statutory codes into enumerable units, annotating each with relevant legal context drawn from headings, cross-references, definitions, and editorial notes. Then, it uses LLMs to reason about statutory language, classify provisions, and extract structured information. Finally, STARA can agentically organize, analyze, and report on the results of its statutory surveys.
This approach not only enables higher precision and recall than general-purpose tools, but it can also be adapted to parse diverse bodies of law—from the U.S. Code to state codes to city municipal codes. We benchmark STARA’s performance and show that it makes frontier LLMs as much as three times more accurate on complex statutory research tasks relative to a general-purpose AI system, making previously infeasible research efforts not just possible, but trivial.