Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
A Trustworthy AI Assistant for Investigative Journalists | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
news

A Trustworthy AI Assistant for Investigative Journalists

Date
December 01, 2025
Topics
Communications, Media
journalist holds pen and paper taking notes at a press conference
istock

Gathering and analyzing data require time and expertise — two resources that cash-strapped newspapers often don’t have. Can AI help?

In 2023, an average of 2.5 local newspapers shut down every week. More than half of U.S. counties now have little or no reliable local news coverage, and the trend is accelerating.

This is a business problem. It is also, arguably, a democracy problem. For centuries, local journalism has kept voters engaged in local politics and politicians accountable to those voters. Small papers with investigative tenacity have also routinely broken stories of national importance — the Patriot-News uncovering Penn State’s Jerry Sandusky scandal, for instance.

The answer to this crisis? “Everybody says, ‘Let’s use AI to help,’ ” replies Monica Lam, a professor of computer science at Stanford University. The problem with this, she adds, is that most AI tools aren’t reliable. She cites a 2025 study conducted by the BBC in which the media outlet used major AI models to analyze news content on its website. Over half of answers from the AI had “significant issues,” according to the BBC, including factual errors and fabricated quotations.

“It’s not so easy,” says Lam.

Now, Lam is working with technologists and journalists to develop a more useful tool for the news industry. With Cheryl Phillips, the founder of Stanford’s Big Local News, along with seed funding from the Stanford Institute for Human-Centered AI and a grant from the Brown Institute for Media Innovation at Stanford and Columbia, Lam created DataTalk, a chatbot specifically designed to help investigative journalists and cash-strapped newsrooms do their work more efficiently without sacrificing factual accuracy. DataTalk is built on top of a large language model and designed to retrieve and analyze information kept in big, sometimes unruly, public databases.

“Journalism is losing a lot of people and deep investigative work is harder than ever,” Lam says. “If more people know about the tool we’re building, and if we can keep improving it and keep generating success stories, then our hope is to bolster this type of journalism into the future.”

What is DataTalk?

Investigative journalists often rely on knowledge of database languages like SQL and the expertise of data scientists to unearth important stories. With DataTalk, they could instead simply type their question into a chat window and get an answer within a few seconds.

Available for anyone to use, the tool is currently focused on campaign finance data, meaning its public use is constrained to questions related to federal political campaigns, such as how much money a candidate for Congress has raised from out of state.

But the tool is expanding. The Baltimore Banner recently began using DataTalk to discover news stories buried in 311 non-emergency call log data. In the coming months, Big Local News hopes to work with Lam and other journalism organizations to identify other key datasets that could be integrated into DataTalk and to build a system that will make it easy for local journalists to add their own data to the agent. State-level campaign finance records are one example.

Along with its analysis, DataTalk provides the code that it used to conduct the analysis and an explanation, in plain English, of what the code is doing. This ensures that what it’s asking in technical language is the same question that the journalist asked in plain language. It also explains the ways in which its analysis may be limited.

To ensure DataTalk is accurate and useful, Lam and Phillips worked with domain expert Derek Willis, one of the country’s foremost campaign finance data journalists, who helped refine how the chatbot conducts its search and interpretation.

“Willis was able to provide really critical instructions to make sure that when a regular journalist asks a question of the agent, it knows which tables to go to and how to form a query out of the general instructions it received,” Phillips says. “Simpler datasets like 311 calls might not need this level of expertise. We consider the structure of the information and the domain we’re looking at to determine what kind of expertise is needed to ensure this model works.”

Once the tool was established, Lam’s group collaborated with Willis and the Big Local team to continually evaluate and improve the DataTalk interface. He also worked with students in Phillips’ class to help improve their understanding of how the agent works and has, since then, continued to maintain and improve the tool’s technical infrastructure.

From Classroom to Newsroom

In the fall of 2024, Phillips piloted the chatbot in her “Big Local Journalism” class. Students focused on campaign finance stories and, over the course of the quarter, published three stories in partnership with local newsrooms. One story compared the pools of donors for two candidates in a Hawaiian congressional race; another story looked at Kamala Harris’s campaign spending on reproductive health ads in Georgia. (The students manually fact-checked each story and replicated DataTalk’s analysis using their own code.)

“The newsrooms that published this work were happy to have it,” Phillips says. “These were stories that they would not otherwise have been able to tell.”

Around this time, the Maine Monitor reached out to do its own analysis comparing campaign contributions from inside and outside of the state. Reassured by the success of the pilot, Phillips helped the journalist at the Monitor conduct her investigation.

An AI Toolbox for Journalism

DataTalk is one piece within Phillips' and Lam’s more sweeping plan to support the world of investigative journalism; they have in mind a full toolkit of applications that help newsrooms generate stories, whether those newsrooms are small local operations collapsing under the strain of scarce resources or national outlets with plenty of investigative muscle. The scholars also plan to provide tutorials on how to use these applications and different stories to which they might be assigned.

Next up, the team hopes to add in DataTalk functionality to Agenda Watch, which uses computational methods as well as AI to gather meeting agendas and minutes from city councils, school boards, and other local decision-making bodies around the U.S. Agenda Watch can also alert users to newsworthy items that appear in local documents.

“Taken together, this effort is meant to reduce the cost of producing accountability journalism,” Phillips says. “It makes it possible, we hope, to dig into investigations and produce stories that matter.”

istock
Share
Link copied to clipboard!
Contributor(s)
Dylan Walsh
Related
  • Monica Lam
    Professor of Computer Science, and, by courtesy, of Electrical Engineering

Related News

Stanford Researchers: AI Reality Check Imminent
Forbes
Dec 23, 2025
Media Mention

Shana Lynch, HAI Head of Content and Associate Director of Communications, pointed out the "'era of AI evangelism is giving way to an era of AI evaluation,'" in her AI predictions piece, where she interviewed several Stanford AI experts on their insights for AI impacts in 2026.

Media Mention
Your browser does not support the video tag.

Stanford Researchers: AI Reality Check Imminent

Forbes
Generative AIEconomy, MarketsHealthcareCommunications, MediaDec 23

Shana Lynch, HAI Head of Content and Associate Director of Communications, pointed out the "'era of AI evangelism is giving way to an era of AI evaluation,'" in her AI predictions piece, where she interviewed several Stanford AI experts on their insights for AI impacts in 2026.

Stanford HAI Welcomes Six Distinguished Scholars as Senior Fellows
Feb 03, 2025
Announcement
From top left: Susan Athey, Michael Bernstein, Angèle Christin, Mykel Kochenderfer, Dorsa Sadigh, and Melissa Valentine.
Announcement
From top left: Susan Athey, Michael Bernstein, Angèle Christin, Mykel Kochenderfer, Dorsa Sadigh, and Melissa Valentine.

Stanford HAI Welcomes Six Distinguished Scholars as Senior Fellows

Communications, MediaMachine LearningFeb 03
Social Media Ads May Not Influence User Satisfaction as Much as You Think
Shana Lynch
Aug 23, 2024
News

A new study by researchers from Stanford, Carnegie Mellon, and Meta finds that the presence of ads on Facebook doesn’t significantly affect how users value the platform. 

News

Social Media Ads May Not Influence User Satisfaction as Much as You Think

Shana Lynch
Economy, MarketsCommunications, MediaAug 23

A new study by researchers from Stanford, Carnegie Mellon, and Meta finds that the presence of ads on Facebook doesn’t significantly affect how users value the platform.