Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Generative AI | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Back to Generative AI

All Work Published on Generative AI

Whose Opinions Do Language Models Reflect?
Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, Tatsunori Hashimoto
Quick ReadSep 20, 2023
Policy Brief

This brief introduces a quantitative framework that allows policymakers to evaluate the behavior of language models to assess what kinds of opinions they reflect.

Whose Opinions Do Language Models Reflect?

Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, Tatsunori Hashimoto
Quick ReadSep 20, 2023

This brief introduces a quantitative framework that allows policymakers to evaluate the behavior of language models to assess what kinds of opinions they reflect.

Generative AI
Ethics, Equity, Inclusion
Policy Brief
AI-Faked Cases Become Core Issue Irritating Overworked Judges
Bloomberg Law
Dec 29, 2025
Media Mention

As AI-hallucinated case citations flood the courts, judges have increased fines for attorneys who have cited fake cases. HAI Policy Fellow Riana Pfefferkorn hopes this will "make the firm sit up and pay better attention."

AI-Faked Cases Become Core Issue Irritating Overworked Judges

Bloomberg Law
Dec 29, 2025

As AI-hallucinated case citations flood the courts, judges have increased fines for attorneys who have cited fake cases. HAI Policy Fellow Riana Pfefferkorn hopes this will "make the firm sit up and pay better attention."

Generative AI
Law Enforcement and Justice
Media Mention
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Sina Semnani, Violet Yao, Monica Lam, Heidi Zhang
Dec 01, 2023
Research
Your browser does not support the video tag.

This paper presents the first few-shot LLM-based chatbot that almost never hallucinates and has high conversationality and low latency. WikiChat is grounded on the English Wikipedia, the largest curated free-text corpus. WikiChat generates a response from an LLM, retains only the grounded facts, and combines them with additional information it retrieves from the corpus to form factual and engaging responses. We distill WikiChat based on GPT-4 into a 7B-parameter LLaMA model with minimal loss of quality, to significantly improve its latency, cost and privacy, and facilitate research and deployment. Using a novel hybrid human-and-LLM evaluation methodology, we show that our best system achieves 97.3% factual accuracy in simulated conversations. It significantly outperforms all retrieval-based and LLM-based baselines, and by 3.9%, 38.6% and 51.0% on head, tail and recent knowledge compared to GPT-4. Compared to previous state-of-the-art retrieval-based chatbots, WikiChat is also significantly more informative and engaging, just like an LLM. WikiChat achieves 97.9% factual accuracy in conversations with human users about recent topics, 55.0% better than GPT-4, while receiving significantly higher user ratings and more favorable comments.

WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia

Sina Semnani, Violet Yao, Monica Lam, Heidi Zhang
Dec 01, 2023

This paper presents the first few-shot LLM-based chatbot that almost never hallucinates and has high conversationality and low latency. WikiChat is grounded on the English Wikipedia, the largest curated free-text corpus. WikiChat generates a response from an LLM, retains only the grounded facts, and combines them with additional information it retrieves from the corpus to form factual and engaging responses. We distill WikiChat based on GPT-4 into a 7B-parameter LLaMA model with minimal loss of quality, to significantly improve its latency, cost and privacy, and facilitate research and deployment. Using a novel hybrid human-and-LLM evaluation methodology, we show that our best system achieves 97.3% factual accuracy in simulated conversations. It significantly outperforms all retrieval-based and LLM-based baselines, and by 3.9%, 38.6% and 51.0% on head, tail and recent knowledge compared to GPT-4. Compared to previous state-of-the-art retrieval-based chatbots, WikiChat is also significantly more informative and engaging, just like an LLM. WikiChat achieves 97.9% factual accuracy in conversations with human users about recent topics, 55.0% better than GPT-4, while receiving significantly higher user ratings and more favorable comments.

Natural Language Processing
Foundation Models
Machine Learning
Generative AI
Your browser does not support the video tag.
Research
Stanford Researchers: AI Reality Check Imminent
Forbes
Dec 23, 2025
Media Mention

Shana Lynch, HAI Head of Content and Associate Director of Communications, pointed out the "'era of AI evangelism is giving way to an era of AI evaluation,'" in her AI predictions piece, where she interviewed several Stanford AI experts on their insights for AI impacts in 2026.

Stanford Researchers: AI Reality Check Imminent

Forbes
Dec 23, 2025

Shana Lynch, HAI Head of Content and Associate Director of Communications, pointed out the "'era of AI evangelism is giving way to an era of AI evaluation,'" in her AI predictions piece, where she interviewed several Stanford AI experts on their insights for AI impacts in 2026.

Generative AI
Economy, Markets
Healthcare
Communications, Media
Media Mention
Generative AI: Perspectives from Stanford HAI
Russ Altman, Erik Brynjolfsson, Michele Elam, Surya Ganguli, Daniel E. Ho, James Landay, Curtis Langlotz, Fei-Fei Li, Percy Liang, Christopher Manning, Peter Norvig, Rob Reich, Vanessa Parli
Deep DiveMar 01, 2023
Research

A diversity of perspectives from Stanford leaders in medicine, science, engineering, humanities, and the social sciences on how generative AI might affect their fields and our world

Generative AI: Perspectives from Stanford HAI

Russ Altman, Erik Brynjolfsson, Michele Elam, Surya Ganguli, Daniel E. Ho, James Landay, Curtis Langlotz, Fei-Fei Li, Percy Liang, Christopher Manning, Peter Norvig, Rob Reich, Vanessa Parli
Deep DiveMar 01, 2023

A diversity of perspectives from Stanford leaders in medicine, science, engineering, humanities, and the social sciences on how generative AI might affect their fields and our world

Generative AI
Research
Most-Read: The Stanford HAI Stories that Defined AI in 2025
Shana Lynch
Dec 15, 2025
News
illustration of people reading computers, phones, and print

Readers wanted to know if their therapy chatbot could be trusted, whether their boss was automating the wrong job, and if their private conversations were training tomorrow's models.

Most-Read: The Stanford HAI Stories that Defined AI in 2025

Shana Lynch
Dec 15, 2025

Readers wanted to know if their therapy chatbot could be trusted, whether their boss was automating the wrong job, and if their private conversations were training tomorrow's models.

Economy, Markets
Generative AI
Healthcare
illustration of people reading computers, phones, and print
News
1
2
3
4
5