Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Foundation Models | Stanford HAI
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Back to Foundation Models

All Work Published on Foundation Models

How Stanford Researchers Design Reliable, Human-Focused AI Systems
Stanford Report
Nov 12, 2025
Media Mention

HAI Faculty Affiliate Diyi Yang studies the foundations of AI, ensuring these tools are designed with people in mind.

How Stanford Researchers Design Reliable, Human-Focused AI Systems

Stanford Report
Nov 12, 2025

HAI Faculty Affiliate Diyi Yang studies the foundations of AI, ensuring these tools are designed with people in mind.

Design, Human-Computer Interaction
Foundation Models
Media Mention
A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition
Tyler Benster, Guy Wilson, Reshef Elisha, Francis R. Willett, Shaul Druckmann
Mar 02, 2024
Research
Your browser does not support the video tag.

Silent Speech Interfaces (SSIs) offer a nonin- vasive alternative to brain-computer interfaces for soundless verbal communication. We in- troduce Multimodal Orofacial Neural Audio (MONA), a system that leverages cross-modal alignment through novel loss functions—cross- contrast (crossCon) and supervised temporal con- trast (supTcon)—to train a multimodal model with a shared latent representation. This archi- tecture enables the use of audio-only datasets like LibriSpeech to improve silent speech recog- nition. Additionally, our introduction of Large Language Model (LLM) Integrated Scoring Ad- justment (LISA) significantly improves recogni- tion accuracy. Together, MONA LISA reduces the state-of-the-art word error rate (WER) from 28.8% to 12.2% in the Gaddy (2020) benchmark dataset for silent speech on an open vocabulary. For vocal EMG recordings, our method improves the state-of-the-art from 23.3% to 3.7% WER. In the Brain-to-Text 2024 competition, LISA per- forms best, improving the top WER from 9.8% to 8.9%. To the best of our knowledge, this work represents the first instance where noninvasive silent speech recognition on an open vocabulary has cleared the threshold of 15% WER, demon- strating that SSIs can be a viable alternative to au- tomatic speech recognition (ASR). Our work not only narrows the performance gap between silent and vocalized speech but also opens new possi- bilities in human-computer interaction, demon- strating the potential of cross-modal approaches in noisy and data-limited regimes.

A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition

Tyler Benster, Guy Wilson, Reshef Elisha, Francis R. Willett, Shaul Druckmann
Mar 02, 2024

Silent Speech Interfaces (SSIs) offer a nonin- vasive alternative to brain-computer interfaces for soundless verbal communication. We in- troduce Multimodal Orofacial Neural Audio (MONA), a system that leverages cross-modal alignment through novel loss functions—cross- contrast (crossCon) and supervised temporal con- trast (supTcon)—to train a multimodal model with a shared latent representation. This archi- tecture enables the use of audio-only datasets like LibriSpeech to improve silent speech recog- nition. Additionally, our introduction of Large Language Model (LLM) Integrated Scoring Ad- justment (LISA) significantly improves recogni- tion accuracy. Together, MONA LISA reduces the state-of-the-art word error rate (WER) from 28.8% to 12.2% in the Gaddy (2020) benchmark dataset for silent speech on an open vocabulary. For vocal EMG recordings, our method improves the state-of-the-art from 23.3% to 3.7% WER. In the Brain-to-Text 2024 competition, LISA per- forms best, improving the top WER from 9.8% to 8.9%. To the best of our knowledge, this work represents the first instance where noninvasive silent speech recognition on an open vocabulary has cleared the threshold of 15% WER, demon- strating that SSIs can be a viable alternative to au- tomatic speech recognition (ASR). Our work not only narrows the performance gap between silent and vocalized speech but also opens new possi- bilities in human-computer interaction, demon- strating the potential of cross-modal approaches in noisy and data-limited regimes.

Natural Language Processing
Machine Learning
Foundation Models
Your browser does not support the video tag.
Research
How Persuasive is AI-Generated Propaganda?
Josh A. Goldstein, Jason Chao, Shelby Grossman, Alex Stamos, Michael Tomz
Quick ReadSep 03, 2024
Policy Brief

This brief presents the findings of an experiment that measures how persuasive AI-generated propaganda is compared to foreign propaganda articles written by humans.

How Persuasive is AI-Generated Propaganda?

Josh A. Goldstein, Jason Chao, Shelby Grossman, Alex Stamos, Michael Tomz
Quick ReadSep 03, 2024

This brief presents the findings of an experiment that measures how persuasive AI-generated propaganda is compared to foreign propaganda articles written by humans.

Democracy
Foundation Models
Policy Brief
Offline “Studying” Shrinks the Cost of Contextually Aware AI
Andrew Myers
Sep 29, 2025
News
Blue abstract background with light traveling through abstract flat cable illustrating data flow (3D render)

By having AI study a user’s context offline, researchers dramatically reduce the memory and cost required to make AI contextually aware.

Offline “Studying” Shrinks the Cost of Contextually Aware AI

Andrew Myers
Sep 29, 2025

By having AI study a user’s context offline, researchers dramatically reduce the memory and cost required to make AI contextually aware.

Foundation Models
Machine Learning
Blue abstract background with light traveling through abstract flat cable illustrating data flow (3D render)
News
How Persuasive Is AI-generated Propaganda?
Josh A. Goldstein, Jason Chao, Shelby Grossman, Alex Stamos, Michael Tomz
Feb 20, 2024
Research

Can large language models, a form of artificial intelligence (AI), generate persuasive propaganda? We conducted a preregistered survey experiment of US respondents to investigate the persuasiveness of news articles written by foreign propagandists compared to content generated by GPT-3 davinci (a large language model). We found that GPT-3 can create highly persuasive text as measured by participants’ agreement with propaganda theses. We further investigated whether a person fluent in English could improve propaganda persuasiveness. Editing the prompt fed to GPT-3 and/or curating GPT-3’s output made GPT-3 even more persuasive, and, under certain conditions, as persuasive as the original propaganda. Our findings suggest that propagandists could use AI to create convincing content with limited effort.

How Persuasive Is AI-generated Propaganda?

Josh A. Goldstein, Jason Chao, Shelby Grossman, Alex Stamos, Michael Tomz
Feb 20, 2024

Can large language models, a form of artificial intelligence (AI), generate persuasive propaganda? We conducted a preregistered survey experiment of US respondents to investigate the persuasiveness of news articles written by foreign propagandists compared to content generated by GPT-3 davinci (a large language model). We found that GPT-3 can create highly persuasive text as measured by participants’ agreement with propaganda theses. We further investigated whether a person fluent in English could improve propaganda persuasiveness. Editing the prompt fed to GPT-3 and/or curating GPT-3’s output made GPT-3 even more persuasive, and, under certain conditions, as persuasive as the original propaganda. Our findings suggest that propagandists could use AI to create convincing content with limited effort.

Natural Language Processing
Foundation Models
Generative AI
Research
Response to NTIA’s Request for Comment on Dual Use Open Foundation Models
Researchers from Stanford HAI
Mar 27, 2024
Response to Request

Stanford scholars respond to a federal RFC on dual use foundation models with widely available model weights, urging policymakers to consider their marginal risks.

Response to NTIA’s Request for Comment on Dual Use Open Foundation Models

Researchers from Stanford HAI
Mar 27, 2024

Stanford scholars respond to a federal RFC on dual use foundation models with widely available model weights, urging policymakers to consider their marginal risks.

Foundation Models
Regulation, Policy, Governance
Privacy, Safety, Security
Response to Request
1
2
3
4
5