Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
Natural Language Processing | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Back to Natural Language Processing

All Work Published on Natural Language Processing

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations
Christopher Potts, Jing Huang, Zhengxuan Wu, Mor Geva, Atticus Geiger
Aug 14, 2024
Research
Your browser does not support the video tag.

Individual neurons participate in the representation of multiple high-level concepts. To what extent can different interpretability methods successfully disentangle these roles? To help address this question, we introduce RAVEL (Resolving Attribute–Value Entanglements in Language Models), a dataset that enables tightly controlled, quantitative comparisons between a variety of existing interpretability methods. We use the resulting conceptual framework to define the new method of Multi-task Distributed Alignment Search (MDAS), which allows us to find distributed representations satisfying multiple causal criteria. With Llama2-7B as the target language model, MDAS achieves state-of-the-art results on RAVEL, demonstrating the importance of going beyond neuron-level analyses to identify features distributed across activations. We release our benchmark at https://github.com/ explanare/ravel.

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

Christopher Potts, Jing Huang, Zhengxuan Wu, Mor Geva, Atticus Geiger
Aug 14, 2024

Individual neurons participate in the representation of multiple high-level concepts. To what extent can different interpretability methods successfully disentangle these roles? To help address this question, we introduce RAVEL (Resolving Attribute–Value Entanglements in Language Models), a dataset that enables tightly controlled, quantitative comparisons between a variety of existing interpretability methods. We use the resulting conceptual framework to define the new method of Multi-task Distributed Alignment Search (MDAS), which allows us to find distributed representations satisfying multiple causal criteria. With Llama2-7B as the target language model, MDAS achieves state-of-the-art results on RAVEL, demonstrating the importance of going beyond neuron-level analyses to identify features distributed across activations. We release our benchmark at https://github.com/ explanare/ravel.

Natural Language Processing
Your browser does not support the video tag.
Research
Language Models in the Classroom: Bridging the Gap Between Technology and Teaching
Instructors and students of CS293
Apr 09, 2025
News

Instructors and students from Stanford class CS293/EDUC473 address the failures of current educational technologies and outline how to empower both teachers and learners through collaborative innovation.

Language Models in the Classroom: Bridging the Gap Between Technology and Teaching

Instructors and students of CS293
Apr 09, 2025

Instructors and students from Stanford class CS293/EDUC473 address the failures of current educational technologies and outline how to empower both teachers and learners through collaborative innovation.

Education, Skills
Generative AI
Natural Language Processing
News
A Multi-Center Study on the Adaptability of a Shared Foundation Model for Electronic Health Records
Lin Lawrence Guo, Jason Fries, Nigam Shah, Ethan Steinberg, Scott Lanyon Fleming, Keith Morse, Catherine Aandilian, Jose Posada, Lillian Sung
Jun 27, 2024
Research
Your browser does not support the video tag.

Foundation models are transforming artificial intelligence (AI) in healthcare by providing modular components adaptable for various downstream tasks, making AI development more scalable and cost-effective. Foundation models for structured electronic health records (EHR), trained on coded medical records from millions of patients, demonstrated benefits including increased performance with fewer training labels, and improved robustness to distribution shifts. However, questions remain on the feasibility of sharing these models across hospitals and their performance in local tasks. This multi-center study examined the adaptability of a publicly accessible structured EHR foundation model (FMSM), trained on 2.57 M patient records from Stanford Medicine. Experiments used EHR data from The Hospital for Sick Children (SickKids) and Medical Information Mart for Intensive Care (MIMIC-IV). We assessed both adaptability via continued pretraining on local data, and task adaptability compared to baselines of locally training models from scratch, including a local foundation model. Evaluations on 8 clinical prediction tasks showed that adapting the off-the-shelf FMSMmatched the performance of gradient boosting machines (GBM) locally trained on all data while providing a 13% improvement in settings with few task-specific training labels. Continued pretraining on local data showed FMSM required fewer than 1% of training examples to match the fully trained GBM’s performance, and was 60 to 90% more sample-efficient than training local foundation models from scratch. Our findings demonstrate that adapting EHR foundation models across hospitals provides improved prediction performance at less cost, underscoring the utility of base foundation models as modular components to streamline the development of healthcare AI.

A Multi-Center Study on the Adaptability of a Shared Foundation Model for Electronic Health Records

Lin Lawrence Guo, Jason Fries, Nigam Shah, Ethan Steinberg, Scott Lanyon Fleming, Keith Morse, Catherine Aandilian, Jose Posada, Lillian Sung
Jun 27, 2024

Foundation models are transforming artificial intelligence (AI) in healthcare by providing modular components adaptable for various downstream tasks, making AI development more scalable and cost-effective. Foundation models for structured electronic health records (EHR), trained on coded medical records from millions of patients, demonstrated benefits including increased performance with fewer training labels, and improved robustness to distribution shifts. However, questions remain on the feasibility of sharing these models across hospitals and their performance in local tasks. This multi-center study examined the adaptability of a publicly accessible structured EHR foundation model (FMSM), trained on 2.57 M patient records from Stanford Medicine. Experiments used EHR data from The Hospital for Sick Children (SickKids) and Medical Information Mart for Intensive Care (MIMIC-IV). We assessed both adaptability via continued pretraining on local data, and task adaptability compared to baselines of locally training models from scratch, including a local foundation model. Evaluations on 8 clinical prediction tasks showed that adapting the off-the-shelf FMSMmatched the performance of gradient boosting machines (GBM) locally trained on all data while providing a 13% improvement in settings with few task-specific training labels. Continued pretraining on local data showed FMSM required fewer than 1% of training examples to match the fully trained GBM’s performance, and was 60 to 90% more sample-efficient than training local foundation models from scratch. Our findings demonstrate that adapting EHR foundation models across hospitals provides improved prediction performance at less cost, underscoring the utility of base foundation models as modular components to streamline the development of healthcare AI.

Natural Language Processing
Healthcare
Foundation Models
Your browser does not support the video tag.
Research
An Open-Source AI Agent for Doing Tasks on the Web
Katharine Miller
Mar 27, 2025
News

NNetNav learns how to navigate websites by mimicking childhood learning through exploration.

An Open-Source AI Agent for Doing Tasks on the Web

Katharine Miller
Mar 27, 2025

NNetNav learns how to navigate websites by mimicking childhood learning through exploration.

Machine Learning
Natural Language Processing
News
pyvene: A Library for Understanding and Improving PyTorch Models via Interventions
Zhengxuan Wu, Atticus Geiger, Jing Huang, Noah Goodman, Christopher Potts, Aryaman Arora, Zheng Wang
Jun 01, 2024
Research
Your browser does not support the video tag.

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce pyvene, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. pyvene supports complex intervention schemes with an intuitive configuration format, and its interventions can be static or include trainable parameters. We show how pyvene provides a unified and extensible framework for performing interventions on neural models and sharing the intervened upon models with others. We illustrate the power of the library via interpretability analyses using causal abstraction and knowledge localization. We publish our library through Python Package Index (PyPI) and provide code, documentation, and tutorials at ‘https://github.com/stanfordnlp/pyvene‘.

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Zhengxuan Wu, Atticus Geiger, Jing Huang, Noah Goodman, Christopher Potts, Aryaman Arora, Zheng Wang
Jun 01, 2024

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce pyvene, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. pyvene supports complex intervention schemes with an intuitive configuration format, and its interventions can be static or include trainable parameters. We show how pyvene provides a unified and extensible framework for performing interventions on neural models and sharing the intervened upon models with others. We illustrate the power of the library via interpretability analyses using causal abstraction and knowledge localization. We publish our library through Python Package Index (PyPI) and provide code, documentation, and tutorials at ‘https://github.com/stanfordnlp/pyvene‘.

Natural Language Processing
Generative AI
Machine Learning
Foundation Models
Your browser does not support the video tag.
Research
Chatbots, Like the Rest of Us, Just Want to Be Loved
Wired
Mar 05, 2025
Media Mention

A study led by Stanford HAI Faculty Fellow Johannes Eichstaedt reveals that large language models adapt their behavior to appear more likable when they are being studied, mirroring human tendencies to present favorably.

Chatbots, Like the Rest of Us, Just Want to Be Loved

Wired
Mar 05, 2025

A study led by Stanford HAI Faculty Fellow Johannes Eichstaedt reveals that large language models adapt their behavior to appear more likable when they are being studied, mirroring human tendencies to present favorably.

Natural Language Processing
Machine Learning
Generative AI
Foundation Models
Media Mention
1
2
3
4
5