Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
pyvene: A Library for Understanding and Improving PyTorch Models via Interventions | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
research

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Date
June 01, 2024
Topics
Natural Language Processing
Generative AI
Machine Learning
Foundation Models
Your browser does not support the video tag.
Read Paper
abstract

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce pyvene, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. pyvene supports complex intervention schemes with an intuitive configuration format, and its interventions can be static or include trainable parameters. We show how pyvene provides a unified and extensible framework for performing interventions on neural models and sharing the intervened upon models with others. We illustrate the power of the library via interpretability analyses using causal abstraction and knowledge localization. We publish our library through Python Package Index (PyPI) and provide code, documentation, and tutorials at ‘https://github.com/stanfordnlp/pyvene‘.

Share
Link copied to clipboard!
Authors
  • Zhengxuan Wu
  • Atticus Geiger
  • Jing Huang
  • headshot
    Noah Goodman
  • Christopher Potts
    Christopher Potts
  • Aryaman Arora
  • Zheng Wang

Related Publications

Stories for the Future 2024
Isabelle Levent
Deep DiveMar 31, 2025
Research

We invited 11 sci-fi filmmakers and AI researchers to Stanford for Stories for the Future, a day-and-a-half experiment in fostering new narratives about AI. Researchers shared perspectives on AI and filmmakers reflected on the challenges of writing AI narratives. Together researcher-writer pairs transformed a research paper into a written scene. The challenge? Each scene had to include an AI manifestation, but could not be about the personhood of AI or AI as a threat. Read the results of this project.

Research

Stories for the Future 2024

Isabelle Levent
Machine LearningGenerative AIArts, HumanitiesCommunications, MediaDesign, Human-Computer InteractionSciences (Social, Health, Biological, Physical)Deep DiveMar 31

We invited 11 sci-fi filmmakers and AI researchers to Stanford for Stories for the Future, a day-and-a-half experiment in fostering new narratives about AI. Researchers shared perspectives on AI and filmmakers reflected on the challenges of writing AI narratives. Together researcher-writer pairs transformed a research paper into a written scene. The challenge? Each scene had to include an AI manifestation, but could not be about the personhood of AI or AI as a threat. Read the results of this project.

The Promise and Perils of Artificial Intelligence in Advancing Participatory Science and Health Equity in Public Health
Abby C King, Zakaria N Doueiri, Ankita Kaulberg, Lisa Goldman Rosas
Feb 14, 2025
Research
Your browser does not support the video tag.

Current societal trends reflect an increased mistrust in science and a lowered civic engagement that threaten to impair research that is foundational for ensuring public health and advancing health equity. One effective countermeasure to these trends lies in community-facing citizen science applications to increase public participation in scientific research, making this field an important target for artificial intelligence (AI) exploration. We highlight potentially promising citizen science AI applications that extend beyond individual use to the community level, including conversational large language models, text-to-image generative AI tools, descriptive analytics for analyzing integrated macro- and micro-level data, and predictive analytics. The novel adaptations of AI technologies for community-engaged participatory research also bring an array of potential risks. We highlight possible negative externalities and mitigations for some of the potential ethical and societal challenges in this field.

Research
Your browser does not support the video tag.

The Promise and Perils of Artificial Intelligence in Advancing Participatory Science and Health Equity in Public Health

Abby C King, Zakaria N Doueiri, Ankita Kaulberg, Lisa Goldman Rosas
Foundation ModelsGenerative AIMachine LearningNatural Language ProcessingSciences (Social, Health, Biological, Physical)HealthcareFeb 14

Current societal trends reflect an increased mistrust in science and a lowered civic engagement that threaten to impair research that is foundational for ensuring public health and advancing health equity. One effective countermeasure to these trends lies in community-facing citizen science applications to increase public participation in scientific research, making this field an important target for artificial intelligence (AI) exploration. We highlight potentially promising citizen science AI applications that extend beyond individual use to the community level, including conversational large language models, text-to-image generative AI tools, descriptive analytics for analyzing integrated macro- and micro-level data, and predictive analytics. The novel adaptations of AI technologies for community-engaged participatory research also bring an array of potential risks. We highlight possible negative externalities and mitigations for some of the potential ethical and societal challenges in this field.

Policy-Shaped Prediction: Avoiding Distractions in Model-Based Reinforcement Learning
Nicholas Haber, Miles Huston, Isaac Kauvar
Dec 13, 2024
Research
Your browser does not support the video tag.

Model-based reinforcement learning (MBRL) is a promising route to sampleefficient policy optimization. However, a known vulnerability of reconstructionbased MBRL consists of scenarios in which detailed aspects of the world are highly predictable, but irrelevant to learning a good policy. Such scenarios can lead the model to exhaust its capacity on meaningless content, at the cost of neglecting important environment dynamics. While existing approaches attempt to solve this problem, we highlight its continuing impact on leading MBRL methods —including DreamerV3 and DreamerPro — with a novel environment where background distractions are intricate, predictable, and useless for planning future actions. To address this challenge we develop a method for focusing the capacity of the world model through synergy of a pretrained segmentation model, a task-aware reconstruction loss, and adversarial learning. Our method outperforms a variety of other approaches designed to reduce the impact of distractors, and is an advance towards robust model-based reinforcement learning.

Research
Your browser does not support the video tag.

Policy-Shaped Prediction: Avoiding Distractions in Model-Based Reinforcement Learning

Nicholas Haber, Miles Huston, Isaac Kauvar
Machine LearningFoundation ModelsDec 13

Model-based reinforcement learning (MBRL) is a promising route to sampleefficient policy optimization. However, a known vulnerability of reconstructionbased MBRL consists of scenarios in which detailed aspects of the world are highly predictable, but irrelevant to learning a good policy. Such scenarios can lead the model to exhaust its capacity on meaningless content, at the cost of neglecting important environment dynamics. While existing approaches attempt to solve this problem, we highlight its continuing impact on leading MBRL methods —including DreamerV3 and DreamerPro — with a novel environment where background distractions are intricate, predictable, and useless for planning future actions. To address this challenge we develop a method for focusing the capacity of the world model through synergy of a pretrained segmentation model, a task-aware reconstruction loss, and adversarial learning. Our method outperforms a variety of other approaches designed to reduce the impact of distractors, and is an advance towards robust model-based reinforcement learning.

LABOR-LLM: Language-Based Occupational Representations with Large Language Models
Susan Athey, Herman Brunborg, Tianyu Du, Ayush Kanodia, Keyon Vafa
Dec 11, 2024
Research
Your browser does not support the video tag.

Vafa et al. (2024) introduced a transformer-based econometric model, CAREER, that predicts a worker’s next job as a function of career history (an “occupation model”). CAREER was initially estimated (“pre-trained”) using a large, unrepresentative resume dataset, which served as a “foundation model,” and parameter estimation was continued (“fine-tuned”) using data from a representative survey. CAREER had better predictive performance than benchmarks. This paper considers an alternative where the resume-based foundation model is replaced by a large language model (LLM). We convert tabular data from the survey into text files that resemble resumes and fine-tune the LLMs using these text files with the objective to predict the next token (word). The resulting fine-tuned LLM is used as an input to an occupation model. Its predictive performance surpasses all prior models. We demonstrate the value of fine-tuning and further show that by adding more career data from a different population, fine-tuning smaller LLMs surpasses the performance of fine-tuning larger models.

Research
Your browser does not support the video tag.

LABOR-LLM: Language-Based Occupational Representations with Large Language Models

Susan Athey, Herman Brunborg, Tianyu Du, Ayush Kanodia, Keyon Vafa
Foundation ModelsNatural Language ProcessingDec 11

Vafa et al. (2024) introduced a transformer-based econometric model, CAREER, that predicts a worker’s next job as a function of career history (an “occupation model”). CAREER was initially estimated (“pre-trained”) using a large, unrepresentative resume dataset, which served as a “foundation model,” and parameter estimation was continued (“fine-tuned”) using data from a representative survey. CAREER had better predictive performance than benchmarks. This paper considers an alternative where the resume-based foundation model is replaced by a large language model (LLM). We convert tabular data from the survey into text files that resemble resumes and fine-tune the LLMs using these text files with the objective to predict the next token (word). The resulting fine-tuned LLM is used as an input to an occupation model. Its predictive performance surpasses all prior models. We demonstrate the value of fine-tuning and further show that by adding more career data from a different population, fine-tuning smaller LLMs surpasses the performance of fine-tuning larger models.