Zheng Wang

Latest Related to Zheng Wang

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate

Participate

Skip to content

people

Zheng Wang

Latest Related to Zheng Wang

Research

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Aryaman Arora, Zheng Wang, Atticus Geiger, Jing Huang, Zhengxuan Wu, Noah Goodman, Christopher Potts

Natural Language ProcessingGenerative AIMachine LearningFoundation ModelsJun 01

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce pyvene, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. pyvene supports complex intervention schemes with an intuitive configuration format, and its interventions can be static or include trainable parameters. We show how pyvene provides a unified and extensible framework for performing interventions on neural models and sharing the intervened upon models with others. We illustrate the power of the library via interpretability analyses using causal abstraction and knowledge localization. We publish our library through Python Package Index (PyPI) and provide code, documentation, and tutorials at ‘https://github.com/stanfordnlp/pyvene‘.