Christopher Potts

Professor of Linguistics and, by courtesy, of Computer Science, Stanford University

Latest Work

Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs

Krista Opsahl-Ong, Michael J Ryan, Josh Purtell, David Broman, Christopher Potts, Matei Zaharia, Omar Khattab

Nov 14

Research

Language Model Programs, i.e. sophisticated pipelines of modular language model (LM) calls, are increasingly advancing NLP tasks, but they require crafting prompts that are jointly effective for all modules. We study prompt optimization for LM programs, i.e. how to update these prompts to maximize a downstream metric without access to module-level labels or gradients. To make this tractable, we factorize our problem into optimizing the free-form instructions and few-shot demonstrations of every module and introduce several strategies to craft task-grounded instructions and navigate credit assignment across modules. Our strategies include (i) program- and data-aware techniques for proposing effective instructions, (ii) a stochastic mini-batch evaluation function for learning a surrogate model of our objective, and (iii) a meta-optimization procedure in which we refine how LMs construct proposals over time. Using these insights we develop MIPRO, a novel algorithm for optimizing LM programs. MIPRO outperforms baseline optimizers on five of seven diverse multi-stage LM programs using a best-in-class open-source model (Llama-3-8B), by as high as 13% accuracy. We have released our new optimizers and benchmark in DSPy at [http://dspy.ai](http://dspy.ai).

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

Christopher Potts, Jing Huang, Zhengxuan Wu, Mor Geva, Atticus Geiger

Aug 14

Research

Individual neurons participate in the representation of multiple high-level concepts. To what extent can different interpretability methods successfully disentangle these roles? To help address this question, we introduce RAVEL (Resolving Attribute–Value Entanglements in Language Models), a dataset that enables tightly controlled, quantitative comparisons between a variety of existing interpretability methods. We use the resulting conceptual framework to define the new method of Multi-task Distributed Alignment Search (MDAS), which allows us to find distributed representations satisfying multiple causal criteria. With Llama2-7B as the target language model, MDAS achieves state-of-the-art results on RAVEL, demonstrating the importance of going beyond neuron-level analyses to identify features distributed across activations. We release our benchmark at https://github.com/ explanare/ravel.

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Zhengxuan Wu, Atticus Geiger, Jing Huang, Noah Goodman, Christopher Potts, Aryaman Arora, Zheng Wang

Jun 01

Research

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce pyvene, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. pyvene supports complex intervention schemes with an intuitive configuration format, and its interventions can be static or include trainable parameters. We show how pyvene provides a unified and extensible framework for performing interventions on neural models and sharing the intervened upon models with others. We illustrate the power of the library via interpretability analyses using causal abstraction and knowledge localization. We publish our library through Python Package Index (PyPI) and provide code, documentation, and tutorials at ‘https://github.com/stanfordnlp/pyvene‘.

Navigate

Participate

Stay Up To Date

Christopher Potts

Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

All Related

DSPy: Compiling Declarative Language Model Calls into State-of-the-Art Pipelines

DSPy: Compiling Declarative Language Model Calls into State-of-the-Art Pipelines

Evaluating Human and Machine Understanding of Data Visualizations

Evaluating Human and Machine Understanding of Data Visualizations

A Moderate Proposal for Radically Better AI-powered Web Search

A Moderate Proposal for Radically Better AI-powered Web Search