HAI Weekly Seminar with Art Owen
Variable Importance, Cohort Shapley Value, and Redlining
In order to explain what a black box algorithm does, Owen says, we can start by studying which variables are important for its decisions. Variable importance is studied by making hypothetical changes to predictor variables. Changing parameters one at a time can produce input combinations that are outliers or very unlikely. They can be physically impossible, or even logically impossible. It is problematic to base an explanation on outputs corresponding to impossible inputs. Owen introduces the cohort Shapley (CS) measure to avoid this problem, based on Shapley value from cooperative game theory.
There are many tradeoffs in picking a variable importance measure, so CS is not the unique reasonable choice. One interesting property of CS is that it can detect 'redlining', meaning the impact of a protected variable on an algorithm's output when that algorithm was trained without the protected variable.