The Mythos of Model Interpretability
Zachary C. Lipton https://arxiv.org/abs/1606.03490
The Mythos of Model Interpretability Zachary C. Lipton - - PowerPoint PPT Presentation
The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline What is interpretability ? What are its desiderata? What model properties confer interpretability? Caveats, pitfalls, and takeaways
Zachary C. Lipton https://arxiv.org/abs/1606.03490
model is interpretable and therefore preferable
desiderata it serves are seldom defined
across papers?
intelligible, transparent, and understandable, both interchangeably (within papers) and inconsistently (across papers)
interpretability is something other than performance
Evaluation Metric
Evaluation Metric Interpretation
Evaluation Metric Interpretation
seeking interpretable models are crazy or…
fundamentally mismatched from real life objectives
introducing more specific language
when it’s uncertain?
same mistakes as human?
with the model?
tell us something about the natural world
trained simply to make predictions, but often used to take actions
in triage) that assigns lower risk to asthma patients
setups often differ from real world
non-stationary, noisier, etc.
the model doesn’t depend
to make a decision
to aid a person in making a decision
may simply be valuable for the extra bits it carries
to fall into two categories
model works
some (potentially post-hoc) explanation
is simplicity
advocating small decision trees
person can step through the algorithm in reasonable time
understanding individual components of a model
model or the nodes of a decision tree
would require only that we understand the behavior algorithm
convex optimizations, generalization bounds
A h y e s , s
e t h i n g c
i s h a p p e n i n g i n n
e 7 5 , 3 4 5 , 1 6 7 … m a y b e i t s e e s a c a t ? M a y b e w e ’ l l s e e s
e t h i n g a w e s
e i f w e j i g g l e t h e i n p u t s ?
explanations (absent transparency), we might train a (possibly separate) model to generate explanations
captions as interpretations
(Image: Karpathy et al 2015)
between input and output might be impossible to describe succinctly, local explanations are potentially useful. (Image: Wang et al 2016)
a post-hoc explanation might be to retrieve labeled items that are deemed similar by the model
retrieve histories from similar patients (Image: Mikolov et al 2014)
deep learning
Acknowledgments: Zachary C. Lipton was supported by the Division of Biomedical Informatics at UCSD, via training grant (T15LM011271) from the NIH/NLM. Thanks to Charles Elkan, Julian McAuley, David Kale, Maggie Makar, Been Kim, Lihong Li, Rich Caruana, Daniel Fried, Jack Berkowitz, & Sepp Hochreiter
References: The Mythos of Model Interpretability (ICML Workshop on Human Interpretability 2016) - ZC Lipton http://arxiv.org/abs/1511.03677 Directly Modeling Missing Data with RNNs (MLHC 2016) - ZC Lipton, DC Kale, R Wetzel http://arxiv.org/abs/1606.04130 Learning to Diagnose (ICLR 2016) - ZC Lipton, DC Kale, Charles Elkan, R Wetzel http://arxiv.org/abs/1511.03677 Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. (2015) - R Caruana et al http://dl.acm.org/citation.cfm?id=2788613