The Mythos of Model Interpretability Zachary C. Lipton - PowerPoint PPT Presentation

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490

Outline • What is interpretability ? • What are its desiderata? • What model properties confer interpretability? • Caveats, pitfalls, and takeaways

What is Interpretability? • Many papers make axiomatic claims that some model is interpretable and therefore preferable • But what interpretability is and precisely what desiderata it serves are seldom defined • Does interpretability hold consistent meaning across papers?

Inconsistent Definitions • Papers use the words interpretable, explainable, intelligible, transparent, and understandable , both interchangeably (within papers) and inconsistently (across papers) • One common thread, however, is that interpretability is something other than performance

We want good models Evaluation Metric

We also want interpretable models Evaluation Metric Interpretation

The Human Wants Something the Metric Doesn’t Evaluation Metric Interpretation

What Gives? • So either the metric captures everything and people seeking interpretable models are crazy or… • The metric / loss functions we optimize are fundamentally mismatched from real life objectives • We hope to refine the discourse on interpretability,   introducing more specific language • Through the lens of the literature, we create a taxonomy of both objectives & methods  

Trust • Does the model know   when it’s uncertain? • Does the model make   same mistakes as human? • Are we comfortable   with the model?

Causality • We may want models to   tell us something about   the natural world • Supervised models are   trained simply to make   predictions, but often used to take actions • Caruana (2015) shows a mortality predictor (for use in triage) that assigns lower risk to asthma patients

Transferability • The idealized training   setups often differ from   real world • Real problem may be   non-stationary, noisier,   etc. • Want sanity-checks that   the model doesn’t depend   on weaknesses in setup

Informativeness • We may train a model   to make a decision • But it’s real purpose is   to aid a person in   making a decision • Thus an interpretation   may simply be valuable for the extra bits it carries

Transparency • Proposed solutions conferring interpretability tend to fall into two categories • Transparency addresses understanding how the model works • Explainability concerns the model’s ability to offer some (potentially post-hoc) explanation

Simulatability • One notion of transparency   is simplicity • This accords with papers   advocating small decision trees • A model is transparent if a   person can step through the   algorithm in reasonable time

Decomposability • A relaxed notion requires   understanding individual   components of a model • Such as: weights of a linear   model or the nodes of a   decision tree

Transparent Algorithms • A yet weaker notion   would require only   that we understand the   behavior algorithm • E.g. convergence of   convex optimizations,   generalization bounds  

Post-Hoc Interpretability A h y e s , s o m e t h i n g c o o l i s h a p p e n i n g i n n o d e 7 5 0 , 3 4 5 , 1 6 7 … m a y b e i t s e e s a c a t ?   M a y b e w e ’ l l s e e s o m e t h i n g a w e s o m e i f w e j i g g l e t h e i n p u t s ?

Verbal Explanations • Just as people generate   explanations (absent   transparency), we might   train a (possibly separate)   model to generate   explanations • We might consider image   captions as interpretations   of object predictions   (Image: Karpathy et al 2015)

Saliency Maps • While the full relationship   between input and output   might be impossible to   describe succinctly,   local explanations are   potentially useful. (Image: Wang et al 2016)

Case-Based Explanations • Another way to generate   a post-hoc explanation might   be to retrieve labeled items   that are deemed similar   by the model • For some models, we can   retrieve histories from   similar patients   (Image: Mikolov et al 2014)

Discussion Points • Linear models not strictly more interpretable than deep learning • Claims about interpretability must be qualified • Transparency may be at odds with the goals of AI • Post-hoc interpretations may potentially mislead

Thanks! Acknowledgments: Zachary C. Lipton was supported by the Division of Biomedical Informatics at UCSD, via training grant (T15LM011271) from the NIH/NLM. Thanks to Charles Elkan, Julian McAuley, David Kale, Maggie Makar, Been Kim, Lihong Li, Rich Caruana, Daniel Fried, Jack Berkowitz, & Sepp Hochreiter References: The Mythos of Model Interpretability (ICML Workshop on Human Interpretability 2016) - ZC Lipton http://arxiv.org/abs/1511.03677 Directly Modeling Missing Data with RNNs (MLHC 2016) - ZC Lipton, DC Kale, R Wetzel http://arxiv.org/abs/1606.04130   Learning to Diagnose (ICLR 2016) - ZC Lipton, DC Kale, Charles Elkan, R Wetzel http://arxiv.org/abs/1511.03677 Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. ( 2015) - R Caruana et al   http://dl.acm.org/citation.cfm?id=2788613

The Mythos of Model Interpretability Zachary C. Lipton - PowerPoint PPT Presentation

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline What is interpretability ? What are its desiderata? What model properties confer interpretability? Caveats, pitfalls, and takeaways

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline

The Mythos Project: The Mythos Project: a distance learning experience a distance learning

Interpretability of Machine Learning for Computer Vision Xinshuo Weng* *Most slides borrowed

INTERPRETABILITY AND INTERPRETABILITY AND EXPLAINABILITY EXPLAINABILITY Christian Kaestner

Interpretability and functional transparency Tommi Jaakkola in collaboration with David Alvarez

Evangelion Mythos and the Plot You Thought the Show Forgot Anime Is Lit Podcast Twitter:

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series

Explaining Machine Learning Models Armen Donigian Director of Data Science Engineering Roadmap

Interpretability in PRA Marta Bilkova , Dick de Jongh , and Joost J. Joosten ,

Interpretability and the arithmetized completeness theorem (Taishi Kurahashi)

Interpretability and Robustness for Multi-Hop QA Mohit Bansal (MRQA-EMNLP 2019 Workshop) 1

Model Interpretation Danish Pruthi April 28, 2020 Why interpretability? Task:

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples Guanhong Tao ,

Driverless AI for GPUs Interpretability, Accuracy, Speed Sri Ambati, Arno Candel GTC 2017 Time

Interpretability in Machine Learning Why Interpret ? The current state of machine learning And

The Evolution of Tox21: Enhancing Physiological Relevance & Interpretability with Emerging

The Nagoya Protocol: Implications for Collections Management of U.S. Biological Collections

The Funct ctional Role of ABS in Todays Financi cial System Photis Lysandrou and Anastasia

Time-varying trading day adjustment in SEASABS Lujuan Chen, Jonathan Campbell Australian Bureau

Case List Survey Has Meeting the ABS Case List Requirements Been a Significant Issue? 12 10 8

Richard ORourke Global Energy MBA 2014 Project & Dissertation STRUCTURED FINANCE &

GAS & LNG MIDDLE EAST SUMMIT | OMAN 2018 DISCLAIMER AUSTRALIA AND ALL JURISTICTIONS The

Emergency Disconnect Procedures: Industry Approaches to DP Drift-Off Analysis James N. Brekke,

PRESENTATION TITLE Knowledge Tangible and Intangible Heritage of Belarus Minsk, 11th

The Mythos of Model Interpretability Zachary C. Lipton - PowerPoint PPT Presentation

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline What is interpretability ? What are its desiderata? What model properties confer interpretability? Caveats, pitfalls, and takeaways

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline

The Mythos Project: The Mythos Project: a distance learning experience a distance learning

Interpretability of Machine Learning for Computer Vision Xinshuo Weng* *Most slides borrowed

INTERPRETABILITY AND INTERPRETABILITY AND EXPLAINABILITY EXPLAINABILITY Christian Kaestner

Interpretability and functional transparency Tommi Jaakkola in collaboration with David Alvarez

Evangelion Mythos and the Plot You Thought the Show Forgot Anime Is Lit Podcast Twitter:

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series

Explaining Machine Learning Models Armen Donigian Director of Data Science Engineering Roadmap

Interpretability in PRA Marta Bilkova , Dick de Jongh , and Joost J. Joosten ,

Interpretability and the arithmetized completeness theorem (Taishi Kurahashi)

Interpretability and Robustness for Multi-Hop QA Mohit Bansal (MRQA-EMNLP 2019 Workshop) 1

Model Interpretation Danish Pruthi April 28, 2020 Why interpretability? Task:

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples Guanhong Tao ,

Driverless AI for GPUs Interpretability, Accuracy, Speed Sri Ambati, Arno Candel GTC 2017 Time

Interpretability in Machine Learning Why Interpret ? The current state of machine learning And

The Evolution of Tox21: Enhancing Physiological Relevance &amp; Interpretability with Emerging

The Nagoya Protocol: Implications for Collections Management of U.S. Biological Collections

The Funct ctional Role of ABS in Todays Financi cial System Photis Lysandrou and Anastasia

Time-varying trading day adjustment in SEASABS Lujuan Chen, Jonathan Campbell Australian Bureau

Case List Survey Has Meeting the ABS Case List Requirements Been a Significant Issue? 12 10 8

Richard ORourke Global Energy MBA 2014 Project &amp; Dissertation STRUCTURED FINANCE &amp;

GAS &amp; LNG MIDDLE EAST SUMMIT | OMAN 2018 DISCLAIMER AUSTRALIA AND ALL JURISTICTIONS The

Emergency Disconnect Procedures: Industry Approaches to DP Drift-Off Analysis James N. Brekke,

PRESENTATION TITLE Knowledge Tangible and Intangible Heritage of Belarus Minsk, 11th

The Evolution of Tox21: Enhancing Physiological Relevance & Interpretability with Emerging

Richard ORourke Global Energy MBA 2014 Project & Dissertation STRUCTURED FINANCE &

GAS & LNG MIDDLE EAST SUMMIT | OMAN 2018 DISCLAIMER AUSTRALIA AND ALL JURISTICTIONS The