The Mythos of Model Interpretability Zachary C. Lipton - - PowerPoint PPT Presentation

the mythos of model interpretability
SMART_READER_LITE
LIVE PREVIEW

The Mythos of Model Interpretability Zachary C. Lipton - - PowerPoint PPT Presentation

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline What is interpretability ? What are its desiderata? What model properties confer interpretability? Caveats, pitfalls, and takeaways


slide-1
SLIDE 1

The Mythos of Model Interpretability

Zachary C. Lipton https://arxiv.org/abs/1606.03490

slide-2
SLIDE 2

Outline

  • What is interpretability?
  • What are its desiderata?
  • What model properties confer interpretability?
  • Caveats, pitfalls, and takeaways
slide-3
SLIDE 3

What is Interpretability?

  • Many papers make axiomatic claims


This model is {interpretable, explainable, intelligible, transparent, understandable}

  • But what is interpretability? & why is it desirable?
  • Does it hold consistent meaning across papers?
slide-4
SLIDE 4

We want good models

Evaluation Metric

slide-5
SLIDE 5

We also want interpretable models

Evaluation Metric Interpretation

slide-6
SLIDE 6

The Human Wants Something the Metric Doesn’t

Evaluation Metric Interpretation

slide-7
SLIDE 7

So What’s Up?

It seems either:

  • Metric captures everything and people are crazy
  • The metric mismatched from real objectives

We hope to refine the discourse on interpretability In dialogue with the literature, we create a taxonomy

  • f both objectives & methods

slide-8
SLIDE 8

Outline

  • What is interpretability?
  • What are its desiderata?
  • What model properties confer interpretability?
  • Caveats, pitfalls, and takeaways
slide-9
SLIDE 9

Trust

  • Does the model know 


when it’s uncertain?

  • Does the model make 


same mistakes as humans?

  • Are we comfortable 


with the model?

slide-10
SLIDE 10

Causality

  • Tell us something about 


the natural world

  • Predictions vs actions
  • Caruana (2015) shows a mortality predictor (for use

in triage) that assigns lower risk to asthma patients

slide-11
SLIDE 11

Transferability

  • Training setups differ from 


the wild

  • Reality may be 


non-stationary, noisy

  • Don’t want model to 


depend on weak setup

slide-12
SLIDE 12

Informativeness

  • We may train a model


to make a *decision*

  • But it’s real purpose is 


to be a feature

  • Thus an interpretation


may simply be valuable for the extra bits it carries

slide-13
SLIDE 13

Outline

  • What is interpretability?
  • What are its desiderata?
  • What model properties confer interpretability?
  • Caveats, pitfalls, and takeaways
slide-14
SLIDE 14

Transparency

  • Proposed solutions conferring interpretability tend

to fall into two categories

  • Transparency addresses understanding how the

model works

  • Explainability concerns the model’s ability to offer

some (potentially post-hoc) explanation

slide-15
SLIDE 15

Simulatability

  • One notion of transparency


is simplicity

  • Small decision trees, sparse 


linear models, rules

  • A model is simulatable if a 


person can *run* it

slide-16
SLIDE 16

Decomposability

  • A relaxed notion requires


understanding individual
 components of a model

  • Such as: weights of a linear 


model or the nodes of a 
 decision tree

slide-17
SLIDE 17

Transparent Algorithms

  • We understand the 


behavior algorithm 
 (but maybe not output)

  • E.g. convergence of 


convex optimizations,
 generalization bounds 


slide-18
SLIDE 18

Post-Hoc Interpretability

A h y e s , s

  • m

e t h i n g c

  • l

i s h a p p e n i n g i n n

  • d

e 7 5 , 3 4 5 , 1 6 7 … m a y b e i t s e e s a c a t ? 
 T r y j i g g l i n g t h e i n p u t s ?

slide-19
SLIDE 19

Verbal Explanations

  • Just as people generate 


explanations (absent 
 transparency), we might
 train a (possibly separate)
 model to generate 
 explanations

  • Could think of captions 


as interpretations


  • f classification model


(Image: Karpathy et al 2015)

slide-20
SLIDE 20

Saliency Maps

  • Mapping b/w input & output


might be impossible to 
 describe succinctly,
 local explanations are 
 potentially useful. (Image: Wang et al 2016)

slide-21
SLIDE 21

Case-Based Explanations

  • Retrieve labeled items 


that look similar to the model

  • Doctors employ this technique 


to explain treatments
 (Image: Mikolov et al 2014)

slide-22
SLIDE 22

Outline

  • What is interpretability?
  • What are its desiderata?
  • What model properties confer interpretability?
  • Caveats, pitfalls, and takeaways
slide-23
SLIDE 23

Discussion Points

  • Linear models not strictly more interpretable than

deep learning

  • Claims about interpretability must be qualified
  • Transparency may be at odds with the goals of AI
  • Post-hoc interpretations may potentially mislead
slide-24
SLIDE 24

Thanks!

Acknowledgments: Zachary C. Lipton was supported by the Division of Biomedical Informatics at UCSD, via training grant (T15LM011271) from the NIH/NLM. Thanks to Charles Elkan, Julian McAuley, David Kale, Maggie Makar, Been Kim, Lihong Li, Rich Caruana, Daniel Fried, Jack Berkowitz, & Sepp Hochreiter

References: The Mythos of Model Interpretability (ICML Workshop on Human Interpretability 2016) - ZC Lipton http://arxiv.org/abs/1511.03677