Interpretability in Machine Learning Why Interpret ? The current - - PowerPoint PPT Presentation

interpretability in machine learning why interpret the
SMART_READER_LITE
LIVE PREVIEW

Interpretability in Machine Learning Why Interpret ? The current - - PowerPoint PPT Presentation

Interpretability in Machine Learning Why Interpret ? The current state of machine learning And its uses ... https://www.tesla.com/videos/autopilot-self- NYPost MIT Technology Review driving-hardware-neighborhood-long DeepMind DeepMind So


slide-1
SLIDE 1

Interpretability in Machine Learning

slide-2
SLIDE 2

Why Interpret ?

slide-3
SLIDE 3

The current state of machine learning

slide-4
SLIDE 4

And its uses ...

https://www.tesla.com/videos/autopilot-self- driving-hardware-neighborhood-long DeepMind NYPost DeepMind MIT Technology Review

slide-5
SLIDE 5

So are we in the golden age of AI ?

slide-6
SLIDE 6

Safety and well being

slide-7
SLIDE 7

Bias in algorithms

https://medium.com/@Joy.Buolamwini/response- racial-and-gender-bias-in-amazon-rekognition- commercial-ai-system-for-analyzing-faces- a289222eeced https://www.infoq.com/presentations/unconscious- bias-machine-learning/

slide-8
SLIDE 8

Adversarial Examples

slide-9
SLIDE 9

Legal Issues - GDPR

slide-10
SLIDE 10

And more ...

  • Interactive feedback - can model learn from human actions in online

setting ? (Can you tell a model to not repeat a specific mistake ?)

  • Recourse – Can a model tell us what actions we can take to change its
  • utput ? (For example, what can you do to improve your credit score?)
slide-11
SLIDE 11

In general, it seems like there are few fundamental problems –

  • We don’t trust the models
  • We don’t know what happens in extreme cases
  • Mistakes can be expensive / harmful
  • Does the model makes similar mistakes as humans ?
  • How to change model when things go wrong ?

Interpretability is one way we try to deal with these problems

slide-12
SLIDE 12

What is interpretability ?

slide-13
SLIDE 13

There is no standard definition – Most agree it is something different from performance. Ability to explain or to present a model in understandable terms to humans (Doshi-Velez 2017) Cynical view – It is what makes you feel good about the model. It really depends on target audience.

slide-14
SLIDE 14

What does interpretation looks like ?

  • In pre-deep learning models, some models are considered

“interpretable”

slide-15
SLIDE 15

What does interpretation look like ?

  • Heatmap Visualization

[Jain 2018] [Sundarajan 2017]

slide-16
SLIDE 16

What does interpretation looks like ?

  • Give prototypical examples

By Chire - Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curi d=11765684 [Kim 2016]

slide-17
SLIDE 17

What does interpretation look like ?

  • Bake it into the model

[Bastings et al 2019]

slide-18
SLIDE 18

What does interpretation looks like ?

  • Provide explanation as text

[Rajani et al 2019] [Hancock et al 2018]

slide-19
SLIDE 19

Some properties of Interpretations

  • Faithfulness - how to provide explanations that accurately represent the true reasoning

behind the model’s final decision.

  • Plausibility – Is the explanation correct or something we can believe is true, given our

current knowledge of the problem ?

  • Understandable – Can I put it in terms that end user without in-depth knowledge of the

system can understand ?

  • Stability – Does similar instances have similar interpretations ?
slide-20
SLIDE 20

Evaluating Interpretability [Doshi-Velez 2017]

  • Application level evaluation – Put the model in practice and have the

end users interact with explanations to see if they are useful .

  • Human evaluation – Set up a Mechanical Turk task and ask non-

experts to judge the explanations

  • Functional evaluation – Design metrics that directly test properties
  • f your explanation.
slide-21
SLIDE 21

How to “interpret” ? Some definitions

slide-22
SLIDE 22

Global vs Local

  • Do we explain individual

prediction ? Example – Heatmaps Rationales

  • Do we explain entire model

? Example – Prototypes Linear Regression Decision Trees

slide-23
SLIDE 23

Inherent vs Post-hoc

  • Is the explainability built

into the model ? Example – Rationales Linear Regression Decision Trees Natural Language Explanations

  • Is the model black-box and

we use external method to try to understand it ? Example – Heatmaps (Some forms) Prototypes

slide-24
SLIDE 24

Model based vs Model Agnostic

  • Can it explain only few

classes of models ? Example – Rationales LR / Decision Trees Attention Gradients (Differentiable Models only)

  • Can it explain any model ?

Example – LIME – Locally Interpretable Model Agnostic Explanations SHAP – Shapley Values

slide-25
SLIDE 25

Some Locally Interpretable, Post-hoc methods

slide-26
SLIDE 26

Saliency Based Methods

  • Heatmap based visualization
  • Need differentiable model in most cases
  • Normally involve gradient

Model (dog) Explanation Method Model

slide-27
SLIDE 27

[Adebayo et al 2018]

slide-28
SLIDE 28

Saliency Example - Gradients

𝑔 𝑦 : 𝑆𝑒 → 𝑆

𝐹 𝑔 𝑦 = 𝑒𝑔(𝑦) 𝑒𝑦

How do we take gradient with respect to words ? Take gradient with respect to embedding of the word .

slide-29
SLIDE 29

Saliency Example – Leave-one-out

𝑔 𝑦 : 𝑆𝑒 → 𝑆

𝐹(𝑔)(𝑦)𝑗 = 𝑔 𝑦 − 𝑔(𝑦\i)

How to remove ? 1. Zero out pixels in image 2. Remove word from the text 3. Replace the value with population mean in tabular data

slide-30
SLIDE 30

Problems with Saliency Maps

  • Only capture first order information
  • Strange things can happen to

heatmaps in second order.

[Feng et al 2018]

slide-31
SLIDE 31

(Slide Credit – Julius Adebayo)

slide-32
SLIDE 32

LIME – locally interpretable model agnostic

Black Box (e.g. Neural Network) 𝑦1, 𝑦2, ⋯ , 𝑦𝑂 𝑧1, 𝑧2, ⋯ , 𝑧𝑂 𝑦1, 𝑦2, ⋯ , 𝑦𝑂 ෤ 𝑧1, ෤ 𝑧2, ⋯ , ෤ 𝑧𝑂 as close as possible ⋯ ⋯ Linear Model

Can’t do it globally of course, but locally ? Main Idea behind LIME

(Image Credit – Hung-yi Lee)

slide-33
SLIDE 33

Intuition behind LIME

[Ribeiro et al 2016]

slide-34
SLIDE 34

LIME - Image

  • 1. Given a data point you want to explain
  • 2. Sample at the nearby - Each image is represented as a set of

superpixels (segments).

Ref: https://medium.com/@kstseng/lime-local-interpretable-model-agnostic- explanation%E6%8A%80%E8%A1%93%E4%BB%8B%E7%B4%B9-a67b6c34c3f8

Randomly delete some segments. Compute the probability of “frog” by black box 0.85 0.01 0.52 Black Black Black

(Slide Credit – Hung-yi Lee)

slide-35
SLIDE 35

LIME - Image

  • 3. Fit with linear (or interpretable) model

0.85 0.01 0.52 Linear Linear Linear

Extract Extract Extract

𝑁 is the number of segments. 𝑦𝑛 = ൜0 1

Segment m is deleted. Segment m exists. 𝑦1 𝑦𝑁 𝑦𝑛

⋯ ⋯ ⋯ ⋯

(Slide Credit – Hung-yi Lee)

slide-36
SLIDE 36

LIME - Image

  • 4. Interpret the model you learned

0.85 Linear Extract

𝑧 = 𝑥1𝑦1 + ⋯ + 𝑥𝑛𝑦𝑛 + ⋯ + 𝑥𝑁𝑦𝑁 𝑁 is the number of segments. 𝑦𝑛 = ൜0 1

Segment m is deleted. Segment m exists. If 𝑥𝑛 ≈ 0 If 𝑥𝑛 is positive If 𝑥𝑛 is negative segment m is not related to “frog” segment m indicates the image is “frog” segment m indicates the image is not “frog”

(Slide Credit – Hung-yi Lee)

slide-37
SLIDE 37

The Math behind LIME

Match interpretable model to black box Control complexity of the model

slide-38
SLIDE 38

Example from NLP

slide-39
SLIDE 39

Rationalization Models

slide-40
SLIDE 40

General Idea

Extractor Classifier Tree frog (97%) Extractor Classifier Positive (98%)

slide-41
SLIDE 41

(Slides Credit – Tao Lei)

slide-42
SLIDE 42

(Slides Credit – Tao Lei)

slide-43
SLIDE 43

(Slides Credit – Tao Lei)

slide-44
SLIDE 44

(Slides Credit – Tao Lei)

slide-45
SLIDE 45

(Slides Credit – Tao Lei)

slide-46
SLIDE 46

(Slides Credit – Tao Lei)

slide-47
SLIDE 47

(Slides Credit – Tao Lei)

slide-48
SLIDE 48

(Slides Credit – Tao Lei)

slide-49
SLIDE 49

FRESH Model – Faithful Rationale Extraction using Saliency Thresholding

slide-50
SLIDE 50

FRESH Model – Faithful Rationale Extraction using Saliency Thresholding

slide-51
SLIDE 51

FRESH Model – Faithful Rationale Extraction using Saliency Thresholding

slide-52
SLIDE 52

Some Results – Functional Evaluation

slide-53
SLIDE 53

Some Results – Human Evaluation

slide-54
SLIDE 54

Some Results – Human Evaluation

slide-55
SLIDE 55

Important Points to take away

  • Interpretability – no consistent definition
  • When designing new system, ask your stakeholders what they want
  • ut of it .
  • See if you can use inherently interpretable model .
  • If not, what method can you use to interpret the black box ?
  • Ask – does this method make sense ? Question Assumptions !!!
  • Stress Test and Evaluate !