Sanity Checks for Saliency Maps Julius Adebayo *+ , Justin Gilmer # , - - PowerPoint PPT Presentation

sanity checks for saliency maps
SMART_READER_LITE
LIVE PREVIEW

Sanity Checks for Saliency Maps Julius Adebayo *+ , Justin Gilmer # , - - PowerPoint PPT Presentation

Sanity Checks for Saliency Maps Julius Adebayo *+ , Justin Gilmer # , Michael Muelly # , Ian Goodfellow # , Moritz Hardt ^ # , Been Kim # * Work was done during the Google AI residency program, + MIT, ^ UC Berkeley, # Google Brain. Interpretability


slide-1
SLIDE 1

Sanity Checks for Saliency Maps

Julius Adebayo*+, Justin Gilmer#, Michael Muelly#, Ian Goodfellow#, Moritz Hardt^#, Been Kim#

*Work was done during the Google AI residency program, +MIT, ^UC Berkeley, #Google Brain.

slide-2
SLIDE 2

Interpretability

To use machine learning more responsibly.

slide-3
SLIDE 3

Investigating post-training interpretability methods.

3

Given a fixed model, find the evidence of prediction.

slide-4
SLIDE 4

4

Junco Bird-ness

A trained machine learning model (e.g., neural network)

Given a fixed model, find the evidence of prediction. Why was this a Junco bird?

Investigating post-training interpretability methods.

slide-5
SLIDE 5

One of the most popular techniques:

Saliency maps

5

Caaaaan do!

A trained machine learning model (e.g., neural network)

The promise: these pixels are the evidence of prediction.

Junco Bird-ness

slide-6
SLIDE 6

The promise: these pixels are the evidence of prediction.

Sanity check question.

6

A trained machine learning model (e.g., neural network)

Junco Bird-ness

slide-7
SLIDE 7

Sanity check question.

7

A trained machine learning model (e.g., neural network)

If so, when prediction changes, the explanation should change. Extreme case: If prediction is random, the explanation should REALLY change.

The promise: these pixels are the evidence of prediction.

Junco Bird-ness

slide-8
SLIDE 8

Sanity check: When prediction changes, do explanations change?

Saliency map

slide-9
SLIDE 9

Randomized weights! Network now makes garbage predictions.

Saliency map

Sanity check: When prediction changes, do explanations change?

slide-10
SLIDE 10

Saliency map

!!!!!???!?

Sanity check: When prediction changes, do explanations change?

Randomized weights! Network now makes garbage predictions.

slide-11
SLIDE 11

Saliency map

!!!!!???!?

Sanity check: When prediction changes, do explanations change?

Randomized weights! Network now makes garbage predictions.

the evidence of prediction?????

slide-12
SLIDE 12

Before After Guided Backprop Integrated Gradient

Sanity check1: When prediction changes, do explanations change?

No!

slide-13
SLIDE 13

13

Networks trained with….

Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages?

No!

slide-14
SLIDE 14

Conclusion

14

Poster #30 10:45am - 12:45pm @Room 210

  • Confirmation bias: Just because it “makes sense” to humans,

doesn’t mean it reflects the evidence for prediction.

  • Do sanity checks for your interpretability methods!

(e.g., TCAV [K. et al ’18])

  • Others who independently reached the same conclusions:

[Nie, Zhang, Patel ’18] [Ulyanov, Vedaldi, Lempitsky ’18]

  • Some of these methods have been shown to be useful for humans.

Why? More studies needed.