Sanity Checks for Saliency Maps Julius Adebayo PhD Student, MIT. - - PowerPoint PPT Presentation

sanity checks for saliency maps
SMART_READER_LITE
LIVE PREVIEW

Sanity Checks for Saliency Maps Julius Adebayo PhD Student, MIT. - - PowerPoint PPT Presentation

Sanity Checks for Saliency Maps Julius Adebayo PhD Student, MIT. Joint work with 1 Some Motivation [Challenges for Transparency, Weller 2017, & Doshi-Velez & Kim, 2017 ] Developer/Researcher: Model Debugging. Safety


slide-1
SLIDE 1

Sanity Checks for ‘Saliency’ Maps

1

Julius Adebayo PhD Student, MIT.

Joint work with

slide-2
SLIDE 2
  • Developer/Researcher: Model Debugging.
  • Safety concerns.
  • Ethical concerns.
  • Trust: Satiate ‘societal’ need for reasoning to trust an automated system

learned from data.

Some Motivation

2

[Challenges for Transparency, Weller 2017, & Doshi-Velez & Kim, 2017 ]

slide-3
SLIDE 3

Goals: Model Debugging

3

  • Model Debugging: reveal spurious correlations or the kinds of

inputs that a model is most likely to have undesirable performance. [Ribeiro+ 2016]

slide-4
SLIDE 4

Promise of Explanations

4

  • Model Debugging: reveal spurious correlations or the kinds of

inputs that a model is most likely to have undesirable performance.

Husky

slide-5
SLIDE 5

Promise of Explanations

5

  • Model Debugging: reveal spurious correlations or the kinds of

inputs that a model is most likely to have undesirable performance.

Explanation

Husky

slide-6
SLIDE 6

Promise of Explanations

6

  • Model Debugging: reveal spurious correlations or the kinds of

inputs that a model is most likely to have undesirable performance.

Explanation

Husky Fix

slide-7
SLIDE 7

Agenda

  • Overview of attribution methods
  • This talk will mostly focus on post-hoc explanation methods

for deep neural networks.

  • The selection conundrum
  • Sanity checks & results
  • Theoretical justification by Nie. et. al. 2018.
  • Passing sanity checks & recent results
  • Conclusion
slide-8
SLIDE 8

Saliency/Attribution Maps

8

Predictions

Explanation

Corn

slide-9
SLIDE 9

Saliency/Attribution Maps

9

Predictions

Explanation

Corn

Attribution maps provide ‘relevance’ scores for each dimension of the input.

slide-10
SLIDE 10

Saliency/Attribution Maps

10

Predictions

Explanation

Corn

Attribution maps provide ‘relevance’ scores for each dimension of the input.

S : Rd → RC

E : Rd → Rd

slide-11
SLIDE 11

How to compute attribution?

11

Predictions

Attribution

Corn

Egrad(x) = ∂Si ∂x

[SVZ’13]

slide-12
SLIDE 12

Some Issues with the Gradient

12

Predictions Corn

‘Visually noisy’, and can violate sensitivity w.r.t. a baseline input [Sundararajan et. al., Shrikumar et. al., and Smilkov et. al.]

slide-13
SLIDE 13

Integrated Gradients

13

Predictions Corn

Sum of ‘interior’ gradients.

[STY’17]

slide-14
SLIDE 14

SmoothGrad

14

Predictions Corn

Average attribution of ‘noisy’ inputs.

[STKVW’17]

slide-15
SLIDE 15

Gradient-Input

15

Predictions Corn

Element-wise product of gradient and input.

slide-16
SLIDE 16

Guided BackProp

16

Predictions Corn

Zero out ‘negative’ gradients and ‘activations’ while back-propagating.

slide-17
SLIDE 17

Other Learned Kinds

17

Predictions

Explanation

Corn

[FV’17]

Formulate an explanation as through learned patch removal.

slide-18
SLIDE 18

Non-Image Settings: Molecules

18

slide-19
SLIDE 19

The Selection Conundrum

19

Predictions Corn

slide-20
SLIDE 20

The Selection Conundrum

For a particular task and model, how should a developer/researcher select which method to use?

slide-21
SLIDE 21

Desirable Properties

  • Sensitivity to the parameters of a model to be

explained.

  • Depend on the labeling of the data, i.e., reflect

the relationship between inputs and outputs.

slide-22
SLIDE 22

Sanity Checks

  • We will use randomization as a way to test both

requirements.

  • Model parameter randomization test: randomize (re-

initialize) the parameters of a model and now compare attribution maps for a trained model to those derived from a randomized model.

  • Data randomization test: compare attribution maps for a

model trained with correct labels to those derived from a model trained with random labels.

slide-23
SLIDE 23

Model Parameter Randomization

Inception V3

  • Cascading randomization from top to bottom layers.
  • Independent layer randomization.
slide-24
SLIDE 24

Conjecture: If a model captures higher level class concepts, then saliency maps should change as the model is being randomized.

Model Parameter Randomization

24

Gradient Gradient-SG Gradient Input Guided Back-propagation GradCAM Integrated Gradients Integrated Gradients-SG

Original Explanation

Guided GradCAM

Cascading randomization from top to bottom layers

Original Image

slide-25
SLIDE 25

Conjecture: If a model captures higher level class concepts, then saliency maps should change as the model is being randomized.

Model Parameter Randomization

25

Gradient Gradient-SG Gradient Input Guided Back-propagation GradCAM Integrated Gradients Integrated Gradients-SG

logits Original Explanation

Guided GradCAM

Cascading randomization from top to bottom layers

Original Image

slide-26
SLIDE 26

Conjecture: If a model captures higher level class concepts, then saliency maps should change as the model is being randomized.

Model Parameter Randomization

26

Gradient Gradient-SG Gradient Input Guided Back-propagation GradCAM Integrated Gradients Integrated Gradients-SG

logits mixed_7c mixed_7b mixed_7a mixed_6e mixed_6d mixed_6c mixed_6b Original Explanation

Guided GradCAM

Cascading randomization from top to bottom layers

Original Image

slide-27
SLIDE 27

Conjecture: If a model captures higher level class concepts, then saliency maps should change as the model is being randomized.

Model Parameter Randomization

27

Gradient Gradient-SG Gradient Input Guided Back-propagation GradCAM Integrated Gradients Integrated Gradients-SG

logits conv2d_1a_3x3 mixed_7c mixed_7b conv2d_2a_3x3 conv2d_2b_3x3 conv2d_4a_3x3 mixed_7a mixed_6e mixed_6d mixed_6c mixed_6b mixed_6a mixed_5d mixed_5c mixed_5b conv2d_3b_1x1 Original Explanation

Guided GradCAM

Cascading randomization from top to bottom layers

Original Image

slide-28
SLIDE 28

Conjecture: If a model captures higher level class concepts, then saliency maps should change as the model is being randomized.

Model Parameter Randomization

28

Gradient Gradient-SG Gradient Input Guided Back-propagation GradCAM Integrated Gradients Integrated Gradients-SG

logits conv2d_1a_3x3 mixed_7c mixed_7b conv2d_2a_3x3 conv2d_2b_3x3 conv2d_4a_3x3 mixed_7a mixed_6e mixed_6d mixed_6c mixed_6b mixed_6a mixed_5d mixed_5c mixed_5b conv2d_3b_1x1 Original Explanation

Guided GradCAM

Cascading randomization from top to bottom layers

Original Image

slide-29
SLIDE 29

Metrics

Inception v3 - ImageNet

Mixed

7c 7b 7a 6e 6d 6c 6b 6a 5d 5c 5b 4a 3b 2b 2a 1a

logits

  • riginal

Conv2d

Rank Correlation ABS

Mixed

7c 7b 7a 6e 6d 6c 6b 6a 5d 5c 5b 4a 3b 2b 2a 1a

logits

  • riginal

Conv2d

Rank Correlation No ABS

See Caption Note

  • Rank correlation of attribution from model with trained weights to

those derived from partially randomized models.

  • Attribution sign changes. Roughly similar regions are, however, still

attributed.

slide-30
SLIDE 30

Model Parameter Randomization

Successive Randomization of Layers

Gradient Gradient-SG Gradient-VG Guided Backpropagation Guided GradCAM Integrated Gradients Integrated Gradients-SG

conv_hidden1 conv_hidden2 fc2

  • utput-fc
  • riginal explanation

Original Image Explanation conv_hidden1 conv_hidden2

  • utput-fc
  • riginal explanation

fc2

Independent Randomization of Layers

CNN MNIST

slide-31
SLIDE 31

Medical Setting

Guided Backpropagation Skeletal Radiograph

Age

slide-32
SLIDE 32

Data Randomization

32

CNN - MNIST

True Labels Random Labels

Gradient Gradient-SG Guided BackProp GradCAM Guided GradCAM Integrated Gradients Integrated Gradients-SG Gradient Input

True Labels Random Labels

Gradient Gradient-SG Guided BackProp GradCAM Guided GradCAM Integrated Gradients Integrated Gradients-SG Gradient Input

Rank Correlation - Abs Rank Correlation - No Abs

Absolute-Value Visualization Diverging Visualization

slide-33
SLIDE 33

Data Randomization

33

True Labels Random Labels True Labels Random Labels Rank Correlation - Abs

Rank Correlation - No Abs MLP - MNIST

Gradient Gradient-SG Guided BackProp Integrated Gradients Integrated Gradients-SG Gradient Input Gradient Gradient-SG Guided BackProp Integrated Gradients Integrated Gradients-SG Gradient Input

Absolute-Value Visualization Diverging Visualization

slide-34
SLIDE 34

Some Insights

  • Nie et. al. (ICML 2018) theoretical showed that Guided back

propagation is doing input reconstruction.

  • Observed in Mahendra et. al. 2014 (ECCV) as well.

Figure from Nie et. al, 2018.

slide-35
SLIDE 35

Summary

  • We focused on gradient-based methods mostly.
  • Sanity checks don’t tell if a method is good,

just if it is invariant.

  • Sole visual inspection can be deceiving.
slide-36
SLIDE 36

What about other methods

LIME-5 LIME-10 LIME-20 LIME-50 SHAP Gradient SmoothGrad Guided BackProp PatternNet Pattern Attribution Input-Gradient Integrated Gradients LRP-Z LRP-EPS LRP-SPAF LRP-SPBF VGrad DeepTaylor

Cascading randomization from top to bottom layers for VGG-16

{

LIME Variants

{

Not Previously considered in literature.

slide-37
SLIDE 37
  • Gupta et. al. fix this with competition for gradients (CGI).

A Fix for Sanity Checks

37

[Figure from Gupta et. al. 2019.]

slide-38
SLIDE 38

Other Assessment Methods

  • Hooker et. al. (to appear at

Neurips 2019) propose to remove and retrain.

  • Adel et. al. propose FSM to

‘quantify’ information content.

  • Yang et. al. introduce a

benchmark (w/ground truth) and other metrics to assess how well a map captures model behavior.

slide-39
SLIDE 39
  • ‘Adversarial’ attack on explanations by Ghorbani et. al.

Attacks

39

  • Mean-shift attack by Kindermans & Hooker et. al.
slide-40
SLIDE 40

Conundrum Persists

  • For methods that pass sanity checks how do we

choose among these?

  • Can end-users (developers) use these methods

to debug?

  • What about other explanation classes (concepts

and global methods)?