sanity checks for saliency maps
play

Sanity Checks for Saliency Maps Julius Adebayo *+ , Justin Gilmer # , - PowerPoint PPT Presentation

Sanity Checks for Saliency Maps Julius Adebayo *+ , Justin Gilmer # , Michael Muelly # , Ian Goodfellow # , Moritz Hardt ^ # , Been Kim # * Work was done during the Google AI residency program, + MIT, ^ UC Berkeley, # Google Brain. Interpretability


  1. Sanity Checks for Saliency Maps Julius Adebayo *+ , Justin Gilmer # , Michael Muelly # , Ian Goodfellow # , Moritz Hardt ^ # , Been Kim # * Work was done during the Google AI residency program, + MIT, ^ UC Berkeley, # Google Brain.

  2. Interpretability To use machine learning more responsibly .

  3. Investigating post-training interpretability methods. Given a fixed model, find the evidence of prediction . � 3

  4. Investigating post-training interpretability methods. A trained machine learning model (e.g., neural network) Junco Bird-ness Given a fixed model, find the evidence of prediction . Why was this a Junco bird? � 4

  5. One of the most popular techniques: Saliency maps A trained machine learning model (e.g., neural network) Junco Bird-ness The promise: these pixels are the Caaaaan do! evidence of prediction. � 5

  6. Sanity check question. A trained machine learning model (e.g., neural network) Junco Bird-ness The promise: these pixels are the evidence of prediction. � 6

  7. Sanity check question. A trained machine learning model (e.g., neural network) Junco Bird-ness The promise: these pixels are the evidence of If so, when prediction changes, the explanation should change. prediction. Extreme case: If prediction is random, the explanation should REALLY change. � 7

  8. Sanity check: When prediction changes, do explanations change? Saliency map

  9. Sanity check: When prediction changes, do explanations change? Saliency map Randomized weights! Network now makes garbage predictions.

  10. Sanity check: When prediction changes, do explanations change? Saliency map !!!!!???!? Randomized weights! Network now makes garbage predictions.

  11. Sanity check: When prediction changes, do explanations change? Saliency map !!!!!???!? Randomized weights! Network now makes garbage predictions. the evidence of prediction?????

  12. Sanity check1: When prediction changes, do explanations change? No! Before After Backprop Guided Integrated Gradient

  13. Sanity check2: Networks trained with true and random labels, Do explanations deliver different messages? No! Networks trained with…. � 13

  14. Conclusion • Confirmation bias : Just because it “makes sense” to humans, doesn’t mean it reflects the evidence for prediction. • Do sanity checks for your interpretability methods! (e.g., TCAV [K. et al ’18]) • Others who independently reached the same conclusions: [Nie, Zhang, Patel ’18] [Ulyanov, Vedaldi, Lempitsky ’18] • Some of these methods have been shown to be useful for humans. Why? More studies needed. Poster #30 10:45am - 12:45pm @Room 210 � 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend