interpretability in nlp
play

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding - PowerPoint PPT Presentation

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu Outline A Quick Tour of Interpretability Model Transparency


  1. Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu

  2. Outline • A Quick Tour of Interpretability • Model Transparency • Post-hoc Interpretations • Moving Visual Interpretability to Language: • Word Alignment for NMT Via Model Interpretation • Benchmarking Interpretations Via Lexical Agreement • Future Work 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 2

  3. Outline • A Quick Tour of Interpretability • Model Transparency • Post-hoc Interpretations • Moving Visual Interpretability to Language: • Word Alignment for NMT Via Model Interpretation • Benchmarking Interpretations Via Lexical Agreement • Future Work 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 3

  4. What is Interpretability? • No consensus! • Categorization proposed in [Lipton 2018] • Model Transparency • Post-hoc Interpretation Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 5

  5. Toy Example Speaker Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 6

  6. Toy Example Speaker ? ? ? Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 7

  7. A Transparent Model Speaker 1 2 3 4 Amplifier Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 8

  8. Transparent Models • Build another model that accomplishes the same task , but with easily explainable behaviors • Deep neural networks are not interpretable… • So what models are? (Open question) • log-linear model? • attention model? 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 9

  9. Meh. Too lazy for that! Speaker ? ? ? Game TV Box CD Laptop Console Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 10

  10. Post-hoc Interpretation • Ask a human • Interpretation with stand-alone model (different task!) • Jiggle the cable! • Interpretation with sensitivity w.r.t. features Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 11

  11. Post-hoc Interpretation • Ask a human • Interpretation with stand-alone model (different task!) • Jiggle the cable! • Interpretation with sensitivity w.r.t. features Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 12

  12. A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 13

  13. A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 14

  14. A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 15

  15. A Little Abstraction… Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 16

  16. Relative Sensitivity…? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 17

  17. Relative Sensitivity…? when : Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 18

  18. Saliency Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 19

  19. What’s good about this? 1. Model-agnostic , and yet with some exposure to the interpreted model 2. Derivatives are easy to obtain for any DL toolkit Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 20

  20. Saliency in Computer Vision Image Saliency https://pair-code.github.io/saliency/ Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 21

  21. SmoothGrad • Gradients are very local measure of sensitivity. • Highly non-linear models may have pathological points where the gradients are noisy . 
 [Smilkov et al. 2017] Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 22

  22. SmoothGrad Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 23

  23. SmoothGrad Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 24

  24. SmoothGrad • Solution: calculate saliency for multiple copies of the same input corrupted with gaussian noise , and average the saliency of copies. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 25

  25. SmoothGrad Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 26

  26. SmoothGrad in Computer Vision Original Image Vanilla SmoothGrad Integrated Gradients https://pair-code.github.io/saliency/ Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 27

  27. Integrated Gradients (IG) • Proposed to solve 
 feature saturation • Baseline : an input that carries no information • Compute gradients on interpolated baseline & input and average by integration [Sundararajan et al. 2017] Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 28

  28. IG in Computer Vision Original Image Vanilla SmoothGrad Integrated Gradients https://pair-code.github.io/saliency/ Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 29

  29. Summary Speaker Speaker Game Game TV Box Computer TV Box Computer CD CD Console Console Model Transparency : Post-hoc interpretation : Build model that operates in Keep the original model intact • • an explainable way Interpretation depends on • Interpretation does not specific output • depend on output Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 30

  30. Summary • How is this related to what I’m talking about next? • Word Alignment for NMT Via Model Interpretation • transparent models vs. post-hoc interpretations • Benchmarking Interpretations Via Lexical Agreement • different post-hoc interpretation methods 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 31

  31. Outline • A Quick Tour of Interpretability • Model Transparency • Post-hoc Interpretations • Moving Visual Interpretability to Language: • Word Alignment for NMT Via Model Interpretation • Benchmarking Interpretations Via Lexical Agreement • Future Work 
 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 32

  32. Word Alignment Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 33

  33. Word Alignment Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 34

  34. Model Transparency? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 35

  35. Model Transparency? Wait… word alignments should be aware of the output! Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 36

  36. Post-hoc Interpretations with Stand-alone Models? p(a ij | e, f) Hint: GIZA++, fast-align, etc. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 37

  37. Post-hoc Interpretations with Perturbation/Sensitivity? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 38

  38. Post-hoc Interpretations with Perturbation/Sensitivity? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 39

  39. “Feature” in Computer Vision Photo Credit: Hainan Xu Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 40

  40. “Feature” in NLP It’s straight-forward to compute saliency for 
 a single dimension of the word embedding. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 41

  41. “Feature” in NLP But how to compose the saliency of each dimension into the saliency of a word ? Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 42

  42. Li et al. 2016 Visualizing and Understanding Neural Models in NLP N 1 ∂ y ∑ ∂ e i N i =1 range: (0, ∞ ) Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 43

  43. Our Proposal Consider word embedding look-up as a dot product between the embedding matrix and an one-hot vector . Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 44

  44. Our Proposal The 1 in the one-hot vector denotes the identity of the input word . Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 45

  45. Our Proposal Let’s perturb that 1 like a real value ! i.e. take gradients with regard to the 1 . Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 46

  46. Our Proposal e i ⋅ ∂ y ∑ ∂ e i i range: ( −∞ , ∞ ) N 1 ∂ y ∑ Recall this is different from Li’s proposal: N ∂ e i i =1 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 47

  47. Why is this proposal better? • A input word may strongly discourage certain translation and still carry a large (negative) gradient . • Those are salient words, but shouldn’t be aligned . • Absolute value/L2-norm falls into this pit. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 48

  48. Evaluation • Evaluation of interpretations is tricky ! • Fortunately, there’s human judgments to rely on. • Need to do force decoding with NMT model. Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 49

  49. Setup • Architecture: Convolutional S2S, LSTM, Transformer (with fairseq default hyper- parameters) • Dataset: Following Zenkel et al. [2019], which covers de-en , fr-en and ro-en . • SmoothGrad hyper-parameters: N=30 and σ =0.15 Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision 50

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend