Interpretability in NLP: Moving Beyond Vision Shuoyang Ding - - PowerPoint PPT Presentation

interpretability in nlp
SMART_READER_LITE
LIVE PREVIEW

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding - - PowerPoint PPT Presentation

Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu Outline A Quick Tour of Interpretability Model Transparency


slide-1
SLIDE 1

Interpretability in NLP:

Moving Beyond Vision

Shuoyang Ding

Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu

slide-2
SLIDE 2

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Outline

  • A Quick Tour of Interpretability
  • Model Transparency
  • Post-hoc Interpretations
  • Moving Visual Interpretability to Language:
  • Word Alignment for NMT Via Model Interpretation
  • Benchmarking Interpretations Via Lexical Agreement
  • Future Work


2

slide-3
SLIDE 3

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Outline

  • A Quick Tour of Interpretability
  • Model Transparency
  • Post-hoc Interpretations
  • Moving Visual Interpretability to Language:
  • Word Alignment for NMT Via Model Interpretation
  • Benchmarking Interpretations Via Lexical Agreement
  • Future Work


3

slide-4
SLIDE 4
slide-5
SLIDE 5

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

What is Interpretability?

  • No consensus!
  • Categorization proposed in [Lipton 2018]
  • Model Transparency
  • Post-hoc Interpretation

5

slide-6
SLIDE 6

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Toy Example

6

Speaker TV Box CD Laptop Game Console

slide-7
SLIDE 7

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Toy Example

7

Speaker TV Box CD Laptop Game Console ? ? ?

slide-8
SLIDE 8

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

A Transparent Model

8

Speaker TV Box CD Laptop Game Console

Amplifier

1 2 3 4

slide-9
SLIDE 9

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Transparent Models

  • Build another model that accomplishes the same

task, but with easily explainable behaviors

  • Deep neural networks are not interpretable…
  • So what models are? (Open question)
  • log-linear model?
  • attention model?


9

slide-10
SLIDE 10

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

  • Meh. Too lazy for that!

10

Speaker TV Box CD Laptop Game Console ? ? ?

slide-11
SLIDE 11

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Post-hoc Interpretation

  • Ask a human
  • Interpretation with stand-alone model (different

task!)

  • Jiggle the cable!
  • Interpretation with sensitivity w.r.t. features

11

slide-12
SLIDE 12

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Post-hoc Interpretation

  • Ask a human
  • Interpretation with stand-alone model (different

task!)

  • Jiggle the cable!
  • Interpretation with sensitivity w.r.t. features

12

slide-13
SLIDE 13

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

A Little Abstraction…

13

slide-14
SLIDE 14

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

A Little Abstraction…

14

slide-15
SLIDE 15

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

A Little Abstraction…

15

slide-16
SLIDE 16

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

A Little Abstraction…

16

slide-17
SLIDE 17

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Relative Sensitivity…?

17

slide-18
SLIDE 18

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Relative Sensitivity…?

18

when :

slide-19
SLIDE 19

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Saliency

19

slide-20
SLIDE 20

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

What’s good about this?

  • 1. Model-agnostic, and yet with some exposure to

the interpreted model

  • 2. Derivatives are easy to obtain for any DL toolkit

20

slide-21
SLIDE 21

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Saliency in Computer Vision

21

https://pair-code.github.io/saliency/

Image Saliency

slide-22
SLIDE 22

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

SmoothGrad

  • Gradients are very local measure of sensitivity.
  • Highly non-linear models may have pathological

points where the gradients are noisy.


22

[Smilkov et al. 2017]

slide-23
SLIDE 23

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

SmoothGrad

23

slide-24
SLIDE 24

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

SmoothGrad

24

slide-25
SLIDE 25

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

SmoothGrad

  • Solution: calculate saliency for multiple copies of

the same input corrupted with gaussian noise, and average the saliency of copies.

25

slide-26
SLIDE 26

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

SmoothGrad

26

slide-27
SLIDE 27

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

SmoothGrad in Computer Vision

27 Original Image Vanilla SmoothGrad Integrated Gradients

https://pair-code.github.io/saliency/

slide-28
SLIDE 28

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Integrated Gradients (IG)

28

[Sundararajan et al. 2017]

  • Proposed to solve 


feature saturation

  • Baseline: an input that

carries no information

  • Compute gradients on

interpolated baseline & input and average by integration

slide-29
SLIDE 29

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

IG in Computer Vision

29 Original Image Vanilla SmoothGrad Integrated Gradients

https://pair-code.github.io/saliency/

slide-30
SLIDE 30

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Summary

30

Speaker TV Box CD Computer

Game Console

Speaker TV Box CD Computer

Game Console

Model Transparency:

  • Build model that operates in

an explainable way

  • Interpretation does not

depend on output Post-hoc interpretation:

  • Keep the original model intact
  • Interpretation depends on

specific output

slide-31
SLIDE 31

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Summary

  • How is this related to what I’m talking about next?
  • Word Alignment for NMT Via Model Interpretation
  • transparent models vs. post-hoc interpretations
  • Benchmarking Interpretations Via Lexical Agreement
  • different post-hoc interpretation methods


31

slide-32
SLIDE 32

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Outline

  • A Quick Tour of Interpretability
  • Model Transparency
  • Post-hoc Interpretations
  • Moving Visual Interpretability to Language:
  • Word Alignment for NMT Via Model Interpretation
  • Benchmarking Interpretations Via Lexical Agreement
  • Future Work


32

slide-33
SLIDE 33

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Word Alignment

33

slide-34
SLIDE 34

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Word Alignment

34

slide-35
SLIDE 35

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Model Transparency?

35

slide-36
SLIDE 36

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Model Transparency?

36

Wait… word alignments should be aware of the output!

slide-37
SLIDE 37

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Post-hoc Interpretations with Stand-alone Models?

37

p(aij | e, f)

Hint: GIZA++, fast-align, etc.

slide-38
SLIDE 38

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Post-hoc Interpretations with Perturbation/Sensitivity?

38

slide-39
SLIDE 39

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Post-hoc Interpretations with Perturbation/Sensitivity?

39

slide-40
SLIDE 40

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

“Feature” in Computer Vision

40

Photo Credit: Hainan Xu

slide-41
SLIDE 41

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

“Feature” in NLP

41

It’s straight-forward to compute saliency for 
 a single dimension of the word embedding.

slide-42
SLIDE 42

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

“Feature” in NLP

42

But how to compose the saliency of each dimension into the saliency of a word?

slide-43
SLIDE 43

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Li et al. 2016

43

Visualizing and Understanding Neural Models in NLP

1 N

N

i=1

∂y ∂ei (0, ∞) range:

slide-44
SLIDE 44

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Our Proposal

44

Consider word embedding look-up as a dot product between the embedding matrix and an one-hot vector.

slide-45
SLIDE 45

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Our Proposal

45

The 1 in the one-hot vector denotes the identity of the input word.

slide-46
SLIDE 46

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Our Proposal

46

Let’s perturb that 1 like a real value! i.e. take gradients with regard to the 1.

slide-47
SLIDE 47

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Our Proposal

47

i

ei ⋅ ∂y ∂ei (−∞, ∞) range:

Recall this is different from Li’s proposal: 1 N

N

i=1

∂y ∂ei

slide-48
SLIDE 48

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Why is this proposal better?

  • A input word may strongly discourage certain

translation and still carry a large (negative) gradient.

  • Those are salient words, but shouldn’t be aligned.
  • Absolute value/L2-norm falls into this pit.

48

slide-49
SLIDE 49

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Evaluation

  • Evaluation of interpretations is tricky!
  • Fortunately, there’s human judgments to rely on.
  • Need to do force decoding with NMT model.

49

slide-50
SLIDE 50

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Setup

  • Architecture: Convolutional S2S, LSTM,

Transformer (with fairseq default hyper- parameters)

  • Dataset: Following Zenkel et al. [2019], which

covers de-en, fr-en and ro-en.

  • SmoothGrad hyper-parameters: N=30 and σ=0.15

50

slide-51
SLIDE 51

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Baselines

  • Attention weights
  • Smoothed Attention: forward pass on multiple corrupted

input samples, then average the attention weights over samples

  • [Li et al. 2016]: compute element-wise absolute value of

embedding gradients, then average over embedding dimensions

  • [Li et al. 2016] + SmoothGrad


51

slide-52
SLIDE 52

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Convolutional S2S on de-en

AER

15 20 25 30 35 40 45

Attention Smoothed Attention Li+Grad Li+SmoothGrad Ours+Grad Ours+SmoothGrad fast-align Zenkel et al. [2019] GIZA++

52

Better Worse

slide-53
SLIDE 53

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Attention on de-en

53

AER

15 25 35 45 55 65

Conv LSTM Transformer fast-align Zenkel et al. [2019] GIZA++ Better Worse

slide-54
SLIDE 54

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Ours+SmoothGrad on de-en

54

AER

15 25 35 45 55 65

Conv LSTM Transformer fast-align Zenkel et al. [2019] GIZA++ Better Worse

slide-55
SLIDE 55

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Li vs. Ours

55

slide-56
SLIDE 56

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Li vs. Ours

56

(English: We do not believe that we should cherry-pick .)

slide-57
SLIDE 57

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Summary

  • For each of these interpretation methods:
  • Attention: maximum transparency on how the

model works, but is hard to interpret

  • Stand-alone Alignment Models: gives best word

alignments, but has nothing to do with the translation model

  • Saliency: a good combination of both worlds!


57

slide-58
SLIDE 58

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Outline

  • A Quick Tour of Interpretability
  • Model Transparency
  • Post-hoc Interpretations
  • Moving Visual Interpretability to Language:
  • Word Alignment for NMT Via Model Interpretation
  • Benchmarking Interpretations Via Lexical Agreement
  • Future Work


58

slide-59
SLIDE 59

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

How about other NLP tasks?

  • Text Classification: 


[Aubakirova and Bansal 2016][Arras et al. 2016]

  • Sentiment Analysis: 


[Li et al. 2016][Arras et al. 2017]

  • Question Answering: 


[Mudrakarta et al. 2018]


59

slide-60
SLIDE 60

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Assumption

60

Post-hoc Interpretation = How did the model make decision

slide-61
SLIDE 61

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Assumption

61

Post-hoc Interpretation = How did the model make decision

?

slide-62
SLIDE 62

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Quick Flashback

62

slide-63
SLIDE 63

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Quick Flashback

63

Li et al. 2016 Ours+SmoothGrad

slide-64
SLIDE 64

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Research Question

  • How can we quantitatively test the effectiveness
  • f model interpretation methods in the context of

NLP?

  • What are the said “effectiveness” correlated with?


model size? architecture? task performance?

64

slide-65
SLIDE 65

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Computer Vision

65

Yao et al. 2018 Weakly Supervised Medical Diagnosis and Localization from Multiple Resolutions

slide-66
SLIDE 66

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Main Challenge

No ground-truth
 interpretation

66

slide-67
SLIDE 67

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Lexical Agreements

  • Frequently studied for interpretability [Linzen et al.

2016][Marvin and Linzen 2018][Gulordava et al . 2018][Giulianelli et al. 2018]

  • They concentrate on evaluating probing task

performance, i.e. whether the model can predict the lexical agreements properly

67

slide-68
SLIDE 68

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

E.g. Subject-Verb Agreements

However , most people , having been subjected to news footage of the devastated South Bronx , …

  • A. look B. looks

68

slide-69
SLIDE 69

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

E.g. Subject-Verb Agreements

However , most people , having been subjected to news footage of the devastated South Bronx , …

  • A. look B. looks

69

slide-70
SLIDE 70

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

E.g. Subject-Verb Agreements

However , most people , having been subjected to news footage of the devastated South Bronx , …

  • A. look B. looks

70

slide-71
SLIDE 71

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

E.g. Subject-Verb Agreements

However , most people , having been subjected to news footage of the devastated South Bronx , …

  • A. look B. looks

71

“Probing Task”

slide-72
SLIDE 72

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

The Test

72

However , most people , having been subjected to news footage of the devastated South Bronx , look

slide-73
SLIDE 73

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

The Test

73

However , most people , having been subjected to news footage of the devastated South Bronx , looks

slide-74
SLIDE 74

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

The Test

However , most people , having been subjected to news footage of the devastated South Bronx , look The interpretation passes the test, if ∀ w ∈ {footage, Bronx}, s.t. ψ(people) > ψ(w)

74

ψ: feature importance/saliency

slide-75
SLIDE 75

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

The Test

However , most people , having been subjected to news footage of the devastated South Bronx , looks The interpretation passes the test, if ∃ w ∈ {footage, Bronx}, s.t. ψ(people) < ψ(w)

75

ψ: feature importance/saliency

slide-76
SLIDE 76

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

The Test

  • We constructed test set based on two existing

human-annotated corpus

  • Penn Treebank: new, multiple attractors
  • syneval: Marvin and Linzen [2018], single attractor
  • We plan to construct another one with CoNLL-2012

coreference resolution dataset -- stay tuned!


76

slide-77
SLIDE 77

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Interpreted Model

  • Language Model!
  • With final linear layer replaced with one that is

fine-tuned for predicting specific agreement of interest

  • Word prediction may introduce out-of-scope

agreements and interfere with evaluation

77

slide-78
SLIDE 78

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Experiment

  • Architectures:
  • LSTM model, trained on WikiText-2
  • QRNN model [Bradbury et al. 2017], trained on WikiText-2
  • Transformer model w/ adaptive input [Baevski and Auli, 2018],

trained on WikiText-103

  • All the fine-tuning was done on WikiText-2
  • For subject-verb agreement, the verb tagging is done with

Stanford POS-tagger


78

slide-79
SLIDE 79

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Probing Task Performance

79

0.25 0.5 0.75 1

penn syneval

LSTM QRNN Transformer

slide-80
SLIDE 80

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Interpretation of LSTM

80

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

penn syneval

random vanilla li li_smoothed smoothed integral

slide-81
SLIDE 81

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Interpretation of QRNN

81

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

penn syneval

random vanilla li li_smoothed smoothed integral

slide-82
SLIDE 82

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Interpretation of Transformer

82

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

penn syneval

random vanilla li li_smoothed smoothed integral

slide-83
SLIDE 83

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

What's up with Transformer?

  • Two hypothesis:
  • Deep model hurts interpretability
  • Too many heads hurts interpretability
  • SOTA model: 16 layers, 8 heads
  • Diagnostic model:
  • 4 layers, 8 heads
  • 4 layers, 1 head


83

slide-84
SLIDE 84

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

16 layers, 8 heads

84

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

penn syneval

random vanilla li li_smoothed smoothed integral

slide-85
SLIDE 85

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

4 layers, 8 heads

85

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

penn syneval

random vanilla li li_smoothed smoothed integral

slide-86
SLIDE 86

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

4 layers, 4 heads

86

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

penn syneval

random vanilla li li_smoothed smoothed integral

slide-87
SLIDE 87

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

4 layers, 2 heads

87

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

penn syneval

random vanilla li li_smoothed smoothed integral

slide-88
SLIDE 88

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Some Qualitative Checks

  • Are those interpretations just looking at the

immediate previous word?

  • No. They seems to get a lot of things right!

88

slide-89
SLIDE 89

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Some Qualitative Checks

  • Are they the same with different architectures?
  • No. Different architectures work differently.

89

slide-90
SLIDE 90

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Summary

  • Lexical agreements open up possibilities to do

rigorous quantitative checks for post-hoc interpretation methods in the context of NLP

  • Our proposed method works the best consistently
  • Deep NLP models can be out-of-reach for existing

interpretation methods.

90

slide-91
SLIDE 91

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Outline

  • A Quick Tour of Interpretability
  • Model Transparency
  • Post-hoc Interpretations
  • Moving Visual Interpretability to Language:
  • Word Alignment for NMT Via Model Interpretation
  • Benchmarking Interpretations Via Lexical Agreement
  • Future Work


91

slide-92
SLIDE 92

Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision

Future Work

  • Better interpretation method that works for the

deep architectures in NLP.

  • How can we use interpretability in real-world

applications (QE?), or improve our models?

  • How can we use interpretability to validate whether

the model learned certain linguistic properties?

92

slide-93
SLIDE 93

Thanks!

email: dings@jhu.edu twitter: @_sding github: shuoyangd