Interpretability in NLP:
Moving Beyond Vision
Shuoyang Ding
Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu
Interpretability in NLP: Moving Beyond Vision Shuoyang Ding - - PowerPoint PPT Presentation
Interpretability in NLP: Moving Beyond Vision Shuoyang Ding Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu Outline A Quick Tour of Interpretability Model Transparency
Shuoyang Ding
Microsoft Translator Talk Series Oct 10th, 2019 Work done in collaboration with Philipp Koehn and Hainan Xu
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
2
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
3
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
5
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
6
Speaker TV Box CD Laptop Game Console
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
7
Speaker TV Box CD Laptop Game Console ? ? ?
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
8
Speaker TV Box CD Laptop Game Console
Amplifier
1 2 3 4
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
task, but with easily explainable behaviors
9
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
10
Speaker TV Box CD Laptop Game Console ? ? ?
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
task!)
11
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
task!)
12
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
13
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
14
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
15
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
16
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
17
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
18
when :
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
19
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
the interpreted model
20
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
21
https://pair-code.github.io/saliency/
Image Saliency
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
points where the gradients are noisy.
22
[Smilkov et al. 2017]
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
23
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
24
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
the same input corrupted with gaussian noise, and average the saliency of copies.
25
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
26
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
27 Original Image Vanilla SmoothGrad Integrated Gradients
https://pair-code.github.io/saliency/
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
28
[Sundararajan et al. 2017]
feature saturation
carries no information
interpolated baseline & input and average by integration
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
29 Original Image Vanilla SmoothGrad Integrated Gradients
https://pair-code.github.io/saliency/
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
30
Speaker TV Box CD Computer
Game Console
Speaker TV Box CD Computer
Game Console
Model Transparency:
an explainable way
depend on output Post-hoc interpretation:
specific output
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
31
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
32
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
33
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
34
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
35
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
36
Wait… word alignments should be aware of the output!
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
37
Hint: GIZA++, fast-align, etc.
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
38
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
39
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
40
Photo Credit: Hainan Xu
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
41
It’s straight-forward to compute saliency for a single dimension of the word embedding.
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
42
But how to compose the saliency of each dimension into the saliency of a word?
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
43
Visualizing and Understanding Neural Models in NLP
1 N
N
∑
i=1
∂y ∂ei (0, ∞) range:
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
44
Consider word embedding look-up as a dot product between the embedding matrix and an one-hot vector.
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
45
The 1 in the one-hot vector denotes the identity of the input word.
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
46
Let’s perturb that 1 like a real value! i.e. take gradients with regard to the 1.
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
47
∑
i
ei ⋅ ∂y ∂ei (−∞, ∞) range:
Recall this is different from Li’s proposal: 1 N
N
∑
i=1
∂y ∂ei
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
translation and still carry a large (negative) gradient.
48
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
49
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
Transformer (with fairseq default hyper- parameters)
covers de-en, fr-en and ro-en.
50
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
input samples, then average the attention weights over samples
embedding gradients, then average over embedding dimensions
51
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
AER
15 20 25 30 35 40 45
Attention Smoothed Attention Li+Grad Li+SmoothGrad Ours+Grad Ours+SmoothGrad fast-align Zenkel et al. [2019] GIZA++
52
Better Worse
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
53
AER
15 25 35 45 55 65
Conv LSTM Transformer fast-align Zenkel et al. [2019] GIZA++ Better Worse
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
54
AER
15 25 35 45 55 65
Conv LSTM Transformer fast-align Zenkel et al. [2019] GIZA++ Better Worse
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
55
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
56
(English: We do not believe that we should cherry-pick .)
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
model works, but is hard to interpret
alignments, but has nothing to do with the translation model
57
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
58
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
[Aubakirova and Bansal 2016][Arras et al. 2016]
[Li et al. 2016][Arras et al. 2017]
[Mudrakarta et al. 2018]
59
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
60
Post-hoc Interpretation = How did the model make decision
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
61
Post-hoc Interpretation = How did the model make decision
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
62
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
63
Li et al. 2016 Ours+SmoothGrad
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
NLP?
model size? architecture? task performance?
64
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
65
Yao et al. 2018 Weakly Supervised Medical Diagnosis and Localization from Multiple Resolutions
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
66
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
2016][Marvin and Linzen 2018][Gulordava et al . 2018][Giulianelli et al. 2018]
performance, i.e. whether the model can predict the lexical agreements properly
67
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
However , most people , having been subjected to news footage of the devastated South Bronx , …
68
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
However , most people , having been subjected to news footage of the devastated South Bronx , …
69
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
However , most people , having been subjected to news footage of the devastated South Bronx , …
70
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
However , most people , having been subjected to news footage of the devastated South Bronx , …
71
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
72
However , most people , having been subjected to news footage of the devastated South Bronx , look
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
73
However , most people , having been subjected to news footage of the devastated South Bronx , looks
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
However , most people , having been subjected to news footage of the devastated South Bronx , look The interpretation passes the test, if ∀ w ∈ {footage, Bronx}, s.t. ψ(people) > ψ(w)
74
ψ: feature importance/saliency
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
However , most people , having been subjected to news footage of the devastated South Bronx , looks The interpretation passes the test, if ∃ w ∈ {footage, Bronx}, s.t. ψ(people) < ψ(w)
75
ψ: feature importance/saliency
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
human-annotated corpus
coreference resolution dataset -- stay tuned!
76
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
fine-tuned for predicting specific agreement of interest
agreements and interfere with evaluation
77
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
trained on WikiText-103
Stanford POS-tagger
78
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
79
0.25 0.5 0.75 1
penn syneval
LSTM QRNN Transformer
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
80
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
penn syneval
random vanilla li li_smoothed smoothed integral
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
81
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
penn syneval
random vanilla li li_smoothed smoothed integral
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
82
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
penn syneval
random vanilla li li_smoothed smoothed integral
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
83
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
84
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
penn syneval
random vanilla li li_smoothed smoothed integral
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
85
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
penn syneval
random vanilla li li_smoothed smoothed integral
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
86
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
penn syneval
random vanilla li li_smoothed smoothed integral
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
87
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
penn syneval
random vanilla li li_smoothed smoothed integral
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
immediate previous word?
88
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
89
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
rigorous quantitative checks for post-hoc interpretation methods in the context of NLP
interpretation methods.
90
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
91
Shuoyang Ding — Interpretability in NLP: Moving Beyond Vision
deep architectures in NLP.
applications (QE?), or improve our models?
the model learned certain linguistic properties?
92
email: dings@jhu.edu twitter: @_sding github: shuoyangd