Semantically Equivalent Adversarial Rules for Debugging NLP Models
Marco Tulio Ribeiro Carlos Guestrin
Sameer Singh (UC Irvine)
1
Semantically Equivalent Adversarial Rules for Debugging NLP Models - - PowerPoint PPT Presentation
Semantically Equivalent Adversarial Rules for Debugging NLP Models Marco Tulio Ribeiro Carlos Guestrin Sameer Singh (UC Irvine) 1 NLP / ML models are getting smarter: VQA What type of road sign is shown? > STOP . Visual7A [Zhu et al
Marco Tulio Ribeiro Carlos Guestrin
Sameer Singh (UC Irvine)
1
What type of road sign is shown? > STOP .
2
Visual7A [Zhu et al 2016]
The biggest city on the river Rhine is Cologne, Germany with a population
It is the second-longest river in Central and Western Europe (after the Danube), at about 1,230 km (760 mi) How long is the Rhine? 1230km
3
BiDAF [Seo et al 2017]
“panda” 57.7% confidence “gibbon” 99.3% confidence
4
5
What type of road sign is shown? > STOP . What type of road sign is shown?
What type of road sign is sho wn?
6
What type of road sign is shown? > STOP . What type of road sign is shown?
7
What type of road sign is shown? > Do not Enter. > STOP . What type of road sign is shown?
8
The biggest city on the river Rhine is Cologne, Germany with a population
It is the second-longest river in Central and Western Europe (after the Danube), at about 1,230 km (760 mi) How long is the Rhine? > More than 1,050,000 > 1230km How long is the Rhine?
9
10
What type of road sign is shown? > Do not Enter. > STOP . What type of road sign is shown?
11
What color is the sky? > Gray. > Blue. What color is the sky?
12
The biggest city on the river Rhine is Cologne, Germany with a population
It is the second-longest river in Central and Western Europe (after the Danube), at about 1,230 km (760 mi) How long is the Rhine? > More than 1,050,000 > 1230km How long is the Rhine?
13
Detailed investigation of chum salmon, Oncorhynchus keta, showed that these fish digest ctenophores 20 times as fast as an equal weight of shrimps. What is the oncorhynchus also called? > Oncorhynchus keta What is the oncorhynchus also called?
> chum salmon
14
15
16
17
18
Sentence X en - pt en - fr Portuguese Translation French Translation fr - en pt - en
Good movie! Bom filme! Bon film! Translators Back translators Score Good movie Good film Great movie … Movie good 0.35 0.34 0.1 … 0.001
19
[Mallinson et al, 2017]
What color is the tray? Pink What colour is the tray? Green Which color is the tray? Green What color is it? Green What color is tray? Pink How color is the tray? Green
20
21
Find SEAs Propose Candidate Rules Select Small Rule Set
22
(What → Which) (What NOUN → Which NOUN) (WP type → Which type) (WP NOUN → Which NOUN) … (What type → Which type) What type of road sign is shown? What type of road sign is shown? Candidate Rules:
23
What Which type of road sign is shown? What Which is the person looking at? What Which was I thinking?
Find SEAs Propose Candidate Rules Select Small Rule Set
24
What NOUN → Which NOUN What type → Which type color → colour Selected Rules
25
26
Visual7a-Telling [Zhu et al 2016]
27
BiDAF [Seo et al 2017]
28
FastText [Joulin et al 2016]
29
30
Humans Top scored SEA SEA (top 5) + Human Evaluate adversaries for semantic equivalence
31
11.5 23 34.5 46 Human SEA Human + SEA 45 36 33.6 Human SEA Human + SEA 25 33.8 42.5 51.3 60 Human SEA Human + SEA 33 26 25.3 Human SEA Human + SEA
Visual Question Answering Sentiment Analysis
32
33
They are so easy to love… What kind of meat is on the boy’s plate? How many suitcases? Also great directing and photography
34
35
36
4 8 12 16 Visual QA Sentiment 10.9 14.2 3.3 3 Experts SEARs 5 10 15 20 Visual QA Sentiment 5.4 10.1 12.9 16.9 Finding Rules Evaluating SEARs
% correct predictions flipped Time (minutes)
37
38
(color → colour) (WP VBZ → WP’s) …
Filter out bad rules Augment training Retrain model Data
39
3.5 7 10.5 14 Visual QA Sentiment 3.4 1.4 12.6 12.6 Original Augmented
% of flips due to bugs
40
41
SEA SEARS
Marco Tulio Ribeiro Carlos Guestrin
Sameer Singh (UC Irvine)
42
43
Good movie Good film Great movie … 0.35 0.34 0.1 … good great excellent … 0.7 0.2 0.05 … 1 0.97 0.29 … 1 0.29 0.07 … good movie good
44
45
46
FastText [Joulin et al 2016]
47