Adversarial Training for Satire Detection: Controlling for - - PowerPoint PPT Presentation

▶

Nov 01, 2022 218 likes •370 views

University of Stuttgart Institute for Natural Language Processing Adversarial Training for Satire Detection: Controlling for Confounding Variables June 3rd, 2019 Robert McHardy, Heike Adel and Roman Klinger Satire & Research Goals

SLIDE 1

University of Stuttgart Institute for   Natural Language Processing

Adversarial Training for Satire Detection: Controlling for Confounding Variables

June 3rd, 2019 Robert McHardy, Heike Adel and Roman Klinger

SLIDE 2

Satire & Research Goals Model/Data Experiments & Results Conclusion

Motivation 1: Satire or not?

“ After years of ghting there nally is a settlement between the Gema and Youtube . It became known today , that in future every music video is allowed to be played back in Germany again, as long as the audio is removed”

(translated from German)

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 2 / 12

SLIDE 3

Satire & Research Goals Model/Data Experiments & Results Conclusion

Motivation 2: Satire or not?

“Erfurt ( dpo ) – It is an

rganization which operates
utside of law and order,

funds numerous NPD

peratives and is to a not

inconsiderable extent involved in the series of murders of the so-called Zwickauer Zelle. ”

(translated from German)

DPA is a German news agency – DPO does not exist (in this context).

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 3 / 12

SLIDE 4

1 Satire, Previous Work and Research Goals 2 Model and Data 3 Experiments & Results 4 Conclusion & Availability

Outline

SLIDE 5

Satire & Research Goals Model/Data Experiments & Results Conclusion

Satire

Form of art to critize in an entertaining manner
Stylistic devices include humor, irony, sarcasm
Goal: Mimic regular news in diction
It’s not misinformation or desinformation (fake news):

Articles typically contain satire markers (similar to irony or sarcasm)

Automatic Satire Detection

Automatically distinguish satirical news from regular news ⇒ Challenging task (even for humans)

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 4 / 12

SLIDE 6

Satire & Research Goals Model/Data Experiments & Results Conclusion

Previous Work

Yang et al. 2017 , De Sarkar et al. 2018

Created data sets which are automatically labeled from

publication source

Potential limitation: Models might learn characteristics of

publication sources instead of actual characteristics of satire

(evaluation is not faulty, they use dierent publication

sources for validation than for training) ⇒ Bad generalization to unseen publication sources? ⇒ Interpretation of models (regarding concepts of satire) misleading?

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 5 / 12

SLIDE 7

Satire & Research Goals Model/Data Experiments & Results Conclusion

Our Contributions

We propose adversarial training: Improve robustness of

model against confounding variable of publication sources

We show that adversarial training is crucial for the model to

pay attention to satire instead of publication characteristics

We publish a large German data set for satire detection.
First dataset in German
First dataset including publication sources
Largest resource for satire detection so far

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 6 / 12

SLIDE 8

1 Satire, Previous Work and Research Goals 2 Model and Data 3 Experiments & Results 4 Conclusion & Availability

Outline

SLIDE 9

Satire & Research Goals Model/Data Experiments & Results Conclusion

Model

input layer LSTM layer attention layer feature extractor satire detector publication identifier satire? (yes/no) publication name

∂ J s ∂θ s ∂ J s ∂θ f ∂ J p ∂θ p −λ ∂ J p ∂ θf

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 7 / 12

SLIDE 10

Satire & Research Goals Model/Data Experiments & Results Conclusion

Data Collection and Selection

Regular news:

Der Spiegel, Der Standard, Die Zeit, Süddeutsche Zeitung

Satire:

Der Enthüller, Eulenspiegel, Nordd. Nach., Der Postillon, Satirepatzer, Die Tagespresse, Titanic, Welt (Satire), Der Zeitspiegel, Eine Zeitung, Zynismus24

Articles from January 1st, 2000 and May 1st, 2018

Average Length Publication #Articles Article Sent. Title Regular 320,219 663.45 17 .79 6.86 Satire 9,643 269.28 18.73 9.52

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 8 / 12

SLIDE 11

Satire & Research Goals Model/Data Experiments & Results Conclusion

Research Question 1: Performance

How does a decrease in publication classication performance through adversarial training aect the satire classication performance?

University of Stuttgart

McHardy/Adel/Klinger June 3rd, 2019 9 / 12

SLIDE 12

Satire & Research Goals Model/Data Experiments & Results Conclusion

Research Question 2: Attention Weights

Is adversarial training eective for avoiding that the model pays most attention to the characteristics of publication source rather than actual satire? no adv

Erfurt ( dpo ) - It is an organization which operates outside of law and order , funds numerous NPD operatives and is to a not inconsiderable extent involved in the series

f murders of the so called Zwickauer Zelle .

adv

Erfurt ( dpo ) - It is an organization which operates outside of law and order , funds numerous NPD operatives and is to a not inconsiderable extent involved in the series

f murders of the so called Zwickauer Zelle .

no adv

After all , the proposal to allow family reunion only inclusive mothers-in-law is being discussed , whereof the Union hopes for an ofg-putting efgect .

adv

After all , the proposal to allow family reunion only inclusive mothers-in-law is being discussed , whereof the Union hopes for an ofg-putting efgect .

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 10 / 12

SLIDE 13

Satire & Research Goals Model/Data Experiments & Results Conclusion

Conclusion and Availability

Observation: Satire detection models learn characteristics of

publication sources

Our Contributions

Adversarial training to control for this confounding variable

⇒ Considerable reduction of publication identication performance while satire detection remains on comparable levels ⇒ Attention weights show eectiveness of our approach

First German dataset for satire detection

⇒ Dataset and code available at: http://www.ims.uni-stuttgart.de/data/germansatire

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 11 / 12

SLIDE 14

University of Stuttgart Institute for   Natural Language Processing