Adversarial Training for Satire Detection: Controlling for - - PowerPoint PPT Presentation

adversarial training for satire detection controlling for
SMART_READER_LITE
LIVE PREVIEW

Adversarial Training for Satire Detection: Controlling for - - PowerPoint PPT Presentation

University of Stuttgart Institute for Natural Language Processing Adversarial Training for Satire Detection: Controlling for Confounding Variables June 3rd, 2019 Robert McHardy, Heike Adel and Roman Klinger Satire & Research Goals


slide-1
SLIDE 1

University of Stuttgart Institute for 
 Natural Language Processing

Adversarial Training for Satire Detection: Controlling for Confounding Variables

June 3rd, 2019 Robert McHardy, Heike Adel and Roman Klinger

slide-2
SLIDE 2

Satire & Research Goals Model/Data Experiments & Results Conclusion

Motivation 1: Satire or not?

“ After years of ghting there nally is a settlement between the Gema and Youtube . It became known today , that in future every music video is allowed to be played back in Germany again, as long as the audio is removed”

(translated from German)

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 2 / 12

slide-3
SLIDE 3

Satire & Research Goals Model/Data Experiments & Results Conclusion

Motivation 2: Satire or not?

“Erfurt ( dpo ) – It is an

  • rganization which operates
  • utside of law and order,

funds numerous NPD

  • peratives and is to a not

inconsiderable extent involved in the series of murders of the so-called Zwickauer Zelle. ”

(translated from German)

DPA is a German news agency – DPO does not exist (in this context).

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 3 / 12

slide-4
SLIDE 4

1 Satire, Previous Work and Research Goals 2 Model and Data 3 Experiments & Results 4 Conclusion & Availability

Outline

slide-5
SLIDE 5

Satire & Research Goals Model/Data Experiments & Results Conclusion

Satire

  • Form of art to critize in an entertaining manner
  • Stylistic devices include humor, irony, sarcasm
  • Goal: Mimic regular news in diction
  • It’s not misinformation or desinformation (fake news):

Articles typically contain satire markers (similar to irony or sarcasm)

Automatic Satire Detection

Automatically distinguish satirical news from regular news ⇒ Challenging task (even for humans)

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 4 / 12

slide-6
SLIDE 6

Satire & Research Goals Model/Data Experiments & Results Conclusion

Previous Work

Yang et al. 2017 , De Sarkar et al. 2018

  • Created data sets which are automatically labeled from

publication source

  • Potential limitation: Models might learn characteristics of

publication sources instead of actual characteristics of satire

  • (evaluation is not faulty, they use dierent publication

sources for validation than for training) ⇒ Bad generalization to unseen publication sources? ⇒ Interpretation of models (regarding concepts of satire) misleading?

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 5 / 12

slide-7
SLIDE 7

Satire & Research Goals Model/Data Experiments & Results Conclusion

Our Contributions

  • We propose adversarial training: Improve robustness of

model against confounding variable of publication sources

  • We show that adversarial training is crucial for the model to

pay attention to satire instead of publication characteristics

  • We publish a large German data set for satire detection.
  • First dataset in German
  • First dataset including publication sources
  • Largest resource for satire detection so far

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 6 / 12

slide-8
SLIDE 8

1 Satire, Previous Work and Research Goals 2 Model and Data 3 Experiments & Results 4 Conclusion & Availability

Outline

slide-9
SLIDE 9

Satire & Research Goals Model/Data Experiments & Results Conclusion

Model

input layer LSTM layer attention layer feature extractor satire detector publication identifier satire? (yes/no) publication name

∂ J s ∂θ s ∂ J s ∂θ f ∂ J p ∂θ p −λ ∂ J p ∂ θf

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 7 / 12

slide-10
SLIDE 10

Satire & Research Goals Model/Data Experiments & Results Conclusion

Data Collection and Selection

  • Regular news:

Der Spiegel, Der Standard, Die Zeit, Süddeutsche Zeitung

  • Satire:

Der Enthüller, Eulenspiegel, Nordd. Nach., Der Postillon, Satirepatzer, Die Tagespresse, Titanic, Welt (Satire), Der Zeitspiegel, Eine Zeitung, Zynismus24

  • Articles from January 1st, 2000 and May 1st, 2018

Average Length Publication #Articles Article Sent. Title Regular 320,219 663.45 17 .79 6.86 Satire 9,643 269.28 18.73 9.52

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 8 / 12

slide-11
SLIDE 11

Satire & Research Goals Model/Data Experiments & Results Conclusion

Research Question 1: Performance

How does a decrease in publication classication performance through adversarial training aect the satire classication performance?

  • University of Stuttgart

McHardy/Adel/Klinger June 3rd, 2019 9 / 12

slide-12
SLIDE 12

Satire & Research Goals Model/Data Experiments & Results Conclusion

Research Question 2: Attention Weights

Is adversarial training eective for avoiding that the model pays most attention to the characteristics of publication source rather than actual satire? no adv

Erfurt ( dpo ) - It is an organization which operates outside of law and order , funds numerous NPD operatives and is to a not inconsiderable extent involved in the series

  • f murders of the so called Zwickauer Zelle .

adv

Erfurt ( dpo ) - It is an organization which operates outside of law and order , funds numerous NPD operatives and is to a not inconsiderable extent involved in the series

  • f murders of the so called Zwickauer Zelle .

no adv

After all , the proposal to allow family reunion only inclusive mothers-in-law is being discussed , whereof the Union hopes for an ofg-putting efgect .

adv

After all , the proposal to allow family reunion only inclusive mothers-in-law is being discussed , whereof the Union hopes for an ofg-putting efgect .

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 10 / 12

slide-13
SLIDE 13

Satire & Research Goals Model/Data Experiments & Results Conclusion

Conclusion and Availability

  • Observation: Satire detection models learn characteristics of

publication sources

Our Contributions

  • Adversarial training to control for this confounding variable

⇒ Considerable reduction of publication identication performance while satire detection remains on comparable levels ⇒ Attention weights show eectiveness of our approach

  • First German dataset for satire detection

⇒ Dataset and code available at: http://www.ims.uni-stuttgart.de/data/germansatire

University of Stuttgart McHardy/Adel/Klinger June 3rd, 2019 11 / 12

slide-14
SLIDE 14

University of Stuttgart Institute for 
 Natural Language Processing

Adversarial Training for Satire Detection: Controlling for Confounding Variables

June 3rd, 2019 Robert McHardy, Heike Adel and Roman Klinger