Adversarial Training for Weakly Supervised Event Detection Xiaozhi - - PowerPoint PPT Presentation

adversarial training for weakly supervised event detection
SMART_READER_LITE
LIVE PREVIEW

Adversarial Training for Weakly Supervised Event Detection Xiaozhi - - PowerPoint PPT Presentation

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Adversarial Training for Weakly Supervised Event Detection Xiaozhi Wang 1 , Xu Han 1 , Zhiyuan Liu 1 , Maosong Sun 1 , Peng Li 2 1 Department of Computer Science and


slide-1
SLIDE 1

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Adversarial Training for Weakly Supervised Event Detection

Xiaozhi Wang1, Xu Han1, Zhiyuan Liu1, Maosong Sun1, Peng Li2

1Department of Computer Science and Technology, Tsinghua University 2Pattern Recognition Center, WeChat, Tencent Inc.

July 22, 2019

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 1 / 22

slide-2
SLIDE 2

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Introduction

  • Event Detection: Detect event triggers and identify event types.

Mark Twain and Olivia Langdon married in 1870

Event Type:Marry

  • First stage of the Event Extraction.
  • Important for downstream NLP applications.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 2 / 22

slide-3
SLIDE 3

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Challenge: data sparsity

33 event types 599 documents 6,000+ instances

1

Figure 1: Statistics of ACE 2005 English Data. Thanks Chen et al., 2017.

1Thanks Chen et al., 2017. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 3 / 22

slide-4
SLIDE 4

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Related Work: Distant Supervision

(a) Automatically Labeled Data Generation for Large Scale Event Extraction (Chen et al., 2017) (b) Open-Domain Event Detection using Distant Supervision (Araki et al., 2018)

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 4 / 22

slide-5
SLIDE 5

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Related Work: Semi-supervision

Figure 2: Bootstrapped Training of Event Extraction Classifiers (Huang et al., 2012)

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 5 / 22

slide-6
SLIDE 6

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Related Work: Weakness

  • Sophisticated pre-defined rules: topic bias.
  • Existing instances in knowledge bases: low coverage.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 6 / 22

slide-7
SLIDE 7

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Our Model

  • Adversarial Training to unsupervisedly denoise data.
  • Trigger-based latent instance discovery strategy to automatically construct

large-scale candidate set with good coverage.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 7 / 22

slide-8
SLIDE 8

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Overall architecture

Figure 3: The overall architecture. The event type is Contact.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 8 / 22

slide-9
SLIDE 9

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Adversarial Training

  • Discriminator
  • To detect events correctly.
  • Should resist noise.
  • Generator
  • To confuse the discriminators.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 9 / 22

slide-10
SLIDE 10

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Overall architecture

Figure 4: The overall architecture. The event type is Contact.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 10 / 22

slide-11
SLIDE 11

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Overall architecture

Figure 5: The overall architecture. The event type is Contact.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 11 / 22

slide-12
SLIDE 12

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Adversarial Training

  • Discriminator
  • x ∈ R as positive instances and x ∈ U as negative instances.
  • φD = max
  • Ex∼PR
  • log
  • P(e|x, t)
  • + Ex∼PU
  • log
  • 1 − P(e|x, t)
  • .
  • Generator
  • Select most confusing x ∈ U to fool the discriminator.
  • φG = max Ex∼PU
  • log
  • P(e|x, t)
  • .

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 12 / 22

slide-13
SLIDE 13

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Adversarial Training

  • Discriminator
  • x ∈ R as positive instances and x ∈ U as negative instances.
  • LD = −

x∈R 1 |R| log

  • P(e|x, t)

x∈U PU(x) log

  • 1 − P(e|x, t)
  • .
  • Generator
  • Select most confusing x ∈ U to fool the discriminator.
  • Confusing score: PU(x) =

exp

  • f (x)
  • ˆ

x∈U exp

  • f (ˆ

x)

.

  • LG = −

x∈U PU(x) log

  • P(e|x, t)
  • .

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 13 / 22

slide-14
SLIDE 14

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Method

  • Pre-train a normal model in the noisy dataset, and set a threshold for the

confidence scores of the model.

  • Reliable Set R: instances with higher confidence.
  • Unreliable Set U: instances with lower confidence.
  • Initialize the encoders with the pre-trained model, then conduct adversarial

training.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 14 / 22

slide-15
SLIDE 15

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Experiments

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Recall 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 Precision DMCNN+ADV DMCNN+NA DMCNN+MIL DMCNN

(a) Precision-Recall Curves for the CNN models.

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Recall 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 Precision DMBERT+ADV DMBERT+NA DMBERT+MIL DMBERT

(b) Precision-Recall Curves for the BERT mod- els.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 15 / 22

slide-16
SLIDE 16

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Method

  • Pre-train a model on the small high-quality dataset.
  • Retrieve candidate instances from a large-scale raw dataset to construct a large

candidate set.

  • Automatically label the candidate set with a pre-trained model.
  • Reliable Set R: Small-scale human-annotated data.
  • Unreliable Set U: Large-scale auto-labeled data.
  • Adversarial training, then the instances recommend by the generator will be

trusted.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 16 / 22

slide-17
SLIDE 17

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Trigger-based latent instance discovery strategy

  • Intuition: If a word serves as the trigger in a known instance, the raw sentences

mentioning it may also express an event.

  • Retrieve the sentences in NYT corpus which contains triggers in ACE 2005.
  • Simple but effective.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 17 / 22

slide-18
SLIDE 18

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Experiments

Method Trigger Identification +Classification P R F1 Li’s Joint 73.7 62.3 67.5 JRNN 66.0 73.0 69.3 ANN-FN 77.6 65.2 70.7 DLRNN 77.2 64.9 70.5 GMLATT 78.9 66.9 72.4 DMCNN+Chen’s DS 75.7 66.0 70.5 Bi-LSTM+GAN 71.3 74.7 73.0 GCN-ED 77.9 68.8 73.1 DMCNN 75.6 63.6 69.1 DMCNN+Boot 77.7 65.1 70.8 DMBERT 77.6 71.8 74.6 DMBERT+Boot 77.9 72.5 75.1

Table 1: The overall performance (%) of different models on ACE-2005.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 18 / 22

slide-19
SLIDE 19

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Manual Evaluation

Method Average Precision Fleiss’s Kappa chen2017automatically 88.9

  • zeng2018scale

91.0

  • Our First Iteration

91.7 61.3 Our Second Iteration 87.5 52.0 Table 2: The human evaluation results (%) of auto-labeled data.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 19 / 22

slide-20
SLIDE 20

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Case Study

Event-Type: Justice Subtype: Sue In ACE-2005 Dell sued for ”bait and switch” and false promises. Discovered

  • 1. The lawyers for the four former state officials who

have been sued told the jurors . . .

  • 2. But litigation held up the project until . . . .

Table 3: The examples with highlighting triggers.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 20 / 22

slide-21
SLIDE 21

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

Conclusion and Future work

  • An effective adversarial training method for weakly supervised event detection.
  • Denoise and enhance distantly supervised models.
  • Automatically collect more diverse and accurate training data.
  • Future work
  • Extract event arguments.
  • A large-scale dataset.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 21 / 22

slide-22
SLIDE 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary

The End

Thanks for listening. Questions are welcome.

Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 22 / 22