Adversarial Training for Weakly Supervised Event Detection Xiaozhi - PowerPoint PPT Presentation

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Adversarial Training for Weakly Supervised Event Detection Xiaozhi Wang 1 , Xu Han 1 , Zhiyuan Liu 1 , Maosong Sun 1 , Peng Li 2 1 Department of Computer Science and Technology, Tsinghua University 2 Pattern Recognition Center, WeChat, Tencent Inc. July 22, 2019 Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 1 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Introduction • Event Detection: Detect event triggers and identify event types. Mark Twain and Olivia Langdon married in 1870 Event Type ： Marry • First stage of the Event Extraction. • Important for downstream NLP applications. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 2 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Challenge: data sparsity 33 event types 599 documents 6,000+ instances 1 Figure 1: Statistics of ACE 2005 English Data. Thanks Chen et al., 2017. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li 1 Thanks Chen et al., 2017. Adversarial Training for Weakly Supervised Event Detection 3 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Related Work: Distant Supervision (a) Automatically Labeled Data Generation for (b) Open-Domain Event Detection using Distant Large Scale Event Extraction (Chen et al., Supervision (Araki et al., 2018) 2017) Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 4 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Related Work: Semi-supervision Figure 2: Bootstrapped Training of Event Extraction Classifiers (Huang et al., 2012) Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 5 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Related Work: Weakness • Sophisticated pre-defined rules: topic bias. • Existing instances in knowledge bases: low coverage. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 6 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Our Model • Adversarial Training to unsupervisedly denoise data. • Trigger-based latent instance discovery strategy to automatically construct large-scale candidate set with good coverage. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 7 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Overall architecture Figure 3: The overall architecture. The event type is Contact. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 8 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Adversarial Training • Discriminator • To detect events correctly. • Should resist noise. • Generator • To confuse the discriminators. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 9 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Adversarial Training • Discriminator • x ∈ R as positive instances and x ∈ U as negative instances. � � � �� • φ D = max E x ∼ P R log P ( e | x , t ) + E x ∼ P U log 1 − P ( e | x , t ) . • Generator • Select most confusing x ∈ U to fool the discriminator. � � �� • φ G = max E x ∼ P U log P ( e | x , t ) . Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 12 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Adversarial Training • Discriminator • x ∈ R as positive instances and x ∈ U as negative instances. 1 • L D = − � � � � � |R| log P ( e | x , t ) − � x ∈U P U ( x ) log 1 − P ( e | x , t ) . x ∈R • Generator • Select most confusing x ∈ U to fool the discriminator. � � exp f ( x ) • Confusing score: P U ( x ) = � . � � x ∈U exp f (ˆ x ) ˆ � � • L G = − � x ∈U P U ( x ) log P ( e | x , t ) . Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 13 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Method • Pre-train a normal model in the noisy dataset, and set a threshold for the confidence scores of the model. • Reliable Set R : instances with higher confidence. • Unreliable Set U : instances with lower confidence. • Initialize the encoders with the pre-trained model, then conduct adversarial training. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 14 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Experiments 1.00 1.00 0.95 0.95 0.90 0.90 0.85 0.85 Precision Precision 0.80 0.80 0.75 0.75 0.70 0.70 DMCNN+ADV DMBERT+ADV DMCNN+NA DMBERT+NA 0.65 0.65 DMCNN+MIL DMBERT+MIL DMCNN DMBERT 0.60 0.60 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Recall Recall (a) Precision-Recall Curves for the CNN models. (b) Precision-Recall Curves for the BERT models. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 15 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Method • Pre-train a model on the small high-quality dataset. • Retrieve candidate instances from a large-scale raw dataset to construct a large candidate set. • Automatically label the candidate set with a pre-trained model. • Reliable Set R : Small-scale human-annotated data. • Unreliable Set U : Large-scale auto-labeled data. • Adversarial training, then the instances recommend by the generator will be trusted. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 16 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Trigger-based latent instance discovery strategy • Intuition: If a word serves as the trigger in a known instance, the raw sentences mentioning it may also express an event. • Retrieve the sentences in NYT corpus which contains triggers in ACE 2005. • Simple but effective. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 17 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Experiments Trigger Identification Method +Classification P R F1 Li’s Joint 73.7 62.3 67.5 JRNN 66.0 73.0 69.3 ANN-FN 77.6 65.2 70.7 DLRNN 77.2 64.9 70.5 GMLATT 78.9 66.9 72.4 DMCNN+Chen’s DS 75.7 66.0 70.5 Bi-LSTM+GAN 71.3 74.7 73.0 GCN-ED 68.8 73.1 77.9 DMCNN 75.6 63.6 69.1 DMCNN+Boot 77.7 65.1 70.8 DMBERT 77.6 71.8 74.6 DMBERT+Boot 72.5 77.9 75.1 Table 1: The overall performance (%) of different models on ACE-2005. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 18 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Manual Evaluation Method Average Precision Fleiss’s Kappa chen2017automatically 88.9 - zeng2018scale 91.0 - Our First Iteration 91.7 61.3 Our Second Iteration 87.5 52.0 Table 2: The human evaluation results (%) of auto-labeled data. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 19 / 22

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Case Study Event-Type: Justice Subtype: Sue In ACE-2005 Dell sued for ”bait and switch” and false promises. 1. The lawyers for the four former state officials who Discovered have been sued told the jurors . . . 2. But litigation held up the project until . . . . Table 3: The examples with highlighting triggers. Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, Peng Li Adversarial Training for Weakly Supervised Event Detection 20 / 22

Adversarial Training for Weakly Supervised Event Detection Xiaozhi - PowerPoint PPT Presentation

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Adversarial Training for Weakly Supervised Event Detection Xiaozhi Wang 1 , Xu Han 1 , Zhiyuan Liu 1 , Maosong Sun 1 , Peng Li 2 1 Department of Computer Science and

free 18-May-17 Towards Weakly Supervised Image Understanding 1/50 Towards Weakly Supervised

Weakly Supervised Classification Weakly Supervised Classification and Robust Learning and Robust

LID Challenge: Weakly Supervised Semantic Segmentation 3d place solution NoPeopleAllowed: The 3

Dual-Gradients Localization framework for Weakly Supervised Object Localization Chuangchuang Tan

Weakly-Supervised Temporal Localization via Occurrence Count Learning Julien Schroeter

Generative Adversarial Networks (GANs) By: Ismail Elezi ismail.elezi@gmail.com Supervised

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Friendly Adversarial Training: Attacks Which Do Not Kill Training Make Adversarial Learning

A-NICE-MC Jiaming Song 1. Motivation 2. Notations and Problem Setup 3. Adversarial Training for

A Weakly Supervised Approach for Adaptive Detection of Cyberbullying Roles Bert Huang Department

Searches for New Light Weakly Coupled Particles around DESY Intensity Frontier Workshop IF5:

POLI 437: International Relations of Latin America This week The Blue Wave in Brazil The

A Practical Guide Presented By: Jordan Cooper & Jeremy Cooper C ooper & C ooper Real

POLI 100M: Poli-cal Psychology Lecture 6: Campaigns Taylor N. Carlson @eenstr@ucsd.edu

under the influence Cover page dark patterns and the power of persuasive design A long time

VISUAL 1 KRS-NH VS . OTHER PUBLIC PENSION PLANS FAC28 FAC213 VISUAL 2 AFTER OUSTER OF

Timeline Keeps the process organized Lets look at the sample that I put in the

CAT Game and JCAT Platform Jinzhong Niu Computer Science The Graduate Center, CUNY March 17,

Are You Giving Teachers Blisters? Finding the Right Fit for an EdTech Startup There's an

Adversarial Training for Weakly Supervised Event Detection Xiaozhi - PowerPoint PPT Presentation

Introduction Adversarial Training Distant Supervision Semi-supervision Summary Adversarial Training for Weakly Supervised Event Detection Xiaozhi Wang 1 , Xu Han 1 , Zhiyuan Liu 1 , Maosong Sun 1 , Peng Li 2 1 Department of Computer Science and

free 18-May-17 Towards Weakly Supervised Image Understanding 1/50 Towards Weakly Supervised

Weakly Supervised Classification Weakly Supervised Classification and Robust Learning and Robust

LID Challenge: Weakly Supervised Semantic Segmentation 3d place solution NoPeopleAllowed: The 3

Dual-Gradients Localization framework for Weakly Supervised Object Localization Chuangchuang Tan

Weakly-Supervised Temporal Localization via Occurrence Count Learning Julien Schroeter

Generative Adversarial Networks (GANs) By: Ismail Elezi ismail.elezi@gmail.com Supervised

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Friendly Adversarial Training: Attacks Which Do Not Kill Training Make Adversarial Learning

A-NICE-MC Jiaming Song 1. Motivation 2. Notations and Problem Setup 3. Adversarial Training for

A Weakly Supervised Approach for Adaptive Detection of Cyberbullying Roles Bert Huang Department

Searches for New Light Weakly Coupled Particles around DESY Intensity Frontier Workshop IF5:

POLI 437: International Relations of Latin America This week The Blue Wave in Brazil The

A Practical Guide Presented By: Jordan Cooper &amp; Jeremy Cooper C ooper &amp; C ooper Real

POLI 100M: Poli-cal Psychology Lecture 6: Campaigns Taylor N. Carlson @eenstr@ucsd.edu

under the influence Cover page dark patterns and the power of persuasive design A long time

VISUAL 1 KRS-NH VS . OTHER PUBLIC PENSION PLANS FAC28 FAC213 VISUAL 2 AFTER OUSTER OF

Timeline Keeps the process organized Lets look at the sample that I put in the

CAT Game and JCAT Platform Jinzhong Niu Computer Science The Graduate Center, CUNY March 17,

Are You Giving Teachers Blisters? Finding the Right Fit for an EdTech Startup There's an

A Practical Guide Presented By: Jordan Cooper & Jeremy Cooper C ooper & C ooper Real