Robust Distant Supervision Relation Extraction via Deep - - PowerPoint PPT Presentation

robust distant supervision relation extraction via deep
SMART_READER_LITE
LIVE PREVIEW

Robust Distant Supervision Relation Extraction via Deep - - PowerPoint PPT Presentation

Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning BUPT Pengda Qin , Weiran Xu and William Wang 1 Outline Motivation Algorithm Experiments Conclusion 2 Outline Motivation Algorithm


slide-1
SLIDE 1

Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning

Pengda Qin, Weiran Xu and William Wang

1

BUPT

slide-2
SLIDE 2

Outline

  • Motivation
  • Algorithm
  • Experiments
  • Conclusion

2

slide-3
SLIDE 3

Outline

  • Motivation
  • Algorithm
  • Experiments
  • Conclusion

3

slide-4
SLIDE 4

Plain Text Corpus (Unstructured Info) Classifier Entity-Relation Triple (Structured Info) Relation Type with Labeled Dataset Relation Type without Labeled Dataset

4

Relation Extraction

slide-5
SLIDE 5

“If two entities participate in a relation, any sentence that contains those two entities might express that relation.” (Mintz, 2009)

5

Distant Supervision

slide-6
SLIDE 6

Data(x): <Belgium, Nijlen> Label(y): /location/contains Target Corpus (Unlabeled)

Relation Label: /location/contains

  • 1. Nijlen is a municipality

located in the Belgian province of Antwerp.

  • 2. ……
  • 3. ……

6

Sentence Bag:

Distant Supervision

slide-7
SLIDE 7

v Within-Sentence-Bag Level

v Entity-Pair Level

§ Hoffmann et al., ACL 2011. § Surdean et al., ACL 2012. § Zeng et al., ACL 2015. § Li et al., ACL 2016. § None

7

Wrong Labeling

slide-8
SLIDE 8

§

Place_of_Death (William O’Dwyer, New York city)

8

v Entity-Pair Level

Wrong Labeling

i. Some New York city mayors – William O’Dwyer, Vincent R. Impellitteri and Abraham Beame – were born abroad. ii. Plenty of local officials have, too, including two New York city mayors, James J. Walker, in 1932, and William O’Dwyer, in 1950.

slide-9
SLIDE 9

v Most of entity pairs only have several sentences

1 Sentence 55% 2 Sentence 32% Other 4%

9

v Lots of entity pairs have repetitive sentences

Wrong Labeling

slide-10
SLIDE 10

Outline

  • Motivation
  • Algorithm
  • Experiments
  • Conclusion

10

slide-11
SLIDE 11

Negative set Positive set Negative set Positive set

False Positive False Positive

DS Dataset Cleaned Dataset

11

Overview

False Positive Indicator

slide-12
SLIDE 12

Sentence-Level Indicator

False-Positive Indicator Learn a Policy to Denoise the Training Data

12

General Purpose and Offline Process Without Supervised Information

Requirements

slide-13
SLIDE 13

Negative set Positive set Negative set Positive set

False Positive

𝑆𝑓𝑥𝑏𝑠𝑒 𝐵𝑑𝑢𝑗𝑝𝑜

Classifier

𝑈𝑠𝑏𝑗𝑜

False Positive

DS Dataset Cleaned Dataset

13

Overview

False Positive Indicator Policy-Based Agent

slide-14
SLIDE 14

v State

14

v Action v Reward

§ ??? § Sentence vector § The average vector of previous removed sentences § Remove & retain

Deep Reinforcement Learning

slide-15
SLIDE 15

v One relation type has an agent

15

v Sentence-level v Split into training set and validation set

§ Positive: Distantly-supervised positive sentences § Negative: Sampled from other relations

Deep Reinforcement Learning

slide-16
SLIDE 16

RL Agent Train

16

Train

Relation Classifier Relation Classifier 𝐺

/ 01/

𝐺

/

× +𝓢0 + ×(−𝓢0)

Noisy dataset 𝑄:

;<0

Cleaned dataset Cleaned dataset Removed part Removed part

𝓢0 = 𝛽(𝐺

/ 0 - 𝐺 / 01/)

RL Agent

Epoch i-1 Epoch i

Noisy dataset 𝑄:

;<0

+𝑂:

;<0

𝑂:

;<0 +

Deep Reinforcement Learning

slide-17
SLIDE 17

17

Positive Set Negative Set

§ Accurate § Steady § Fast

False Positive

§ Obvious

Reward

slide-18
SLIDE 18

18

Positive Set Negative Set

Relation Classifier

Train

Relation Classifier

Train Calculate

𝐺

/

Epoch 𝑗

False Positive

False Positive

Positive Negative

Reward

slide-19
SLIDE 19

Outline

  • Motivation
  • Algorithm
  • Experiments
  • Conclusion

19

slide-20
SLIDE 20

Evaluation on a Synthetic Noise Dataset

v Dataset: SemEval-2010 Task 8 v True Positive: Cause-Effect v False Positive: Other relation types

20

v True Positive + False Positive: 1331 samples

slide-21
SLIDE 21

0.64 0.645 0.65 0.655 0.66 0.665 0.67 0.675 0.68 0.685 10 20 30 40 50 60 70 80 90 100

F1 Score Epoch

200 FPs in 1331 Samples

21

(198/388) (197/339) (195/308) (180/279) (179/260) False Positive Removed Part

Evaluation on a Synthetic Noise Dataset

slide-22
SLIDE 22

22

0.68 0.69 0.7 0.71 0.72 0.73 0.74 0.75 10 20 30 40 50 60 70 80 90 100

F1 Score Epoch

0 FPs in 1331 samples

(0/258) (0/150) (0/121) (0/59) (0/32)

Evaluation on a Synthetic Noise Dataset

slide-23
SLIDE 23

23

vCNN+ONE, PCNN+ONE

§ Distant supervision for relation extraction via piecewise convolutional neural networks. (Zeng et al., 2015)

vCNN+ATT, PCNN+ATT

§ Neural relation extraction with selective attention over instances. (Lin et al., 2016)

Distant Supervision

vDataset: Riedel et al., 2010

§ http://iesl.cs.umass.edu/riedel/ecml/

slide-24
SLIDE 24

24

0.4 0.5 0.6 0.7 0.8 0.9 1 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

CNN-based

CNN+ONE CNN+ONE_RL CNN+ATT CNN+ATT_RL

Distant Supervision

slide-25
SLIDE 25

0.4 0.5 0.6 0.7 0.8 0.9 1 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

PCNN-based

PCNN+ONE PCNN+ONE_RL PCNN+ATT PCNN+ATT_RL

25

Distant Supervision

slide-26
SLIDE 26

Outline

  • Motivation
  • Algorithm
  • Experiments
  • Conclusion

26

slide-27
SLIDE 27

v We propose a deep reinforcement learning method for robust distant supervision relation extraction. v Our method is model-agnostic. v Our method boost the performance of recently proposed neural relation extractors.

27

Conclusion

slide-28
SLIDE 28

28

Thank you!

Q&A