A Fine-grained and Noise-aware Method for Neural Relation Extraction - - PowerPoint PPT Presentation

a fine grained and noise aware method for neural relation
SMART_READER_LITE
LIVE PREVIEW

A Fine-grained and Noise-aware Method for Neural Relation Extraction - - PowerPoint PPT Presentation

1 A Fine-grained and Noise-aware Method for Neural Relation Extraction ADVISOR: JIA-LING, KOH SOURCE: CIKM 2019 SPEAKER: SHAO-WEI, HUANG DATE: 2020/05/04 2 OUTLINE Introduction Method Experiment Conclusion 3 INTRODUCTION


slide-1
SLIDE 1

A Fine-grained and Noise-aware Method for Neural Relation Extraction

ADVISOR: JIA-LING, KOH SOURCE: CIKM 2019 SPEAKER: SHAO-WEI, HUANG DATE: 2020/05/04

1

slide-2
SLIDE 2

⚫ Introduction ⚫ Method ⚫ Experiment ⚫ Conclusion

2

OUTLINE

slide-3
SLIDE 3

INTRODUCTION

➢ Relation Extraction(RE):Extracting semantic

relations between two entities from the text corpus.

3 (Ex): Donald Trump is the 45th President of United States. President of

slide-4
SLIDE 4

INTRODUCTION

➢ Supervised RE:

⚫ Heavily relies on human-annotated data to achieve

  • utstanding performance.

⚫ Limited in size and domain specific, preventing large-

scale supervised relation extraction. 4

slide-5
SLIDE 5

INTRODUCTION

➢ Distant supervision RE:

5

Relations In KB President_of (Donald Trump, United States) Sentences in Plain texts S1: Donald Trump is the 45th President of the United States. S2: Donald Trump was born in the United States. S3: Donald Trump believes the United States has incredible potential. Relations In KB President_of (Donald Trump, United States) Sentences in Plain texts S1: Donald Trump is the 45th President of the United States. S2: Donald Trump was born in the United States. S3: Donald Trump believes the United States has incredible potential.

President of (A training bag)

⚫ Automatically generates large-scale training data through knowledge base and plain texts.

slide-6
SLIDE 6

INTRODUCTION

➢ Challenge in Distant supervision(1/2):

6 ⚫ Multi-instance multi-label problem(MIML).

Relations In KB Place_of_birth(Donald Trump, United States) President_of (Donald Trump, United States) Sentences in Plain texts S1: Donald Trump is the 45th President of the United States. (President_of) S2: Donald Trump was born in the United States. (Place_of_birth)

Multi-label problem

Relations In KB Place_of_birth(Donald Trump, United States) President_of (Donald Trump, United States) Sentences in Plain texts S3: Donald Trump believes the United States has incredible potential. (-)

Multi-instance problem

slide-7
SLIDE 7

INTRODUCTION

➢ Challenge in Distant supervision(2/2):

7 ⚫ The assigned relation labels are annotated at bag- level (a set of sentences) instead of sentence-level. Reinforcement Learning Model

slide-8
SLIDE 8

INTRODUCTION

➢ Goal:

8

A training bag

(Donald Trump, United States)

Relatin 1 Relatin 2

.....

Relatin l The most informative sentence for each relation

slide-9
SLIDE 9

OUTLINE

Introduction Method Experiment Conclusion

9

slide-10
SLIDE 10

10

REINFORCEMENT LEARNING

10

State Action; Agent Reward

https://www.youtube.com/watch?v=vmkRMvhCW5c&t=1557s

slide-11
SLIDE 11

11

REINFORCEMENT LEARNING

11

https://www.youtube.com/watch?v=vmkRMvhCW5c&t=1557s

slide-12
SLIDE 12

12

FRAMEWORK

12

slide-13
SLIDE 13

METHOD

13

Notation and Problen definition

➢Let B ={⟨ 𝑓1, 𝑓2 ⟩, (𝑠

1, · · · , 𝑠 𝑚 ), {𝑇1, · · · , 𝑇𝑜 }} be a training bag.

An entity pair The relations that link the entity pair in KB The sentences from corpus which mention this entity pair

➢Problem definition:Figuring out the most expressive sentence for each relation in B. (in order to promote the extractor’s performance)

slide-14
SLIDE 14

S𝑞𝑠𝑓 S𝑑𝑣𝑠

METHOD

14

State

➢Embedding of target entity pair:[𝑓1,;𝑓2] ➢Encoding for previously chosen sentence:S𝑞𝑠𝑓, encoded by CNN.

slide-15
SLIDE 15

0.7 0.8 0.9 0.2 0.5 0.6 0.1 0.1

METHOD

15

State Donald Trump is the 45th President

  • f

United States

Word Embeddings Filters

Input Convolution Max pooling S𝑞𝑠𝑓

slide-16
SLIDE 16

METHOD

16

State

➢Confidence score for S𝑞𝑠𝑓 :p(r | S𝑞𝑠𝑓; θ), calculated by relation extractor. 𝑆|𝑠| (r=53) ➢Encoding for current sentence:S𝑑𝑣𝑠 ➢Confidence score for S𝑑𝑣𝑠:p(r | S𝑑𝑣𝑠; θ) ➢St :[𝑓1 ; 𝑓2 ; S𝑞𝑠𝑓 ; p(r | S𝑞𝑠𝑓; θ) ; S𝑑𝑣𝑠 ; p(r | S𝑑𝑣𝑠; θ)]

slide-17
SLIDE 17

17

METHOD Action

17 ➢First part of action:U, decide whether to adopt the current sentence to replace the previously chosen. ⚫ U =1 → Yes. ⚫ U = 0 → No.

S𝑞𝑠𝑓 S𝑑𝑣𝑠

Judge U=1? or U = 0?

slide-18
SLIDE 18

18

METHOD Action

18 ➢Reasonable assumptions for each bag: ⚫ Expressed-at-least-once ⚫ Express-at-most-one ➢Compete mechanism: ⚫ When more relations simultaneously intend to update the S𝑞𝑠𝑓 with the S𝑑𝑣𝑠. ⚫ Only update the relation which has highest Q(S𝑢 ,U =1).

slide-19
SLIDE 19

➢Second part of action:P, decide the relation whether to stop the search action. ⚫ P =1 → Yes(believe has picked out the expressive sentence for this relation). ⚫ P = 0 → No. 19

METHOD Action

19

slide-20
SLIDE 20

20

METHOD Action

20 ➢ Takes the action 𝑩∗ that equals 𝒃𝒔𝒉𝒏𝒃𝒚𝑩 Q( Q(𝑻𝒖 ,A ,A): ⚫

slide-21
SLIDE 21

(+)/(-) (+)/(-) (+) (-) (+) (-)

21

METHOD Reward

21 ➢Reward: ⚫ When execute a action → Get a reward. ⚫ The objective of reinforcement learning:By maximize total reward to learn Q function.

slide-22
SLIDE 22

22

METHOD Initial states S𝒖𝒋

𝟏 for all episodes

➢ Initial states 𝑻𝒖𝒋

𝟏 for all relations of a bag are the same:

⚫B ={⟨ 𝑓1, 𝑓2 ⟩, (𝑠

1, · · · , 𝑠 𝑚 ), {𝑇1, · · · , 𝑇𝑜 }}, 𝑗 ∈{1, 2, · · · , 𝑚}

⚫𝑇𝑢𝑗

0 = [𝑓1 ; 𝑓2 ; 0 ; 0 ; S1 ; p(r | S1; θ)], for all 𝑗.

⚫The output values of Q-function are the same too.

slide-23
SLIDE 23

23

METHOD Initial states S𝒖𝒋

𝟏 for all episodes

➢ heuristic initialization: 𝑇1 𝑇2 𝑇3 𝑠

1

𝑠

2

𝑠

3

𝑏𝑠𝑕𝑛𝑏𝑦𝑠p(r | S1 ; θ) = 𝑠

3

𝑇𝑢3

0 = [𝑓1 ; 𝑓2 ; S1; p(r | S1; θ);

S3+1 ; p(r | S3+1; θ)]

𝑏𝑠𝑕𝑛𝑏𝑦𝑠p(r | S2 ; θ) = 𝑠

1

𝑇𝑢1

0 = [𝑓1 ; 𝑓2 ; S2; p(r | S2; θ);

S3+1 ; p(r | S3+1; θ)]

𝑏𝑠𝑕𝑛𝑏𝑦𝑠p(r | S3 ; θ) = 𝑠

2

𝑇𝑢2

0 = [𝑓1 ; 𝑓2 ; S3; p(r | S3; θ);

S3+1 ; p(r | S3+1; θ)]

slide-24
SLIDE 24

24

METHOD

24 ➢ Reinforcement learning algorithm for MIML: ⚫ In paper Algorithm 1.

slide-25
SLIDE 25

25

METHOD Joint training

➢ Optimization of the extractor: ⚫

|TS TS| https://morvanzhou.github.io/tutorials/machine- learning/ML-intro/4-03-q-learning/

➢ Optimization of the reinforcement learning: ⚫ ⚫ 𝑦𝑗 is a sentence in TS with relation 𝑧𝑗 Maximize Regarded as accurate value of Q(𝑇𝑢𝑗, 𝐵𝑗 ) minimize

slide-26
SLIDE 26

26

METHOD

26 ➢ Joint training for extractor and reforcement learning: ⚫ In paper Algorithm 2.

slide-27
SLIDE 27

OUTLINE

Introduction Method Experiment Conclusion

27

slide-28
SLIDE 28

28

EXPERIMENT

Dataset

➢ NYT+Freebase:Aligning entities and relations in

Freebase with the corpus of New York Times.

⚫ NYT in 2005-2006 → Training data ⚫ NYT in 2007 → Testing data

slide-29
SLIDE 29

29

EXPERIMENT

Performance with NN methods

https://blog.csdn.net/u013249853/article/details/961 32766

slide-30
SLIDE 30

30

EXPERIMENT

Performance with NN methods

slide-31
SLIDE 31

31

EXPERIMENT

Reasonability of model design

https://blog.csdn.net/u013249853/article/details/961 32766

slide-32
SLIDE 32

32

EXPERIMENT

Case study

slide-33
SLIDE 33

33

EXPERIMENT

Case study

slide-34
SLIDE 34

OUTLINE

Introduction Method Experiment Conclusion

34

slide-35
SLIDE 35

35

CONCLUSION

➢ Craft reinforcement learning to solve MIML problem,and generate

the sentence-level annotated signal in distantlysupervised relation extraction.

➢ Then

these chosen expressive sentences serve as training instances to feed the extractor.

➢ We conduct extensive experiments and the experimental results

demonstrate that our model can effectively alleviate MIML problem and achieve the new state-of-the-art performance.