Neural Networks and Coreference Resolution for Slot Filling Heike - - PowerPoint PPT Presentation

neural networks and coreference resolution for slot
SMART_READER_LITE
LIVE PREVIEW

Neural Networks and Coreference Resolution for Slot Filling Heike - - PowerPoint PPT Presentation

Neural Networks and Coreference Resolution for Slot Filling Heike Adel, Hinrich Sch utze Team CIS University of Munich (LMU) TAC workshop November 16, 2015 CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel


slide-1
SLIDE 1

Neural Networks and Coreference Resolution for Slot Filling

Heike Adel, Hinrich Sch¨ utze Team CIS University of Munich (LMU) TAC workshop November 16, 2015

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 1 / 21

slide-2
SLIDE 2

CIS Slot Filling System: Overview Improved Integration of Coreference Resolution Relation Classification Models for Slot Filling CIS Performance in the TAC Shared Task 2015

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 2 / 21

slide-3
SLIDE 3

System overview

Query (entity name + starting point)

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 3 / 21

slide-4
SLIDE 4

System overview

Information retrieval component

[Terrier]

Query (entity name + starting point) Alias component Aliases for entity

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 3 / 21

slide-5
SLIDE 5

System overview

Information retrieval component

[Terrier]

Query (entity name + starting point) Alias component Aliases for entity Entity linking component

[WAT]

Documents with aliases

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 3 / 21

slide-6
SLIDE 6

System overview

Information retrieval component

[Terrier]

Candidate extraction component

[Stanford CoreNLP]

Query (entity name + starting point) Documents about entities Alias component Aliases for entity Sentence extraction Filler extraction Entity linking component

[WAT]

Documents with aliases

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 3 / 21

slide-7
SLIDE 7

System overview

Information retrieval component

[Terrier]

Candidate extraction component

[Stanford CoreNLP]

Slot filler classification component Query (entity name + starting point) Documents about entities Possible slot fillers Alias component Aliases for entity Sentence extraction Filler extraction Entity linking component

[WAT]

Documents with aliases

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 3 / 21

slide-8
SLIDE 8

System overview

Information retrieval component

[Terrier]

Candidate extraction component

[Stanford CoreNLP]

Slot filler classification component Query (entity name + starting point) Documents about entities Possible slot fillers

  • utput

Alias component Aliases for entity Postprocessing component Scored slot fillers Sentence extraction Filler extraction Entity linking component

[WAT]

Documents with aliases

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 3 / 21

slide-9
SLIDE 9

Contents of this talk

Information retrieval component

[Terrier]

Candidate extraction component

[Stanford CoreNLP]

Slot filler classification component Query (entity name + starting point) Documents about entities Possible slot fillers

  • utput

Alias component Aliases for entity Postprocessing component Scored slot fillers Sentence extraction Filler extraction Entity linking component

[WAT]

Documents with aliases

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 4 / 21

slide-10
SLIDE 10

How coreference could help slot filling

◮ Find every sentence with mentions of the entity

⇒ Provide models next in pipeline with all (?) necessary information to fill the slots

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 5 / 21

slide-11
SLIDE 11

How coreference could help slot filling

◮ Find every sentence with mentions of the entity

⇒ Provide models next in pipeline with all (?) necessary information to fill the slots

◮ Get some slot fillers for free:

◮ The mention “XX-year-old” already includes the fact that the

entity is XX years old (same for “XX-based” or “XX-born”)

◮ The mention “his mother” already includes the fact that the

subject of the sentence is a child of the entity

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 5 / 21

slide-12
SLIDE 12

How coreference could help slot filling

◮ Find every sentence with mentions of the entity

⇒ Provide models next in pipeline with all (?) necessary information to fill the slots

◮ Get some slot fillers for free:

◮ The mention “XX-year-old” already includes the fact that the

entity is XX years old (same for “XX-based” or “XX-born”)

◮ The mention “his mother” already includes the fact that the

subject of the sentence is a child of the entity

⇒ Coreference is a very important component of this task! ⇒ According to [Min and Grishman 2012, Pink et al. 2014], shortcomings of coreference resolution are one of the most important error sources!

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 5 / 21

slide-13
SLIDE 13

Analysis: Shortcomings of coreference resolution systems

◮ Nominal anaphora like “XX-year-old”, “XX-based”,

“XX-born” are not recognized as coreferent to the entity in the previous sentence in most cases

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 6 / 21

slide-14
SLIDE 14

Analysis: Shortcomings of coreference resolution systems

◮ Nominal anaphora like “XX-year-old”, “XX-based”,

“XX-born” are not recognized as coreferent to the entity in the previous sentence in most cases

◮ Pronouns referring to the same entity are often clustered in

the same chain - unfortunately, the entity is often clustered in another chain

◮ Unlinked chains ◮ Wrongly linked chains CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 6 / 21

slide-15
SLIDE 15

Nominal anaphora: Improvements

◮ Heuristic:

Entity ∈ sentencet ?

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 7 / 21

slide-16
SLIDE 16

Nominal anaphora: Improvements

◮ Heuristic:

Entity ∈ sentencet ? Nominal anaphor ∈ sentencet+1 ? yes Ignore possible nominal anaphora no

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 7 / 21

slide-17
SLIDE 17

Nominal anaphora: Improvements

◮ Heuristic:

Entity ∈ sentencet ? Nominal anaphor ∈ sentencet+1 ? yes Another entity directly after anaphor ? Ignore possible nominal anaphora no no yes

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 7 / 21

slide-18
SLIDE 18

Nominal anaphora: Improvements

◮ Heuristic:

Entity ∈ sentencet ? Nominal anaphor ∈ sentencet+1 ? yes Another entity directly after anaphor ? Nominal anaphor may refer to entity Ignore possible nominal anaphora no no yes yes no

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 7 / 21

slide-19
SLIDE 19

Expansion of coreference integration

◮ CIS SF system for 2014 evaluation: only coreference

resolution for entities from queries (<name>)

◮ BUT: consider a sentence like “He is her father.”

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 8 / 21

slide-20
SLIDE 20

Expansion of coreference integration

◮ CIS SF system for 2014 evaluation: only coreference

resolution for entities from queries (<name>)

◮ BUT: consider a sentence like “He is her father.” ◮ Analysis: Coreference resolution for filler: important especially

due to newly introduced inverse slots

◮ 2014: 8 slots with PER fillers ◮ 2015: 20 slots with PER fillers CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 8 / 21

slide-21
SLIDE 21

Expansion of coreference integration

◮ CIS SF system for 2014 evaluation: only coreference

resolution for entities from queries (<name>)

◮ BUT: consider a sentence like “He is her father.” ◮ Analysis: Coreference resolution for filler: important especially

due to newly introduced inverse slots

◮ 2014: 8 slots with PER fillers ◮ 2015: 20 slots with PER fillers

◮ Now: coreference resolution for both <name> and <filler>

◮ But only if filler is a person ◮ Future work: Investigate the effect of coreference resolution for

fillers in more detail Extend it to other filler types as well

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 8 / 21

slide-22
SLIDE 22

Coreference resource

◮ Observation: Long runtime of coreference resolution systems ◮ Solution: Corpus pre-processing

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 9 / 21

slide-23
SLIDE 23

Coreference resource

◮ Observation: Long runtime of coreference resolution systems ◮ Solution: Corpus pre-processing ◮ TAC source corpus: ∼65% pre-processed with [Stanford

CoreNLP] so far

◮ ∼30M chains and ∼105M mentions found ◮ ∼25M pronoun mentions CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 9 / 21

slide-24
SLIDE 24

Coreference resource

◮ Observation: Long runtime of coreference resolution systems ◮ Solution: Corpus pre-processing ◮ TAC source corpus: ∼65% pre-processed with [Stanford

CoreNLP] so far

◮ ∼30M chains and ∼105M mentions found ◮ ∼25M pronoun mentions

◮ Easily accessible format: chains of mention start offset - end

  • ffset pairs

◮ NYT ENG 20090601.0015 14

2424-2441 87-95 170-178 812-820 890-892 1473-1483 1785-1793 2036-2044 2493-2495 211-250 1649-1657 798-892 587-595 1121-1129 1130-1132 ...

◮ Resource will be publicly available

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 9 / 21

slide-25
SLIDE 25

Classification component 2015

data avail- able? Pattern matcher [Roth 2013] no yes Pattern matcher [Roth 2013] SVM probability weighted sum result match: 1.0 no match: 0.0 CNN probability {0.0,1.0} RNN probability

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 10 / 21

slide-26
SLIDE 26

Convolutional neural networks: Motivation

◮ Extract most relevant n-grams

◮ Convolution: Create n-gram representations ◮ Pooling: Find most relevant n-grams ◮ ... independent of position in sentence

◮ Use n-gram based sentence representation for classification ◮ Wordvectors: implicit handling of synonyms

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 11 / 21

slide-27
SLIDE 27

CNNs for slot filling

w1 w2 … wc-1 wc <> wc+1 wc+2 … w2c-1 w2c <> w2c+1w2c+2 … w3c-1w3c Wordvector, case indicator pooling pooling pooling * W * W * W flatten flatten flatten 0 | 1 softmax fully connected MLP n h hidden units left context middle context right context 1/0 concat

◮ Input: pre-trained word

embeddings [word2vec]

◮ Context splitting ◮ Convolution and

pooling for all contexts separately

◮ MLP (one hidden

layer) and softmax for relation classification

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 12 / 21

slide-28
SLIDE 28

Recurrent neural networks: Motivation

◮ Create global sentence representation ◮ ... using all available information ◮ Possibly more robust against insertions (than e.g. patterns) ◮ Possibly better with longer sentence lengths (than CNN)

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 13 / 21

slide-29
SLIDE 29

RNNs for slot filling

Uni-directional RNN

relation h

w1

h h

<>

h

<> w3

h

w5

h

w6 U U U U U U V V V V V

Bi-directional RNN

hb hb hb relation hb hb

w6 w5 <> w3 <> w1

hb hf

w1

hf hf

<>

hf

<> w3

hf

w5

hf

w6

h h h h h h

+ + + + + + + + + + + + H H H H H B B B B B V V V V V Ub Uf Ub Ub Ub Ub Ub Uf Uf Uf Uf Uf

◮ Input: pre-trained word

embeddings [word2vec]

◮ Softmax for classification ◮ (1) Uni-directional RNN ◮ (2) Bi-directional RNN ◮ (3) Multi-task

bi-directional RNN

◮ Predict type of next

word (rel argument 1, rel argument 2, other)

◮ Result of RNN

component: score of the most confident RNN

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 14 / 21

slide-30
SLIDE 30

Performance in the TAC shared task 2015

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 15 / 21

slide-31
SLIDE 31

CIS runs

◮ All runs include coreference resolution ◮ All runs: automatically tuned slot-wise output thresholds

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 16 / 21

slide-32
SLIDE 32

CIS runs

◮ All runs include coreference resolution ◮ All runs: automatically tuned slot-wise output thresholds ◮ Submission of five runs:

◮ Base run: classification with patterns + SVM + CNN CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 16 / 21

slide-33
SLIDE 33

CIS runs

◮ All runs include coreference resolution ◮ All runs: automatically tuned slot-wise output thresholds ◮ Submission of five runs:

◮ Base run: classification with patterns + SVM + CNN ◮ Non-neural run: base run - CNN CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 16 / 21

slide-34
SLIDE 34

CIS runs

◮ All runs include coreference resolution ◮ All runs: automatically tuned slot-wise output thresholds ◮ Submission of five runs:

◮ Base run: classification with patterns + SVM + CNN ◮ Non-neural run: base run - CNN ◮ RNN run: base run + RNN CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 16 / 21

slide-35
SLIDE 35

CIS runs

◮ All runs include coreference resolution ◮ All runs: automatically tuned slot-wise output thresholds ◮ Submission of five runs:

◮ Base run: classification with patterns + SVM + CNN ◮ Non-neural run: base run - CNN ◮ RNN run: base run + RNN ◮ EL run: base run + entity linking for document extraction CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 16 / 21

slide-36
SLIDE 36

CIS runs

◮ All runs include coreference resolution ◮ All runs: automatically tuned slot-wise output thresholds ◮ Submission of five runs:

◮ Base run: classification with patterns + SVM + CNN ◮ Non-neural run: base run - CNN ◮ RNN run: base run + RNN ◮ EL run: base run + entity linking for document extraction ◮ High precision run: base run with output thresholds += 0.2 CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 16 / 21

slide-37
SLIDE 37

CIS system results

◮ Best run: PAT + SVM + CNN + RNN ◮ Final results:

mean macro max macro max micro high P run 12.87 14.01 13.77 base run 20.15 21.89 19.70 RNN run 20.79 22.45 20.90 EL run 20.39 22.15 20.21 non-neural run 17.60 19.28 14.62

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 17 / 21

slide-38
SLIDE 38

Analysis 1: Impact of coreference resolution

◮ All submitted runs included coreference resolution

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 18 / 21

slide-39
SLIDE 39

Analysis 1: Impact of coreference resolution

◮ All submitted runs included coreference resolution ◮ Offline run without coreference resolution ◮ Evaluated using the official assessments and scoring scripts

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 18 / 21

slide-40
SLIDE 40

Analysis 1: Impact of coreference resolution

◮ All submitted runs included coreference resolution ◮ Offline run without coreference resolution ◮ Evaluated using the official assessments and scoring scripts ◮ Results (max micro):

P R F1 hop 0 base run 31.83 23.97 27.35 hop 0

  • coref

29.70 20.82 24.48 hop 1 base run 11.63 7.21 8.90 hop 1

  • coref

10.50 5.66 7.36 all base run 24.02 16.70 19.70 all

  • coref

22.58 14.25 17.47

◮ ⇒ Large impact of coreference resolution on end-to-end

performance

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 18 / 21

slide-41
SLIDE 41

Analysis 2: Impact of neural networks

◮ Design of runs to immediately assess the impact of the neural

networks

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 19 / 21

slide-42
SLIDE 42

Analysis 2: Impact of neural networks

◮ Design of runs to immediately assess the impact of the neural

networks

◮ Results (max micro):

P R F1 hop 0 PAT+SVM 18.99 22.32 20.52 hop 0 PAT+SVM+CNN 31.83 23.97 27.35 hop 0 PAT+SVM+CNN+RNN 29.98 26.58 28.18 hop 1 PAT+SVM 5.92 4.53 5.13 hop 1 PAT+SVM+CNN 11.63 7.21 8.90 hop 1 PAT+SVM+CNN+RNN 13.82 6.08 8.44 all PAT+SVM 14.64 14.60 14.62 all PAT+SVM+CNN 24.02 16.70 19.70 all PAT+SVM+CNN+RNN 25.53 17.69 20.90

◮ ⇒ Neural networks improve end-to-end performance with 6.28

F1 points

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 19 / 21

slide-43
SLIDE 43

Conclusion

◮ Focus of this talk: coreference resolution, relation

classification with neural networks

◮ Coreference resolution:

◮ Coreference resolution for both relation arguments ◮ Heuristical error post-processing

⇒ Considerable impact on end-to-end performance (esp. on recall)

◮ Neural networks:

◮ CNNs and RNNs ◮ Interpolation of scores with non-neural model results

⇒ Very large impact on end-to-end performance

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 20 / 21

slide-44
SLIDE 44

Thanks for your attention!

Contact: heike.adel@cis.lmu.de http://www.cis.uni-muenchen.de/∼heike

CIS at TAC: Neural Networks and Coreference Resolution for Slot Filling Heike Adel 2015/11/16 21 / 21

slide-45
SLIDE 45

References

◮ Terrier:

Iadh Ounis, Gianni Amati, Vassilis Plachouras, Ben He, Craig Macdonald, Christina Lioma: Terrier: A high performance and scalable information retrieval platform. In: OSIR 2006.

◮ WAT:

Francesco Piccinno, Paolo Ferragina: From Tagme to WAT: a new entity annotator. In: workshop on Entity recognition & disambiguation 2014.

◮ Stanford CoreNLP:

Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, David McClosky: The Stanford CoreNLP natural language processing toolkit. In: ACL System Demonstrations 2014.

◮ Min and Grishman 2012:

Bonan Min, Ralph Grishman: Challenges in the knowledge base population slot filling task. In: LREC 2012.

slide-46
SLIDE 46

References

◮ Pink et al. 2014:

Glen Pink, Joel Nothman, James R Curran: Analysing recall loss in named entity slot filling. In: EMNLP 2014.

◮ Roth 2013:

Benjamin Roth, Tassilo Barth, Michael Wiegand, Mittul Singh, Dietrich Klakow: Effective slot filling based on shallow distant supervision methods. In: TAC 2013.

◮ word2vec:

Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean: Efficient estimation of word representations in vector space. In: Workshop at ICLR 2013.

slide-47
SLIDE 47

Acknowledgements

◮ Heike Adel is a recipient of the Google Europe Fellowship in

Natural Language Processing and this research is supported by this fellowship.

◮ This work was also supported by DFG (grant SCHU

2246/4-2).

◮ We would like to thank Pankaj Gupta for training the RNN

models.