Recognizing Named Entities using Automatically Extracted - - PowerPoint PPT Presentation

recognizing named entities using automatically extracted
SMART_READER_LITE
LIVE PREVIEW

Recognizing Named Entities using Automatically Extracted - - PowerPoint PPT Presentation

Recognizing Named Entities using Automatically Extracted Transduction Rules D. Nouvel, J.Y. Antoine, N. Friburger, A. Soulet Universit Franois Rabelais Tours Laboratoire dInformatique Equipe BDTLN Nouvel et al. (Franois Rabelais


slide-1
SLIDE 1

Recognizing Named Entities using Automatically Extracted Transduction Rules

  • D. Nouvel, J.Y. Antoine, N. Friburger, A. Soulet

Université François Rabelais Tours Laboratoire d’Informatique Equipe BDTLN

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 1 / 21

slide-2
SLIDE 2

Named Entity Recognition

◮ Named Entity Recognition (NER) task :

  • Proprer Nouns : person, location, organization (movie, brand. . .)
  • Definite Descriptions : time expression, amount, function (. . .)

◮ Named Entities Recognition (NER) by :

  • Detecting / delimiting NEs (determining frontiers, boundaries)
  • Categorizing / classifying / assigning a type to detected NEs

⇒ Finding markers as NEs boundaries

Example

The <prod> iPhone 4 </prod> was announced during the <time> 7th of june,

2010 </time> keynote by <pers> Steve Jobs </pers>, <fonc> chief executive

  • fficer </fonc> of the <org> Apple </org> company.

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 2 / 21

slide-3
SLIDE 3

General Context

Outline

  • 1. General Context
  • 2. Mining Patterns from Corpus
  • 3. NER using Informative Rules
  • 4. Experimental Results
  • 5. Conclusion

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 3 / 21

slide-4
SLIDE 4

General Context

Context of work

◮ Main approaches of NER :

  • Knowledge-based systems (difficult to attain good recall)
  • Machine learning systems (generally not easy to customize)

⇒ We try to find a common ground for combining / hybriding systems

◮ Existing system : CasEN [Fri06] (transducer / rule-based system) ◮ Available corpus : Ester2 [GGC09], corpus of transcription of

French radio broadcasts annotated in NEs :

Corpus Tokens Sentences NEs Ester2-corr 40 167 1 300 2 798 Ester2-held 48 143 1 683 3 074

TABLE: Characteristics of Ester2 corpora

⇒ Our objective : from Ester2 corpus (as train), mine pattern and

find informative rules that may enhance CasEN for NER

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 4 / 21

slide-5
SLIDE 5

General Context

Data Flow for NER Learning and Evaluating

Annotated Corpus Annotation (MaxEnt) Learning Corpus (annotated texts) Patterns Mining [AS95] Rules Filtering Test Corpus

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 5 / 21

slide-6
SLIDE 6

Mining Patterns from Corpus

Outline

  • 1. General Context
  • 2. Mining Patterns from Corpus
  • 3. NER using Informative Rules
  • 4. Experimental Results
  • 5. Conclusion

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 6 / 21

slide-7
SLIDE 7

Mining Patterns from Corpus

Extracting Patterns

◮ Finding rules that help detecting and categorizing

simultaneously by determining markers of NEs

  • he flies to Poznan → he flies to <loc> Poznan </loc>
  • president Obama → president <pers> Obama </pers>
  • the benefits of Apple → the benefits of <org> Apple </org>

◮ Preprocessings : tokens, lemmas, POS-tagging (TreeTagger)

⇒ Regular tokens : we only keep the lemma (generalized patterns) ⇒ Proper Nouns (PN), we only keep POS (avoids overfitting)

◮ Pattern Mining considerations :

  • Exhaustively looking for patterns on pre-annotated corpus
  • Extracting and filtering patterns correlated to NEs markers
  • Apply patterns on unseen (test) corpus

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 7 / 21

slide-8
SLIDE 8

Mining Patterns from Corpus

Building hierarchy of items

the a this

DET CN

president presidents head

  • fficer

. . .

PN

Poznan Apple . . .

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 8 / 21

slide-9
SLIDE 9

Mining Patterns from Corpus

From Corpus to Patterns : concrete example

he we

PRO

travels

VER

come to

PRP <loc>

Poznan

PN </loc>

by

PRP

with

Corpus pre-annotated sentence

◮ (. . .) As he travels to Poznan by plane, he thought (. . .) ◮ (. . .) , this time, we come to Barcelona with (. . .)

Extracted Patterns

◮ ◮ ◮ ◮ ◮ he travel to <loc> PN </loc> by

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 9 / 21

slide-10
SLIDE 10

Mining Patterns from Corpus

Filtering Patterns as Informative Rules

Transduction Rule

◮ A Transduction Rule is a morpho-syntactic pattern (relying on

the POS-tagging hierarchy) containing NEs markers for which are defined the standard parameters in pattern mining :

  • Support : number of occurrences in corpus
  • Confidence : in what proportion pattern appears with its markers

Informative Transduction Rule

◮ By exhaustively mining corpus, we obtain a very large set of rules

⇒ We need to filter out rules ⇒ For two rules which are generalization one of each other, we keep :

  • The most specific one in terms of POS-tagging hierarchy
  • The most informative according to markers

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 10 / 21

slide-11
SLIDE 11

NER using Informative Rules

Outline

  • 1. General Context
  • 2. Mining Patterns from Corpus
  • 3. NER using Informative Rules
  • 4. Experimental Results
  • 5. Conclusion

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 11 / 21

slide-12
SLIDE 12

NER using Informative Rules

Probability model

◮ Many rules are triggered at a given position ◮ Define a random variable to define probability of markers

P(Mi = mji)

◮ Annotation probability for a sentence (assumption : markers are

independant) : P(M1 = mj1,M2 = mj2,...,Mn = mjn)

≈ ∏

i=1...n

P(Mi = mji)

◮ Probability learned by Maximum Entropy modeling ◮ Use dynamic programming to search annotation (XML-like / flat)

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 12 / 21

slide-13
SLIDE 13

NER using Informative Rules

Dynamic programming

the

DET

/ 0 0.3

<time> 0.3 <org> 0.3 <pers> ∼ 0 <loc> ∼ 0

3rd

NUM

/ 0 0.2

<pers> 0.4 <loc> 0.2

Guggenheim

PN

/ 0 0.5

<org> 0.2 <org> 0.2 </pers> 0.2

Museum

CN

/ 0 0.1

</org> 0.6 </loc> 0.2

spent

VER

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 13 / 21

slide-14
SLIDE 14

Experimental Results

Outline

  • 1. General Context
  • 2. Mining Patterns from Corpus
  • 3. NER using Informative Rules
  • 4. Experimental Results
  • 5. Conclusion

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 14 / 21

slide-15
SLIDE 15

Experimental Results

Ester2 Corpus

Pattern extraction results over Ester2-Corr ( 40K tokens, 3K NEs)

Corpus Sup. Conf. Rules

  • Inf. Rules

Gain Ester2-corr 10 .5 2 270 1 119 2.03 5 .5 28 047 3 673 7.63 3 .3 458 875 12 653 36.27

TABLE: Extraction over Ester2 corpus at support and confidence thresholds

Interpretation

◮ Number of patterns is very large when support / confidence

thresholds are lowered

◮ Filtering pattern is effective and alllows to keep a reasonnable

number of rules

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 15 / 21

slide-16
SLIDE 16

Experimental Results

Predicting Markers

Predicted markers Actual markers tot

/ <pers> </pers> <loc> </loc> <org> </org> <fonc> </fonc> rec. /

27803 27168 46 5 114 68 91 75 28 28 0.98

<pers>

583 86 430 20 1 26 1 18 0.74

</pers>

592 48 470 45 27 0.79

<loc>

700 162 20 2 394 114 1 2 0.56

</loc>

698 137 2 16 2 407 127 0.58

<org>

448 203 30 45 157 2 6 0.35

</org>

443 176 59 69 122 2 0.27

<fonc>

225 84 1 2 3 2 129 0.57

</fonc>

219 112 27 6 10 14 48 0.22 prec. 0.94 0.77 0.83 0.68 0.66 0.40 0.33 0.81 0.46

TABLE: Confusion matrix between rule markers using a MaxEnt classifier

Interpretation

◮ Great ambiguity org/pers and org/loc (known problem) ◮ Beginning of a NE is not necessarily easier to find (cf pers, loc)

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 16 / 21

slide-17
SLIDE 17

Experimental Results

Predictions NEs

FIGURE: Evaluating (SER, to be minimized) NER annotations

Interpretation

◮ MaxEnt accurately weights rules (even less frequent/confident)

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 17 / 21

slide-18
SLIDE 18

Experimental Results

Hybriding Symbolic and Mining Systems

Ins. Del. Typ. Ext. SER Symbolic 43 348 171 257 29.0 fonc

  • 1

+1 28.8 loc +4

  • 15

+3 +1 16.8

  • rg
  • 13

+11 52.8 pers +1

  • 20

+8 15.3 time

  • 2

24.6 total +5

  • 51

+19 +8

  • 1.3

Coupled 48 297 190 265 27.7

TABLE: Using informative rules to enhance a symbolic system

Interpretation

◮ Coupling systems improves system with generic rules

  • from <pers> PN PN
  • to <loc> PN
  • for <time> / years </time> (“for a few years”)

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 18 / 21

slide-19
SLIDE 19

Conclusion

Outline

  • 1. General Context
  • 2. Mining Patterns from Corpus
  • 3. NER using Informative Rules
  • 4. Experimental Results
  • 5. Conclusion

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 19 / 21

slide-20
SLIDE 20

Conclusion

Conclusion

Contributions

◮ Extracting rules using a morpho-syntactic hierarchy ◮ Filtering specific and informative patterns as rules ◮ Using patterns to annotate a texte (Named Entities) ◮ Hybriding systems

Further investigations

◮ Better filtering patterns to be integrated in the knowledge base ? ◮ How to enrich patterns (syntax, semantics, anaphora) ◮ Assess performance with other models to predict markers ◮ Involved in NER task of project Etape (French National Research

Agency, ANR)

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 20 / 21

slide-21
SLIDE 21

Conclusion

Thank you

Rakesh Agrawal and Ramakrishnan Srikant. Mining sequential patterns. In International Conference on Data Engineering (ICDE’95), pages 3–14, 1995. Nathalie Friburger. Linguistique et reconnaissance automatique des noms propres. Meta : Translators’ Journal, 51-4 :637–650, 2006. Sylvain Galliano, Guillaume Gravier, and Laura Chaubard. The ester 2 evaluation campaign for the rich transcription of french radio broadcasts. In 10th Conference of the International Speech Communication Association (INTERSPEECH’2009), 2009.

Nouvel et al. (François Rabelais Tours) REN using Extracted Rules 21 / 21