Joint Entity Disambiguation and Clustering Angela Fahrni, Thierry - - PowerPoint PPT Presentation

joint entity disambiguation and clustering
SMART_READER_LITE
LIVE PREVIEW

Joint Entity Disambiguation and Clustering Angela Fahrni, Thierry - - PowerPoint PPT Presentation

Joint Entity Disambiguation and Clustering Angela Fahrni, Thierry Gckel and Michael Strube Heidelberg Institute for Theoretical Studies gGmbH Heidelberg, Germany 1 / 54 ambiguation, Recognition of NILs, Clustering, Disambiguation,


slide-1
SLIDE 1

Joint Entity Disambiguation and Clustering

Angela Fahrni, Thierry Göckel and Michael Strube Heidelberg Institute for Theoretical Studies gGmbH Heidelberg, Germany

1 / 54

slide-2
SLIDE 2

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

2 / 54

slide-3
SLIDE 3

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

Common and Proper Nouns

2 / 54

slide-4
SLIDE 4

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

Common and Proper Nouns F

  • c

u s

  • n

M

  • d

e l l i n g

2 / 54

slide-5
SLIDE 5

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Common and Proper Nouns F

  • c

u s

  • n

M

  • d

e l l i n g

2 / 54

slide-6
SLIDE 6

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns F

  • c

u s

  • n

M

  • d

e l l i n g

2 / 54

slide-7
SLIDE 7

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns F

  • c

u s

  • n

M

  • d

e l l i n g Low Average Ambiguity

2 / 54

slide-8
SLIDE 8

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Low Average Ambiguity

2 / 54

slide-9
SLIDE 9

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Competitive Results Low Average Ambiguity

2 / 54

slide-10
SLIDE 10

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Competitive Results Low Average Ambiguity

3 / 54

slide-11
SLIDE 11

Within the States, American crocodiles live in Florida. Recently, the biologist Aldecoa captured an older crocodile in the sunshine state. The biologist Aldecoa caught the hatchlings. Text 1 Text 2

4 / 54

slide-12
SLIDE 12

Within the States, American crocodiles live in Florida. Recently, the biologist Aldecoa captured an older crocodile in the sunshine state. The biologist Aldecoa caught the hatchlings. Text 1 Text 2

4 / 54

slide-13
SLIDE 13

WithinSthe States1SAmerican crocodiles liveSin Florida. Recently1Sthe biologist AldecoaScapturedS anSolder crocodileSinSthe sunshine state. The biologist AldecoaScaughtStheS hatchlings. AmericanSCrocodilesSKAnimalw RenéSLacosteSKTennisSplayerw CrocodileSKLocomotivew UnitedSStates FloridaSKUSSStatew StateSofSmatterS IgnacioSAldecoaSSKSpanishSAuthorw EmilioSAldecoaSKFootballSplayerw Biologist Hatchling

KnowledgeSBase

FloridaSKPuertoSRicow StateSKPolityw TextS1 TextS2

4 / 54

slide-14
SLIDE 14

WithinSthe States1SAmerican crocodiles liveSin Florida. Recently1Sthe biologist AldecoaScapturedS anSolder crocodileSinSthe sunshine state. The biologist AldecoaScaughtStheS hatchlings. AmericanSCrocodilesSKAnimalw RenéSLacosteSKTennisSplayerw CrocodileSKLocomotivew UnitedSStates FloridaSKUSSStatew StateSofSmatterS IgnacioSAldecoaSSKSpanishSAuthorw EmilioSAldecoaSKFootballSplayerw Biologist Hatchling

KnowledgeSBase

FloridaSKPuertoSRicow StateSKPolityw TextS1 TextS2

4 / 54

slide-15
SLIDE 15

WithinCthe StatesHCAmerican crocodiles liveCin Florida. RecentlyHCthe biologist AldecoaCcapturedC anColder crocodileCinCthe sunshine state. The biologist AldecoaCcaughtCtheC hatchlings. AmericanCCrocodilesCfAnimalI RenéCLacosteCfTennisCplayerI CrocodileCfLocomotiveI UnitedCStates FloridaCfUSCStateI StateCofCmatterC IgnacioCAldecoaCCfSpanishCAuthorI EmilioCAldecoaCfFootballCplayerI Biologist Hatchling

KnowledgeCBase

FloridaCfPuertoCRicoI StateCfPolityI TextC1 TextC2

4 / 54

slide-16
SLIDE 16

WithinCthe StatesHCAmerican crocodiles liveCin Florida. RecentlyHCthe biologist AldecoaCcapturedC anColder crocodileCinCthe sunshine state. The biologist AldecoaCcaughtCtheC hatchlings. AmericanCCrocodilesCfAnimalI RenéCLacosteCfTennisCplayerI CrocodileCfLocomotiveI UnitedCStates FloridaCfUSCStateI StateCofCmatterC IgnacioCAldecoaCCfSpanishCAuthorI EmilioCAldecoaCfFootballCplayerI Biologist Hatchling

KnowledgeCBase

FloridaCfPuertoCRicoI StateCfPolityI TextC1 TextC2

4 / 54

slide-17
SLIDE 17

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Competitive Results Low Average Ambiguity

5 / 54

slide-18
SLIDE 18

Our Last Year’s Approach

Cascaded Approach

NIL detection for each T ext t end for Clustering of NILs Entity candidates identification Entity disambiguation end for for all Noun n in T ext t

6 / 54

slide-19
SLIDE 19

Our Last Year’s Approach

Cascaded Approach

NIL detection for each T ext t end for Clustering of NILs Entity candidates identification Entity disambiguation end for for all Noun n in T ext t

Local classifier

6 / 54

slide-20
SLIDE 20

Our Last Year’s Approach

Cascaded Approach

NIL detection for each T ext t end for Clustering of NILs Entity candidates identification Entity disambiguation end for for all Noun n in T ext t

Graph-based, global approach Local classifier

6 / 54

slide-21
SLIDE 21

Our Last Year’s Approach

Cascaded Approach

NIL detection for each T ext t end for Clustering of NILs Entity candidates identification Entity disambiguation end for for all Noun n in T ext t

Graph-based, global approach Local classifier Graph-based clustering

6 / 54

slide-22
SLIDE 22

Concept/Entity Disambiguation

Local approaches: Local supervised classification or ranking approaches

Withinwthe StatesEwAmericanwcrocodilesw livewinwFlorida. RecentlyEwthewbiologistwAldecoawcapturedw anwolderwcrocodilewinwthewsunshinewstate. ThewbiologistwwAldecoawcaughtwthew hatchlings. AmericanwCrocodileswCAnimalé RenéwLacostewCTenniswplayeré CrocodilewCLocomotiveé UnitedwStates FloridawCUSwStateé Statewofwmatterw IgnaciowAldecoawwCSpanishwAuthoré EmiliowAldecoawCFootballwplayeré Biologist Hatchling

KnowledgewBase

FloridawCPuertowRicoé StatewCPolityé Textw1 Textw2

7 / 54

slide-23
SLIDE 23

Concept/Entity Disambiguation

Local approaches: Local supervised classification or ranking approaches

Withinwthe StatesEwAmericanwcrocodilesw livewinwFlorida. RecentlyEwthewbiologistwAldecoawcapturedw anwolderwcrocodilewinwthewsunshinewstate. ThewbiologistwwAldecoawcaughtwthew hatchlings. AmericanwCrocodileswCAnimalé RenéwLacostewCTenniswplayeré CrocodilewCLocomotiveé UnitedwStates FloridawCUSwStateé Statewofwmatterw IgnaciowAldecoawwCSpanishwAuthoré EmiliowAldecoawCFootballwplayeré Biologist Hatchling

KnowledgewBase

FloridawCPuertowRicoé StatewCPolityé Textw1 Textw2

7 / 54

slide-24
SLIDE 24

Concept/Entity Disambiguation

Local approaches: Local supervised classification or ranking approaches

WithinétheéStatesHéAmerican crocodiles liveéinéFloria. RecentlyHétheébiologistéAldecoaécapturedé anéolderécrocodileéinétheésunshineéstate. TheébiologistééAldecoaécaughtétheé hatchlings. AmericanéCrocodileséUAnimalf RenééLacosteéUTenniséplayerf CrocodileéULocomotivef UnitedéStates FloridaéUUSéStatef Stateéofématteré IgnacioéAldecoaééUSpanishéAuthorf EmilioéAldecoaéUFootballéplayerf Biologist Hatchling

KnowledgeéBase

FloridaéUPuertoéRicof StateéUPolityf Texté1 Texté2

7 / 54

slide-25
SLIDE 25

Concept/Entity Disambiguation

Local approaches: Local supervised classification or ranking approaches

WithinétheéStatesHéAmerican crocodiles liveéinéFloria. RecentlyHétheébiologistéAldecoaécapturedé anéolderécrocodileéinétheésunshineéstate. TheébiologistééAldecoaécaughtétheé hatchlings. AmericanéCrocodileséUAnimalf RenééLacosteéUTenniséplayerf CrocodileéULocomotivef UnitedéStates FloridaéUUSéStatef Stateéofématteré IgnacioéAldecoaééUSpanishéAuthorf EmilioéAldecoaéUFootballéplayerf Biologist Hatchling

KnowledgeéBase

FloridaéUPuertoéRicof StateéUPolityf Texté1 Texté2

7 / 54

slide-26
SLIDE 26

Concept/Entity Disambiguation

Local approaches: Local supervised classification or ranking approaches

WithinKtheKStatesEKAmericanKcrocodilesK liveKin Florida. RecentlyEKtheKbiologistKAldecoaKcapturedK anKolderKcrocodileKinKtheKsunshineKstate. TheKbiologistKKAldecoaKcaughtKtheK hatchlings. AmericanKCrocodilesKéAnimalL RenéKLacosteKéTennisKplayerL CrocodileKéLocomotiveL UnitedKStates FloridaKéUSKStateL StateKofKmatterK IgnacioKAldecoaKKéSpanishKAuthorL EmilioKAldecoaKéFootballKplayerL Biologist Hatchling

KnowledgeKBase

FloridaKéPuertoKRicoL StateKéPolityL TextK1 TextK2

7 / 54

slide-27
SLIDE 27

Concept/Entity Disambiguation

Local approaches: Local supervised classification or ranking approaches

WithinKtheKStatesEKAmericanKcrocodilesK liveKin Florida. RecentlyEKtheKbiologistKAldecoaKcapturedK anKolderKcrocodileKinKtheKsunshineKstate. TheKbiologistKKAldecoaKcaughtKtheK hatchlings. AmericanKCrocodilesKéAnimalL RenéKLacosteKéTennisKplayerL CrocodileKéLocomotiveL UnitedKStates FloridaKéUSKStateL StateKofKmatterK IgnacioKAldecoaKKéSpanishKAuthorL EmilioKAldecoaKéFootballKplayerL Biologist Hatchling

KnowledgeKBase

FloridaKéPuertoKRicoL StateKéPolityL TextK1 TextK2

7 / 54

slide-28
SLIDE 28

Concept/Entity Disambiguation

Global approaches: Collective classification approaches

WithinSthe States1SAmerican crocodiles liveSin Florida. Recently1Sthe biologist AldecoaScapturedS anSolder crocodileSinSthe sunshine state. The biologist AldecoaScaughtStheS hatchlings. AmericanSCrocodilesSKAnimalw RenéSLacosteSKTennisSplayerw CrocodileSKLocomotivew UnitedSStates FloridaSKUSSStatew StateSofSmatterS IgnacioSAldecoaSSKSpanishSAuthorw EmilioSAldecoaSKFootballSplayerw Biologist Hatchling

KnowledgeSBase

FloridaSKPuertoSRicow StateSKPolityw TextS1 TextS2

8 / 54

slide-29
SLIDE 29

Concept/Entity Disambiguation

Global approaches: Collective classification approaches

WithinSthe States1SAmerican crocodiles liveSin Florida. Recently1Sthe biologist AldecoaScapturedS anSolder crocodileSinSthe sunshine state. The biologist AldecoaScaughtStheS hatchlings. AmericanSCrocodilesSKAnimalw RenéSLacosteSKTennisSplayerw CrocodileSKLocomotivew UnitedSStates FloridaSKUSSStatew StateSofSmatterS IgnacioSAldecoaSSKSpanishSAuthorw EmilioSAldecoaSKFootballSplayerw Biologist Hatchling

KnowledgeSBase

FloridaSKPuertoSRicow StateSKPolityw TextS1 TextS2

8 / 54

slide-30
SLIDE 30

Our Last Year’s Approach

Cascaded Approach

NIL detection for each T ext t end for Clustering of NILs Entity candidates identification Entity disambiguation end for for all Noun n in T ext t

Graph-based, global approach Local classifier Graph-based clustering

9 / 54

slide-31
SLIDE 31

Graph-based Approach

Recently, the [biologist] [Aldecoa] captured a [crocodile] in the [sunshine state]

10 / 54

slide-32
SLIDE 32

Summary Cascaded Approach

  • Global approach with competitive results (best system at NTCIR 9,

between median and best system in the Chinese cross-lingual entity linking task at TAC 2011)

  • Error propagation
  • Training: how to integrate more (local) features?
  • Non-pairwise features?

11 / 54

slide-33
SLIDE 33

Error Propagation

Cascaded Approach

NIL detection for each Text t end for Clustering of NILs Entity candidates identification Entity disambiguation end for for each Noun n in Text t

9 7 %

12 / 54

slide-34
SLIDE 34

Error Propagation

Cascaded Approach

NIL detection for each Text t end for Clustering of NILs Entity candidates identification Entity disambiguation end for for each Noun n in Text t

9 7 % 8 %

12 / 54

slide-35
SLIDE 35

Error Propagation

Cascaded Approach

NIL detection for each Text t end for Clustering of NILs Entity candidates identification Entity disambiguation end for for each Noun n in Text t

9 7 % 8 % 7 2 %

12 / 54

slide-36
SLIDE 36

Error Propagation

Cascaded Approach

NIL detection for each Text t end for Clustering of NILs Entity candidates identification Entity disambiguation end for for each Noun n in Text t

9 7 % 8 % 7 2 % 5 6 %

12 / 54

slide-37
SLIDE 37

Summary Cascaded Approach

  • Global approach with competitive results (best system at NTCIR 9,

between median and best system in the Chinese cross-lingual entity linking task at TAC 2011)

  • Error propagation
  • Training: how to integrate more (local) features?
  • Non-pairwise features?

13 / 54

slide-38
SLIDE 38

Learning Weights

Recently, the [biologist] [Aldecoa] captured a [crocodile] in the [sunshine state]

14 / 54

slide-39
SLIDE 39

Summary Cascaded Approach

  • Global approach with competitive results (best system at NTCIR 9,

between median and best system in the Chinese cross-lingual entity linking task at TAC 2011)

  • Error propagation
  • Training: how to integrate more (local) features?
  • Non-pairwise features?

15 / 54

slide-40
SLIDE 40

Non-pairwise Features?

Recently, the [biologist] [Aldecoa] captured a [crocodile] in the [sunshine state]

16 / 54

slide-41
SLIDE 41

Summary Cascaded Approach

  • Global approach with competitive results (best system at NTCIR 9,

between median and best system in the Chinese cross-lingual entity linking task at TAC 2011)

  • Error propagation
  • Training: how to integrate more (local) features?
  • Non-pairwise features?

17 / 54

slide-42
SLIDE 42

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Competitive Results Low Average Ambiguity

18 / 54

slide-43
SLIDE 43

Novel Approach

Joint Approach Cascaded Approach

NIL detection for each Text t end for for each Noun n in each Text Entity candidates identification end for Disambiguation Clustering of NILs Entity candidates identification Entity disambiguation end for for each Noun n in Text t NIL detection Clustering

19 / 54

slide-44
SLIDE 44

Joint Approach

Within the States, American crocodiles live in Florida. Recently, the biologist Aldecoa captured an older crocodile in the sunshine state. The biologist Aldecoa caught the hatchlings. Text 1 Text 2

20 / 54

slide-45
SLIDE 45

Joint Approach

WithinSthe States1SAmerican crocodiles liveSin Florida. Recently1Sthe biologist AldecoaScapturedS anSolder crocodileSinSthe sunshine state. The biologist AldecoaScaughtStheS hatchlings. AmericanSCrocodilesSKAnimalw RenéSLacosteSKTennisSplayerw CrocodileSKLocomotivew UnitedSStates FloridaSKUSSStatew StateSofSmatterS IgnacioSAldecoaSSKSpanishSAuthorw EmilioSAldecoaSKFootballSplayerw Biologist Hatchling

KnowledgeSBase

FloridaSKPuertoSRicow StateSKPolityw TextS1 TextS2

21 / 54

slide-46
SLIDE 46

Joint Approach

WithinSthe States1SAmerican crocodiles liveSin Florida. Recently1Sthe biologist AldecoaScapturedS anSolder crocodileSinSthe sunshine state. The biologist AldecoaScaughtStheS hatchlings. AmericanSCrocodilesSKAnimalw RenéSLacosteSKTennisSplayerw CrocodileSKLocomotivew UnitedSStates FloridaSKUSSStatew StateSofSmatterS IgnacioSAldecoaSSKSpanishSAuthorw EmilioSAldecoaSKFootballSplayerw Biologist Hatchling

KnowledgeSBase

FloridaSKPuertoSRicow StateSKPolityw TextS1 TextS2

22 / 54

slide-47
SLIDE 47

Joint Approach

Within the States, American crocodiles live in Florida. Recently, the biologist Aldecoa captured an older crocodile in the sunshine state. The biologist Aldecoa caught the hatchlings. Text 1 Text 2

23 / 54

slide-48
SLIDE 48

24 / 54

slide-49
SLIDE 49

Within the States, American crocodiles live in Florida. Recently, the biologist Aldecoa captured an older crocodile in the sunshine state. The biologist Aldecoa caught the hatchlings. Text 1 Text 2 25 / 54

slide-50
SLIDE 50

Joint Approach

WithinCtheCStatesPCAmerican crocodiles liveCinCFlorida. RecentlyPCtheCbiologistCAldecoaCcapturedC anColder crocodileCinCtheCsunshineCstate. TheCbiologistCAldecoaCcaughtCtheC hatchlings. TextC1 TextC2 AmericanCCrocodilesCBAnimalH RenéCLacosteCBTennisCplayerH CrocodileCBLocomotiveH UnitedCStates FloridaCBUSCStateH StateCofCmatterC EmilioCAldecoaCBFootballCplayerH Biologist Hatchling

KnowledgeCBase

FloridaCBPuertoCRicoH StateCBPolityH IgnacioCAldecoaCCBSpanishCAuthorH

26 / 54

slide-51
SLIDE 51

Requirements

  • Joint inference
  • disambiguation, recognition of NILs and clustering
  • Non-pairwise features

27 / 54

slide-52
SLIDE 52

Requirements

  • Joint inference
  • disambiguation, recognition of NILs and clustering
  • Non-pairwise features

→ Markov Logic

27 / 54

slide-53
SLIDE 53

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Competitive Results Low Average Ambiguity

28 / 54

slide-54
SLIDE 54

Markov Logic

  • Markov Logic (ML) combines first-order logic with probabilities
  • A Markov Logic Network (MLN) is a set of pairs (Fi,wi)
  • Fi: first-order formula
  • wi: weight wi ∈ R associated with formula Fi

Domingos and Lowd 2009 29 / 54

slide-55
SLIDE 55

Markov Logic Networks (MLN)

A MLN is a template for constructing a Markov network. Given a set of constants C a Markov network can be defined in the following way:

  • Binary vertex for each possible grounding of each predicate:
  • If ground predicate is true → 1
  • If ground predicate is false → 0
  • One feature for each possible grounding of each formula Fi:
  • If ground formula is true → 1
  • If ground predicate is false → 0
  • Weight: wi

30 / 54

slide-56
SLIDE 56

Probability Distribution

Probability for a possible world x specified by the ground Markov network: P(X = x) = 1 Z exp

i

wini(x)

  • ni(x) is the number of true groundings of Fi in x

wi is the weight of Fi Partition function Z: Z = ∑

x∈X

exp

i

wini(x)

  • Domingos and Lowd 2009

31 / 54

slide-57
SLIDE 57

Disambiguation and Clustering using Markov Logic

WithinSthe States1SAmerican crocodiles liveSin Florida. Recently1Sthe biologist AldecoaScapturedS anSolder crocodileSinSthe sunshine state. The biologist AldecoaScaughtStheS hatchlings. AmericanSCrocodilesSKAnimalw RenéSLacosteSKTennisSplayerw CrocodileSKLocomotivew UnitedSStates FloridaSKUSSStatew StateSofSmatterS IgnacioSAldecoaSSKSpanishSAuthorw EmilioSAldecoaSKFootballSplayerw Biologist Hatchling

KnowledgeSBase

FloridaSKPuertoSRicow StateSKPolityw TextS1 TextS2

32 / 54

slide-58
SLIDE 58

Disambiguation and Clustering using Markov Logic

Hidden Predicates hasEntity(m,e) hasSameEntity(m,n)

33 / 54

slide-59
SLIDE 59

Disambiguation and Clustering using Markov Logic

Hidden Predicates hasEntity(m,e) hasSameEntity(m,n) Hard Constraints At most one entity/concept per mention:

∀m ∈ M : |{e ∈ E : hasEntity(m,e)}| ≤ 1

Symmetry:

∀m,n ∈ M : m = n ∧ hasSameEntity(m,n) → hasSameEntity(n,m)

Transitivity:

∀m,n,l ∈ M : m = n ∧ m = l ∧ n = l ∧ hasSameEntity(m,n)∧ hasSameEntity(n,l) → hasSameEntity(m,l)

33 / 54

slide-60
SLIDE 60

Disambiguation and Clustering using Markov Logic

Hard Constraints (continued) All members of a cluster refer to the same entity/concept:

∀m,n ∈ M : m = n ∧ hasSameEntity(m,n)∧ hasEntity(m,e) → hasEntity(n,e)

If two mentions refer to the same concept, they belong to the same cluster:

∀m,n ∈ M : m = n ∧ m = n ∧ hasEntity(m,e)∧ hasEntity(n,e) → hasSameEntity(m,n)

34 / 54

slide-61
SLIDE 61

Formulas with Learned Weight

Local Features: Prior probability of an entity/concept given a mention:

∀m ∈ M ∀e ∈ Em : hasCommonness(m,e,s) → hasEntity(m,e)

Weight: +(s · w) Example: crocodile: American crocodile: s = 0.4, Crocodile (Locomotive): s = 0.2, Rene Lacoste: s = 0.1, ...

35 / 54

slide-62
SLIDE 62

Formulas with Learned Weight

Local Features: Not related to context:

∀m ∈ M ∀e ∈ Em : hasRelatedness(m,e,s)∧ s = 0 → hasEntity(m,e)

Weight: −(w) Other local features: relatedness, local context similarity, string distance.

36 / 54

slide-63
SLIDE 63

Formulas with Learned Weight

Global Features: Same head and m is substring of n:

∀m,n ∈ M ∀e ∈ Em : m = n ∧ isSubStringHeadMatch(m,n,s) → hasEntity(m,e)∧ hasEntity(n,e)

Weight: +(s · w) Example: American crocodile – crocodile; Jimmy Allen – Allen

37 / 54

slide-64
SLIDE 64

Formulas with Learned Weight

Global Features: Two mentions with the same string tend to refer to the same entity/concept:

∀m,n ∈ M : m = n ∧ hasSameString(m,n,s) → hasSameEntity(m,n)

Weight: +(s · w) Example: crocodile – crocodile

38 / 54

slide-65
SLIDE 65

Formulas with Learned Weight

Global Features: Two mentions in two different documents are part of the same n-gram:

∀m,n ∈ M : m = n ∧ shareNgram(m,n,s) → hasSameEntity(m,n)

Weight: +(s · w) Example: biologist Aldecao – biologist Aldecao

39 / 54

slide-66
SLIDE 66

Within the States, American crocodiles live in Florida. Recently, the biologist Aldecoa captured an older crocodile in the sunshine state. The biologist Aldecoa caught the hatchlings.

Input

States: State of matter, State (Polity), United States American crocodiles: America Crocodiles (Animals) Florida: Florida (US State), Florida (Puerto Rico) biologist: Biologist Aldecoa: Ignacio Aldecoa, Emilio Aldecoa crocodile: American Crocodiles (Animals), Crocodile (Locomotive), René Lacoste sunshine state: Florida (US State)

Mention and Entity Candidates identification

biologist: Biologist Aldecoa: Ignacio Aldecoa, Emilio Aldecoa hatchlings: Hatchling hasCommonness(States, State of matter, 0.3) hasCommonness(States, State (Polity), 0.3) hasCommonesss(States, United States, 0.4) ... hasRelatedness(States, State of matter, 0.01) hasRelatedness(States, State (Polity), 0.03) hasRelatedness(States, United States, 0.31) ... isSubStringHeadMatch(sunshine state, states, 0.5) isSubStringHeadMatch(American crocodiles, crocodiles, 0.5)

Feature Extraction

hasCommonness(Aldecoa, Ignacio Aldecoa, 0.3) hasCommonness(Aldecoa, Emilio Aldecoa, 0.7) hasCommonness(hatchlings, Hatchling, 1.0) ... hasRelatedness(Aldecoa, Ignacio Aldecoa, 0.0) hasReletedness(Aldecoa, Emilio Aldecoa, 0.01) hasReletadness(hatchlings, Hatchling, 0.3) ... sharedNgram(biologist (text 1), biologist (text 2), 1.0) sharedNgram(Aldecoa (text 1), Aldecoa (text 2), 1.0) American crocodiles, crocodile: isSubStringHeadMatch(American crocodiles, crocodiles, 0.5) ...

Regrouping across Documents

Aldecoa (text 1), Aldecoaa (text 2): hasRelatedness(Aldecoa (text 1), Ignacio Aldecoa, 0.0) hasReletedness(Aldecoa (text 1), Emilio Aldecoa, 0.03) hasRelatedness(Aldecoa (text 2), Ignacio Aldecoa, 0.0) hasReletedness(Aldecoa (text 2), Emilio Aldecoa, 0.01) ... sharedNgram(Aldecoa (text 1), Aldecoa (text 2), 1.0)

...

Inference

American Crocodiles (Animal): American crocodiles, crocodiles United States: States Florida (US State): Florida, sunshine state Biologist: biologist (text 1), biologist (text 2) Hatchling: hatchlings Nil 3456: Aldecoa (text 1), Aldecoa (text 2)

Output Postprocessing Preprocessing

T

  • kenization

Part-of-Speech T agging Syntactic Parsing Named Entity Recognition

40 / 54

slide-67
SLIDE 67

Learning and Inference

Learning: Online training using a perceptron Inference: MAP inference using Cutting Planes combined with Integer Linear Programming (Gurobi) Tool: TheBeast: http://code.google.com/p/thebeast/

Riedel 2009 41 / 54

slide-68
SLIDE 68

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Competitive Results Low Average Ambiguity

42 / 54

slide-69
SLIDE 69

Training and Development Data

English Wikipedia articles (featured) After the risks caused by the flammability of hy- drogen became apparent, it was replaced with helium in blimps and balloons.

After the risks caused by the flammability of [[hydrogen]] became apparent, it was replaced with helium in [[non-rigid airship|blimps]] and [[gas balloon|balloons]].

Dataset

Documents Mentions in KB NILs

  • Ave. Amb.

WP Training 500 46,810 43,547 3,263 2.18 WP Dev 100 7,197 6,610 587 2.11

43 / 54

slide-70
SLIDE 70

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Competitive Results Low Average Ambiguity

44 / 54

slide-71
SLIDE 71

Going Cross-lingual

  • Mapping of Chinese and Spanish Wikipedia articles to the English ones

using interlanguage links and triangulation. Mapped Chinese articles 48.4% Mapped Spanish articles 60.2%

  • Disambiguation with respect to the mapped index
  • English link structure can be used to calculate relatedness
  • Trained on the internal hyperlinks of 500 English Wikipedia articles for all

languages

45 / 54

slide-72
SLIDE 72

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Competitive Results Low Average Ambiguity

46 / 54

slide-73
SLIDE 73

English Entity Linking Task at TAC 2012

Run Acc B3 P B3 R B3 F1 B3+ P B3+ R B3+ F1 Best 0.730 Median 0.536 HITS 0.718 0.751 0.932 0.832 0.572 0.678 0.621

47 / 54

slide-74
SLIDE 74

Chinese Entity Linking Task at TAC 2012

Run Micr. B3 P B3 R B3 F1 B3+ P B3+ R B3+ F1 Best 0.740 HITS 0.843 0.863 0.811 0.836 0.738 0.742 0.740 EN 0.895 0.952 0.787 0.861 0.861 0.743 0.798 ZH 0.818 0.848 0.798 0.822 0.701 0.719 0.710

48 / 54

slide-75
SLIDE 75

Spanish Entity Linking Task at TAC 2012

Run Micr. B3 P B3 R B3 F1 B3+ P B3+ R B3+ F1 Best 0.641 HITS 0.707 0.648 0.880 0.746 0.464 0.638 0.538 HITS* 0.707 0.904 0.830 0.866 0.660 0.612 0.635

49 / 54

slide-76
SLIDE 76

Evaluation by Different Categories (Micro-average)

CAT EN EN/ZH EN/ES PER 0.832 0.767 0.880 ORG 0.771 0.869 0.798 GPE 0.553 0.892 0.515 KB 0.598 0.794 0.519 NILs 0.853 0.912 0.859 NW 0.750 0.828 Web 0.657 0.870

50 / 54

slide-77
SLIDE 77

Results on TAC 2011

Micr. B3 F Upperbound I 87.5 87.4 Upperbound II 97.6 97.5 Best 84.6 Median 71.6 ML Dis. 76.8 74.3 ML Dis. + NILs 78.3 75.4 ML Dis. + NILs + Clust. 82.9 80.1

51 / 54

slide-78
SLIDE 78

ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, iguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering,

  • f NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation, Clustering, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering ambiguation, Recognition of NILs, Clustering, Disambiguation, Recognition of NILs, Cluster

  • gnition of NILs, Clustering, Disambiguation, Recognition of NILs, Clustering, Disambiguation,

ing, Disambiguation, Recognition of NILs, Clustering, Recognition of NILs, Clustering,

J

  • i

n t A p p r

  • a

c h Markov Logic Common and Proper Nouns Training on 500 English Wikipedia Articles F

  • c

u s

  • n

M

  • d

e l l i n g Competitive Results Low Average Ambiguity

52 / 54

slide-79
SLIDE 79

Current and future work

  • Integration of more linguistic features
  • More accurate relatedness measures
  • Scalability

53 / 54

slide-80
SLIDE 80

Thank you!

NLP group HITS Mathias Niepert, University of Mannheim Come to our poster!

Angela Fahrni angela.fahrni@h-its.org

54 / 54