NAME MATCHING WITH PHYLOGENIES
Nicholas Andrews, Jason Eisner, Mark Dredze
1
NAME MATCHING WITH PHYLOGENIES Nicholas Andrews, Jason Eisner, Mark - - PowerPoint PPT Presentation
NAME MATCHING WITH PHYLOGENIES Nicholas Andrews, Jason Eisner, Mark Dredze 1 2 2 2 Martin Freeman 2 Martin Freeman M Freeman Martin Freedman Marty Freemen Marty Freeman Martin F 2 Entity Linking Coref Resolution Martin Freeman M
1
2
2
2
2
2
2
3
3
4
5
6
7
7
7
8
8
8
8
9
9
9
10
11
12
13
... Mitt Romney President Barack Obama Barack Obama Secretary of State Hillary Clinton Hillary Clinton Barack Obama Clinton Obama ...
14
15
16
17
18
19
20
backward
21
22
23
common name variants to the correct page (unambiguously)
clearly not names (e.g. numbers)
24
25
Thomas Pynchon, Jr. Thomas R. Pynchon Thomas Pynchon Jr. Thomas R. Pynchon Jr. Thomas Ruggles Pynchon Jr.. Khawaja Gharibnawaz Muinuddin Hasan Chisty Khwaja Gharib Nawaz Khwaja Muin al-Din Chishti Ghareeb Nawaz Khwaja gharibnawaz Muinuddin Chishti
25
Thomas Pynchon, Jr. Thomas R. Pynchon Thomas Pynchon Jr. Thomas R. Pynchon Jr. Thomas Ruggles Pynchon Jr.. Khawaja Gharibnawaz Muinuddin Hasan Chisty Khwaja Gharib Nawaz Khwaja Muin al-Din Chishti Ghareeb Nawaz Khwaja gharibnawaz Muinuddin Chishti
25
25
25
Khawaja Gharibnawaz Muinuddin Hasan Chisty Khwaja Gharib Nawaz Khwaja Muin al-Din Chishti Ghareeb Nawaz Khwaja gharibnawaz Khwaja Moinuddin Chishti Muinuddin Chishti Thomas Ruggles Pynchon, Jr. Thomas Ruggles Pynchon Jr. Thomas R. Pynchon, Jr. Thomas R. Pynchon Jr. Thomas Pynchon, Jr. Thomas R. Pynchon Thomas Pynchon Jr.
25
26
27
Jaro Winkler Levenshtein 10 entities 10+unlabeled Unsupervised 1500 entities 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 MRR
28
Jaro Winkler Levenshtein 10 entities 10+unlabeled Unsupervised 1500 entities 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
MRR
28
Jaro Winkler Levenshtein 10 entities 10+unlabeled Unsupervised 1500 entities 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
MRR
28
Jaro Winkler Levenshtein 10 entities 10+unlabeled Unsupervised 1500 entities 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
MRR
28
Jaro Winkler Levenshtein 10 entities 10+unlabeled Unsupervised 1500 entities 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
MRR
28
Jaro Winkler Levenshtein 10 entities 10+unlabeled Unsupervised 1500 entities 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
MRR
28
Jaro Winkler Levenshtein 10 entities 10+unlabeled Unsupervised 1500 entities 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
MRR
28
29
Nicholas Andrews, Jason Eisner, Mark Dredze. Name Phylogeny: A Generative Model of String
Language Processing (EMNLP), 2012.
30