Enityt Linking Entity Linking
Use cursor keys to flip through slides.
Entity Linking Enityt Linking Laura Dietz dietz@cs.umass.edu - - PowerPoint PPT Presentation
Entity Linking Enityt Linking Laura Dietz dietz@cs.umass.edu University of Massachusetts Use cursor keys to fl ip through slides. Problem: Entity Linking Query Entity NIL Given query mention in a source document, identify which Wikipedia
Use cursor keys to flip through slides.
Given query mention in a source document, identify which Wikipedia entity it represents
Query Entity
NIL
Example Query: Northern Ireland has a population of about one and a half million people. At the time of partition in 1921 Protestants / unionists had a two-thirds majority in the
James Craig, described the state as having ‘a Protestant Parliament for a Protestant people.’ The state effectively discriminated against Catholics in housing, jobs, and political representation. http://cain.ulst.ac.uk/othelem/incorepaper09.htm Northern Ireland Northern Ireland
Example Query: Northern Ireland has a population of about one and a half million people. At the time of partition in 1921 Protestants / unionists had a two-thirds majority in the
James Craig, described the state as having ‘a Protestant Parliament for a Protestant people.’ The state effectively discriminated against Catholics in housing, jobs, and political representation. http://cain.ulst.ac.uk/othelem/incorepaper09.htm James Craig James Craig
Example Query: Northern Ireland has a population of about one and a half million people. At the time of partition in 1921 Protestants / unionists had a two-thirds majority in the
James Craig, described the state as having ‘a Protestant Parliament for a Protestant people.’ The state effectively discriminated against Catholics in housing, jobs, and political representation. http://cain.ulst.ac.uk/othelem/incorepaper09.htm James Craig
Q: Query String V: Name Variants M: Neighbor Mentions S: Sentence
James Craig
Name Variants: Within-doc Coreference Neighbor Mentions: NER T agger (Alternative Mention Detection) Sentence: T erm models Symbol Notation:
Ulster Unionists Northern Ireland Prime Minister of Northern Ireland Sir James Craig 1st Viscount Craigavon Northern Ireland James Craig, 1st Viscount Craigavon Irish Unionist Unionism in Ireland Ulster
James Craig JC, 1st Viscount Craigavon
title: James Craig, 1st Viscount Craigavon anchor text: Sir James Craig's Craig Administration disambiguation: James Craig freebase name: Lord Craigavon
James Craig James Craig (actor)
title: James Craig (actor) anchor text: James Craig James Craig in disambiguation: James Craig freebase name: James Craig (actor)
James Craig
is exact title match? is disambiguation match? inlinks through this name is approx match? TF-IDF similarity score
Features: Name variants, Document T erms, Links, Popularity ...
Query Feature vector for supervised Re-ranking and classification Re-ranking NIL classification: Is it similar enough to be a match? NIL?
Candidate Entities
Q: Query String V: Name Variants M: Neighbor Mentions S: Sentence
Example Query: Northern Ireland has a population of about one and a half million people. At the time of partition in 1921 Protestants / unionists had a two-thirds majority in the
James Craig, described the state as having ‘a Protestant Parliament for a Protestant people.’ The state effectively discriminated against Catholics in housing, jobs, and political representation. http://cain.ulst.ac.uk/othelem/incorepaper09.htm James Craig James Craig +Name Variants + Neighbors + Sentence
James Craig
James Craig Northern Ireland Catholics
American Catholic Church
James Craig Northern Ireland Catholics
American Catholic Church
James Craig Northern Ireland Catholics
American Catholic Church
James Craig Northern Ireland Catholics
American Catholic Church
not compatible
James Craig
Ulster Unionists Northern Ireland Prime Minister of Northern Ireland Sir James Craig 1st Viscount Craigavon Northern Ireland James Craig, 1st Viscount Craigavon Irish Unionist Unionism in Ireland Ulster
James Craig Northern Ireland Catholics
Ulster Unionists Northern Ireland Prime Minister of Northern Ireland Nashville, T ennessee B-Movies
Method 4
Q: Query String V: Name Variants M: Neighbor Mentions S: Sentence
Integrate over Method 5
Requires iterative optimization Can be solved inside a search engine
Identify context of query mention
Preprocessing: build a special KB Index
Search Index with special Fields
Ulster Unionists Northern Ireland Prime Minister of Northern Ireland Ulster Unionists Northern Ireland
Ulster Unionists Northern Ireland
James Craig
neighbor occurs in text? neighbor in inlink titles? neighbor in outlink titles? is approx match? TF-IDF similarity score Northern Ireland
Machine learn the feature weights
Ulster Unionists Northern Ireland
James Craig
Machine learn the feature weights
is exact title match? is disambiguation match? inlinks through this name is approx match? TF-IDF similarity score
Issue the Entity Linking IR Actually: structured matching
with special KB Index Select the entity that maximizes:
Example Query: ABC shot "Lost" in Australia ABC True entity: American Broadcasting Company Context "Australia" and mention similarity will point instead to Australian Broadcasting Corporation Approach: Identify misleading neighbors (variant of M5)
5 10 15 20 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
average recall
2009
Q QV QVM_nrm QVM_nrm LTR 5 10 15 20 0.75 0.80 0.85 0.90 0.95 1.00
2010
5 10 15 20 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
2011
5 10 15 20
cutoff rank k
0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
2012
Q: Query String V: Name Variants M: Neighbor Mentions S: Sentence
M1 (Popularity) variant of M5 (Joint Retrieval) M5 + M2 (JR + ML)
M1 Popularity / Keyphraseness: Mihalcea et al. In CIKM, 2007. Wikify!: linking documents to encyclopedic knowledge. M2 Machine Learn Mention-to-Entity Similarity Bunescu et al. In EACL, 2006. "Using Encyclopedic Knowledge for Named entity Disambiguation."
"Entity disambiguation for knowledge base population". M4 Joint Assignment Silviu Cucerzan. In EMNLP-CoNLL, 2007. "Large-scale named entity disambiguation based on wikipedia data." Ratinov et al. ACL 2011. "Local and global algorithms for disambiguation to wikipedia." Entity-to-Entity Features: Ceccarelli et al. In CIKM, 2013. "Learning relatedness measures for entity linking." M5 Joint Retrieval Model Dalton et al. In OAIR, 2013. "A neighborhood relevance model for entity linking." more: http://nlp.cs.rpi.edu/kbp/2014/elreading.html http://www.mendeley.com/groups/3339761/entity-linking-and-retrieval/
List of toolkits: http://nlp.cs.rpi.edu/kbp/2014/tools.html Several Online Demos: UIUC Wikifier http://cogcomp.cs.illinois.edu/demo/wikify/ T agMe! http://tagme.di.unipi.it/ AIDA https://gate.d5.mpi-inf.mpg.de/webaida/
TAC KBP Entity Linking T ask http://nlp.cs.rpi.edu/kbp/2014/ SIGIR Entity Recognition and Disambiguation Challenge http://web-ngram.research.microsoft.com/erd2014/ INEX 2014 T weet Contextualization Track https://inex.mmci.uni-saarland.de/tracks/qa/ Questions? email: dietz@cs.umass.edu web: http://ciir.cs.umass.edu/~dietz/