http://lcl.uniroma1.it
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking of Web pages in 50 languages Roberto Navigli navigli@di.uniroma1.it
ERC Starting Grant MultiJEDI No. 259234 LIDER EU Project No. 610782
http://lcl.uniroma1.it ERC Starting Grant MultiJEDI No. 259234 ERC - - PowerPoint PPT Presentation
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking of Web pages in 50 languages Roberto Navigli navigli@di.uniroma1.it http://lcl.uniroma1.it ERC Starting Grant MultiJEDI No. 259234 ERC StG: Multilingual
http://lcl.uniroma1.it
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking of Web pages in 50 languages Roberto Navigli navigli@di.uniroma1.it
ERC Starting Grant MultiJEDI No. 259234 LIDER EU Project No. 610782
Joint work with
Disambiguation: a Unified Approach. Transactions of the Association for Computational Linguistics (TACL), 2, 2014.
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro and Roberto NavigliAndrea Moro Alessandro Raganato
Motivation
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Naviglilanguages
independently of the source/target language
performance multilingual text understanding
BabelNet (http://babelnet.org)
A wide-coverage multilingual semantic network and encyclopedic dictionary in 50 languages!
Concepts from WordNet Named Entities and specialized concepts from Wikipedia Concepts integrated from both resources
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliBabelNet (http://babelnet.org)
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliNew 2.5 version out!
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliBabelNet as a Multilingual Inventory for
Concepts Calcio in Italian can denote different concepts: Named Entities The text mario can be used to represent different things such as the video game charachter or a soccer player (Gomez) or even a music album
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliCalcio / Kick in BabelNet 2.5
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliCalcio / Calcium in BabelNet 2.5
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliCalcio / Soccer in BabelNet 2.5
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliWord Sense Disambiguation in a Nutshell
striker (target word) “Thomas and Mario are strikers playing in Munich” (context)
WSD system
knowledge sense of target word
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliEntity Linking in a Nutshell
Thomas (target mention) “Thomas and Mario are strikers playing in Munich” (context)
Entity Linking system
Named Entity knowledge
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliDisambiguation and Entity Linking together!
BabelNet is a huge multilingual inventory for both word senses and named entities!
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli.org
Babelfy: A Joint approach to WSD and EL
Personalized PageRank is the state-of-the-art method for graph-based word sense disambiguation, however it cannot be run for each new input on huge graphs. Idea: Precompute semantic signatures for the nodes! Semantic signatures are the most relevant nodes for a given node in the graph computed by using random walk with restart
Andrea Moro and Alessandro Raganato and Roberto Navigli. 2014. Entity Linking meets Word Sense Disambiguation: a Unified Approach. TACL http://babelfy.org
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliSemantic Signatures: RWR
restart at the target vertex;
Step 1: Calculate Semantic Signatures
striker
athlete sport soccer player
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliA Joint approach to WSD and EL
meanings from BabelNet by matching mentions with BabelNet lexicalizations;
signatures;
coherent candidates;
Step 2: Find all possible meanings of words in context
“Thomas and Mario are strikers playing in Munich”
Thomas (novel) Seth Thomas Thomas Müller Mario Gómez Mario (Album) Mario (Character) Striker (Movie) Striker (Video Game) striker (Sport) Munich (City) FC Bayern Munich Munich (Song)
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliStep 2: Find all possible meanings of words in context
“Thomas and Mario are strikers playing in Munich”
Thomas (novel) Seth Thomas Thomas Müller Mario Gómez Mario (Album) Mario (Character) Striker (Movie) Striker (Video Game) striker (Sport) Munich (City) FC Bayern Munich Munich (Song)
Ambiguity!
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliStep 3: Connect all the candidate meanings
Thomas and Mario are strikers playing in Munich
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliStep 4: Extract a dense subgraph
Thomas and Mario are strikers playing in Munich
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliStep 5: Select the most reliable meanings
“Thomas and Mario are strikers playing in Munich”
Thomas (novel) Seth Thomas Thomas Müller Mario Gómez Mario (Album) Mario (Character) Striker (Movie) Striker (Video Game) striker (Sport) Munich (City) FC Bayern Munich Munich (Song)
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliExperimental Results: State-of-the-art multilingual disambiguation
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliExperimental Results
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliExperimental Results: State-of-the-art Entity Linking
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto NavigliThanks or…
(grazie)
Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Naviglihttp://babelnet.org http://babelfy.org
Roberto Navigli