http lcl uniroma1 it
play

http://lcl.uniroma1.it ERC Starting Grant MultiJEDI No. 259234 ERC - PowerPoint PPT Presentation

Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking of Web pages in 50 languages Roberto Navigli navigli@di.uniroma1.it http://lcl.uniroma1.it ERC Starting Grant MultiJEDI No. 259234 ERC StG: Multilingual


  1. Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking of Web pages in 50 languages Roberto Navigli navigli@di.uniroma1.it http://lcl.uniroma1.it � ERC Starting Grant MultiJEDI No. 259234 ERC StG: Multilingual Joint Word Sense Disambiguation (MultiJEDI) 1 � Roberto Navigli � LIDER EU Project No. 610782 �

  2. Joint work with Andrea Moro Alessandro Raganato A. Moro, A. Raganato, R. Navigli. Entity Linking meets Word Sense Disambiguation: a Unified Approach. Transactions of the Association for Computational Linguistics (TACL) , 2, 2014. Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro and Roberto Navigli

  3. 2014.05.08 BabelNet & friends 3 Roberto Navigli

  4. 2014.05.08 BabelNet & friends 4 Roberto Navigli

  5. Motivation • Domain-specific Web content is available in many languages • Information should be extracted and processed independently of the source/target language • This could be done automatically by means of high- performance multilingual text understanding Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  6. BabelNet (http://babelnet.org) A wide-coverage multilingual semantic network and encyclopedic dictionary in 50 languages! Named Entities and Concepts from WordNet specialized concepts from Wikipedia Concepts integrated from both resources Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  7. BabelNet (http://babelnet.org) Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  8. New 2.5 version out! • 50 languages covered (including Latin!) • 21 million textual definitions • 67M word senses and named entities! • 1.1B RDF triples available via SPARQL endpoint • Seamless integration of: • WordNet 3.0 • Wikipedia • Wiktionary • OmegaWiki : a collaborative multilingual dictionary • Open Multilingual WordNet [Bond and Foster, 2013] • Translations for all open-class parts of speech Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  9. BabelNet as a Multilingual Inventory for Concepts Calcio in Italian can denote different concepts: Named Entities The text mario can be used to represent different things such as the video game charachter or a soccer player (Gomez) or even a music album Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  10. Calcio / Kick in BabelNet 2.5 Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  11. Calcio / Calcium in BabelNet 2.5 Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  12. Calcio / Soccer in BabelNet 2.5 Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  13. Word Sense Disambiguation in a Nutshell striker “Thomas and Mario are strikers playing in Munich” (target word) (context) WSD system knowledge sense of target word Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  14. Entity Linking in a Nutshell Thomas “ Thomas and Mario are strikers playing in Munich” (target mention) (context) Entity Linking system knowledge Named Entity Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  15. Disambiguation and Entity Linking together! BabelNet is a huge multilingual inventory for both word senses and named entities! Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  16. .org Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  17. Babelfy: A Joint approach to WSD and EL Personalized PageRank is the state-of-the-art method for graph-based word sense disambiguation, however it cannot be run for each new input on huge graphs . Idea : Precompute semantic signatures for the nodes! Semantic signatures are the most relevant nodes for a given node in the graph computed by using random walk with restart Andrea Moro and Alessandro Raganato and Roberto Navigli. 2014. Entity Linking meets Word Sense Disambiguation: a Unified Approach. TACL http://babelfy.org Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  18. Semantic Signatures: RWR 1. Start from one target vertex of the semantic netowork; 2. Randomly select a neighbor of the current vertex or restart at the target vertex; 3. Keep counting the hitting frequencies; 4. Take the most visited vertices. Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  19. Step 1: Calculate Semantic Signatures offside striker athlete soccer player sport Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  20. A Joint approach to WSD and EL 1. Given an input text select all the possible candidate meanings from BabelNet by matching mentions with BabelNet lexicalizations; 2. Connect all the candidate meanings by using semantic signatures; 3. Extract a dense subgraph containing semantically coherent candidates; 4. Select the most connected candidate for each fragment of text. Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  21. Step 2: Find all possible meanings of words in context “Thomas and Mario are strikers playing in Munich” Munich (City) Seth Thomas Mario (Character) striker (Sport) Mario (Album) Striker (Video Game) Thomas Müller FC Bayern Munich Mario Gómez Striker (Movie) Thomas (novel) Munich (Song) Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  22. Step 2: Find all possible meanings of words in context “Thomas and Mario are strikers playing in Munich” Munich (City) Seth Thomas Mario (Character) striker (Sport) Ambiguity! Mario (Album) Striker (Video Game) Thomas Müller FC Bayern Munich Mario Gómez Striker (Movie) Thomas (novel) Munich (Song) Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  23. Step 3: Connect all the candidate meanings Thomas and Mario are strikers playing in Munich Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  24. Step 4: Extract a dense subgraph Thomas and Mario are strikers playing in Munich Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  25. Step 5: Select the most reliable meanings “Thomas and Mario are strikers playing in Munich” Munich (City) Seth Thomas Mario (Character) striker (Sport) Mario (Album) Striker (Video Game) Thomas Müller FC Bayern Munich Mario Gómez Striker (Movie) Thomas (novel) Munich (Song) Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  26. Experimental Results: State-of-the-art multilingual disambiguation Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  27. Experimental Results Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  28. Experimental Results: State-of-the-art Entity Linking Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  29. Integrating Syntactic and Semantic Analysis into the Open Information Extraction Paradigm Andrea Moro and Roberto Navigli

  30. Integrating Syntactic and Semantic Analysis into the Open Information Extraction Paradigm Andrea Moro and Roberto Navigli

  31. Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  32. Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  33. Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  34. Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

  35. Babelfying the Multilingual Web: state-of-the-art Disambiguation and Entity Linking Andrea Moro, Alessandro Raganato and Roberto Navigli

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend