nifify towards better quality entity linking
play

NIFify: Towards Better Quality Entity Linking Datasets Henry - PowerPoint PPT Presentation

NIFify: Towards Better Quality Entity Linking Datasets Henry Rosales-M endez, Aidan Hogan and Barbara Poblete University of Chile { hrosales,ahogan,bpoblete } @dcc.uchile.cl May 14, 2019 LA-WEB 2019 - 10th Latin American Web Congress


  1. NIFify: Towards Better Quality Entity Linking Datasets † Henry Rosales-M´ endez, Aidan Hogan and Barbara Poblete University of Chile { hrosales,ahogan,bpoblete } @dcc.uchile.cl May 14, 2019 † LA-WEB 2019 - 10th Latin American Web Congress

  2. Example

  3. Example - Entity Recognition

  4. Example - Entity Disambiguation

  5. Name Variations in Entity Linking Michael Joseph Jackson Michael J. Jackson King of Pop

  6. Name Variations in Entity Linking Michael Jackson

  7. Are there benchmark datasets to measure Entity Linking results?

  8. Overview of popular EL datasets Dataset Mn Typ Format MSNBC MSNBC ✗ ✗ IITB ✓ ✗ IITB AIDA/CoNLL ✓ ✗ AIDA ACE2004 MSNBC ✗ ✗ AQUAINT ✗ ✗ MSNBC DBpedia Spotlight ✓ ✗ Lexvo KORE50 AIDA ✓ ✗ N3-RSS 500 ✓ ✗ NIF Reuters 128 ✓ ✗ NIF News-100 ✓ ✗ NIF Wes2015 NIF ✓ ✗ SemEval 2015 Task 13 ✓ ✗ SemEval Thibaudet ✗ ✓ RENDEN Bergson RENDEN ✗ ✓ DBpedia Abstracts ✗ ✗ NIF MEANTIME ✓ ✓ CAT VoxEL NIF ✓ ✗

  9. Overview of popular EL datasets Dataset Mn Typ Format MSNBC MSNBC ✗ ✗ IITB ✓ ✗ IITB AIDA/CoNLL ✓ ✗ AIDA ACE2004 MSNBC ✗ ✗ AQUAINT ✗ ✗ MSNBC DBpedia Spotlight ✓ ✗ NIF KORE50 ✓ ✗ NIF N3-RSS 500 ✓ ✗ NIF Reuters 128 ✓ ✗ NIF News-100 ✓ ✗ NIF Wes2015 NIF ✓ ✗ SemEval 2015 Task 13 ✓ ✗ SemEval Thibaudet ✗ ✓ RENDEN Bergson RENDEN ✗ ✓ DBpedia Abstracts ✗ ✗ NIF MEANTIME ✓ ✓ CAT VoxEL NIF ✓ ✗

  10. Proposal NIFify: a tool that simultaneously supports the creation, visualization, and validation of NIF datasets, as well as the comparison of EL systems.

  11. Related Work • NIF-Dataset creation QRTool BENGAL Automatic NIF Creation Demo Source Code

  12. Related Work • NIF-Dataset creation QRTool BENGAL Automatic NIF Creation Demo Source Code • NIF-Dataset validation Eaglet Demo Source Code NIF-Dataset Validation

  13. Related Work • NIF-Dataset creation QRTool BENGAL Automatic NIF Creation Demo Source Code • NIF-Dataset validation Eaglet Demo Source Code NIF-Dataset Validation • Benchmarking GERBIL Benchmark Demo Source Code Visualization NIF-Dataset Creation Orbis Benchmark Demo Source Code Visualization

  14. Related Work • NIF-Dataset creation QRTool BENGAL Automatic NIF Creation Demo Source Code • NIF-Dataset validation Eaglet Demo Source Code NIF-Dataset Validation • Benchmarking GERBIL Benchmark Demo Source Code Visualization NIF-Dataset Creation Orbis Benchmark Demo Source Code Visualization NIF-Dat

  15. NIFify - Creation

  16. NIFify - Creation

  17. NIFify - Creation

  18. NIFify - Creation

  19. NIFify - Validation

  20. NIFify - Validation

  21. Errors found in current NIF datasets Spelling Error Link Error Format Error Dataset DBpedia Spotlight 8 23 4 N3-RSS 500 1 34 – Reuters 128 4 71 – News-100 9 1515 – Wes2015 – 609 – VoxEL – 8 – • https://users.dcc.uchile.cl/~hrosales/dataset_errors.html

  22. Errors found in current NIF datasets

  23. NIFify - Benchmarking

  24. NIFify - Benchmarking

  25. Conclusion • NIFify: Creation/Validation/Visualization/Benchmark • Demo: https://users.dcc.uchile.cl/~hrosales/NIFify_v2.html • Source Code: https://github.com/henryrosalesmendez/NIFify_v2

  26. NIFify: Towards Better Quality Entity Linking Datasets † Henry Rosales-M´ endez, Aidan Hogan and Barbara Poblete University of Chile { hrosales,ahogan,bpoblete } @dcc.uchile.cl May 14, 2019 † LA-WEB 2019 - 10th Latin American Web Congress

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend