From Dictionaries to Cross-lingual Lexical Resources Guadalupe - - PowerPoint PPT Presentation

from dictionaries to cross lingual lexical resources
SMART_READER_LITE
LIVE PREVIEW

From Dictionaries to Cross-lingual Lexical Resources Guadalupe - - PowerPoint PPT Presentation

From Dictionaries to Cross-lingual Lexical Resources Guadalupe Aguado-de-Cea, Elena Montiel-Ponsoda, Ilan Kernerman, Noam Ordan Ontology Engineering Group (OEG) Universidad Politcnica de Madrid (UPM) {lupe,emontiel}@fi.upm.es K


slide-1
SLIDE 1

Julia Bosque-Gil

From Dictionaries to Cross-lingual Lexical Resources

Guadalupe Aguado-de-Cea, Elena Montiel-Ponsoda, Ilan Kernerman, Noam Ordan

Ontology Engineering Group (OEG) Universidad Politécnica de Madrid (UPM)

Acknowledgements Spported by the Spanish Ministry of Economy and Competitiveness project 4V (TIN2013-46238-C4-2-R), the Excellence Network ReTeLe (TIN2015-68955-REDT), and the Juan de la Cierva program and by the Spanish Ministry of Education, Culture and Sports through the FPU program.

{lupe,emontiel}@fi.upm.es

K Dictionaries, Tel Aviv

{ilan, noam}@kdictionaries.com

slide-2
SLIDE 2

WHY

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-3
SLIDE 3

Data in proprietary formats:

  • Isolated
  • Not interoperable
  • Multiple access points
  • Duplicities

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-4
SLIDE 4

http://es.wiktionary.org http://rae.es http://www.wikilengua.org/ index.php/Terminesp:red http://es.wikipedia.org http:/www.apertium.org

“Red” (comp uter network)

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

http://kdictionaries.com

slide-5
SLIDE 5

*Picture attribution: http://commons.wikimedia.org/wiki/User:Gugerell

“Red” Etimology: Del latin “rete” Gender: “f” Definition: “Conjunto de

  • rdenadores o de equipos

informáticos conectados entre sí….” “Red” translations: “xarxa”(ca), “rede”(ga), … “Red” Norm: UNE 21302-131 English: network German: Netzwerk “Red” Pronunciation: [red] Grammar category: sustantivo femenino Singular: “red” Plural: “redes” “Red_de_computadores” Category: redes informáticas Image

Complementary information, but not linked

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

“Red” translation:“rede”(pt) Grammatical categoty: n, f Definition:“conjunto de computadoras con (...)

slide-6
SLIDE 6

NIF

NLP Interchange Format

Linguistic Linked Open Data cloud

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-7
SLIDE 7

WHAT

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-8
SLIDE 8

K Dictionaries multi-language Global Series

4

  • Spanish set of the K Dictionaries (KD) multi-language Global Series (24 languages)
  • The approach followed in this series is to compile for each language a core vocabulary as a

standalone project and have it translated to other languages in more projects.

  • Not biased towards any language, and each is represented on its own terms.
  • Translated to another language at a later phase, creating a pair-specific, and thus pair-

sensitive, interlingual representation

http://kdictionaries-online.com/

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-9
SLIDE 9

K Dictionaries XML proprietary format

4

<SenseGrp identifier="SE00000730" version="1"> <Synonym>agitado</Synonym> <Definition>que es muy animado</Definition> <TranslationCluster identifier="TC00001663" text="que es muy animado" type="def"> <Locale lang="nl"> <TranslationBlock> <TranslationCtn> <Translation>verhit</Translation> </TranslationCtn> <TranslationCtn> <Translation>vurig</Translation> </TranslationCtn> </TranslationBlock> </Locale> <Locale lang="no"> <TranslationBlock> <TranslationCtn> <Translation>ivrig, oppsatt, opphetet</Translation> </TranslationCtn> </TranslationBlock> […]

Multilingual information for the Spanish headword acalorado (heated) Dutch Norwegian

slide-10
SLIDE 10

K Dictionaries XML proprietary format

4

[…] </TranslationCluster> <ExampleCtn type="sid" version="1"> <Example>sesión acalorada</Example> <TranslationCluster identifier="TC00001664" text="sesión acalorada" type="exmp"> <Locale lang="nl"> <TranslationBlock> <TranslationCtn> <Translation>vurige zitting</Translation> </TranslationCtn> </TranslationBlock> </Locale> <Locale lang="no"> <TranslationBlock> <TranslationCtn> <Translation>et opphetet møte</Translation> </TranslationCtn> </TranslationBlock> […] </TranslationCluster>

slide-11
SLIDE 11

HOW

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-12
SLIDE 12

5 tasks to migrate resources into linked data (Vila

Suero et al., 2014):

  • 1. Data exploration
  • 2. URI naming strategy definition
  • 3. Modeling
  • 4. RDF generation
  • 5. Linking

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-13
SLIDE 13

5 tasks to migrate resources into linked data (Vila

Suero et al., 2014):

  • 1. Data exploration
  • 2. URI naming strategy definition
  • 3. Modeling
  • 4. RDF generation
  • 5. Linking

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-14
SLIDE 14

Model for representing linguistic information in RDF(s) (semantics by reference)

“An ontology-based semantic lexicon would leave the semantics to the ontology, focusing instead on providing domain-specific terms and object descriptions in the

  • ntology.” (Buitelaar, 2010)
  • Concise and descriptive (external repositories of

linguistic categories)

  • Modular and extensible (5 modules)

Modeling

lemon-ontolex

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-15
SLIDE 15

LMF -Lexical Markup Framework (ISO 24613) XML LexInfo, LIR Represent lexical information relative to an ontology OWL SKOS (W3C Standard) Designed for Taxonomy/Vocabulary representation in RDF Nowadays… de facto standard for transforming linguistic resources into the linked data format

From McCrae, J. , “lemon: The Lexicon Model for Ontologies” SD-LLOD-15

The origins…

http://linghub.lider-project.eu/

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-16
SLIDE 16

lemon-ontolex

8 MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-17
SLIDE 17

The vartrans module

8 MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-18
SLIDE 18

Modelling example for acalorado

8 MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-19
SLIDE 19

CONCLUSIONS

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

slide-20
SLIDE 20

LD paradigm

4

EN DE ES FR IT

MetaForum 2016, 4-5.07.16, Lisboa, Portugal

  • Innovative way of representing

and linking data

  • Keeping language specificities

as much as possible, in a cross- lingual graph, bottom-up fashion

  • Retrieving language information

by linking translations

  • In the KD case, RDF links

contribute to automatic growth of lexical resources, at a different pace

  • “One-stop shop”, rather than

many shops

slide-21
SLIDE 21

Thank you!

See you at the poster session…

MetaForum 2016, 4-5.07.16, Lisboa, Portugal