Collaborative NLP-aided ontology modelling
1
Chiara Ghidini Marco Rospocher
ghidini@fbk.eu rospocher@fbk.eu
International Winter School on Language and Data/Knowledge Technologies TrentoRISE – Trento, 24th February 2012
Collaborative NLP-aided ontology modelling Chiara Ghidini - - PowerPoint PPT Presentation
Collaborative NLP-aided ontology modelling Chiara Ghidini Marco Rospocher ghidini@fbk.eu rospocher@fbk.eu International Winter School on Language and Data/Knowledge Technologies TrentoRISE Trento, 24
International Winter School on Language and Data/Knowledge Technologies TrentoRISE – Trento, 24th February 2012
Many definitions of an ontology in literature; Here we refer to an ontology as a “formal specifications of the
Ontologies contain a formal explicit description of:
Concepts (aka classes) Relations (aka roles) Individuals (aka instances)
Classes (and relations) can be ordered in taxonomies using the
(*) [Gruber, T.R. (1993). A Translation Approach to Portable Ontology
Andrew Charles Patty Rome Milan London Paris People Town
hasWife hasBrother livesIn livesIn
Andrew Charles Patty Rome Milan London Paris People Town
hasWife hasBrother livesIn livesIn
Andrew Charles Patty Rome Milan London Paris People Town
hasWife hasBrother livesIn livesIn
Andrew Charles Patty Rome Milan London Paris People Town
hasWife hasBrother livesIn livesIn
Classes (and relations) can be ordered in taxonomies using the
Example: biological classification
Same for roles
Concepts can be formally described through axioms A Pizza Margherita is a pizza which has both tomato topping and
Slide taken from “Ontology-Driven Conceptual Modelling” A tutorial by Nicola Guarino.
To share common understanding of the structure of information
To enable reuse of domain knowledge To make domain assumptions explicit To separate domain knowledge from the operational
To analyze domain knowledge
Large taxonomies categorizing Web sites (such as on Yahoo!) Medical Ontologies (such as SNOMED) to annotate documents
Categorizations of products for sale and their features (such as
Therefore……
1.
How to write an ontology?
How to change this axiom? Is this information relevant? What is the meaning
2.
Do I really need to read all this?
Wikis support collaborative editing; Users are quite familiar with viewing/editing wiki
Only a web-browser is required on the client side; Wikis provide a shared knowledge repository
Wikis can provide a uniform tool/interface for the
1.
that stretches above the surrounding land in a limited area usually in the form of a
than a hill.
Mountain
A mountain is a large landform The highest mountain on earth is the Mount Everest
2.
that stretches above the surrounding land in a limited area usually in the form of a
than a hill.
Mountain
A mountain is a large landform The highest mountain on earth is the Mount Everest
v Land form v 8madeOf(Earth t Rock) v 9height. 2500 Mountain(Mt.Everest) v ¬Hill u ¬Plain Mountain(Mt.Kilimanjaro)
(unstructured content) (structured content)
3.
different views to support different modeling actors;
that stretches above the surrounding land in a limited area usually in the form of a
than a hill. A mountain is a large landform The highest mountain on earth is the Mount Everest
(unstructured view)
Mountain Mountain
(semi - structured view) earth made of is a landform height at least 2,500m samples
made of rock different from hill, plain
v Land form v 8madeOf(Earth t Rock) v 9height. 2500 Mountain(Mt.Everest) v ¬Hill u ¬Plain Mountain(Mt.Kilimanjaro)
(fully - structured view)
Mountain
Alignment between the different views
that stretches above the surrounding land in a limited area usually in the form of a
steeper than a hill.
Mountain
(unstructured view)
A mountain is a large landform The highest mountain on earth is the Mount Everest
earth made of is a landform height at least 2,500m samples
made of rock different from hill, plain
(semi-structured view)
v Land form v 8madeOf(Earth t Rock) v 9height. 2500 Mountain(Mt. Everest) v ¬Hill u ¬Plain Mountain(Mt. Kilimanjaro)
(fully structured view)
19
Collaborative editing between knowledge experts and knowledge engineers Web 2.0 tool Term extraction features Automatic translation from and to OWL and BPMN Support for validation and feedback Integrated ontology and process modeling Graphical and textual editing Available as open source tool. Demo at moki.fbk.eu
Support ontology modeling by extracting concepts
… actually, by automatically extracting key-phrases Key-phrases are the terms characterizing a document or a
Automatic concepts extraction plays an important role in
To boost the ontology construction/extension phase To “validate” an ontology against a domain corpus
A framework for supporting ontology building/evaluation by
A fully-working and publicly available implementation of the
Key-concepts extraction Alignment with additional resources Corpus collection External resources (e.g Wordnet) Candidate key-concepts list Enriched key-concepts list Extended ontology Domain corpus Current ontology Validation / Evaluation Ontology metrics
The corpus can be manually or automatically selected (e.g.
Corpus could consist of:
(large) collection of documents
A single big document
Key-concepts extraction ! Alignment with external resources ! Corpus collection ! Manual validation !
Performed by KX (Keyphrase eXtraction) tool.
exploits linguistic information and statistical measures to select
a list of weighted keywords from documents;
handles multi-words; flexible parameters configuration; easily adaptable to new languages; ranked 2nd (out of 20) at SemEval2010, task on “Automatic
Keyphrase Extraction from Scientific Articles”.
Key-concepts extraction ! Alignment with external resources ! Corpus collection ! Manual validation !
Extracted key-concepts aligned and enriched with additional
WordNet (& WN domains): synonyms, definitions, SUMO labels; Wikipedia: link to the Wikipedia page corresponding to the term
(exploiting BabelNet);
Other external resources (e.g. dictionary).
Enriched key-concepts list matched against the ontology, to
Key-concepts extraction ! Alignment with external resources ! Corpus collection ! Manual validation !
Ontology Extension:
The user decides which of the extracted key-concepts to add to
the ontology;
The additional details provided in the enriched list may guide the
formalization;
Ontology Terminological Evaluation:
Automatically computed metrics (variants of IR precision and
recall) support users in determining the terminological coverage
Key-concepts extraction ! Alignment with external resources ! Corpus collection ! Manual validation !
The proposed approach can support several different ontology
Ontology construction boosting: building an ontology from
scratch;
Ontology extension: adding new concepts to an existing
Ontology evaluation: evaluating terminologically an ontology
against a domain corpus;
Ontology ranking: ranking candidate ontologies wrt a given
domain corpus;
Ranking of ontology concepts: determining which are the
domain-wise most relevant concepts defined in an ontology.
Framework fully-implemented in MoKi Publicly available @ moki.fbk.eu Accepts a collection of digital documents in any popular
Let’s see it in action!
Starting Point: a collaborative ontology modeling framework
Goal: to support building rich and high quality ontologies Issue: current state of the art NLP techniques for information
mainly focused on the extraction of terms; more suitable to support the construction of light-weight
medium-quality ontologies;
Challenge: how to appropriately exploit NLP techniques to
Objective:
Address key research challenges in NLP and ontology
Strong algorithmic and methodological aspects, together