Data Integration with Ontologies Sebastian Brandt - PowerPoint PPT Presentation

Data Integration with Ontologies Sebastian Brandt brandt@cs.manchester.ac.uk (slides by Bijan Parsia bparsia@cs.man.ac.uk) 1 Friday, 2 May 2014

Ontology Based Data Access (ODBA) • Ontology at run time? – More, ontology for the end user!??! • By end user, I mean, “someone writing queries” • Familiar – Controlled vocabulary – Query by example • New – “Better” queries – Integrated views of data Friday, 2 May 2014

Person “Better” queries Student Employee • Better how? hasAge hasSalary – Consider a simple schema – What does the logical schema look like? create table employee (id number(4) – Lots of variants hasAge number(3), hasSalary number(6); • Sane queries create table student (id number(4) – SELECT hasAge FROM employee hasAge number(3), hasSalary number(5); WHERE hasSalary >= 50000; – SELECT hasAge FROM student WHERE hasSalary >= 50000; – What about Persons? • Union query? • Rather write – SELECT hasAge FROM Person WHERE hasSalary >= 50000; – no matter what kind of persons there are Friday, 2 May 2014

What do we want? • We want to be able to query our data – in the same way • no matter how the underlying structure changes – in a “natural” way • so that I get the answers I need – effectively • no waiting until the end of time – unobtrusively • i.e., without too much disruption to my information systems • Often the people using the data – are not the same as the people • collect the data • curate the data • manage the data • build apps using the data • Opportunities for impedance mismatch Friday, 2 May 2014

Bioinformatics case • Thousands (if not 100s of 1000s) of data sources – Not all are databases! • or SQL database! • Very much over same or related data • Domain knowledge is widely shared – Biologists know what they are talking about • genes, proteins, trees, etc. • Data structure knowledge not widely shared – Consequence of the first point! • What must they do to get an answer? Friday, 2 May 2014

Workflow 1. Discover (all!) relevant sources 2. Assimilate their structure and content 3. Formulate query fragments – Each source might have it’s own! – The user must understand how things come together 4. Dispatch the queries • Need to understand the interfaces! select ? ???? 5. Synthesize the results //protein/[@?dfl] //protein/[@?dfl] select ? ???? //protein/[@?dfl] //protein/[@?dfl] gene, that, I, want http://www.publicdomainpictures.net/view-image.php?image=21541&picture=trassliga-kablar-pa-pole Friday, 2 May 2014

The hope • An ontology – representing domain knowledge – in a reasonably familiar way – would provide easier access • For example: – http://www.cs.man.ac.uk/~stevensr/tambis/video/Tut-Tao- query.avi Friday, 2 May 2014

Two Basic Strategies • In general: – TBox = Schema; ABox = Data • ETL – Convert the databases into an ABox • Federation – Split, dispatch, and splice queries on the fly https://babbage.inf.unibz.it/trac/obdapublic/wiki/ObdalibQuestIntro Friday, 2 May 2014

We always need mappings! • We need to map the data structure – into the common schema/TBox – no matter what – no free lunch – but we saw how to do that! • ETL is a development time thing – Develop the mapping – Run the conversion – Mappings inactive at runtime – What are the pros/cons? • Federation leaves the data in situ – But has to exploit the mappings at query time – Pros/cons? Friday, 2 May 2014

Issues • We have a non-standard query language – OWL or SPARQL (a SQL like conjunctive query language) • We have to do “extra” work – Build the common ontology – Create the mappings • We have computational issues – Data complexity of OWL is very high (NP-Complete) Friday, 2 May 2014

Trade expressivity for performance • We want an ontology language – which is expressive enough to represent DB schemas – with good data complexity (at least) – sound and complete algorithms for federated query answering • Answer: OWL QL – http://www.w3.org/TR/owl2-profiles/#OWL_2_QL – A restriction of OWL • “The OWL 2 QL profile is designed so that sound and complete query answering is in LOGSPACE (more precisely, in AC0) with respect to the size of the data (assertions), while providing many of the main features necessary to express conceptual models such as UML class diagrams and ER diagrams.” • “...data (assertions) that is stored in a standard relational database system can be queried through an ontology via a simple rewriting mechanism, i.e., by rewriting the query into an SQL query that is then answered by the RDBMS system, without any changes to the data.” • (Based on the DL Lite family of DLs.) Friday, 2 May 2014

Several important moves • Restrict the expression language – B ::= A | ∃ R | ∃ R − Only unqualified existentials! C ::= B | ¬B | C1 ⊓ C2 No hasFinger some Finger • Odd axiom shapes SubClass axioms are – B ⊑ C asymmetric – (funct R), (funct R - ) • No negations on RHS • No conjunctions on RHS – Are conjunctions meaningful here? http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.102.1525&rep=rep1&type=pdf Friday, 2 May 2014

We can express • ISA – using A1 ⊑ A2 ; • disjointness – A1 ⊑ ¬A2 • role-typing – ∃ R ⊑ A1 (or ∃ R − ⊑ A2); • participation constraints, – A ⊑ ∃ R (resp., A ⊑ ∃ R − ); • non-participation constraints – using A ⊑ ¬ ∃ R and A ⊑ ¬ ∃ R − ; • functionality restrictions – using (funct R) and (funct R − ) – but no other counting http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.102.1525&rep=rep1&type=pdf Friday, 2 May 2014

Person Recall our example Student Employee • Want to write hasAge hasSalary – SELECT hasAge FROM Person WHERE hasSalary >= 50000; create table employee (id number(4) – and get the right answers hasAge number(3), hasSalary number(6); • We build an ontology create table student (id number(4) – Student SubClassOf: Person hasAge number(3), hasSalary number(5); – Employee SubClassOf: Person – Etc. • We build mappings • Our query now works! – No change to database https://babbage.inf.unibz.it/trac/obdapublic/wiki/SimpleHelloWorldTutorial Friday, 2 May 2014

Data Integration with Ontologies Sebastian Brandt - PowerPoint PPT Presentation

Data Integration with Ontologies Sebastian Brandt brandt@cs.manchester.ac.uk (slides by Bijan Parsia bparsia@cs.man.ac.uk) 1 Friday, 2 May 2014 Ontology Based Data Access (ODBA) Ontology at run time? More, ontology for the end

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Ontologies & Its Applications Ontologies & Its Applications San Su Lee, Jong Lim, Rami

Formal rmal Foundations oundations of of Ontologies Ontologies and and Reasoning Reasoning

A Similarity Measure for Formal Ontologies with an application to ontologies of a geographic kind

Terminologies & Terminologies & Ontologies? Ontologies? What are they for? What would

Ontologies: Weather and Ontologies: Weather and Flight Information Kajal Claypool Kelly Moran

SHER: Semantic Databases SHER: Semantic Databases using using ontologies ontologies Julian

Formal rmal Foundations oundations of of Ontologies Ontologies and and Reasoning Reasoning

Systems Systems Systems Integration Systems Integration Systems Systems Integration Systems

Ontologies and data integration in biomedicine Success stories and challenging issues Olivier

Data Cleaning for Data Integration Advanced School on Data Exchange, Integration, and Streams

A Semantic Similarity Measure for Formal Ontologies Mark Hall Final presentation for the master

NExIOM, the NASA Constellation Program Ontologies How they are supporting NASA Constellation

Research Integration Model Codes Looking Forward Integration Bim Ex Plan Research

Integration Programme? Integration Strategy? No national or local integration programme (not

On Querying OBO Ontologies using a DAG Pattern Query Language Amarnath Gupta Simone Santini

Welcome to the Course Introduction to the Course Hans-Joachim Bckenhauer Dennis Komm Autumn

AN LLVM INSTRUMENTATION PLUG-IN FOR SCORE-P Performance: an old problem The most constant

Unit 3 Digital Circuits (Logic) A Brief History COMPUTERS AND SWITCHING TECHNOLOGY 3 4

New mobile phone algorithms a real world story Steve Babbage Vodafone Group R&D 17

Women In Open Source & Computer Technology Allison Fox - RITlug Secretary Wait, youre a

ChurchTuring Thesis CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Fall 2018

eSTREAM Algorithms for the Next Round http://www.ecrypt.eu.org/stream/ 27 March 2007 Matt

I ntroduction to Parallel Perform ance Engineering Bert W esarg Technische Universitt Dresden

Data Integration with Ontologies Sebastian Brandt - PowerPoint PPT Presentation

Data Integration with Ontologies Sebastian Brandt brandt@cs.manchester.ac.uk (slides by Bijan Parsia bparsia@cs.man.ac.uk) 1 Friday, 2 May 2014 Ontology Based Data Access (ODBA) Ontology at run time? More, ontology for the end

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Ontologies &amp; Its Applications Ontologies &amp; Its Applications San Su Lee, Jong Lim, Rami

Formal rmal Foundations oundations of of Ontologies Ontologies and and Reasoning Reasoning

A Similarity Measure for Formal Ontologies with an application to ontologies of a geographic kind

Terminologies &amp; Terminologies &amp; Ontologies? Ontologies? What are they for? What would

Ontologies: Weather and Ontologies: Weather and Flight Information Kajal Claypool Kelly Moran

SHER: Semantic Databases SHER: Semantic Databases using using ontologies ontologies Julian

Formal rmal Foundations oundations of of Ontologies Ontologies and and Reasoning Reasoning

Systems Systems Systems Integration Systems Integration Systems Systems Integration Systems

Ontologies and data integration in biomedicine Success stories and challenging issues Olivier

Data Cleaning for Data Integration Advanced School on Data Exchange, Integration, and Streams

A Semantic Similarity Measure for Formal Ontologies Mark Hall Final presentation for the master

NExIOM, the NASA Constellation Program Ontologies How they are supporting NASA Constellation

Research Integration Model Codes Looking Forward Integration Bim Ex Plan Research

Integration Programme? Integration Strategy? No national or local integration programme (not

On Querying OBO Ontologies using a DAG Pattern Query Language Amarnath Gupta Simone Santini

Welcome to the Course Introduction to the Course Hans-Joachim Bckenhauer Dennis Komm Autumn

AN LLVM INSTRUMENTATION PLUG-IN FOR SCORE-P Performance: an old problem The most constant

Unit 3 Digital Circuits (Logic) A Brief History COMPUTERS AND SWITCHING TECHNOLOGY 3 4

New mobile phone algorithms a real world story Steve Babbage Vodafone Group R&amp;D 17

Women In Open Source &amp; Computer Technology Allison Fox - RITlug Secretary Wait, youre a

ChurchTuring Thesis CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Fall 2018

eSTREAM Algorithms for the Next Round http://www.ecrypt.eu.org/stream/ 27 March 2007 Matt

I ntroduction to Parallel Perform ance Engineering Bert W esarg Technische Universitt Dresden

Ontologies & Its Applications Ontologies & Its Applications San Su Lee, Jong Lim, Rami

Terminologies & Terminologies & Ontologies? Ontologies? What are they for? What would

New mobile phone algorithms a real world story Steve Babbage Vodafone Group R&D 17

Women In Open Source & Computer Technology Allison Fox - RITlug Secretary Wait, youre a