Islands of tractability in ontology-based data access Michael - PowerPoint PPT Presentation

Islands of tractability in ontology-based data access Michael Zakharyaschev Department of Computer Science and Information Systems , Birkbeck, University of London http://www.dcs.bbk.ac.uk/~michael supported by EPSRC grants ExODA EP/H05099X and iTract EP/M012670

Data access in industry (from Norwegian Petroleum Directorate’s FactPages) show me the wellbores completed before 2008 where Statoil as a drilling operator sampled less than 10 meters of cores 5 days later: SELECT DISTINCT cores.wlbName, cores.lenghtM, wellbore.wlbDrillingOperator, wellbore.wlbCompletionYear FROM ( (SELECT wlbName, wlbNpdidWellbore, (wlbTotalCoreLength * 0.3048) AS lenghtM FROM wellbore core WHERE wlbCoreIntervalUom = ’[ft ]’ ) UNION (SELECT wlbName, wlbNpdidWellbore, wlbTotalCoreLength AS lenghtM FROM wellbore core WHERE wlbCoreIntervalUom = ’[m ]’ ) In STATOIL: ) as cores, ( (SELECT wlbNpdidWellbore, wlbDrillingOperator, wlbCompletionYear FROM wellbore development all 1,000 TB of relational data UNION (SELECT wlbNpdidWellbore, wlbDrillingOperator, wlbCompletionYear FROM wellbore exploration all ) 2,000 tables UNION (SELECT wlbNpdidWellbore, wlbDrillingOperator, wlbCompletionYear FROM wellbore shallow all ) different schemas ) as wellbore WHERE wellbore.wlbNpdidWellbore = cores.wlbNpdidWellbore ... 30–70% of time on data gathering UCL 16.11.15 1

Ontology-based data access (OBDA) (the Romans ≈ 2007 ) SELECT DISTINCT ?unit ?well query WHERE { [] npdv:stratumForWellbore ?wellboreURI ; npdv:inLithostratigraphicUnit [ npdv:name ?unit ] . ?wellboreURI npdv:name ?well . ?core a npdv:WellboreCore ; ProductionWellbore npdv:coreForWellbore ?wellboreURI . ∪ } Wellbore coreForWellbore WellboreCore [] rdf:type rr:TriplesMap; stratumForWellbore rr:logicalTable "select * from wellbore core"; rr:subjectMap [ a rr:TermMap; rr:template "&npd-v2;wellbore/ { wlbNpdidWellbore } /";]; ontology rr:propertyObjectMap [ rr:property npdv:coreIntervalBottom; WellboreStratum rr:column "wlbCoreIntervalBottom" ]; mappings ... A B C D 1 2 3 CREATE TABLE wellbore core ( 4 wlbName varchar(60) NOT NULL, 5 wlbCoreNumber int(11) NOT NULL, wlbCoreIntervalTop decimal(13,6), data sources ... ) Ontology – gives a high-level conceptual view of the data – provides a convenient & natural vocabulary for user queries – enriches incomplete data with background knowledge UCL 16.11.15 2

OBDA via FO-rewriting query q query rewriting q ′ + + rewriting + + unfolding unfolding mapping mapping ontology T ontology npdv:MoveableFacility npdv:MoveableFacility npdv:MoveableFacility npdv:MoveableFacility (URI(”&npdv;facility/ {} ”,t7)) (URI(”&npdv;facility/ {} ”,t7)) ⊑ npdv:Facility ⊑ npdv:Facility :- facility moveable(t1,. . . ,t6,t7,t8,. . . ,t10) :- facility moveable(t1,. . . ,t6,t7,t8,. . . ,t10) . . . . . . . . . . . . canonical canonical virtual ABox A virtual ABox database database model model + + + + triples triples n -ary relations n -ary relations derived triples derived triples = q ′ ( � for all A and � a , T , A | = q ( � a ) ⇐ ⇒ I A | a ) reduction to DB query evaluation UCL 16.11.15 3

OWL 2 QL profile of OWL 2 (W3C 2012) Roles ̺ ( x, y ) ::= ⊤ | P ( x, y ) | P ( y, x ) R ::= ⊤ | P | P − Basic concepts τ ( x ) ::= ⊤ | A ( x ) | ∃ y ̺ ( x, y ) B ::= ⊤ | A | ∃ R � τ ( x ) → τ ′ ( x ) � B ⊑ B ′ TBoxes ∀ x � ̺ ( x, y ) → ̺ ′ ( x, y ) � R ⊑ R ′ ∀ x, y ∀ x ̺ ( x, x ) R is reflexive B ⊓ B ′ ⊑ ⊥ � τ ( x ) ∧ τ ′ ( x ) → ⊥ � ∀ x R ⊓ R ′ ⊑ ⊥ � ̺ ( x, y ) ∧ ̺ ′ ( x, y ) → ⊥ � ∀ x, y � � ∀ x ̺ ( x, x ) → ⊥ R is irreflexive � τ ( x ) → ∃ y ( ̺ 1 ( x, y ) ∧ · · · ∧ ̺ k ( x, y ) ∧ τ ′ ( y )) � B ⊑ ∃ R.B ′ Sugar ∀ x (expressible via additional role inclusions) ABoxes { A ( a ) , P ( a, b ) , ... } based on the ‘ DL-Lite family’ designed by the Romans ( ≈ 2005 ) and extended by Artale, Calvanese, Kontchakov & Z ( 2007 – 9 ) UCL 16.11.15 4

Example Staff ontology T � � ∀ x ProjectManager ( x ) → ∃ y ( isAssistedBy ( x, y ) ∧ PA ( y )) � � ∀ x ∃ y managesProject ( x, y ) → ProjectManager ( x ) � � ∀ x ProjectManager ( x ) → Staff ( x ) � � ∀ x PA ( x ) → Secretary ( x ) User query q : find the staff assisted by secretaries q ( x ) = ∃ y ( Staff ( x ) ∧ isAssistedBy ( x, y ) ∧ Secretary ( y ))) PE-rewriting of ontology-mediated query ( T , q ) q ′ ( x ) = ∃ y � � Staff ( x ) ∧ isAssistedBy ( x, y ) ∧ ( Secretary ( y ) ∨ PA ( y )) ∨ ProjectManager ( x ) ∨ ∃ z managesProject ( x, z ) UCL 16.11.15 5

Why are OWL 2 QL OMQs FO-rewritable? � Canonical model (chase) C T , A of a given consistent ( T , A ) homomorphically embeddable into every model of ( T , A ) T , A | = q ⇐ ⇒ C T , A | = q for any CQ q Example: T = { A ⊑ ∃ R − . ∃ R.B, B ⊑ ∃ S.B } A = { A ( a ) } C T , A a a a a R R R R R R S S S A A A A B B B B B B all Horn DLs have canonical models but OMQ ( {∃ R.A ⊑ A } , A ( x )) is not FO-rewritable (recursive datalog needed) � Bounded depth derivation property: there is a function f such that ⇒ C N T , A | = q ⇐ T , A | = q with C N T , A constructed in N = f ( |T | , | q | ) steps ⇔ FO-rewritability f is polynomial for OWL 2 QL UCL 16.11.15 6

What is the price of OBDA? – reduction to DB query evaluation could be too expensive OBDA would not be viable 1 what is the size of rewritings ? – depending on the type of OMQs – depending on the type of rewritings new research (succinctness) problem 2 what is the combined complexity of OMQ answering ? – depending on the type of OMQs well-known problem in DB theory it may turn out that reduction to DB query evaluation is not most optimal way of OMQ answering UCL 16.11.15 7

Tree-witness rewriting of OMQ Q = ( T , q ) C τ 2 ( a 2 ) q t 2 T q h h C T , A q t 1 C τ 1 ( a 1 ) T � � � � � q tw ( � x ) = ∃ � y S ( � z ) ∧ tw t Θ independent set z ) ∈ q \ q Θ t ∈ Θ S ( � of tree witnesses Θ is independent if q t ∩ q t ′ = ∅ , for any distinct t , t ′ ∈ Θ UCL 16.11.15 8

The number of tree witnesses B q ( x 1 , x 2 , x 3 ) B C T , { A ( a ) } a x 1 x 2 x 3 A exponentially-many tree witnesses huge tw-rewriting however, it can be simplified to a polynomial-size PE-rewriting: A ( z ) ∧ � n � � �� q ( x 1 , x 2 , x 3 ) ∨ ∃ z ( x i = z ) ∨∃ y ( R ( y, x i ) ∧ R ( y, z )) i =1 can we always do this? UCL 16.11.15 9

Circuit complexity P/poly: the class of problems decidable by polynomial-size circuit families P ⊆ P/poly �⊆ if NP P/poly then P � = NP – almost all Boolean functions with n inputs require circuits of size Θ(2 n /n ) (Shannon 1949) are there complex Boolean functions f n in NP ? (known lower bound: 5 n − o ( n ) ) nobody knows, but ... UCL 16.11.15 10

Monotone circuit complexity (Razborov, Raz, et al. 1985) Boolean variables e ij give graph G = ( V, E ) : V = { 1 , . . . , n } , E = � � { i, j } | e ij = 1 (e.g., for k ≤ n 1 / 4 ) – C LIQUE n,k ( � e ) = 1 iff G contains a k -clique √ ( 2 ε k ) monotone circuits: exp monotone formulas: exp formulas with ¬ : superpoly unless NP ⊆ P/poly – M ATCHING n ( � e ) = 1 iff the bipartite graph � e with n vertices in each part has a perfect matching (subset of edges containing every node once) monotone formulas: exp formulas with ¬ : poly UCL 16.11.15 11

Tree-witness rewriting as a Boolean function OMQ Q = ( T , q ) a hypergraph H Q = ( V, E ) where vertices V = atoms of q hyperedges E = tree witnesses q t monotone Boolean hypergraph function for Q (or hypergraph H Q ) � � � � � f Q = p v ∧ p e E ′ ⊆ E independent e ∈ E ′ v ∈ V \ V E ′ (some tweaks required in case of exponentially-many tree witnesses) – Boolean formula ϕ for f Q FO-rewriting of size O ( | ϕ | · | Q | ) – monotone Boolean formula ϕ for f Q PE-rewriting – monotone Boolean circuit ϕ for f Q NDL-rewriting (nonrecursive datalog) tool for obtaining upper succinctness and complexity bounds using classical circuit complexity UCL 16.11.15 12

Tool for lower bounds For any OMQ Q = ( T , q ) and assignment α : predicates ( q ) → { 0 , 1 } , A α = { A ( a ) | α ( A ) = 1 } ∪ { P ( a, a ) | α ( P ) = 1 } ABox with a single individual a Primitive evaluation function: g Q ( α ) = 1 ⇔ T , A α | = q ( � a ) – FO-rewriting q ′ of Q Boolean formula for g Q of size O ( | q ′ | ) – PE-rewriting q ′ of Q monotone Boolean formula for g Q – NDL-rewriting q ′ of Q monotone Boolean circuit for g Q (proof by quantifier elimination) tool for obtaining lower succinctness bounds using classical circuit complexity UCL 16.11.15 13

Case study: OMQs with ontologies of depth 1 ∃ P − ⊑ ∃ R no axioms such as A ⊑ ∃ P , depth 1 depth 2 b a b a A A Q = ( T , q ) with T of depth 1 hypergraph H Q is of degree ≤ 2 each vertex belongs to ≤ 2 hyperedges ∃ OMQ Q H with T of depth 1 and H ∼ hypergraph H of degree ≤ 2 = H Q H What can hypergraph functions of degree 2 compute? UCL 16.11.15 14

Islands of tractability in ontology-based data access Michael - PowerPoint PPT Presentation

Islands of tractability in ontology-based data access Michael Zakharyaschev Department of Computer Science and Information Systems , Birkbeck, University of London http://www.dcs.bbk.ac.uk/~michael supported by EPSRC grants ExODA EP/H05099X and

(More on) Islands of Tractability in Ontology-Based Data Access Carsten Lutz, University of

Detecting and Exploiting Subproblem Tractability Christian Bessiere, Cl ement Carbonnel,

Highlands and Islands Highlands and Islands Highlands and Islands Highlands and Islands

Data driven Ontology Alignment Data driven Ontology Alignment Nigam Shah nigam@stanford.edu

On the Tractability of Digraph-Based Task Models Martin Stigge Uppsala University, Sweden Joint

5. Structured Descriptions & Tradeoff Between Expressiveness and Tractability Outline

INFRASTRUCTURE NEEDS OF THE TIWI ISLANDS The Tiwi Islands lie 80 km to the North of Darwin and

The Combined Approach to Ontology-Based Data Access R. Kontchakov, C. Lutz, D. Toman, F.Wolter

Some (more) Burning Issues for Ontology Initiatives Background: Current Ontology Work in Bremen

Ontology Development 101: A Guide to Creating Your First Ontology Natalya F. Noy and Deborah L.

Systematic Annotation Mark Voorhies 4/5/2011 The Gene Ontology Three directed acyclic graphs

Combining XML querying Combining XML querying with ontology reasoning: with ontology reasoning:

Ontology Languages for the Semantic Web Ontology Languages Wide variety of languages for

Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA 1 Outline S O P

Ontology Engineering Lecture 7: Top-down (and middle-out) Ontology Development II Maria Keet

ODPReco - A Tool to Recommend Ontology Design Patterns Maleeha Arif Yasvi, Raghava Mutharaju

Public-Private Model in Graphs Brian Brubach Soheil Ehsani Karthik Sankararaman

Units without degeneracy, from polycategories to sequent calculi Amar Hadzihasanovic (

Tota otal S Sens ensitivity B Bas ased ed DFM FM Optimiz mizat ation ion of of Standar

Typically-Correct Derandomization for Small Time and Space William M. Hoza 1 University of Texas

Programmable Hash Functions in the Multilinear Setting Eduarda S. V. Freire, Dennis Hofheinz,

Exponential-Time Approximation of Hard Problems Lukasz Kowalik joint work with: Marek Cygan,

Polyteam Semantics Team Semantics Axiomatizations in team semantics Polyteams and Jonni

PROBABILITY THEORY Lecture 1 Basics Lecture 2 Independence and Bernoulli Trials

Islands of tractability in ontology-based data access Michael - PowerPoint PPT Presentation

Islands of tractability in ontology-based data access Michael Zakharyaschev Department of Computer Science and Information Systems , Birkbeck, University of London http://www.dcs.bbk.ac.uk/~michael supported by EPSRC grants ExODA EP/H05099X and

(More on) Islands of Tractability in Ontology-Based Data Access Carsten Lutz, University of

Detecting and Exploiting Subproblem Tractability Christian Bessiere, Cl ement Carbonnel,

Highlands and Islands Highlands and Islands Highlands and Islands Highlands and Islands

Data driven Ontology Alignment Data driven Ontology Alignment Nigam Shah nigam@stanford.edu

On the Tractability of Digraph-Based Task Models Martin Stigge Uppsala University, Sweden Joint

5. Structured Descriptions &amp; Tradeoff Between Expressiveness and Tractability Outline

INFRASTRUCTURE NEEDS OF THE TIWI ISLANDS The Tiwi Islands lie 80 km to the North of Darwin and

The Combined Approach to Ontology-Based Data Access R. Kontchakov, C. Lutz, D. Toman, F.Wolter

Some (more) Burning Issues for Ontology Initiatives Background: Current Ontology Work in Bremen

Ontology Development 101: A Guide to Creating Your First Ontology Natalya F. Noy and Deborah L.

Systematic Annotation Mark Voorhies 4/5/2011 The Gene Ontology Three directed acyclic graphs

Combining XML querying Combining XML querying with ontology reasoning: with ontology reasoning:

Ontology Languages for the Semantic Web Ontology Languages Wide variety of languages for

Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA 1 Outline S O P

Ontology Engineering Lecture 7: Top-down (and middle-out) Ontology Development II Maria Keet

ODPReco - A Tool to Recommend Ontology Design Patterns Maleeha Arif Yasvi, Raghava Mutharaju

Public-Private Model in Graphs Brian Brubach Soheil Ehsani Karthik Sankararaman

Units without degeneracy, from polycategories to sequent calculi Amar Hadzihasanovic (

Tota otal S Sens ensitivity B Bas ased ed DFM FM Optimiz mizat ation ion of of Standar

Typically-Correct Derandomization for Small Time and Space William M. Hoza 1 University of Texas

Programmable Hash Functions in the Multilinear Setting Eduarda S. V. Freire, Dennis Hofheinz,

Exponential-Time Approximation of Hard Problems Lukasz Kowalik joint work with: Marek Cygan,

Polyteam Semantics Team Semantics Axiomatizations in team semantics Polyteams and Jonni

PROBABILITY THEORY Lecture 1 Basics Lecture 2 Independence and Bernoulli Trials

5. Structured Descriptions & Tradeoff Between Expressiveness and Tractability Outline