islands of tractability in ontology based data access
play

Islands of tractability in ontology-based data access Michael - PowerPoint PPT Presentation

Islands of tractability in ontology-based data access Michael Zakharyaschev Department of Computer Science and Information Systems , Birkbeck, University of London http://www.dcs.bbk.ac.uk/~michael supported by EPSRC grants ExODA EP/H05099X and


  1. Islands of tractability in ontology-based data access Michael Zakharyaschev Department of Computer Science and Information Systems , Birkbeck, University of London http://www.dcs.bbk.ac.uk/~michael supported by EPSRC grants ExODA EP/H05099X and iTract EP/M012670

  2. Data access in industry (from Norwegian Petroleum Directorate’s FactPages) show me the wellbores completed before 2008 where Statoil as a drilling operator sampled less than 10 meters of cores 5 days later: SELECT DISTINCT cores.wlbName, cores.lenghtM, wellbore.wlbDrillingOperator, wellbore.wlbCompletionYear FROM ( (SELECT wlbName, wlbNpdidWellbore, (wlbTotalCoreLength * 0.3048) AS lenghtM FROM wellbore core WHERE wlbCoreIntervalUom = ’[ft ]’ ) UNION (SELECT wlbName, wlbNpdidWellbore, wlbTotalCoreLength AS lenghtM FROM wellbore core WHERE wlbCoreIntervalUom = ’[m ]’ ) In STATOIL: ) as cores, ( (SELECT wlbNpdidWellbore, wlbDrillingOperator, wlbCompletionYear FROM wellbore development all 1,000 TB of relational data UNION (SELECT wlbNpdidWellbore, wlbDrillingOperator, wlbCompletionYear FROM wellbore exploration all ) 2,000 tables UNION (SELECT wlbNpdidWellbore, wlbDrillingOperator, wlbCompletionYear FROM wellbore shallow all ) different schemas ) as wellbore WHERE wellbore.wlbNpdidWellbore = cores.wlbNpdidWellbore ... 30–70% of time on data gathering UCL 16.11.15 1

  3. Ontology-based data access (OBDA) (the Romans ≈ 2007 ) SELECT DISTINCT ?unit ?well query WHERE { [] npdv:stratumForWellbore ?wellboreURI ; npdv:inLithostratigraphicUnit [ npdv:name ?unit ] . ?wellboreURI npdv:name ?well . ?core a npdv:WellboreCore ; ProductionWellbore npdv:coreForWellbore ?wellboreURI . ∪ } Wellbore coreForWellbore WellboreCore [] rdf:type rr:TriplesMap; stratumForWellbore rr:logicalTable "select * from wellbore core"; rr:subjectMap [ a rr:TermMap; rr:template "&npd-v2;wellbore/ { wlbNpdidWellbore } /";]; ontology rr:propertyObjectMap [ rr:property npdv:coreIntervalBottom; WellboreStratum rr:column "wlbCoreIntervalBottom" ]; mappings ... A B C D 1 2 3 CREATE TABLE wellbore core ( 4 wlbName varchar(60) NOT NULL, 5 wlbCoreNumber int(11) NOT NULL, wlbCoreIntervalTop decimal(13,6), data sources ... ) Ontology – gives a high-level conceptual view of the data – provides a convenient & natural vocabulary for user queries – enriches incomplete data with background knowledge UCL 16.11.15 2

  4. OBDA via FO-rewriting query q query rewriting q ′ + + rewriting + + unfolding unfolding mapping mapping ontology T ontology npdv:MoveableFacility npdv:MoveableFacility npdv:MoveableFacility npdv:MoveableFacility (URI(”&npdv;facility/ {} ”,t7)) (URI(”&npdv;facility/ {} ”,t7)) ⊑ npdv:Facility ⊑ npdv:Facility :- facility moveable(t1,. . . ,t6,t7,t8,. . . ,t10) :- facility moveable(t1,. . . ,t6,t7,t8,. . . ,t10) . . . . . . . . . . . . canonical canonical virtual ABox A virtual ABox database database model model + + + + triples triples n -ary relations n -ary relations derived triples derived triples = q ′ ( � for all A and � a , T , A | = q ( � a ) ⇐ ⇒ I A | a ) reduction to DB query evaluation UCL 16.11.15 3

  5. OWL 2 QL profile of OWL 2 (W3C 2012) Roles ̺ ( x, y ) ::= ⊤ | P ( x, y ) | P ( y, x ) R ::= ⊤ | P | P − Basic concepts τ ( x ) ::= ⊤ | A ( x ) | ∃ y ̺ ( x, y ) B ::= ⊤ | A | ∃ R � τ ( x ) → τ ′ ( x ) � B ⊑ B ′ TBoxes ∀ x � ̺ ( x, y ) → ̺ ′ ( x, y ) � R ⊑ R ′ ∀ x, y ∀ x ̺ ( x, x ) R is reflexive B ⊓ B ′ ⊑ ⊥ � τ ( x ) ∧ τ ′ ( x ) → ⊥ � ∀ x R ⊓ R ′ ⊑ ⊥ � ̺ ( x, y ) ∧ ̺ ′ ( x, y ) → ⊥ � ∀ x, y � � ∀ x ̺ ( x, x ) → ⊥ R is irreflexive � τ ( x ) → ∃ y ( ̺ 1 ( x, y ) ∧ · · · ∧ ̺ k ( x, y ) ∧ τ ′ ( y )) � B ⊑ ∃ R.B ′ Sugar ∀ x (expressible via additional role inclusions) ABoxes { A ( a ) , P ( a, b ) , ... } based on the ‘ DL-Lite family’ designed by the Romans ( ≈ 2005 ) and extended by Artale, Calvanese, Kontchakov & Z ( 2007 – 9 ) UCL 16.11.15 4

  6. Example Staff ontology T � � ∀ x ProjectManager ( x ) → ∃ y ( isAssistedBy ( x, y ) ∧ PA ( y )) � � ∀ x ∃ y managesProject ( x, y ) → ProjectManager ( x ) � � ∀ x ProjectManager ( x ) → Staff ( x ) � � ∀ x PA ( x ) → Secretary ( x ) User query q : find the staff assisted by secretaries q ( x ) = ∃ y ( Staff ( x ) ∧ isAssistedBy ( x, y ) ∧ Secretary ( y ))) PE-rewriting of ontology-mediated query ( T , q ) q ′ ( x ) = ∃ y � � Staff ( x ) ∧ isAssistedBy ( x, y ) ∧ ( Secretary ( y ) ∨ PA ( y )) ∨ ProjectManager ( x ) ∨ ∃ z managesProject ( x, z ) UCL 16.11.15 5

  7. Why are OWL 2 QL OMQs FO-rewritable? � Canonical model (chase) C T , A of a given consistent ( T , A ) homomorphically embeddable into every model of ( T , A ) T , A | = q ⇐ ⇒ C T , A | = q for any CQ q Example: T = { A ⊑ ∃ R − . ∃ R.B, B ⊑ ∃ S.B } A = { A ( a ) } C T , A a a a a R R R R R R S S S A A A A B B B B B B all Horn DLs have canonical models but OMQ ( {∃ R.A ⊑ A } , A ( x )) is not FO-rewritable (recursive datalog needed) � Bounded depth derivation property: there is a function f such that ⇒ C N T , A | = q ⇐ T , A | = q with C N T , A constructed in N = f ( |T | , | q | ) steps ⇔ FO-rewritability f is polynomial for OWL 2 QL UCL 16.11.15 6

  8. What is the price of OBDA? – reduction to DB query evaluation could be too expensive OBDA would not be viable 1 what is the size of rewritings ? – depending on the type of OMQs – depending on the type of rewritings new research (succinctness) problem 2 what is the combined complexity of OMQ answering ? – depending on the type of OMQs well-known problem in DB theory it may turn out that reduction to DB query evaluation is not most optimal way of OMQ answering UCL 16.11.15 7

  9. Tree-witness rewriting of OMQ Q = ( T , q ) C τ 2 ( a 2 ) q t 2 T q h h C T , A q t 1 C τ 1 ( a 1 ) T � � � � � q tw ( � x ) = ∃ � y S ( � z ) ∧ tw t Θ independent set z ) ∈ q \ q Θ t ∈ Θ S ( � of tree witnesses Θ is independent if q t ∩ q t ′ = ∅ , for any distinct t , t ′ ∈ Θ UCL 16.11.15 8

  10. The number of tree witnesses B q ( x 1 , x 2 , x 3 ) B C T , { A ( a ) } a x 1 x 2 x 3 A exponentially-many tree witnesses huge tw-rewriting however, it can be simplified to a polynomial-size PE-rewriting: A ( z ) ∧ � n � � �� q ( x 1 , x 2 , x 3 ) ∨ ∃ z ( x i = z ) ∨∃ y ( R ( y, x i ) ∧ R ( y, z )) i =1 can we always do this? UCL 16.11.15 9

  11. Circuit complexity P/poly: the class of problems decidable by polynomial-size circuit families P ⊆ P/poly �⊆ if NP P/poly then P � = NP – almost all Boolean functions with n inputs require circuits of size Θ(2 n /n ) (Shannon 1949) are there complex Boolean functions f n in NP ? (known lower bound: 5 n − o ( n ) ) nobody knows, but ... UCL 16.11.15 10

  12. Monotone circuit complexity (Razborov, Raz, et al. 1985) Boolean variables e ij give graph G = ( V, E ) : V = { 1 , . . . , n } , E = � � { i, j } | e ij = 1 (e.g., for k ≤ n 1 / 4 ) – C LIQUE n,k ( � e ) = 1 iff G contains a k -clique √ ( 2 ε k ) monotone circuits: exp monotone formulas: exp formulas with ¬ : superpoly unless NP ⊆ P/poly – M ATCHING n ( � e ) = 1 iff the bipartite graph � e with n vertices in each part has a perfect matching (subset of edges containing every node once) monotone formulas: exp formulas with ¬ : poly UCL 16.11.15 11

  13. Tree-witness rewriting as a Boolean function OMQ Q = ( T , q ) a hypergraph H Q = ( V, E ) where vertices V = atoms of q hyperedges E = tree witnesses q t monotone Boolean hypergraph function for Q (or hypergraph H Q ) � � � � � f Q = p v ∧ p e E ′ ⊆ E independent e ∈ E ′ v ∈ V \ V E ′ (some tweaks required in case of exponentially-many tree witnesses) – Boolean formula ϕ for f Q FO-rewriting of size O ( | ϕ | · | Q | ) – monotone Boolean formula ϕ for f Q PE-rewriting – monotone Boolean circuit ϕ for f Q NDL-rewriting (nonrecursive datalog) tool for obtaining upper succinctness and complexity bounds using classical circuit complexity UCL 16.11.15 12

  14. Tool for lower bounds For any OMQ Q = ( T , q ) and assignment α : predicates ( q ) → { 0 , 1 } , A α = { A ( a ) | α ( A ) = 1 } ∪ { P ( a, a ) | α ( P ) = 1 } ABox with a single individual a Primitive evaluation function: g Q ( α ) = 1 ⇔ T , A α | = q ( � a ) – FO-rewriting q ′ of Q Boolean formula for g Q of size O ( | q ′ | ) – PE-rewriting q ′ of Q monotone Boolean formula for g Q – NDL-rewriting q ′ of Q monotone Boolean circuit for g Q (proof by quantifier elimination) tool for obtaining lower succinctness bounds using classical circuit complexity UCL 16.11.15 13

  15. Case study: OMQs with ontologies of depth 1 ∃ P − ⊑ ∃ R no axioms such as A ⊑ ∃ P , depth 1 depth 2 b a b a A A Q = ( T , q ) with T of depth 1 hypergraph H Q is of degree ≤ 2 each vertex belongs to ≤ 2 hyperedges ∃ OMQ Q H with T of depth 1 and H ∼ hypergraph H of degree ≤ 2 = H Q H What can hypergraph functions of degree 2 compute? UCL 16.11.15 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend