ontological constraints
play

Ontological Constraints Giorgio Orsi 1,2 and Andreas Pieris 2 1 - PowerPoint PPT Presentation

Optimizing Query Answering under Ontological Constraints Giorgio Orsi 1,2 and Andreas Pieris 2 1 Institute for the Future of Computing Oxford Martin School University of Oxford 2 Department of Computer Science University of Oxford VLDB 2011


  1. Optimizing Query Answering under Ontological Constraints Giorgio Orsi 1,2 and Andreas Pieris 2 1 Institute for the Future of Computing Oxford Martin School University of Oxford 2 Department of Computer Science University of Oxford VLDB 2011

  2. Ontological Databases Ontological Reasoning DB Constraints Ontological DB

  3. Ontological Databases Ontological Reasoning DB Constraints Ontological DB D D  ABox D  TBox

  4. Ontological Databases Ontological Reasoning DB Constraints Ontological DB D D  ABox D  TBox Q ( X )  9 Y  ( X,Y )

  5. Ontological Databases Ontological Reasoning DB Constraints Ontological DB D D  ABox , { t | D [  ² 9 u  ( t,u ) } D  TBox Q ( X )  9 Y  ( X,Y )

  6. Ontological Constraints (examples) 8 X emp ( X )  person ( X ) Concept Inclusions: 8 X 8 Y manages ( X,Y )  isManaged (Y, X ) (Inverse) Relation Inclusion: 8 X 8 Y 8 Z mgs ( X,Y ), mgs ( Y,Z )  mgs ( X , Z ) Relation Transitivity: 8 X emp ( X )  9 Y report ( X , Y ) Participation: Disjointness: 8 X emp ( X ), customer ( X )  ? 8 X 8 Y 8 Z reports ( X,Y ), reports ( X,Z )  Y = Z Functionality:

  7. Datalog § [ Cali’ et Al , PODS 09] ¡ Datalog variant allowing in the head: - 9 -variables ! TGDs 8 X 8 Y  ( X , Y )  9 Z  ( X , Z ) Datalog + - Equality atoms ! EGDs 8 X  ( X )  X i =X j - Constant false ( ? ) ! NCs 8 X  ( X )  ?

  8. Datalog § [ Cali’ et Al , PODS 09] ¡ Datalog variant allowing in the head: - 9 -variables ! TGDs 8 X 8 Y  ( X , Y )  9 Z  ( X , Z ) Datalog + - Equality atoms ! EGDs 8 X  ( X )  X i =X j - Constant false ( ? ) ! NCs 8 X  ( X )  ? ¡ But, query answering under Datalog + is undecidable

  9. Datalog § [ Cali’ et Al , PODS 09] ¡ Datalog variant allowing in the head: - 9 -variables ! TGDs 8 X 8 Y  ( X , Y )  9 Z  ( X , Z ) Datalog + - Equality atoms ! EGDs 8 X  ( X )  X i =X j - Constant false ( ? ) ! NCs 8 X  ( X )  ? ¡ But, query answering under Datalog + is undecidable ¡ Datalog + is syntactically restricted ! Datalog §

  10. Datalog § [ Cali’ et Al , PODS 09] ¡ Datalog variant allowing in the head: - 9 -variables ! TGDs 8 X 8 Y  ( X , Y )  9 Z  ( X , Z ) Datalog + - Equality atoms ! EGDs 8 X  ( X )  X i =X j - Constant false ( ? ) ! NCs 8 X  ( X )  ? ¡ But, query answering under Datalog + is undecidable ¡ Datalog + is syntactically restricted ! Datalog § ¡ TGDs more expressive than inclusion dependencies 8 D 8 P 8 A runs ( D , P ), area ( P , A )  9 E employee ( E , D , A )

  11. The Chase Procedure Input: Database D , set of TGDs  Output: A model of D [  D person ( john )  8 X person ( X )  9 Y father ( Y , X ) 8 X 8 Y father ( X , Y )  person ( X ) chase ( D ,  ) = D [ ?

  12. The Chase Procedure Input: Database D , set of TGDs  Output: A model of D [  D person ( john )  8 X person ( X )  9 Y father ( Y , X ) 8 X 8 Y father ( X , Y )  person ( X ) chase ( D ,  ) = D [ { father ( z 1 ,john)

  13. The Chase Procedure Input: Database D , set of TGDs  Output: A model of D [  D person ( john )  8 X person ( X )  9 Y father ( Y , X ) 8 X 8 Y father ( X , Y )  person ( X ) chase ( D ,  ) = D [ { father ( z 1 ,john), person ( z 1 )

  14. The Chase Procedure Input: Database D , set of TGDs  Output: A model of D [  D person ( john )  8 X person ( X )  9 Y father ( Y , X ) 8 X 8 Y father ( X , Y )  person ( X ) chase ( D ,  ) = D [ { father ( z 1 ,john), person ( z 1 ), father ( z 2 , z 1 )

  15. The Chase Procedure Input: Database D , set of TGDs  Output: A model of D [  D person ( john )  8 X person ( X )  9 Y father ( Y , X ) 8 X 8 Y father ( X , Y )  person ( X ) chase ( D ,  ) = D [ { father ( z 1 ,john), person ( z 1 ), father ( z 2 , z 1 ), … }

  16. Query Answering via Chase Q h C = chase ( D ,  ) D h 2 h 1 h 2 ( C ) . . . h 1 ( C ) M 1 M 2 D [  ² Q , chase ( D ,  ) ² Q [see, e.g., Deutsch, Nash & Remmel, PODS 08]

  17. Query Answering via Rewriting  Q

  18. Query Answering via Rewriting  Q compilation Q 

  19. Query Answering via Rewriting  Q compilation Q  Q  evaluation D

  20. Chase vs Rewriting

  21. Linear TGDs 8 X 8 Y r ( X , Y )  9 Z  ( X , Z ) single body atom ¡ Properly generalize inclusion dependencies. ¡ Enjoy the bounded-derivation depth property. ¡ FO-rewritable  Query Answering in AC0 (data complexity).

  22. FO-rewritability: example [Gottlob et Al., ICDE 11]  promoter(X)   Y promotesTo(X,Y) promotesTo(X,Y)  customer(Y) q  promotesTo(A,B), customer(B) Q Q  q  promotesTo(A,B), customer(B) (original query)

  23. FO-rewritability: example [Gottlob et Al., ICDE 11]  promoter(X)   Y promotesTo(X,Y) promotesTo(X,Y)  customer(Y) q  promotesTo(A,B), customer(B) Q Q  q  promotesTo(A,B), customer(B) { Y = B } q  promotesTo(A,B), customer(V 0 ,B) ( V 0 is fresh )

  24. FO-rewritability: Example [Gottlob et Al., ICDE 11]  promoter(X)   Y promotesTo(X,Y) promotesTo(X,Y)  customer(Y) q  promotesTo(A,B), customer(B) Q Q  q  promotesTo(A,B), customer(B) factorization q  promotesTo(A,B), promotesTo(V 0 ,B) ans(A)  promotesTo(A,B) { A = V 0 }

  25. FO-rewritability: example [Gottlob et Al., ICDE 11]  promoter(X)   Y promotesTo(X,Y) promotesTo(X,Y)  customer(Y) q  promotesTo(A,B), customer(B) Q Q  q  promotesTo(A,B), customer(B) q  promotesTo(A,B) {X = A, Y = B} q  promoter(A)

  26. FO-rewritability: example [Gottlob et Al., ICDE 11]  promoter(X)   Y promotesTo(X,Y) promotesTo(X,Y)  customer(Y) q  promotesTo(A,B), customer(B) Q Q  q  promotesTo(A,B), customer(B) UCQ rewriting q  promotesTo(A,B) (first-order) q  promoter(A)

  27. FO-rewritability ¡ Desirable properties of a FO-rewriting:  independent on the DB  executable by any DBMS  easy to compute (e.g., polynomial time)  small size (e.g., polynomial size)

  28. FO-rewritability ¡ Desirable properties of a FO-rewriting:  independent on the DB  executable by any DBMS  easy to compute (e.g., polynomial time)  small size (e.g., polynomial size) ¡ Unions of Conjunctive Queries (UCQs) Calvanese et Al, JAR 07  executable by any DBMS Perez Urbina et Al, JAL 09  DB independent Cali’ et Al , PODS 09  easy to optimize and distribute Gottlob et Al, ICDE 11 and others…  worst-case exponential size in Q and 

  29. FO-rewritability ¡ Combined and hybrid FO-rewriting  good computational properties Perez Urbina et Al, JAL 09 Kontchakov et Al., KR 10 (e.g., polynomial in size) Gottlob and Schwentick, DL 11  requires access to the DB

  30. FO-rewritability ¡ Combined and hybrid FO-rewriting  good computational properties Perez Urbina et Al, JAL 09 Kontchakov et Al., KR 10 (e.g., polynomial in size) Gottlob and Schwentick, DL 11  requires access to the DB ¡ Purely intensional Datalog rewriting Perez Urbina et Al, JAL 09  very compressed representation Rosati and Almatelli., KR 10  purely intensional  requires view-creation or Datalog engine

  31. Datalog Rewriting: Keep it First-Order! ¡ A Datalog query is (in general) not a first-order query  a non-recursive Datalog query is a first-order query  a bounded Datalog query is a first-order query

  32. Datalog Rewriting: Keep it First-Order! ¡ A Datalog query is (in general) not a first-order query  a non-recursive Datalog query is a first-order query  a bounded Datalog query is a first-order query ¡ Input:  a (w.l.o.g. boolean) conjunctive query Q = <q, ρ > Q : q(X)  p(X), s(X,Y)  <q, q(X)  p(X),s(X,Y) >  a set of linear TGDs  ¡ Output:  a bounded Datalog query Q  = <q, π  >

  33. Datalog Rewriting: skolemization (and renaming)  r(X,Y)   Z s(Y,Z) s(X,Y)   Z p(Y,Y,Z) p(X,Y,Z)  t(Z)

  34. Datalog Rewriting: skolemization (and renaming)   f r(X,Y)   Z s(Y,Z) r(X 1 ,Y 1 )  s(Y 1 ,f 1 (Y 1 )) s(X,Y)   Z p(Y,Y,Z) s(X 2 ,Y 2 )  p(Y 2 ,Y 2 ,f 2 (Y 2 )) p(X,Y,Z)  t(Z) p(X 3 ,Y 3 ,Z 3 )  t(Z 3 )

  35. Datalog Rewriting: Skolemization (and renaming)   f r(X,Y)   Z s(Y,Z) r(X 1 ,Y 1 )  s(Y 1 ,f 1 (Y 1 )) s(X,Y)   Z p(Y,Y,Z) s(X 2 ,Y 2 )  p(Y 2 ,Y 2 ,f 2 (Y 2 )) p(X,Y,Z)  t(Z) p(X 3 ,Y 3 ,Z 3 )  t(Z 3 ) ¡  f and  are equisatisfiable (not equivalent) ¡ Introduce one Skolem function for each existential variable

  36. Datalog Rewriting: Rule Saturation ¡ Apply resolution inference rule to rules in  f  at least one of the rules contains Skolem terms  f δ 1 : r (X 1 ,Y 1 )  s(Y 1 ,f 1 (Y 1 )) δ 2 : s(X 2 ,Y 2 )  p(Y 2 ,Y 2 ,f 2 (Y 2 )) δ 3 : p(X 3 ,Y 3 ,Z 3 )  t(Z 3 )

  37. Datalog Rewriting: Rule Saturation ¡ Apply resolution inference rule to rules in  f  at least one of the rules contains Skolem terms  f [  f ] δ 1 : r (X 1 ,Y 1 )  s(Y 1 ,f 1 (Y 1 )) … δ 2 : s(X 2 ,Y 2 )  p(Y 2 ,Y 2 ,f 2 (Y 2 )) r(X 1 ,Y 1 )  p(f 1 (Y 1 ) ,f 1 (Y 1 ), f 2 (f 1 (Y 1 ))) δ 3 : p(X 3 ,Y 3 ,Z 3 )  t(Z 3 ) …

  38. Datalog Rewriting: Properties of Rule Saturation ¡ [  f ] mimics the chase derivations.

  39. Datalog Rewriting: Properties of Rule Saturation ¡ [  f ] mimics the chase derivations. δ 1 : r (X 1 ,Y 1 )  s(Y 1 ,f 1 (Y 1 )) δ 2 : s(X 2 ,Y 2 )  p(Y 2 ,Y 2 ,f 2 (Y 2 )) δ 3 : p(X 3 ,Y 3 ,Z 3 )  t(Z 3 )

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend