database theory conjunctive queries static analysis
play

Database Theory: Conjunctive Queries & Static Analysis CS 645 - PowerPoint PPT Presentation

Database Theory: Conjunctive Queries & Static Analysis CS 645 Feb 20, 2006 1 Life of a database theoretician Expressiveness of query languages Any query in L1 can be expressed in L2 Query q cannot be expressed in L


  1. Database Theory: Conjunctive Queries & Static Analysis CS 645 Feb 20, 2006 1

  2. Life of a database theoretician • Expressiveness of query languages – Any query in L1 can be expressed in L2 – Query q cannot be expressed in L • Complexity of languages – Bounds on resources required to evaluate any query in language L • Static analysis of queries (for optimization) – Given q in L: is it minimal? – Given q1 and q2 in L: are they equivalent? • Views 2

  3. Coming lectures • TODAY: – Overview of languages – Conjunctive queries (CQs) – Properties of CQs – Containment/equivalence for CQs • Next Week – Adding recursion – Reasoning about views 3

  4. Query languages • So far we ʼ ve seen: – Relational algebra – Relational calculus – SQL 4

  5. Review: relational algebra • Five operators: – Union: ∪ – Difference: - – Selection: σ – Projection: Π – Cartesian Product: × • Derived or auxiliary operators: – Intersection, complement – Joins (natural,equi-join, theta join) – Renaming: ρ

  6. Review: relational calculus Name and sid of students who are taking the English: course “DB” Π name , sid ( Students Takes σ name = ”DB” ( Course) RA: {x name, x sid | ∃ x cid ∃ x term Students(x sid ,x name ) ∧ Takes(x sid ,x cid ) ∧ RC: Course(x cid ,”DB”, x term ) }

  7. Review: SQL Basic form: SELECT attributes FROM relations (possibly multiple, joined) WHERE conditions (selections)

  8. Query language classes Algebra Logic SQL Recursive Queries Expressiveness SFW + FO queries RA (safe) RC UNION EXCEPT Conjunctive single datalog Queries rule

  9. Conjunctive Queries abbreviated : CQ • A subset of FO queries (i.e. less expressive) • Many queries in practice are conjunctive • Some optimizers handle only conjunctive queries - break larger queries into many CQs • CQ ʼ s have “better” theoretical properties than arbitrary queries 9

  10. Conjunctive Queries in rule-based (datalog) notation Variables Subgoals P(x,z) :- R(x,y) & R(y,z) ← Implicit ∃ Head Body Conjunction “IF” • R: Extensional database (EDB) - stored • P: Intentional database (IDB) - computed 10

  11. Conjunctive Queries Intuitively: when facts in the body are true of stored relations, then we infer the fact in the head P(x,z) :- R(x,y) & R(y,z) • More formally: • Consider all possible substitutions : assignments of the variables in the body 11

  12. Examples EDB Relation: ManagedBy(emp,mgr) A(x) :- ManagedBy(“Smith”,y) & ManagedBy(x,y) All employees having the same manager as “Smith” ManagedBy(x,y) A(x) substitution substitution Sam Frank Sally substitution y = Bob y = Joan Smith Joan y = Joan Smith x = ? x = Smith x = Sally Sally Joan Smith Bob 12

  13. Defining answers to CQ general form ans(u) :- R1(u1) & ... & Rn(un) e.g. ui = (x,y,z) of a CQ q v(ui) = (a,a,b) • A substitution v is a function from variables into the domain. e.g. x → a, y → a, z → b, u → c • Let I be an instance, i.e. relations I(R1) ... I(Rn) • A tuple t is in the answer q(I) if there is a substitution v s.t: – v(u1) ∈ I(R1) for each i, and – t = v(u) 13

  14. Examples EDB Relation: ManagedBy(emp,mgr) • Find all employees having the same director as Smith: A(x) :- ManagedBy(“Smith”,y), ManagedBy(y,z), ManagedBy(x,u), ManagedBy(u,z) (Your director is your manager ʼ s manager) 14

  15. Query language classes Algebra Logic SQL Recursive Queries Expressiveness SFW + FO queries RA (safe) RC UNION EXCEPT single RA: Conjunctive datalog σ , π , × Queries rule

  16. CQ and RA Relational Algebra: • CQ correspond precisely to σ C , Π A , × (missing: ∪ , –) A(x) :- ManagedBy(“Smith”,y), ManagedBy(x,y) Π $2.name $1.manager=$2.manager σ name=“Smith” 16 ManagedBy ManagedBy

  17. Query language classes Algebra Logic SQL Recursive Queries Expressiveness SFW + FO queries RA (safe) RC UNION EXCEPT single RA: Conjunctive S d FW datalog σ , π , × Queries rule

  18. CQ and SQL Rule-based: A(x) :- ManagedBy(“Smith”,y), ManagedBy(x,y) Notice SQL: “distinct” select distinct m2.name from ManagedBy m1, ManagedBy m2 where m1.name=“Smith” AND m1.manager=m2.manager 18

  19. Boolean queries A() :- ManagedBy(“Smith”, x), ManagedBy(“Sally”, x) Is there someone who manages both Smith and Sally? non-empty relation • Returns: – relation { 〈〉 } if the answer is yes – relation { } if the answer is no empty relation 19

  20. Properties of Conjunctive Queries • Satisfiability – A query q is satisfiable if there exists some input relation I such that q(I) is non-empty. – FACT: Every CQ is satisfiable. • Monotonicity – A query q is monotonic if for each instance I,J over schema, I ⊆ J implies q(I) ⊆ q(J). – FACT: Every CQ is monotonic. 20

  21. Satisfiability of CQs We can always generate satisfying EDB relations from the body of the rule. S(x,y,z) :- P(x,w) & R(w,y,v) & P(v,z) d e a b b c d P a b R b c d S a c e d e 21

  22. Monotonicity of CQs general form ans(u) :- R1(u1) & ... & Rn(un) e.g. ui = (x,y,z) of a CQ q • Consider two databases I, J s.t. I ⊆ J. • let t ∈ q(I). – Then for some substitution v: • v(ui) ∈ I(Ri) for each i. • t = v(u) – Since I ⊆ J, v(ui) ∈ J(Ri) for each i – So t ∈ q(J)

  23. Consequence of monotonicity Product ( pname, price, category, maker) Find products that are more expensive than all those produced By “Gizmo-Works” SELECT name FROM Product WHERE price > ALL (SELECT price FROM Purchase WHERE maker= ʻ Gizmo-Works ʼ ) • This query is NOT monotone. • Therefore, it is not in the class of conjunctive queries. • It cannot be expressed as a simple SFW query.

  24. Extensions of CQs 24

  25. Query language classes Algebra Logic SQL Recursive Queries Expressiveness SFW + FO queries RA (safe) RC UNION EXCEPT single RA: Conjunctive S d FW datalog σ , π , × Queries rule

  26. Extensions of CQ: disequality CQ ≠ Find managers that manage at least 2 employees A(y) :- ManagedBy(x,y), ManagedBy(z,y), x ≠ z 26

  27. Extensions of CQ: inequality CQ < Find employees earning more than their manager A(y) :- ManagedBy(x,y), Salary(x,u), Salary(y,v), u > v Additional EDB Relation: Salary(emp,money)

  28. Extensions of CQ: negation CQ ¬ Find people sharing the same office with Alice, but with a different manager A(y) :- Office(“Alice”,u), Office(y,u), ManagedBy(“Alice”,x), ¬ManagedBy(y,x) Additional EDB Relation: Office(emp,officenum) 28

  29. Extensions of CQ: union UCQ Unions of conjunctive queries Rule-based: A(name) :- Employee(name, dept, age, salary), age > 50 A(name) :- RetiredEmployee(name, address) Datalog notation is very convenient for expressing unions (no need for ∨ ) 29

  30. Query language classes Algebra Logic SQL Recursive Queries Expressiveness SFW + FO queries RA (safe) RC UNION EXCEPT UCQ CQ < CQ ≠ CQ ¬ single RA: Conjunctive S d FW datalog σ , π , × Queries rule

  31. Extensions of CQ • If we extend too much, we capture FO – Namely: CQs + Union, Negation • Theoreticians need to be careful: small extensions may make a huge difference on certain theoretical properties of CQ 31

  32. Query language classes Algebra Logic SQL Recursive Queries Expressiveness (safe) RC SFW + FO queries RA UNION UCQ ¬ EXCEPT UCQ CQ < CQ ≠ CQ ¬ single RA: Conjunctive S d FW datalog σ , π , × Queries rule

  33. Query Equivalence and Containment • One kind of static analysis • Useful for query optimization • Intensively studied since 1977 33

  34. Query Equivalence SELECT x.name, x.manager FROM Employee x, Employee y WHERE x.dept = ʻ Sales ʼ and x.office = y.office and x.floor = 5 and y.dept = ʻ Sales ʼ Hmmmm…. Is there a simpler way to write that ? 34

  35. Query Equivalence • Queries q 1 and q 2 are equivalent if for every database D , q 1 ( D ) = q 2 ( D ). • Notation: q 1 ≡ q 2 relations equal 35

  36. Query Containment • Query q 1 is contained in q 2 if for every database D , q 1 ( D ) ⊆ q 2 ( D ). • Notation: q 1 ⊆ q 2 • Obviously: q 1 ⊆ q 2 and q 2 ⊆ q 1 iff q 1 ≡ q 2 • Conversely: q 1 ∧ q 2 ≡ q 2 iff q 1 ⊆ q 2 We will study the containment problem only.

  37. Sidenote: containment for Boolean queries • Recall: q 1 is contained in q 2 if for every database D , q 1 ( D ) ⊆ q 2 ( D ). – if q1, q2 are boolean they return { 〈〉 } or { } – containment says: – whenever q1(D) = { 〈〉 } then q2(D) = { 〈〉 }. • Containment is implication: q1 → q2 37

  38. Examples of Query Containments Is q 1 ⊆ q 2 ? q 1 (x) :- R(x,y), R(y,z), R(z,w) q 2 (x) :- R(x,y), R(y,z) 38

  39. Examples of Query Containments Is q 1 ⊆ q 2 ? q 1 (x) :- R(x,y), R(y,z), R(z,x) q 2 (x) :- R(x,y), R(y,x) Counter-example B A C 39

  40. Examples of Query Containments Is q 1 ⊆ q 2 ? q 1 (x) :- R(x,u), R(u,u) q 2 (x) :- R(x,u), R(u,v), R(v,w) Example A B 40

  41. Examples of Query Containments Is q 1 ⊆ q 2 ? q 1 (x) :- R(x,u), R(u,”Smith”) q 2 (x) :- R(x,u), R(u,v) 41

  42. Query Containment • Theorem Query containment for CQ is decidable and NP-complete. (query complexity) 42

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend