Relational Database: Identities of Relational Algebra; Example of - - PowerPoint PPT Presentation

relational database identities of relational algebra
SMART_READER_LITE
LIVE PREVIEW

Relational Database: Identities of Relational Algebra; Example of - - PowerPoint PPT Presentation

Relational Database: Identities of Relational Algebra; Example of Query Optimization Greg Plaxton Theory in Programming Practice, Fall 2005 Department of Computer Science University of Texas at Austin Selection Splitting For any database


slide-1
SLIDE 1

Relational Database: Identities of Relational Algebra; Example of Query Optimization

Greg Plaxton Theory in Programming Practice, Fall 2005 Department of Computer Science University of Texas at Austin

slide-2
SLIDE 2

Selection Splitting

  • For any database relation R and predicates p, q, we have

σp∧q(R) = σp(σq(R))

  • A corollary is that selection is commutative, that is,

σp(σq(R)) = σq(σp(R))

Theory in Programming Practice, Plaxton, Fall 2005

slide-3
SLIDE 3

Projection Refinement

  • For any subsets a and b of a database relation R such that a ⊆ b, we

have πa(R) = πa(πb(R))

Theory in Programming Practice, Plaxton, Fall 2005

slide-4
SLIDE 4

Commutativity of Selection and Projection

  • For any subset a of the attributes of a database relation R, and any

predicate p that names only attributes in a, we have πa(σp(R)) = σp(πa(R))

Theory in Programming Practice, Plaxton, Fall 2005

slide-5
SLIDE 5

Commutativity and Associativity of Union, Cross Product, Join

  • Union and cross product are commutative and associative
  • Join is commutative
  • For any database relations R, S, and T such that (1) R and S have at

least one common attribute, (2) S and T have at least one common attribute, and (3) no attribute is common to R, S, and T, we have (R ⊲ ⊳ S) ⊲ ⊳ T = R ⊲ ⊳ (S ⊲ ⊳ T)

Theory in Programming Practice, Plaxton, Fall 2005

slide-6
SLIDE 6

Selection Pushing

  • For any database relations R and S, any predicate p, and any operator

in the set {∪, ∩, −}, we have σp(R S) = σp(R) σp(S)

  • For any database relations R and S, any predicate p that depends only
  • n attributes of R, and any operator in the set {×, ⊲

⊳}, we have σp(R S) = σp(R) S

Theory in Programming Practice, Plaxton, Fall 2005

slide-7
SLIDE 7

Projection Pushing

  • For any database relations R and S, and any set of attributes a, we

have πa(R ∪ S) = πa(R) ∪ πa(S)

Theory in Programming Practice, Plaxton, Fall 2005

slide-8
SLIDE 8

Distributivity of Projection over Join

  • For any database relations R and S with associated sets of attributes

r and s, respectively, and any sets of attributes a, b, and c such that a ⊆ r ∪ s, b = (a ∩ r) ∪ d, and c = (a ∩ s) ∪ d where d = r ∩ s, we have πa(R ⊲ ⊳ S) = πa(πb(R) ⊲ ⊳ πc(S))

Theory in Programming Practice, Plaxton, Fall 2005

slide-9
SLIDE 9

Query Optimization

  • We are given a query in the form of a relational algebra expression α
  • We could evaluate α directly
  • Instead, it might be more efficient to use identities such as the ones

presented earlier to obtain an equivalent expression β for which a direct evaluation is more efficient

Theory in Programming Practice, Plaxton, Fall 2005

slide-10
SLIDE 10

An Example of Query Optimization

  • We consider an abstraction of the movie example discussed in the

course packet

  • For the sake of brevity, we use the letters A through I to refer to the

nine attributes of the example: A for Title, B for Actor, C for Director, D for Genre, E for Year, F for Theatre, G for Time, H for Rating, I for Address

  • We have three database relations R, S, and T

with attributes {A, B, C, D, E}, {A, F, G, H}, and {F, I}, respectively

  • Let p (resp., q) denote a predicate asserting that attribute B (resp.,

G) has a particular given value

  • We wish to evaluate πI(σp∧q(R ⊲

⊳ S ⊲ ⊳ T))

Theory in Programming Practice, Plaxton, Fall 2005

slide-11
SLIDE 11

Example: High Level

  • We wish to evaluate

πI(σp∧q(R ⊲ ⊳ S ⊲ ⊳ T))

  • We will prove that this expression is equivalent to

πI(πF[πA(σp(R)) ⊲ ⊳ πA,F(σq(S))] ⊲ ⊳ T)

  • Why is the latter expression likely to be more efficient to evaluate

directly?

  • In what follows we will give a step-by-step proof of the equivalence of

the two preceding formulae

Theory in Programming Practice, Plaxton, Fall 2005

slide-12
SLIDE 12

Step One

  • Claim:

πI(σp∧q(R ⊲ ⊳ S ⊲ ⊳ T)) = πI(σp∧q[(R ⊲ ⊳ S) ⊲ ⊳ T])

  • This claim follows from the associativity of ⊲

⊳, since the necessary preconditions for applying this rule are met: – R and S have at least one common attribute (A is common) – S and T have at least one common attribute (F is common) – no attribute is common to R, S, and T

Theory in Programming Practice, Plaxton, Fall 2005

slide-13
SLIDE 13

Step Two

  • Claim:

πI(σp∧q[(R ⊲ ⊳ S) ⊲ ⊳ T]) = πI(σp∧q(R ⊲ ⊳ S) ⊲ ⊳ T)

  • This claim follows from selection pushing over join, since the necessary

precondition for applying this rule is met – The predicate p ∧ q names only the attributes B and G, both of which are attributes of R ⊲ ⊳ S

Theory in Programming Practice, Plaxton, Fall 2005

slide-14
SLIDE 14

Lemma 1

  • For any database relations R and S, and any predicates p and q such

that p depends only on attributes of R and q depends only on attributes

  • f S, we have

σp∧q(R ⊲ ⊳ S) = σp(R) ⊲ ⊳ σq(S)

  • Proof:

– By selection splitting and commutativity of join, σp∧q(R ⊲ ⊳ S) = σp(σq(S ⊲ ⊳ R)) – By selection pushing over join and commutativity of join, σp(σq(S ⊲ ⊳ R)) = σp(R ⊲ ⊳ σq(S)) – By selection pushing over join, σp(R ⊲ ⊳ σq(S)) = σp(R) ⊲ ⊳ σq(S)

Theory in Programming Practice, Plaxton, Fall 2005

slide-15
SLIDE 15

Step Three

  • Claim:

πI(σp∧q(R ⊲ ⊳ S) ⊲ ⊳ T) = πI([σp(R) ⊲ ⊳ σq(S)] ⊲ ⊳ T)

  • This claim follows from Lemma 1 since p only involves attribute B and

q only involves attribute G – Note that B is an attribute of R and G is an attribute of S

Theory in Programming Practice, Plaxton, Fall 2005

slide-16
SLIDE 16

Step Four

  • Claim:

πI([σp(R) ⊲ ⊳ σq(S)] ⊲ ⊳ T) = πI(πF[σp(R) ⊲ ⊳ σq(S)] ⊲ ⊳ πF,I(T))

  • This claim follows from distributivity of projection over join, since the

necessary preconditions for applying this rule are met, with – r = {A, B, C, D, E, F, G, H} as the set of attributes of σp(R) ⊲ ⊳ σq(S) – s = {F, I} as the set of attributes of T – d = r ∩ s = {F} – a = {I} ⊆ r ∪ s – b = (a ∩ r) ∪ d = {F} – c = (a ∩ s) ∪ d = {F, I}

Theory in Programming Practice, Plaxton, Fall 2005

slide-17
SLIDE 17

Step Five

  • Claim:

πI(πF[σp(R) ⊲ ⊳ σq(S)] ⊲ ⊳ πF,I(T)) = πI(πF[σp(R) ⊲ ⊳ σq(S)] ⊲ ⊳ T)

  • This claim follows from the observation that πF,I(T) = T, since {F, I}

is the set of attributes of T

Theory in Programming Practice, Plaxton, Fall 2005

slide-18
SLIDE 18

Lemma 2

  • Claim:

πF(σp(R) ⊲ ⊳ σq(S)) = πF(πA(σp(R)) ⊲ ⊳ πA,F(σq(S)))

  • This claim follows from distributivity of projection over join, since the

necessary preconditions for applying this rule are met, with – r = {A, B, C, D, E} as the set of attributes of σp(R) – s = {A, F, G, H} as the set of attributes of σq(S) – d = r ∩ s = {A} – a = {F} ⊆ r ∪ s – b = (a ∩ r) ∪ d = {A} – c = (a ∩ s) ∪ d = {A, F}

Theory in Programming Practice, Plaxton, Fall 2005

slide-19
SLIDE 19

Step Six

  • Claim:

πI(πF[σp(R) ⊲ ⊳ σq(S)] ⊲ ⊳ T) = πI(πF[πA(σp(R)) ⊲ ⊳ πA,F(σq(S))] ⊲ ⊳ T)

  • This claim follows from Lemma 2

Theory in Programming Practice, Plaxton, Fall 2005