deciding equivalences among aggregate queries w erner
play

Deciding Equivalences among Aggregate Queries W erner Nutt - PDF document

Deciding Equivalences among Aggregate Queries W erner Nutt German Resea rch Center fo r AI (DFKI) Saa rb r uck en, Germany Y ehoshua Sagiv, Sa ra Shurin The Heb rew Universit y Jerusalem, Israel Deciding


  1. Deciding Equivalences among Aggregate Queries W erner Nutt German Resea rch Center fo r AI (DFKI) Saa rb r� uck en, Germany Y ehoshua Sagiv, Sa ra Shurin The Heb rew Universit y Jerusalem, Israel � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 1

  2. Ackno wledgement � This w o rk o riginated within the ESPRIT Long T erm Resea rch Project "F oundations of Data W a rehouse Qualit y" (D W Q) � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 2

  3. Motivation In recent y ea r, increased interest in optimization of aggregate queries � data w a rehousing decision supp o rt � Aggregate queries a re costly � they touch many data items need fo r sp ecialized optimizati on techniques ; Idea: Use p revious results to answ er new queries � exploit redundancy! � create redundancy! T o do so, w e have to b e able to answ er the question: \What can b e computed from what?" (= the view usabilit y p roblem) � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 3

  4. Decision Supp o rt Queries Gr o wth Change in in Sales Sales as Sales Sales as vs. % of This % of Last Ca tegor y vs. Pr oduct Region Month Ca tegor y Month Last Month F ramis Cen tral 110 � 12% 31% 3% F ramis Eastern 179 -3% 28% -1% F ramis W estern 55 5% � 12% 1% 344 � 6% 33% 1% T otal F r amis Widget Cen tral 66 2% 18% 2% Widget Eastern 102 4% 12% 5% Widget W estern 39 -9% 9% -1% 207 1% 13% 4% T otal Widget Grand T otal 551 4% 20% 2% Example of a business rep ort. Exceptionally high v alues are mark ed with an asterisk ( � ). Exceptionally lo w v alues are sho wn as b old. (The example is tak en from R. Kim ball, The Data W arehouse T o olkit, Addison W esley) An SQL-query fo r the �rst column: Select p.product -na me as Product, m.region name as Region, sum(f.sal es) as Sales This Month From sales fact f, product p, market m, time t Where f.product key = p.product key, f.market key = m.market key, f.time key = t.time key, p.product name in ('Framis', 'Widget' ), t.month = 'May', t.year = 1996 Groupby p.product name, m.region name � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 4

  5. Aggregate Queries: Abstract Notation SQL notation: Select p.A, s.B, max(r.C), sum(s.D), count(*) From p, r, s Where p.Z = r.Z, p.A = s.A, r.W = 'Joe', s.B < 10 Groupby p.A, s.B Abstract notation: ( A; max ( C ) ; sum ( D ) ; coun t ) q B ; s ( A; ) & p ( A; ) & ( Z ) & B ; D Z r ; C ; W 0 0 = & � W Joe B 10 In general: ( x ( y ) ; ( y )) & q ; : : : ; x ; � : : : ; � R C 1 m 1 1 n n Sho rt: ( � � ( � )) & , q x ; � y R C with conjunction of relational atoms R conjunction of compa risons C � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 5

  6. The View Usabilit y Problem Given views ( � � ( � )) & v x ; � y R C i i i i i i and a query ( � � ( � )) & q x ; � y R C ; is there a query � ~ ~ ~ ( � ( � )) & q x; � y R C ; such that � consists of instantiations of the R v i � and ~ a re equivalent q q (i.e., and ~ p ro duce the same results q q over all databases ) W e need a syntactic cha racterization of ; equivalence! � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 6

  7. Dimensions of the Problem ( � � ( � ) ) & q x ; � y R C Which aggregate functions? � { min , max { coun t { sum { coun t distinct { . . . � Queries { without compa risons { with compa risons over the: rationals , � � integers � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 7

  8. Previous W o rk � Equivalence of conjunctive queries Chandra/Merlin 1977, Klug 1988 View usabilit y fo r conjunctive queries � Levy et al. 1995 � Containment and equivalence of conjunc- tive queries under bag-semantics Chaudhuri/V a rdi 1993 � Equivalence p reserving transfo rmations of aggregate queries Levy/Mumick 1994, Gupta et al. 1995 � View usabilit y fo r aggregate queries (su�- cient criteria) Srivastava et al. 1996 � View usabilit y fo r data cub es Ha rina ra y an et al. 1996, Gupta et al. 1997 (Almost) no complete cha racterizations fo r ; aggregate queries! � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 8

  9. Decouple Aggregations Observation: In ( Pro ( Sales ) ; ( Pro�t )) & , q d ; max sum R C the aggregates ( Sales ), ( Pro�t ), max sum a re functionally dep endent on Pro d . De�nition: ( � ( y )) & q x ; � R C j j is the -th k ernel of j ( � ( y ) ; ( y )) & . q x ; � : : : ; � R C n n 1 1 � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 9

  10. Divide and Conquer Theo rem: ( � ( y ) ; ( y )) q x ; � : : : ; � 1 1 n n 0 ( � ( y ) ; ( y )) q x ; � : : : ; � 1 1 n n a re equivalent if and only if their k ernels ( � ( y )) q x; � j j j 0 ( � ( y )) q x; � j j j a re pairwise equivalent fo r all 2 1 ::n . j it su�ces to solve the equivalence p roblem ; fo r queries with a single aggregate term ( � � ( y )) & q x ; R C ( simple aggregate queries). � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 10

  11. Aggregate Queries and Conjunctive Queries The co re of ( � � ( y )) & q x ; R C is the conjunctive query � ( � ) & q x ; y R C : Examples: � The co re of ( � ( y )) & q x ; sum R C is � ( � ) & q x; y R C : The co re of � ( � coun t ) & q x; R C is � ( � ) & q x R C : Strategy: Reduce equivalence of simple aggre- gate queries to p rop erties of their co res. � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 11

  12. Reminder on Conjunctive Queries � Conjunctive queries have the fo rm ( � ) & q x R C relational atom conjunction conjunction with va riables of relational of compa risons atoms D � := the result of over database D q q 0 0 � and a re equivalent (written � ) i� q q q q D 0D = fo r all db's D q q 0 0 � is contained in (written � ) i� q q q q D 0D � fo r all db's D q q Ho w can w e check containment? ; � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 12

  13. Query Homomo rphisms An homomo r phism from 0 0 0 ( � ) & q x R C to ( � ) & q x R C is a substitution such that � � = � � � x x 0 � � � R R 0 � j = . C � C Theo rem (Chandra/Merlin 77): F o r relational conjunctive queries: 0 � , there is an homomo rphism q q 0 from to q q Finding an homomo rphism is NP-complete! � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 13

  14. Containment and Compa risons Classical example: 0 p ( u; ) & � q v u v p ( y ) & p ( z ) q ; z ; y 0 W e have , � q q 0 but no homomo rphism from to . q q Idea: replace with its linea r expansion ( q ) ! q L L p ( y ) & p ( z ) & q ; z ; y y < z f y g <z p ( y ) & p ( z ) & = q ; z ; y y z f y = z g p ( y ) & p ( z ) & q ; z ; y y > z f y g >z (case analysis) Theo rem (Klug 88): 0 fo r every in ( q ) , � , q q q L L L there is an homomo rphism 0 from to q q L P Containment with compa risons is � -complete. 2 (van der Meyden) � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 14

  15. Relational Max-Queries 0 0 ( � max ( y )) � ( � max ( y )) ? q x ; R q x; R Theo rem: F o r relational max-queries: 0 0 the co res � and � � , q q q q a re equivalent Relational queries deliver the same max only if they deliver the same values ! � Deciding Equivalences among Aggregate Queries PODS June 1998 { Slide 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend