pessimistic query optimization tighter upper bounds for
play

Pessimistic Query Optimization: Tighter Upper Bounds for - PowerPoint PPT Presentation

Pessimistic Query Optimization: Tighter Upper Bounds for Intermediate Join Cardinalities Walter Cai Magdalena Balazinska Dan Suciu University of Washington [walter,magda,suciu]@cs.washington.edu July 19th, 2019 Cai, Balazinska, Suciu (UW)


  1. Pessimistic Query Optimization: Tighter Upper Bounds for Intermediate Join Cardinalities Walter Cai Magdalena Balazinska Dan Suciu University of Washington [walter,magda,suciu]@cs.washington.edu July 19th, 2019 Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 1 / 27

  2. Systematic Underestimation Query optimizers assume: ◮ Uniformity ◮ Independence Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 2 / 27

  3. Background: Cardinality Bounds Background: Cardinality Bounds 1 Tightened Cardinality Bounds 2 Evaluation 3 Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 3 / 27

  4. Background: Cardinality Bounds Background: Cardinality Bounds 1 Tightened Cardinality Bounds 2 Evaluation 3 Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 3 / 27

  5. Background: Cardinality Bounds Example Query (SQL) SELECT * FROM pseudonym , cast, movie_companies , company_name WHERE pseudonym.person id = cast.person id AND cast.movie id id = movie_companies.movie id AND movie_companies.company id = company_name.id; Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 4 / 27

  6. Background: Cardinality Bounds Example Query (Join Graph & Datalog) pseudonym Q ( x , y , z , w ) � pseudo ( x , y ) , cast ( y , z ) , cast mc ( z , w ) , cn ( w ) movie companies company name y �→ person z �→ movie Figure: Join Graph. w �→ company Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 5 / 27

  7. Background: Cardinality Bounds Review: Entropy Take random variable X : � h ( X ) = − P ( X = a ) · log( P ( X = a )) a Multiple variables: � h ( X , Y ) = − P ( X = a , Y = b ) · log( P ( X = a , Y = b )) a , b Conditional entropy: � P ( X = a , Y = b ) � � h ( X | Y ) = − P ( X = a , Y = b ) · log P ( Y = b ) a , b Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 6 / 27

  8. Background: Cardinality Bounds Review: Entropy X ∼ P ( x 1 , . . . , x n ) ◮ Fact: h ( X ) ≤ log( n ) ◮ h ( X ) = log( n ) iff P is uniform Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 7 / 27

  9. Background: Cardinality Bounds Connection to Entropy Q ( x , y , z , w ) � pseudo ( x , y ) , cast ( y , z ) , mc ( z , w ) , cn ( w ) ◮ Create random variable for each attribute. x → X , y → Y , z → Z , w → W ◮ Let ( X , Y , Z , W ) be uniformly distributed over true output of Q . � � � � h ( X , Y , Z , W ) = log � Q ( x , y , z , w ) � � � � � � � � � exp h ( X , Y , Z , W ) = � Q ( x , y , z , w ) � � � Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 8 / 27

  10. Background: Cardinality Bounds Entropic Bound � � � � � � � Q ( x , y , z , w ) � = exp h ( X , Y , Z , W ) � � ◮ Suffices to bound h ( X , Y , Z , W ) . Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 9 / 27

  11. Background: Cardinality Bounds Entropic Bound h ( X , Y , Z , W ) ≤ h ( X | Y ) + h ( Y , Z ) + h ( W | Z ) Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 10 / 27

  12. Background: Cardinality Bounds Entropic Bound � � � � � Q ( x , y , z , w ) � = exp( h ( X , Y , Z , W )) ≤ exp( h ( X | Y ) + h ( Y , Z ) + h ( W | Z )) � � h ( Y , Z ) ≤ log( count ( cast )) h ( X | Y ) ≤ log( max degree ( pseudonym )) h ( W | Z ) ≤ log( max degree ( movie companies )) Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 11 / 27

  13. Background: Cardinality Bounds Entropic Bound � � � � � Q ( x , y , z , w ) � = exp( h ( X , Y , Z , W )) ≤ exp( h ( X | Y ) + h ( Y , Z ) + h ( W | Z )) � � h ( Y , Z ) ≤ log c cast h ( X | Y ) ≤ log d y pseudo h ( W | Z ) ≤ log d z mc Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 11 / 27

  14. Background: Cardinality Bounds Cardinality Bound � � � � � Q ( x , y , z , w ) � = exp( h ( X , Y , Z , W )) � � ≤ exp( h ( X | Y ) + h ( Y , Z ) + h ( W | Z ) ) � �� �� �� � � ��� �� ��� � � ��� �� ��� � ≤ log d y ≤ log c cast ≤ log d z mc pseudo ≤ d y pseudo · c cast · d z mc Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 12 / 27

  15. Background: Cardinality Bounds Many Entropic Bounds h ( X , Y , Z , W ) ≤ ... h ( X , Y ) + h ( Z | Y ) + h ( W | Z ) h ( X , Y ) + h ( Z | Y ) + h ( W ) h ( X , Y ) + h ( Z , W ) h ( X , Y ) + h ( Z | W ) + h ( W ) h ( X | Y ) + h ( Y , Z ) + h ( W | Z ) h ( X | Y ) + h ( Y | Z ) + h ( Z , W ) h ( X | Y ) + h ( Y | Z ) + h ( Z | W ) + h ( Z ) Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 13 / 27

  16. Background: Cardinality Bounds Entropic Bounds Q ( x , y , z , w ) � pseudo ( x , y ) , cast ( y , z ) , mc ( z , w ) , cn ( w )  c pseudo · d y cast · d z  mc    c pseudo · d y   cast · c cn       c pseudo · c mc       c pseudo · d w mc · c cn � �   � �  � Q ( x , y , z , w ) � ≤ min � � d y  pseudo · c cast · d z   mc    d y   pseudo · c cast · c cn      d y pseudo · d z  cast · c mc       d y pseudo · d z cast · d w mc · c cn  Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 14 / 27

  17. Background: Cardinality Bounds Entropic Bounds Neat! But is it useful? ◮ Short answer: ‘No’. (Not yet, anyway) ◮ Bounds still too loose (overestimation) ◮ Need to tighten ◮ How to tighten? Partitioning Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 15 / 27

  18. Tightened Cardinality Bounds Background: Cardinality Bounds 1 Tightened Cardinality Bounds 2 Evaluation 3 Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 16 / 27

  19. Tightened Cardinality Bounds Q ( x , y , z , w ) � pseudo ( x , y ) , cast ( y , z ) , mc ( z , w ) , cn ( w ) company name pseudonym movie companies cast pseudo cast mc cn Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 17 / 27

  20. Tightened Cardinality Bounds Q ( x , y , z , w ) � pseudo ( x , y ) , cast ( y , z ) , mc ( z , w ) , cn ( w ) company name pseudonym movie companies cast pseudo [0,0] pseudo [1,0] cast [0,0] cast [1,0] mc [0,0] mc [1,0] cn [0] pseudo [0,1] pseudo [1,1] cast [0,1] cast [1,1] mc [0,1] mc [1,1] cn [1] ◮ Value based hashing ◮ Analagous to hash join Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 17 / 27

  21. Tightened Cardinality Bounds Q ( x , y , z , w ) � pseudo ( x , y ) , cast ( y , z ) , mc ( z , w ) , cn ( w ) company name pseudonym movie companies cast cast [1,0] pseudo [0,1] mc [0,1] cn [1] ◮ Value based hashing ◮ Analagous to hash join Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 17 / 27

  22. Tightened Cardinality Bounds Q ( x , y , z , w ) � pseudo ( x , y ) , cast ( y , z ) , mc ( z , w ) , cn ( w ) company name pseudonym movie companies cast pseudo [0,0] pseudo [1,0] cast [0,0] cast [1,0] mc [0,0] mc [1,0] cn [0] pseudo [0,1] pseudo [1,1] cast [0,1] cast [1,1] mc [0,1] mc [1,1] cn [1] ◮ Q ( D ) : query evaluated on database D � Q ( D ) = Q ( D [ J ]) ◮ Q ( D [ J ]) : query evaluated on parition D [ J ] J Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 17 / 27

  23. Tightened Cardinality Bounds Q ( x , y , z , w ) � pseudo ( x , y ) , cast ( y , z ) , mc ( z , w ) , cn ( w ) company name pseudonym movie companies cast pseudo [0,0] pseudo [1,0] cast [0,0] cast [1,0] mc [0,0] mc [1,0] cn [0] pseudo [0,1] pseudo [1,1] cast [0,1] cast [1,1] mc [0,1] mc [1,1] cn [1] � Q ( D ) = Q ( D [ J ]) ◮ Bound each partition D [ J ] J � � � ◮ Sum will be bound on full database � � � Q ( D ) bound ( Q ( D [ J ])) � ≤ � � J Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 17 / 27

  24. Tightened Cardinality Bounds Partition Bounding  c pseudo [ J ] · d y cast [ J ] · d z    mc [ J ]    c pseudo [ J ] · d y  cast [ J ] · c cn [ J ]         c pseudo [ J ] · c mc [ J ]       c pseudo [ J ] · d w mc [ J ] · c cn [ J ]  � � �   � �  � Q ( D ) min � ≤ � � d y pseudo [ J ] · c cast [ J ] · d z     mc [ J ] J ∈{ 0 , 1 } 4    d y  pseudo [ J ] · c cast [ J ] · c cn [ J ]       d y  pseudo [ J ] · d z  cast [ J ] · c mc [ J ]       d y pseudo [ J ] · d z cast [ J ] · d w  mc [ J ] · c cn [ J ]   Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 18 / 27

  25. Tightened Cardinality Bounds Optimizations ◮ Bound Formula Generation ◮ Partition Budgeting ◮ Combats exponential runtime w.r.t. hash size ◮ Non-monotonic behaviour ◮ Filter Predicates Cai, Balazinska, Suciu (UW) Pessimistic Query Optimization July 19th, 2019 19 / 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend