a framework for testing query
play

A Framework for Testing Query Transformation Rules HICHAM G. - PowerPoint PPT Presentation

A Framework for Testing Query Transformation Rules HICHAM G. ELMONGUI, VIVEK NARASAYYA, RAVI RAMAMURTHY SIGMOD09 Presenter: ZHANG, Yushan @Prism 1 2020/10/8 Outline Background of this research The Challenges Framework Design


  1. A Framework for Testing Query Transformation Rules HICHAM G. ELMONGUI, VIVEK NARASAYYA, RAVI RAMAMURTHY SIGMOD’09 Presenter: ZHANG, Yushan @Prism 1 2020/10/8

  2. Outline ➢ Background of this research ➢ The Challenges ➢ Framework Design ➢ Experiment ➢ Reflections 2 2020/10/8

  3. Background Query Optimization: Overview Two Requirements: ➢ Produce faster plans ➢ Guarantee correct results https://docs.oracle.com/database/121/TGSQL/tgsql_optcncpt.htm#TGSQL196 3 2020/10/8

  4. Background Transformation Rule: example Let 𝐹 denote a relational algebra expression, 𝜄 is a predicate E.g. 𝜏 𝜄 1 ∧𝜄 2 𝐹 ≡ 𝜏 𝜄 1 (𝜏 𝜄2 𝐹) , a cascade of 𝜏 𝐹 1 ⋈ 𝜄 𝐹 2 ≡ 𝐹 2 ⋈ 𝜄 𝐹 1 , theta-join is commutative Database System Concepts , 7 th edition. P747-751 4 2020/10/8

  5. Background Testing Aspects ➢ Coverage ➢ a given transformation rule ( rule ) should be executed with different queries ➢ Correctness ➢ a given transformation rule should NOT alter the results of a query ➢ a pair of transformation rules should NOT alter the results of a query ➢ Performance… (not considered in this paper) 5 2020/10/8

  6. The Background Challenges The Challenges ➢ Efficiently generate queries that exercise a particular rule (Generation Problem) ➢ Randomly generation [1][2] takes many trials before find a query - slow ➢ Randomly generated queries are hard to interpret - complex ➢ Efficiently execute test suites for correctness testing (Compression Problem) ➢ Intuitive approach: turn off the rules one by one, and compare the results ➢ Problem: un-optimized queries could be extremely slow [1] H. Bati, L. Giakoumakis, S.Herbert, A.Surna. A genetic approach for random testing of database systems. Proceedings of VLDB 2007. 6 2020/10/8 [2] D.Slutz, Massive Stochastic Testing of SQL. Proceeding of VLDB 1998.

  7. The Background Approach Challenges Generation with Rule Patterns ➢ Captures the sufficient condition for a rule to execute is hard ➢ Pull up GBAgg requires certain functional dependency ➢ Use Rule Patterns ( necessary condition ) ➢ A query exercising the given rule must have the corresponding pattern ➢ Initialize the “wholes” in the pattern ➢ Support rule composition (extends to test multiple rules) Take-away message: ➢ reduce the trials with known necessary conditions 7 2020/10/8

  8. The Background Approach Challenges Test Suite Compression (TSC) ➢ Very high cost for executing the test suite ➢ A query could exercise multiple rules ➢ A rule is exercised by multiple rules and 𝑙 = 1 Intuition ( bipartite graph problem ): A set of transformation rules: 𝑆 = 𝑠 1 , 𝑠 2 , ⋯ 𝑠 𝑈𝑇 1 = {𝑟 1 } 𝑜 𝑈𝑇 2 = {𝑟 2 } A test suite for a rule has 𝑙 queries 𝐷𝑝𝑡𝑢(𝑈𝑇 1 ) = 100 + 180 Overall test suites for R: 𝑈𝑇 =ڂ 𝑗 𝑈𝑇 𝑗 𝐷𝑝𝑡𝑢 𝑈𝑇 2 = 100 + 120 (with cost ) Nodes are rules and queries 𝐷𝑝𝑡𝑢(𝑈𝑇) = 𝐷𝑝𝑡𝑢(𝑈𝑇 1 ) + 𝐷𝑝𝑡𝑢(𝑈𝑇 2 ) (with cost ) Edge from 𝑠 𝑗 to 𝑟 𝑘 denotes 𝑠 𝑗 is exercised when 𝑟 𝑘 is optimized. Goal : Minimize cost of TS Question: what is the optimal solution for this example? Constraint : Each rule is exercised k times 8 2020/10/8

  9. The Background Approach Challenges Test Suite Compression (TSC) ➢ Very high cost for executing the test suite ➢ A query could exercise multiple rules ➢ A rule is exercised by multiple rules and 𝑙 = 1 Intuition ( bipartite graph problem ): A set of transformation rules: 𝑆 = 𝑠 1 , 𝑠 2 , ⋯ 𝑠 𝑈𝑇 1 = {𝑟 1 } 𝑜 𝑈𝑇 2 = {𝑟 2 } A test suite for a rule has 𝑙 queries 𝐷𝑝𝑡𝑢(𝑈𝑇 1 ) = 100 + 180 Overall test suites for R: 𝑈𝑇 =ڂ 𝑗 𝑈𝑇 𝑗 𝐷𝑝𝑡𝑢 𝑈𝑇 2 = 100 + 120 (with cost ) Nodes are rules and queries 𝐷𝑝𝑡𝑢(𝑈𝑇) = 𝐷𝑝𝑡𝑢(𝑈𝑇 1 ) + 𝐷𝑝𝑡𝑢(𝑈𝑇 2 ) (with cost ) Edge from 𝑠 𝑗 to 𝑟 𝑘 denotes 𝑠 𝑗 is exercised when 𝑟 𝑘 is optimized. Optimal solution: 𝐷𝑝𝑡𝑢(𝑈𝑇) = 100 + 120 + 120 9 2020/10/8

  10. The Background Approach Challenges Test Suite Compression (TSC) ➢ Reduction from the Set Cover Problem to TSC (NP-Hard) ➢ Only approximation algorithm is possible ➢ Two algorithms ➢ Applying the Set Cover Heuristic (ignores edge cost) ➢ Constrained Set Multicover algorithms (SMC) ➢ TopKIndependent algorithm (considers edge cost) ➢ Consider edge costs for each rule separately, sort them and use greedy picks ➢ Further optimization to boost the picking process 10 2020/10/8

  11. The Framework Background Challenges Design Framework Overview 11 2020/10/8

  12. The Framework Background Experiments Challenges Design Experiment Setup The approach is prototyped in Microsoft SQL Server Databases: TPC-H Used 30 rules of the optimizer ➢ The efficiency of query generation: random vs. patthern ➢ The effectiveness of two algorithms: SetMultiCover vs. TopKIndependent ➢ The importance of exploiting monotonicity ( further optimization on TopKIndependent ) 12 2020/10/8

  13. The Framework Background Experiments Challenges Design Random vs. Pattern based Generation 13 2020/10/8

  14. The Framework Background Experiments Challenges Design Compression Quality Y-axis: Log scale Optimizer estimated cost 14 2020/10/8

  15. Summary This papers considers the problem of testing transformation rules of a query optimizer Main Contributions: ➢ Efficiently generate queries to cover the given rules ➢ Use query patterns to reduce the random generations trials ➢ Efficiently execute test suites for correctness testing ➢ Use approximation to pick a subset of the queries 15 2020/10/8

  16. Reflections ➢ This paper has a very clean and clear writing style, which worth a detailed reading ➢ There is a well-established logic between the defined problem and the solution ➢ A typical example is used throughout the paper ➢ The approach illustration is simple to understand without the theory 16 2020/10/8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend