generating efficient execution plans for vertically
play

Generating Efficient Execution Plans for Vertically Partitioned XML - PowerPoint PPT Presentation

Generating Efficient Execution Plans for Vertically Partitioned XML Databases Patrick Kling, M. Tamer Ozsu, and Khuzaima Daudjee University of Waterloo David R. Cheriton School of Computer Science VLDB 2011 1 The Problem Centralized


  1. Generating Efficient Execution Plans for Vertically Partitioned XML Databases Patrick Kling, M. Tamer ¨ Ozsu, and Khuzaima Daudjee University of Waterloo David R. Cheriton School of Computer Science VLDB 2011 1

  2. The Problem • Centralized query evaluation techniques for XML well understood • These techniques do not scale to large collection sizes and heavy workloads • Goal: use distribution to improve scalability • Focus on end-to-end cost of query evaluation 2

  3. Distributed XML Query Evaluation: Two Scenarios • Integrating multiple data sources • Fragmentation is determined by existing data sources • Need flexible fragmentation model to express this • Distribution for performance • Choose fragmentation to suit workload • Can use more constrained fragmentation model • Fragmentation specification allows for distributed query optimization 3

  4. Distributed XML Query Evaluation: Two Scenarios • Integrating multiple data sources • Fragmentation is determined by existing data sources • Need flexible fragmentation model to express this • Distribution for performance • Choose fragmentation to suit workload • Can use more constrained fragmentation model • Fragmentation specification allows for distributed query optimization 3

  5. Outline 1 Fragmenting XML Collections 2 Querying Distributed XML Collections Query Model Distributed Query Evaluation Improving Performance 3 Performance Evaluation 4 Conclusion 4

  6. Outline 1 Fragmenting XML Collections 2 Querying Distributed XML Collections Query Model Distributed Query Evaluation Improving Performance 3 Performance Evaluation 4 Conclusion 5

  7. Fragmenting XML Collections • Ad-hoc fragmentation • Structure-based fragmentation 6

  8. Ad-hoc fragmentation • Cut arbitrary edges in document tree • Highly flexible (good for data integration) • No explicit fragmentation specification • Limited potential for exploiting fragmentation characteristics for query optimization • Not a suitable choice for this work 7

  9. Structure-based Fragmentation • Fragmentation according to characteristics of data or schema • Yields a fragmentation specification that can be exploited for query optimization • Better choice when distributing for performance 8

  10. Our Fragmentation Model • Focus on simplicity and precise fragmentation specification • Focus on partitioning collection (replication is orthogonal) • Follow semantics of relational fragmentation techniques • Horizontal fragmentation (based on predicates/selection) • Vertical fragmentation (based on partitioning of schema/projection) • Hybrid fragmentation (combination of horizontal and vertical steps) 9

  11. Our Fragmentation Model • Focus on simplicity and precise fragmentation specification • Focus on partitioning collection (replication is orthogonal) • Follow semantics of relational fragmentation techniques • Horizontal fragmentation (based on predicates/selection) • Vertical fragmentation (based on partitioning of schema/projection) • Hybrid fragmentation (combination of horizontal and vertical steps) 9

  12. Vertical Fragmentation author 2 P 1 → 2 P 1 → 3 13 14 f V 1 RP 1 → 2 RP 1 → 3 13 14 name 2 pubs 2 f V 3 first 2 last 2 Jane Dean f V 2 10

  13. Vertical Fragmentation Specification Vertical fragmentation is specified by a fragmentation schema . ONCE author pubs OPT MULT agent book ONCE f V f V 1 3 ONCE MULT name chapter ONCE ONCE OPT ONCE first last reference f V 4 ∗ ∗ f V 2 11

  14. Outline 1 Fragmenting XML Collections 2 Querying Distributed XML Collections Query Model Distributed Query Evaluation Improving Performance 3 Performance Evaluation 4 Conclusion 12

  15. Query model XQ, subset of XPath • Nested paths with child and descendant steps • Explicit node tests and wild cards • Value constraints (numeric or textual) • Q := σ | ∗ | Q // Q | Q / Q | Q [ q ] q := Q | . = / � = str | . = / � = / ≤ / < / ≥ / > num 13

  16. Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” 14

  17. Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” / author[ ./ name[ ./ first = “William”and ./ last = “Shakespeare”]] // reference 14

  18. Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” • Node tests / author[ ./ name[ ./ first = “William”and ./ last = “Shakespeare”]] // reference 14

  19. Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” • Node tests / author[ ./ name[ ./ first = “William”and • Value constraints ./ last = “Shakespeare”]] // reference 14

  20. Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” • Node tests / author[ ./ name[ ./ first = “William”and • Value constraints ./ last = “Shakespeare”]] // reference • Structural constraints 14

  21. Tree Patterns author / // name reference / / first last .=’William’ .=’Shakespeare’ 15

  22. Tree Patterns • Pattern nodes with node tests and value constraints author / // name reference / / first last .=’William’ .=’Shakespeare’ 15

  23. Tree Patterns • Pattern nodes with node tests and value constraints author / // name reference / / first last .=’William’ .=’Shakespeare’ 15

  24. Tree Patterns • Pattern nodes with node tests and value constraints author • Edges annotated with XPath / // name axes reference / / first last .=’William’ .=’Shakespeare’ 15

  25. Tree Patterns • Pattern nodes with node tests and value constraints author • Edges annotated with XPath / // name axes reference • Extraction point nodes / / first last .=’William’ .=’Shakespeare’ 15

  26. Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare William chapter 4 chapter 5 reference 4 16

  27. Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare William chapter 4 chapter 5 reference 4 16

  28. Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare William chapter 4 chapter 5 reference 4 16

  29. Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare William chapter 4 chapter 5 reference 4 16

  30. Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare William chapter 4 chapter 5 reference 4 16

  31. Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare William chapter 4 chapter 5 reference 4 16

  32. Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare William chapter 4 chapter 5 reference 4 16

  33. Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare William chapter 4 chapter 5 reference 4 16

  34. Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare chapter 4 chapter 5 William reference 4 [ a e 1 = reference 4 ] 16

  35. Evaluating Tree Pattern Queries • Various centralized approaches exist • Navigating document trees • Structural joins • We leverage these for distributed query evaluation 17

  36. Querying Vertically Distributed XML Collections • Input • Fragmentation-unaware tree pattern query • Fragmentation schema • Tasks • Annotate tree pattern nodes with corresponding fragments • Decompose tree pattern into sub-patterns for individual fragments • Convert sub-patterns to local plans using existing techniques (each site is free to choose local strategy) • Generate distributed execution plan that specifies how results are combined 18

  37. Querying Vertically Distributed XML Collections • Annotate tree pattern nodes • Decompose tree pattern • Convert sub-patterns into local plans • Generate distributed execution plan author / // name reference / / first last .=’Shakespeare’ .=’William’ 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend