Generating Efficient Execution Plans for Vertically Partitioned XML - PowerPoint PPT Presentation

Generating Efficient Execution Plans for Vertically Partitioned XML Databases Patrick Kling, M. Tamer ¨ Ozsu, and Khuzaima Daudjee University of Waterloo David R. Cheriton School of Computer Science VLDB 2011 1

The Problem • Centralized query evaluation techniques for XML well understood • These techniques do not scale to large collection sizes and heavy workloads • Goal: use distribution to improve scalability • Focus on end-to-end cost of query evaluation 2

Distributed XML Query Evaluation: Two Scenarios • Integrating multiple data sources • Fragmentation is determined by existing data sources • Need flexible fragmentation model to express this • Distribution for performance • Choose fragmentation to suit workload • Can use more constrained fragmentation model • Fragmentation specification allows for distributed query optimization 3

Outline 1 Fragmenting XML Collections 2 Querying Distributed XML Collections Query Model Distributed Query Evaluation Improving Performance 3 Performance Evaluation 4 Conclusion 4

Fragmenting XML Collections • Ad-hoc fragmentation • Structure-based fragmentation 6

Ad-hoc fragmentation • Cut arbitrary edges in document tree • Highly flexible (good for data integration) • No explicit fragmentation specification • Limited potential for exploiting fragmentation characteristics for query optimization • Not a suitable choice for this work 7

Structure-based Fragmentation • Fragmentation according to characteristics of data or schema • Yields a fragmentation specification that can be exploited for query optimization • Better choice when distributing for performance 8

Our Fragmentation Model • Focus on simplicity and precise fragmentation specification • Focus on partitioning collection (replication is orthogonal) • Follow semantics of relational fragmentation techniques • Horizontal fragmentation (based on predicates/selection) • Vertical fragmentation (based on partitioning of schema/projection) • Hybrid fragmentation (combination of horizontal and vertical steps) 9

Vertical Fragmentation author 2 P 1 → 2 P 1 → 3 13 14 f V 1 RP 1 → 2 RP 1 → 3 13 14 name 2 pubs 2 f V 3 first 2 last 2 Jane Dean f V 2 10

Vertical Fragmentation Specification Vertical fragmentation is specified by a fragmentation schema . ONCE author pubs OPT MULT agent book ONCE f V f V 1 3 ONCE MULT name chapter ONCE ONCE OPT ONCE first last reference f V 4 ∗ ∗ f V 2 11

Query model XQ, subset of XPath • Nested paths with child and descendant steps • Explicit node tests and wild cards • Value constraints (numeric or textual) • Q := σ | ∗ | Q // Q | Q / Q | Q [ q ] q := Q | . = / � = str | . = / � = / ≤ / < / ≥ / > num 13

Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” 14

Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” / author[ ./ name[ ./ first = “William”and ./ last = “Shakespeare”]] // reference 14

Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” • Node tests / author[ ./ name[ ./ first = “William”and ./ last = “Shakespeare”]] // reference 14

Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” • Node tests / author[ ./ name[ ./ first = “William”and • Value constraints ./ last = “Shakespeare”]] // reference 14

Query Example “Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” • Node tests / author[ ./ name[ ./ first = “William”and • Value constraints ./ last = “Shakespeare”]] // reference • Structural constraints 14

Tree Patterns author / // name reference / / first last .=’William’ .=’Shakespeare’ 15

Tree Patterns • Pattern nodes with node tests and value constraints author / // name reference / / first last .=’William’ .=’Shakespeare’ 15

Tree Patterns • Pattern nodes with node tests and value constraints author • Edges annotated with XPath / // name axes reference / / first last .=’William’ .=’Shakespeare’ 15

Tree Patterns • Pattern nodes with node tests and value constraints author • Edges annotated with XPath / // name axes reference • Extraction point nodes / / first last .=’William’ .=’Shakespeare’ 15

Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare William chapter 4 chapter 5 reference 4 16

Evaluating Tree Pattern Queries author / // name reference a e 1 / / last first .=’Shakespeare’ .=’William’ author 4 name 4 pubs 4 first 4 last 4 book 4 Shakespeare chapter 4 chapter 5 William reference 4 [ a e 1 = reference 4 ] 16

Evaluating Tree Pattern Queries • Various centralized approaches exist • Navigating document trees • Structural joins • We leverage these for distributed query evaluation 17

Querying Vertically Distributed XML Collections • Input • Fragmentation-unaware tree pattern query • Fragmentation schema • Tasks • Annotate tree pattern nodes with corresponding fragments • Decompose tree pattern into sub-patterns for individual fragments • Convert sub-patterns to local plans using existing techniques (each site is free to choose local strategy) • Generate distributed execution plan that specifies how results are combined 18

Querying Vertically Distributed XML Collections • Annotate tree pattern nodes • Decompose tree pattern • Convert sub-patterns into local plans • Generate distributed execution plan author / // name reference / / first last .=’Shakespeare’ .=’William’ 19

Generating Efficient Execution Plans for Vertically Partitioned XML - PowerPoint PPT Presentation

Generating Efficient Execution Plans for Vertically Partitioned XML Databases Patrick Kling, M. Tamer Ozsu, and Khuzaima Daudjee University of Waterloo David R. Cheriton School of Computer Science VLDB 2011 1 The Problem Centralized

H2 F2009 H2 F2009 GENERATING GENERATING GENERATING GENERATING FREE CASH FLOW FREE CASH FLOW

MASTERING STRATEGY EXECUTION 18 BEST PRACTICES FOR STRATEGY EXECUTION STRATEGY EXECUTION AS

Activities related to monolithic and vertically Activities related to monolithic and vertically

Advanced Electric Generating Advanced Electric Generating Advanced Electric Generating

Ratchaburi Electricity Generating Holding PCL. Ratchaburi Electricity Generating Holding PCL.

Recursive Definitions Generating Functions Lecture 18 Generating Functions A generating

Chapter 17 Employee Benefits: Retirement Plans Fundamentals of Private Retirement Plans

Generating Subfields Mark van Hoeij June 15, 2017 Mark van Hoeij Generating Subfields Overview

Atikokan Generating Station Thunder Bay Generating Station March 5, 2013 Alberta Biomaterials

execution states with swapping Processes, Execution, and State 3F. Execution State Model exit

2007-08 August 2008 Table 11: Early Retirement Incentive Plans and Flexible Benefit Plans Early

An Established Vertically Integrated Palm Oil Production Company, Cote dIvoire Focussed on

Secure Linear Regression on Secure Linear Regression on Vertically Partitioned Datasets

District Plans ( Combined, CIP, Advising, and Literacy Plans) Guidance Webinar: July 14, 2020

PRODUCTION EXECUTION PRODUCTION EXECUTION Table of contents Course Map Module 1: Production

STRATAEGOS CONSULTING STRATEGY EXECUTION CONSULTING STRATAEGOS.COM WELCOME STRATEGY EXECUTION

Introducing the zoo of paper beasts David Simonsen, WAYF, david@wayf.dk Todays walk in the zoo

Cyber@UC Meeting 82 ICMP and ARP exploits If Youre New! Join our Slack:

* Kurose and Ross, Computer Networking

CIS 81 Protocol Scenarios for Layers 2 and 3 Beta Date: 9/1/05 Written by Rick Graziani Cabrillo

Radio transients in the starburst galaxy Arp 220 Lund 2015-02-09 Eskil Varenius, John Conway,

The AMS-IX switching platform APRICOT KYOTO February 2005 Henk Steenman Topics The

National Energy Research Scientific Computing Center (NERSC) Highly Scalable Networking

Isola-on A Slice Abstrac-on for So4ware Defined Networks Cole