Generating Efficient Execution Plans for Vertically Partitioned XML - - PowerPoint PPT Presentation

generating efficient execution plans for vertically
SMART_READER_LITE
LIVE PREVIEW

Generating Efficient Execution Plans for Vertically Partitioned XML - - PowerPoint PPT Presentation

Generating Efficient Execution Plans for Vertically Partitioned XML Databases Patrick Kling, M. Tamer Ozsu, and Khuzaima Daudjee University of Waterloo David R. Cheriton School of Computer Science VLDB 2011 1 The Problem Centralized


slide-1
SLIDE 1

Generating Efficient Execution Plans for Vertically Partitioned XML Databases

Patrick Kling, M. Tamer ¨ Ozsu, and Khuzaima Daudjee

University of Waterloo David R. Cheriton School of Computer Science

VLDB 2011

1

slide-2
SLIDE 2

The Problem

  • Centralized query evaluation techniques for XML well

understood

  • These techniques do not scale to large collection sizes and

heavy workloads

  • Goal: use distribution to improve scalability
  • Focus on end-to-end cost of query evaluation

2

slide-3
SLIDE 3

Distributed XML Query Evaluation: Two Scenarios

  • Integrating multiple data sources
  • Fragmentation is determined by existing data sources
  • Need flexible fragmentation model to express this
  • Distribution for performance
  • Choose fragmentation to suit workload
  • Can use more constrained fragmentation model
  • Fragmentation specification allows for distributed query
  • ptimization

3

slide-4
SLIDE 4

Distributed XML Query Evaluation: Two Scenarios

  • Integrating multiple data sources
  • Fragmentation is determined by existing data sources
  • Need flexible fragmentation model to express this
  • Distribution for performance
  • Choose fragmentation to suit workload
  • Can use more constrained fragmentation model
  • Fragmentation specification allows for distributed query
  • ptimization

3

slide-5
SLIDE 5

Outline

1 Fragmenting XML Collections 2 Querying Distributed XML Collections

Query Model Distributed Query Evaluation Improving Performance

3 Performance Evaluation 4 Conclusion

4

slide-6
SLIDE 6

Outline

1 Fragmenting XML Collections 2 Querying Distributed XML Collections

Query Model Distributed Query Evaluation Improving Performance

3 Performance Evaluation 4 Conclusion

5

slide-7
SLIDE 7

Fragmenting XML Collections

  • Ad-hoc fragmentation
  • Structure-based fragmentation

6

slide-8
SLIDE 8

Ad-hoc fragmentation

  • Cut arbitrary edges in document tree
  • Highly flexible (good for data integration)
  • No explicit fragmentation specification
  • Limited potential for exploiting fragmentation characteristics

for query optimization

  • Not a suitable choice for this work

7

slide-9
SLIDE 9

Structure-based Fragmentation

  • Fragmentation according to characteristics of data or schema
  • Yields a fragmentation specification that can be exploited for

query optimization

  • Better choice when distributing for performance

8

slide-10
SLIDE 10

Our Fragmentation Model

  • Focus on simplicity and precise fragmentation specification
  • Focus on partitioning collection (replication is orthogonal)
  • Follow semantics of relational fragmentation techniques
  • Horizontal fragmentation (based on predicates/selection)
  • Vertical fragmentation (based on partitioning of

schema/projection)

  • Hybrid fragmentation (combination of horizontal and vertical

steps)

9

slide-11
SLIDE 11

Our Fragmentation Model

  • Focus on simplicity and precise fragmentation specification
  • Focus on partitioning collection (replication is orthogonal)
  • Follow semantics of relational fragmentation techniques
  • Horizontal fragmentation (based on predicates/selection)
  • Vertical fragmentation (based on partitioning of

schema/projection)

  • Hybrid fragmentation (combination of horizontal and vertical

steps)

9

slide-12
SLIDE 12

Vertical Fragmentation

author2 P1→2

13

P1→3

14

f V

1

RP1→2

13

name2 first2 Jane last2 Dean

f V

2

RP1→3

14

pubs2

f V

3 10

slide-13
SLIDE 13

Vertical Fragmentation Specification

Vertical fragmentation is specified by a fragmentation schema.

author agent

OPT

f V

1

pubs book

MULT

f V

3

name first

ONCE

∗ last

ONCE

f V

2

chapter reference

OPT ONCE

f V

4

ONCE ONCE ONCE MULT

11

slide-14
SLIDE 14

Outline

1 Fragmenting XML Collections 2 Querying Distributed XML Collections

Query Model Distributed Query Evaluation Improving Performance

3 Performance Evaluation 4 Conclusion

12

slide-15
SLIDE 15

Query model

XQ, subset of XPath

  • Nested paths with child and descendant steps
  • Explicit node tests and wild cards
  • Value constraints (numeric or textual)
  • Q := σ | ∗ | Q//Q | Q/Q |Q[q]

q := Q | . = / = str | . = / = / ≤ / < / ≥ / > num

13

slide-16
SLIDE 16

Query Example

“Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ”

14

slide-17
SLIDE 17

Query Example

“Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” /author[./name[./first = “William”and ./last = “Shakespeare”]]//reference

14

slide-18
SLIDE 18

Query Example

“Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” /author[./name[./first = “William”and ./last = “Shakespeare”]]//reference

  • Node tests

14

slide-19
SLIDE 19

Query Example

“Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” /author[./name[./first = “William”and ./last = “Shakespeare”]]//reference

  • Node tests
  • Value constraints

14

slide-20
SLIDE 20

Query Example

“Find all references in publications written by authors whose first name is ‘William’ and whose last name is ‘Shakespeare’ ” /author[./name[./first = “William”and ./last = “Shakespeare”]]//reference

  • Node tests
  • Value constraints
  • Structural constraints

14

slide-21
SLIDE 21

Tree Patterns

author name / first

.=’William’

/ last

.=’Shakespeare’

/ reference //

15

slide-22
SLIDE 22

Tree Patterns

author name / first

.=’William’

/ last

.=’Shakespeare’

/ reference //

  • Pattern nodes with node

tests and value constraints

15

slide-23
SLIDE 23

Tree Patterns

author name / first

.=’William’

/ last

.=’Shakespeare’

/ reference //

  • Pattern nodes with node

tests and value constraints

15

slide-24
SLIDE 24

Tree Patterns

author name / first

.=’William’

/ last

.=’Shakespeare’

/ reference //

  • Pattern nodes with node

tests and value constraints

  • Edges annotated with XPath

axes

15

slide-25
SLIDE 25

Tree Patterns

author name / first

.=’William’

/ last

.=’Shakespeare’

/ reference //

  • Pattern nodes with node

tests and value constraints

  • Edges annotated with XPath

axes

  • Extraction point nodes

15

slide-26
SLIDE 26

Evaluating Tree Pattern Queries

author name / first

.=’William’

/ last

.=’Shakespeare’

/ ae

1

reference //

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

16

slide-27
SLIDE 27

Evaluating Tree Pattern Queries

author name / first

.=’William’

/ last

.=’Shakespeare’

/ ae

1

reference //

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

16

slide-28
SLIDE 28

Evaluating Tree Pattern Queries

author name / first

.=’William’

/ last

.=’Shakespeare’

/ ae

1

reference //

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

16

slide-29
SLIDE 29

Evaluating Tree Pattern Queries

author name / first

.=’William’

/ last

.=’Shakespeare’

/ ae

1

reference //

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

16

slide-30
SLIDE 30

Evaluating Tree Pattern Queries

author name / first

.=’William’

/ last

.=’Shakespeare’

/ ae

1

reference //

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

16

slide-31
SLIDE 31

Evaluating Tree Pattern Queries

author name / first

.=’William’

/ last

.=’Shakespeare’

/ ae

1

reference //

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

16

slide-32
SLIDE 32

Evaluating Tree Pattern Queries

author name / first

.=’William’

/ last

.=’Shakespeare’

/ ae

1

reference //

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

16

slide-33
SLIDE 33

Evaluating Tree Pattern Queries

author name / first

.=’William’

/ last

.=’Shakespeare’

/ ae

1

reference //

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

16

slide-34
SLIDE 34

Evaluating Tree Pattern Queries

author name / first

.=’William’

/ last

.=’Shakespeare’

/ ae

1

reference //

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5 [ae

1 = reference4]

16

slide-35
SLIDE 35

Evaluating Tree Pattern Queries

  • Various centralized approaches exist
  • Navigating document trees
  • Structural joins
  • We leverage these for distributed query evaluation

17

slide-36
SLIDE 36

Querying Vertically Distributed XML Collections

  • Input
  • Fragmentation-unaware tree pattern query
  • Fragmentation schema
  • Tasks
  • Annotate tree pattern nodes with corresponding fragments
  • Decompose tree pattern into sub-patterns for individual

fragments

  • Convert sub-patterns to local plans using existing techniques

(each site is free to choose local strategy)

  • Generate distributed execution plan that specifies how results

are combined

18

slide-37
SLIDE 37

Querying Vertically Distributed XML Collections

  • Annotate tree pattern nodes
  • Decompose tree pattern
  • Convert sub-patterns into local plans
  • Generate distributed execution plan

author name / first

.=’William’

/ last

.=’Shakespeare’

/ reference //

19

slide-38
SLIDE 38

Querying Vertically Distributed XML Collections

  • Annotate tree pattern nodes
  • Decompose tree pattern
  • Convert sub-patterns into local plans
  • Generate distributed execution plan

author f V

1

name f V

2

/ first

.=’William’

f V

2

/ last

.=’Shakespeare’

f V

2

/ reference f V

4

//

19

slide-39
SLIDE 39

Querying Vertically Distributed XML Collections

  • Annotate tree pattern

nodes

  • Decompose tree pattern
  • Convert sub-patterns

into local plans

  • Generate distributed

execution plan

author f V

1

name f V

2

/ first

.=’William’

f V

2

/ last

.=’Shakespeare’

f V

2

/ reference f V

4

//

20

slide-40
SLIDE 40

Querying Vertically Distributed XML Collections

  • Annotate tree pattern

nodes

  • Decompose tree pattern
  • Convert sub-patterns

into local plans

  • Generate distributed

execution plan

author f V

1

name f V

2

/ first

.=’William’

f V

2

/ last

.=’Shakespeare’

f V

2

/ reference f V

4

//

author ap

2

P1→2

/ ap

3

P1→3

//

q1

1(f V 1 )

arp

2

RP1→2

name / first

.=’William’

/ last

.=’Shakespeare’

/

q2

1(f V 2 )

arp

3

RP1→3

ap

4

P3→4

//

q3

1(f V 3 )

arp

4

RP3→4

ae

1

reference //

q4

1(f V 4 ) 20

slide-41
SLIDE 41

Querying Vertically Distributed XML Collections

  • Annotate tree pattern

nodes

  • Decompose tree pattern
  • Convert sub-patterns

into local plans

  • Generate distributed

execution plan

author f V

1

name f V

2

/ first

.=’William’

f V

2

/ last

.=’Shakespeare’

f V

2

/ reference f V

4

//

author ap

2

P1→2

/ ap

3

P1→3

//

q1

1(f V 1 )

arp

2

RP1→2

name / first

.=’William’

/ last

.=’Shakespeare’

/

q2

1(f V 2 )

arp

3

RP1→3

ap

4

P3→4

//

q3

1(f V 3 )

arp

4

RP3→4

ae

1

reference //

q4

1(f V 4 ) 20

slide-42
SLIDE 42

Querying Vertically Distributed XML Collections

  • Annotate tree pattern

nodes

  • Decompose tree pattern
  • Convert sub-patterns

into local plans

  • Generate distributed

execution plan ⋊ ⋉id(ap

3)=id(arp 3 )

⋊ ⋉id(ap

2)=id(arp 2 )

p1

1(f V 1 )

p2

1(f V 2 )

⋊ ⋉id(ap

4)=id(arp 4 )

p3

1(f V 3 )

p4

1(f V 4 ) author ap

2

P1→2

/ ap

3

P1→3

//

q1

1(f V 1 )

arp

2

RP1→2

name / first

.=’William’

/ last

.=’Shakespeare’

/

q2

1(f V 2 )

arp

3

RP1→3

ap

4

P3→4

//

q3

1(f V 3 )

arp

4

RP3→4

ae

1

reference //

q4

1(f V 4 ) 21

slide-43
SLIDE 43

Querying Vertically Distributed XML Collections

  • Annotate tree pattern

nodes

  • Decompose tree pattern
  • Convert sub-patterns

into local plans

  • Generate distributed

execution plan ⋊ ⋉id(ap

3)=id(arp 3 )

⋊ ⋉id(ap

2)=id(arp 2 )

p1

1(f V 1 )

p2

1(f V 2 )

⋊ ⋉id(ap

4)=id(arp 4 )

p3

1(f V 3 )

p4

1(f V 4 ) author ap

2

P1→2

/ ap

3

P1→3

//

q1

1(f V 1 )

arp

2

RP1→2

name / first

.=’William’

/ last

.=’Shakespeare’

/

q2

1(f V 2 )

arp

3

RP1→3

ap

4

P3→4

//

q3

1(f V 3 )

arp

4

RP3→4

ae

1

reference //

q4

1(f V 4 ) 21

slide-44
SLIDE 44

Querying Vertically Distributed XML Collections

  • Annotate tree pattern

nodes

  • Decompose tree pattern
  • Convert sub-patterns

into local plans

  • Generate distributed

execution plan ⋊ ⋉id(ap

3)=id(arp 3 )

⋊ ⋉id(ap

2)=id(arp 2 )

p1

1(f V 1 )

p2

1(f V 2 )

⋊ ⋉id(ap

4)=id(arp 4 )

p3

1(f V 3 )

p4

1(f V 4 ) author ap

2

P1→2

/ ap

3

P1→3

//

q1

1(f V 1 )

arp

2

RP1→2

name / first

.=’William’

/ last

.=’Shakespeare’

/

q2

1(f V 2 )

arp

3

RP1→3

ap

4

P3→4

//

q3

1(f V 3 )

arp

4

RP3→4

ae

1

reference //

q4

1(f V 4 ) 21

slide-45
SLIDE 45

Querying Vertically Distributed XML Collections

  • Annotate tree pattern

nodes

  • Decompose tree pattern
  • Convert sub-patterns

into local plans

  • Generate distributed

execution plan ⋊ ⋉id(ap

3)=id(arp 3 )

⋊ ⋉id(ap

2)=id(arp 2 )

p1

1(f V 1 )

p2

1(f V 2 )

⋊ ⋉id(ap

4)=id(arp 4 )

p3

1(f V 3 )

p4

1(f V 4 ) author ap

2

P1→2

/ ap

3

P1→3

//

q1

1(f V 1 )

arp

2

RP1→2

name / first

.=’William’

/ last

.=’Shakespeare’

/

q2

1(f V 2 )

arp

3

RP1→3

ap

4

P3→4

//

q3

1(f V 3 )

arp

4

RP3→4

ae

1

reference //

q4

1(f V 4 ) 21

slide-46
SLIDE 46

Improving Distributed Execution Plans

  • Pruning irrelevant fragments
  • Join order
  • Push cross-fragment joins into local plans

22

slide-47
SLIDE 47

Improving Distributed Execution Plans

  • Pruning irrelevant fragments
  • Join order
  • Push cross-fragment joins into local plans

22

slide-48
SLIDE 48

Pushing Cross-Fragment Joins

Large fraction of local results are discarded by cross-fragment join

RP3→4

19

chapter2 reference2 RP3→4

20

chapter3 reference3 RP3→4

21

chapter4 reference4

f V

4 23

slide-49
SLIDE 49

Pushing Cross-Fragment Joins

Large fraction of local results are discarded by cross-fragment join

RP3→4

19

chapter2 reference2 RP3→4

20

chapter3 reference3 RP3→4

21

chapter4 reference4

f V

4 23

slide-50
SLIDE 50

Pushing Cross-Fragment Joins

Large fraction of local results are discarded by cross-fragment join

RP3→4

19

chapter2 reference2 RP3→4

20

chapter3 reference3 RP3→4

21

chapter4 reference4

f V

4

  • Idea: only access relevant sub-trees in fragment
  • Avoid computing irrelevant local results
  • Use pipelining to push cross-fragment join into local plan

23

slide-51
SLIDE 51

A Local Query Plan

πarp

2

⋉a1/a3 ⋉a1/a2

⋊ ⋉arp

2 /a1

scanarp

2 :RP1→2 ∗

scana1:name σa2=’William’ scana2:first σa3=’Shakespeare’ scana3:last

p2

1(f V 2 ) 24

slide-52
SLIDE 52

A Local Query Plan

πarp

2

⋉a1/a3 ⋉a1/a2

⋊ ⋉arp

2 /a1

scanarp

2 :RP1→2 ∗

scana1:name σa2=’William’ scana2:first σa3=’Shakespeare’ scana3:last

p2

1(f V 2 )

  • Plan scans root proxy

nodes in fragment

  • Idea: filter these root

proxy nodes before evaluating remainder

  • f plan
  • Works for navigating

plans and plans based

  • n structural joins

(shown here)

24

slide-53
SLIDE 53

A Local Query Plan

πarp

2 ,...

⋉a1/a3 ⋉a1/a2

⋊ ⋉arp

2 /a1

⋊ ⋉ . . . scanarp

2 :RP1→2 ∗

scana1:name σa2=’William’ scana2:first σa3=’Shakespeare’ scana3:last

p2

1(f V 2 )

  • Plan scans root proxy

nodes in fragment

  • Idea: filter these root

proxy nodes before evaluating remainder

  • f plan
  • Works for navigating

plans and plans based

  • n structural joins

(shown here)

25

slide-54
SLIDE 54

Pushing Cross-Fragment Joins

p4′

1 (f V 4 )

⋊ ⋉id(ap

4)=id(arp 4 )

p3′

1 (f V 3 )

⋊ ⋉id(ap

3)=id(arp 3 )

p2′

1 (f V 2 )

⋊ ⋉id(ap

2)=id(arp 2 )

p1

1(f V 1 ) scanarp

2 :RP1→2 ∗

scanarp

3 :RP1→3 ∗

scanarp

4 :RP3→4 ∗ 26

slide-55
SLIDE 55

Pushing Cross-Fragment Joins

p4′

1 (f V 4 )

⋊ ⋉id(ap

4)=id(arp 4 )

p3′

1 (f V 3 )

⋊ ⋉id(ap

3)=id(arp 3 )

p2′

1 (f V 2 )

⋊ ⋉id(ap

2)=id(arp 2 )

p1

1(f V 1 ) scanarp

2 :RP1→2 ∗

scanarp

3 :RP1→3 ∗

scanarp

4 :RP3→4 ∗ 26

slide-56
SLIDE 56

Pushing Cross-Fragment Joins

p4′

1 (f V 4 )

⋊ ⋉id(ap

4)=id(arp 4 )

p3′

1 (f V 3 )

⋊ ⋉id(ap

3)=id(arp 3 )

p2′

1 (f V 2 )

⋊ ⋉id(ap

2)=id(arp 2 )

p1

1(f V 1 ) scanarp

2 :RP1→2 ∗

scanarp

3 :RP1→3 ∗

scanarp

4 :RP3→4 ∗ 26

slide-57
SLIDE 57

Pushing Cross-Fragment Joins

p4′

1 (f V 4 )

⋊ ⋉id(ap

4)=id(arp 4 )

p3′

1 (f V 3 )

⋊ ⋉id(ap

3)=id(arp 3 )

p2′

1 (f V 2 )

⋊ ⋉id(ap

2)=id(arp 2 )

p1

1(f V 1 ) scanarp

2 :RP1→2 ∗

scanarp

3 :RP1→3 ∗

scanarp

4 :RP3→4 ∗ 26

slide-58
SLIDE 58

Pushing Cross Fragment Joins: Implementation

  • Can use full pipelining if both inputs to join are ordered
  • Alternatively, can use index on root proxy nodes
  • Full parallelism after first tuple received by local plan

27

slide-59
SLIDE 59

Pushing Cross Fragment Joins

  • Avoids accessing large portion of sub-trees within a fragment
  • Can only be fully used in left-deep plans
  • Decreases flexibility (e.g., where joins are performed)

28

slide-60
SLIDE 60

Label Path Filtering

  • Cross-fragment join pushing works well but decreases flexibility
  • Goal: find a solution that can obtain partial benefit for

scenarios where join pushing cannot be applied

  • Idea: use selection instead of join to filter out some root proxy

nodes

29

slide-61
SLIDE 61

Label Path Filtering

  • Assign to each proxy node the label path from the document

root

  • Filter for label paths that are compatible with the query

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

30

slide-62
SLIDE 62

Label Path Filtering

  • Assign to each proxy node the label path from the document

root

  • Filter for label paths that are compatible with the query

author4 name4 first4 William last4 Shakespeare pubs4 book4 chapter4 reference4 chapter5

/author/pubs/book

30

slide-63
SLIDE 63

Label Path Filtering

⋊ ⋉id(arp

4 )=id(ap 4)

⋊ ⋉id(arp

3 )=id(ap 3)

⋊ ⋉id(arp

2 )=id(ap 2)

p1

1(f V 1 )

p2

1 ∗ (f V 2 )

p3

1(f V 3 )

p4′

1 (f V 4 )

σlabel(arp

4 )=/author/pubs/book

scanarp

4 =(RP3→4 ∗

)

  • Assume there are two

types of publications: book and article

  • Can use selection to

filter chapters based

  • n publication type

31

slide-64
SLIDE 64

Label Path Filtering

  • Can be used in more cases
  • Retains higher degree of flexibility
  • Benefit is more limited (does not filter all irrelevant root proxy

nodes)

32

slide-65
SLIDE 65

Determining the Best Distributed Execution Plan

  • Join pushing and label path filtering are not always

advantageous

  • Determine best execution plan using cost model

33

slide-66
SLIDE 66

Outline

1 Fragmenting XML Collections 2 Querying Distributed XML Collections

Query Model Distributed Query Evaluation Improving Performance

3 Performance Evaluation 4 Conclusion

34

slide-67
SLIDE 67

Performance Evaluation

  • Implemented techniques within Natix
  • 12 GB XMark collection (auction data)
  • 1 Amazon EC2 instance for each of each of 10 vertical

fragments

35

slide-68
SLIDE 68

Performance Evaluation

  • XPathMark queries (with few filtering value constraints)
  • Modified, more selective XPathMark queries

A1

/site/closed auctions/closed auction/annotation/description/text/keyword

A2

//closed auction//keyword

A3

/site/closed auctions/closed auction//keyword

A4

/site/closed auctions/closed auction[annotation/description/text/keyword]/date

A5

/site/closed auctions/closed auction[descendant::keyword]/date

A6

/site/people/person[profile/gender and profile/age]/name

B7

//person[profile/@income]/name

A1S

/site/closed auctions/closed auction[price > 600]/annotation/description/text/keyword

A2S

//closed auction[price > 600]//keyword

A3S

/site/closed auctions/closed auction[price > 600]//keyword

A4S

/site/closed auctions/closed auction[price > 600][annotation/description/text/keyword]/date

A5S

/site/closed auctions/closed auction[price > 600][descendant::keyword]/date

A6S

/site/people/person[starts-with(name, ’Ry’)][profile/gender and profile/age]/name

B7S

//person[starts-with(name, ’Ry’)][profile/@income]/name 36

slide-69
SLIDE 69

Performance Evaluation: XPathMark

200 400 600 800 1000 1200 1400 A1 A2 A3 A4 A5 A6 B7 Response time (seconds) Plan cent dist push

37

slide-70
SLIDE 70

Performance Evaluation: Selective XPathMark

200 400 600 800 1000 1200 1400 A1 A2 A3 A4 A5 A6 B7 Response time (seconds) Plan cent dist push

38

slide-71
SLIDE 71

Conclusions

  • Distribution can make XML query evaluation more scalable
  • Join pushing can significantly improve query performance
  • A cost model is essential for finding the optimal technique for

a given query

39

slide-72
SLIDE 72

References

[1] Patrick Kling, M. Tamer ¨ Ozsu, Khuzaima Daudjee: Generating Efficient Execution Plans for Vertically Partitioned XML Databases, PVLDB 2010. [2] Patrick Kling, M. Tamer ¨ Ozsu, Khuzaima Daudjee: Scaling XML Query Processing: Distribution, Localization and Pruning, DAPD 2011.

40