massive data algorithmics
play

Massive Data Algorithmics Lecture 4: External Search Trees Massive - PowerPoint PPT Presentation

Introduction Weight-balanced B-tree Persistent trees Massive Data Algorithmics Lecture 4: External Search Trees Massive Data Algorithmics Lecture 4: External Search Trees Introduction Range queries Weight-balanced B-tree 1D range queries


  1. Introduction Weight-balanced B-tree Persistent trees Massive Data Algorithmics Lecture 4: External Search Trees Massive Data Algorithmics Lecture 4: External Search Trees

  2. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Database queries G. Ometer born: Aug 16, 1954 salary salary: $3,500 A database query may ask for all employees with age between a 1 and a 2 , and salary between s 1 and s 2 19,500,000 19,559,999 date of birth Massive Data Algorithmics Lecture 4: External Search Trees

  3. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Balanced binary search trees A balanced binary search tree with the points in the leaves 49 23 80 10 37 62 89 3 19 30 59 70 93 49 89 3 10 19 23 30 37 59 62 70 80 93 97 Massive Data Algorithmics Lecture 4: External Search Trees

  4. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Balanced binary search trees The search path for 25 49 23 80 10 37 62 89 3 19 30 59 70 93 49 89 3 10 19 23 30 37 59 62 70 80 93 97 Massive Data Algorithmics Lecture 4: External Search Trees

  5. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Balanced binary search trees The search paths for 25 and for 90 49 23 80 10 37 62 89 3 19 30 59 70 93 49 89 3 10 19 23 30 37 59 62 70 80 93 97 Massive Data Algorithmics Lecture 4: External Search Trees

  6. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Example 1D range query A 1-dimensional range query with [ 25 , 90 ] 49 23 80 10 37 62 89 3 19 30 49 59 70 93 89 3 10 19 23 30 37 59 62 70 80 93 97 Massive Data Algorithmics Lecture 4: External Search Trees

  7. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Example 1D range query A 1-dimensional range query with [ 61 , 90 ] 49 split node 23 80 10 37 62 89 3 19 30 49 59 70 93 89 3 10 19 23 30 37 59 62 70 80 93 97 Massive Data Algorithmics Lecture 4: External Search Trees

  8. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Node types for a query Three types of nodes for a given query : White nodes: never visited by the query Grey nodes: visited by the query, unclear if they lead to output Black nodes: visited by the query, whole subtree is output Massive Data Algorithmics Lecture 4: External Search Trees

  9. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Examining 1D range queries For any 1D range query, we can identify O ( log n ) nodes that together represent all answers to a 1D range query Massive Data Algorithmics Lecture 4: External Search Trees

  10. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Toward 2D range queries For any 2d range query, we can identify O ( log n ) nodes that together represent all points that have a correct first coordinate Massive Data Algorithmics Lecture 4: External Search Trees

  11. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Toward 2D range queries (1 , 5) (3 , 8) (4 , 2) (5 , 9) (6 , 7) (7 , 3) (8 , 1) (9 , 4) Massive Data Algorithmics Lecture 4: External Search Trees

  12. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Toward 2D range queries (1 , 5) (3 , 8) (4 , 2) (5 , 9) (6 , 7) (7 , 3) (8 , 1) (9 , 4) Massive Data Algorithmics Lecture 4: External Search Trees

  13. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Toward 2D range queries data structure for searching on y -coordinate (1 , 5) (3 , 8) (4 , 2) (5 , 9) (6 , 7) (7 , 3) (8 , 1) (9 , 4) Massive Data Algorithmics Lecture 4: External Search Trees

  14. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries Toward 2D range queries (5 , 9) (3 , 8) (6 , 7) (1 , 5) (9 , 4) (7 , 3) (4 , 2) (8 , 1) (3 , 8) (1 , 5) (4 , 2) (5 , 9) (6 , 7) (7 , 3) (8 , 1) (9 , 4) Massive Data Algorithmics Lecture 4: External Search Trees

  15. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries 2D range trees Every internal node stores a whole tree in an associated structure , on y -coordinate Question: How much storage does this take? Massive Data Algorithmics Lecture 4: External Search Trees

  16. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries 2D range queries ν p p µ ′ p µ p Massive Data Algorithmics Lecture 4: External Search Trees

  17. Introduction Range queries Weight-balanced B-tree 1D range queries Persistent trees 2D range queries 2D range queries ν µ ′ µ Massive Data Algorithmics Lecture 4: External Search Trees

  18. Definition Introduction Insertion Weight-balanced B-tree Deletion Persistent trees Summary Secondary Structures When secondary structures used, a rebalance on v often requires O ( w ( v )) I/Os ( w ( v ) is weight of v ) - If Ω ( w ( v )) inserts have to be made below v between operations ⇒ O ( 1 ) amortized split bound ⇒ O ( log B N ) amortized insert bound Nodes in standard B-tree do not have this property Massive Data Algorithmics Lecture 4: External Search Trees

  19. Definition Introduction Insertion Weight-balanced B-tree Deletion Persistent trees Summary BB[ α ]-tree In internal memory BB[ α ]-trees have the desired property Defined using weight-constraint - Ratio between weight of left child and weight of right child of a node v is between α and 1 − α ( α < 1 ) ⇒ Height: O ( log N ) √ If 2 / 11 < α < 1 − 1 / 2 2 rebalancing can be performed using rotations Seems hard to implement BB[ α ]-trees I/O-efficiently Massive Data Algorithmics Lecture 4: External Search Trees

  20. Definition Introduction Insertion Weight-balanced B-tree Deletion Persistent trees Summary Weight-balanced B-tree Idea: Combination of B-tree and BB[ α ]-tree - Weight constraint on nodes instead of degree constraint - Rebalancing performed using split/fuse as in B-tree Weight-balanced B-tree with parameters b and k ( b > 8 , k ≥ 8 ) - All leaves on same level and contain between k / 4 and k elements - Internal node v at level l has w ( v ) < b l k - Except for the root, internal node v at level l has w ( v ) > 1 / 4 b l k - The root has more than one child Massive Data Algorithmics Lecture 4: External Search Trees

  21. Definition Introduction Insertion Weight-balanced B-tree Deletion Persistent trees Summary Weight-balanced B-tree Every internal node has degree between 1 / 4 b l k / b l − 1 k = 1 / 4 b and b l k / ( 1 / 4 ) b l − 1 k = 4 b ⇒ Height: O ( log b N / k ) External memory: - Choose 4 b = B (or even B c for 0 < c ≤ 1 ) - k = B ⇒ O ( N / B ) space, O ( log B N / B + T / B ) query Massive Data Algorithmics Lecture 4: External Search Trees

  22. Definition Introduction Insertion Weight-balanced B-tree Deletion Persistent trees Summary Weight-balanced B-tree Insert Search for relevant leaf u and insert new element Traverse path from u to root: - If level l node v now has w ( v ) = b l k + 1 then split into nodes v ′ and v ′′ with w ( v ′ ) ≥ ⌊ 1 / 2 ( b l k + 1 ) ⌋− b l − 1 k and w ( v ′′ ) ≤ ⌈ 1 / 2 ( b l k + 1 ) ⌉ + b l − 1 k Algorithm correct since b l − 1 k ≤ 1 / 8 b l k such that w ( v ′ ) ≥ 3 / 8 b l k and w ( v ′′ ) ≤ 5 / 8 b l k - touch O ( log b N / k ) nodes Weight-balance property: Ω ( b l k ) updates below v ′ and v ′′ before next rebalance operation Massive Data Algorithmics Lecture 4: External Search Trees

  23. Definition Introduction Insertion Weight-balanced B-tree Deletion Persistent trees Summary Weight-balanced B-tree Delete Search for relevant leaf u and insert new element Traverse path from u to root: - If level l node v now has w ( v ) = 1 / 4 b l k − 1 then fuse with sibling into nodes v ′ with 2 / 4 b l k − 1 ≤ w ( v ′ ) ≤ 5 / 4 b l k − 1 If now w ( v ′ ) ≥ 7 / 8 b l k then split into nodes with weight ≥ 7 / 16 b l k − 1 − b l − 1 k ≥ 5 / 16 b l k − 1 and ≤ 5 / 8 b l k + b l − 1 k ≤ 6 / 8 b l k Algorithm correct and touch O ( log b N / k ) nodes Weight-balance property: Ω ( b l k ) updates below v ′ and v ′′ before next rebalance operation Massive Data Algorithmics Lecture 4: External Search Trees

  24. Definition Introduction Insertion Weight-balanced B-tree Deletion Persistent trees Summary Summary/Conclusion: Weight-balanced B-tree Weight-balanced B-tree with branching parameter b and leaf parameter k = Ω ( B ) - O ( N / B ) space - Height O ( log b N / k ) - O ( log B N ) rebalancing operations after update - Ω ( w ( v )) updates below v between consecutive operations on v Weight-balanced B-tree with branching parameter B c and leaf parameter B - Updates in O ( log B N ) and queries in O ( log B N + T / B ) I/Os Construction bottom-up in O ( N / B log M / B N / B ) I/O Massive Data Algorithmics Lecture 4: External Search Trees

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend