massive data algorithmics
play

Massive Data Algorithmics Lecture 7: Range Searching Massive Data - PowerPoint PPT Presentation

Three-Sided Range Queries Internal Priority Search Tree Externalizing Priority Search Tree Massive Data Algorithmics Lecture 7: Range Searching Massive Data Algorithmics Lecture 7: Range Searching Three-Sided Range Queries Internal Priority


  1. Three-Sided Range Queries Internal Priority Search Tree Externalizing Priority Search Tree Massive Data Algorithmics Lecture 7: Range Searching Massive Data Algorithmics Lecture 7: Range Searching

  2. Three-Sided Range Queries Internal Priority Search Tree Externalizing Priority Search Tree Three-Sided Range Queries Interval management: 1.5 dimensional search More general 2 d problem: Dynamic 3-sidede range searching - Maintain set of points in plane such that given query ( q 1 , q 2 , q 3 ) , all points ( x , y ) with q 1 ≤ x ≤ q 2 and y ≥ q 3 can be found efficiently Massive Data Algorithmics Lecture 7: Range Searching

  3. Three-Sided Range Queries Internal Priority Search Tree Externalizing Priority Search Tree Three-Sided Range Queries: Static Solution Static solution: - Sweep top-down inserting x in persistent B-tree at ( x , y ) - Answer query by performing range query with [ q 1 , q 2 ] in B-tree at q 3 Optimal: - O ( N / B ) space - O ( log B N + T / B ) query - O ( N / B log M / B N / B ) construction Dynamic? in internal memory: priority search tree Massive Data Algorithmics Lecture 7: Range Searching

  4. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree Base tree on x-coordinates with nodes augmented with points Heap on y-coordinates: - Decreasing y values on root-leaf path - ( x , y ) on path from root to leaf holding x - If v holds point then parent( v ) holds point Massive Data Algorithmics Lecture 7: Range Searching

  5. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Insert Linear space Insert of ( x , y ) (assuming fixed x -coordinate set): - Compare y with y -coordinate in root - Smaller: Recursively insert ( x , y ) in subtree on path to x - Bigger: Insert in root and recursively insert old point in subtree ⇒ O ( log N ) update Massive Data Algorithmics Lecture 7: Range Searching

  6. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Insert Linear space Insert of ( x , y ) (assuming fixed x -coordinate set): - Compare y with y -coordinate in root - Smaller: Recursively insert ( x , y ) in subtree on path to x - Bigger: Insert in root and recursively insert old point in subtree ⇒ O ( log N ) update Massive Data Algorithmics Lecture 7: Range Searching

  7. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Insert Linear space Insert of ( x , y ) (assuming fixed x -coordinate set): - Compare y with y -coordinate in root - Smaller: Recursively insert ( x , y ) in subtree on path to x - Bigger: Insert in root and recursively insert old point in subtree ⇒ O ( log N ) update Massive Data Algorithmics Lecture 7: Range Searching

  8. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Insert Linear space Insert of ( x , y ) (assuming fixed x -coordinate set): - Compare y with y -coordinate in root - Smaller: Recursively insert ( x , y ) in subtree on path to x - Bigger: Insert in root and recursively insert old point in subtree ⇒ O ( log N ) update Massive Data Algorithmics Lecture 7: Range Searching

  9. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Insert Linear space Insert of ( x , y ) (assuming fixed x -coordinate set): - Compare y with y -coordinate in root - Smaller: Recursively insert ( x , y ) in subtree on path to x - Bigger: Insert in root and recursively insert old point in subtree ⇒ O ( log N ) update Massive Data Algorithmics Lecture 7: Range Searching

  10. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Query Query with ( q 1 , q 2 , q 3 ) starting at root v : - Report point in v if satisfying query - Visit both children of v if point reported - Always visit child( s ) of v on path( s ) to q 1 and q 2 ⇒ O ( log N + T ) query Massive Data Algorithmics Lecture 7: Range Searching

  11. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Query Query with ( q 1 , q 2 , q 3 ) starting at root v : - Report point in v if satisfying query - Visit both children of v if point reported - Always visit child( s ) of v on path( s ) to q 1 and q 2 ⇒ O ( log N + T ) query Massive Data Algorithmics Lecture 7: Range Searching

  12. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Query Query with ( q 1 , q 2 , q 3 ) starting at root v : - Report point in v if satisfying query - Visit both children of v if point reported - Always visit child( s ) of v on path( s ) to q 1 and q 2 ⇒ O ( log N + T ) query Massive Data Algorithmics Lecture 7: Range Searching

  13. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Query Query with ( q 1 , q 2 , q 3 ) starting at root v : - Report point in v if satisfying query - Visit both children of v if point reported - Always visit child( s ) of v on path( s ) to q 1 and q 2 ⇒ O ( log N + T ) query Massive Data Algorithmics Lecture 7: Range Searching

  14. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Query Query with ( q 1 , q 2 , q 3 ) starting at root v : - Report point in v if satisfying query - Visit both children of v if point reported - Always visit child( s ) of v on path( s ) to q 1 and q 2 ⇒ O ( log N + T ) query Massive Data Algorithmics Lecture 7: Range Searching

  15. Three-Sided Range Queries Definition Internal Priority Search Tree Insert Externalizing Priority Search Tree Query Internal Priority Search Tree: Query Query with ( q 1 , q 2 , q 3 ) starting at root v : - Report point in v if satisfying query - Visit both children of v if point reported - Always visit child( s ) of v on path( s ) to q 1 and q 2 ⇒ O ( log N + T ) query Massive Data Algorithmics Lecture 7: Range Searching

  16. Ideas Three-Sided Range Queries Base tree Internal Priority Search Tree Query Externalizing Priority Search Tree Update Externalizing Priority Search Tree Natural idea: Block tree Problem: - O ( log B N ) I/Os to follow paths to to q 1 and q 2 - But O ( T ) I/Os may be used to visit other nodes (”overshooting”) ⇒ O ( log B N + T ) query Massive Data Algorithmics Lecture 7: Range Searching

  17. Ideas Three-Sided Range Queries Base tree Internal Priority Search Tree Query Externalizing Priority Search Tree Update Externalizing Priority Search Tree Solution idea: - Store B points in each node: * O ( B 2 ) points stored in each supernode * B output points can pay for overshooting - Bootstrapping: * Store O ( B 2 ) points in each supernode in static structure Massive Data Algorithmics Lecture 7: Range Searching

  18. Ideas Three-Sided Range Queries Base tree Internal Priority Search Tree Query Externalizing Priority Search Tree Update Base Tree Base tree: Weight-balanced B-tree with branching parameter B / 4 and leaf parameter B on x -coordinates Points in heap order: - Root stores B top points for each of the Θ ( B ) child slabs - Remaining points stored recursively Points in each node stored in B 2 -structure - Persistent B-tree structure for static problem ⇒ Linear space Massive Data Algorithmics Lecture 7: Range Searching

  19. Ideas Three-Sided Range Queries Base tree Internal Priority Search Tree Query Externalizing Priority Search Tree Update Answering Queries Query with ( q 1 , q 2 , q 3 ) starting at root v : - Query B 2 -structure and report points satisfying query - Visit child v if * v on path to q 1 or q 2 * All points corresponding to v satisfy query Massive Data Algorithmics Lecture 7: Range Searching

  20. Ideas Three-Sided Range Queries Base tree Internal Priority Search Tree Query Externalizing Priority Search Tree Update Answering Queries Query with ( q 1 , q 2 , q 3 ) starting at root v : - Query B 2 -structure and report points satisfying query - Visit child v if * v on path to q 1 or q 2 * All points corresponding to v satisfy query Massive Data Algorithmics Lecture 7: Range Searching

  21. Ideas Three-Sided Range Queries Base tree Internal Priority Search Tree Query Externalizing Priority Search Tree Update Answering Queries Query with ( q 1 , q 2 , q 3 ) starting at root v : - Query B 2 -structure and report points satisfying query - Visit child v if * v on path to q 1 or q 2 * All points corresponding to v satisfy query Massive Data Algorithmics Lecture 7: Range Searching

  22. Ideas Three-Sided Range Queries Base tree Internal Priority Search Tree Query Externalizing Priority Search Tree Update Query Analysis Analysis: - O ( log B B 2 + T v / B ) = O ( 1 + T v / B ) I/Os used to visit node v - O ( log B N ) nodes on path to q 1 or q 2 - For each node v not on path to q 1 or q 2 visited, B points reported in parent( v ) ⇒ O ( log B N + T / B ) Massive Data Algorithmics Lecture 7: Range Searching

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend