external memory geometric data structures
play

ExternalMemoryGeometricDataStructures LarsArge DukeUniversity - PowerPoint PPT Presentation

ExternalMemoryGeometricDataStructures LarsArge DukeUniversity June29,2002 SummerSchoolonMassiveDatasets Externalmemorydatastructures SoFarSoGood Yesterdaywediscussed


  1. External�Memory�Geometric�Data�Structures Lars�Arge Duke�University June�29,�2002 Summer�School�on�Massive�Datasets

  2. External�memory�data�structures So�Far�So�Good • Yesterday�we�discussed “dimension�1.5”�problems: – Interval�stabbing and�point�location • We�developed�a�number�of�useful�tools/techniques – Logarithmic�method – Weight-balanced�B-trees – Global�rebuilding • On Thursday�we�also�discussed�several�tools/techniques – B-trees – Persistent�B-trees – Construction�using�buffer�technique Lars�Arge 2

  3. External�memory�data�structures Interval�Management • Maintain� N intervals�with�unique�endpoints�dynamically�such�that� stabbing�query�with�point� x� can�be�answered�efficiently x • Solved�using�external�interval�tree • We�obtained�the�same�bounds�as�for�the� 1d case – Space:� O ( N/B ) + – Query:� (log T ) O B N B – Updates:�������������������I/Os (log ) O N B Lars�Arge 3

  4. External�memory�data�structures Interval�Management • External�interval�tree: Θ – Fan-out�������������weight-balanced�B-tree on�endpoints ( B ) – Intervals�stored�in� O ( B )�secondary�structure�in�each�internal�node – Query�efficiency�using�filtering – Bootstrapping used�to�avoid� O ( B )�search�cost�in�each�node * Size� O ( B 2 )�underflow�structure�in�each�node * Constructed�using�sweep�and�persistent�B-tree * Dynamic�using�global�rebuilding v v $m$�blocks Θ ( B ) Lars�Arge 4

  5. External�memory�data�structures 3-Sided�Range�Searching • Interval�management�corresponds�to�simple�form�of� 2d range�search (x 1 ,x 2 ) x 1 x 2 (x,x) x • More�general�problem:�Dynamic 3-sidede�range�searching – Maintain�set�of�points�in�plane�such that�given�query�( q 1 , q 2 ,� q 3 ),�all�points ( x , y )�with� q 1 ≤ x ≤ q 2 and� y ≥ q 3 can q 3 be�found�efficiently q 1 q 2 Lars�Arge 5

  6. External�memory�data�structures 3-Sided�Range�Searching�:�Static�Solution • Construction:�Sweep top-down�inserting� x in�persistent�B-tree�at�( x , y ) – O ( N/B )�space N – I/O�construction�using�buffer�technique ( log ) O N B B • Query�( q 1 , q 2 ,� q 3 ):�Perform�range�query�with�[ q 1 , q 2 ]�in�B-tree�at� q 3 + – I/Os (log T ) O B N B • Dynamic�using�logarithmic�method (log 2 N – Insert: ) O B q 3 + (log 2 – Query:�� T ) O B N B q 1 q 2 (log ) • Improve�to������������������?��Deletes? O N B Lars�Arge 6

  7. External�memory�data�structures Internal�Priority�Search�Tree 9 16.20 4 16 5,6 19,9 1 5 13 19 1,2 9,4 13,3 20,3 1 4 5 9 13 16 19 20 4,1 • Base�tree�on� x -coordinates with�nodes�augmented�with�points • Heap�on� y -coordinates – Decreasing y� values�on�root-leaf�path – ( x , y )�on�path�from�root�to�leaf�holding� x – If� v holds�point�then� parent ( v )�holds�point Lars�Arge 7

  8. External�memory�data�structures Internal�Priority�Search�Tree 9 10,21 16.20 Insert�(10,21) 4 16 5,6 19,9 1 5 13 19 1,2 9,4 13,3 20,3 1 4 5 9 13 16 19 20 4,1 • Linear�space • Insert�of�( x , y )�(assuming�fixed� x -coordinate�set): – Compare�y�with� y -coordinate�in�root – Smaller:�Recursively�insert ( x , y )�in�subtree on�path�to� x – Bigger:�Insert�in�root�and�recursively�insert�old�point�in�subtree � O (log N )�update Lars�Arge 8

  9. External�memory�data�structures Internal�Priority�Search�Tree 9 16.20 4 4 16 5,6 19,9 4 19 1 5 13 19 1,2 9,4 13,3 20,3 1 4 5 9 13 16 19 20 4,1 • Query with�( q 1 , q 2 ,� q 3 )�starting�at�root�v: – Report�point�in� v if�satisfying�query – Visit�both�children�of� v if�point�reported – Always�visit�child(s)�of� v on�path(s)�to� q 1 and q 2 � O (log N+T )�query Lars�Arge 9

  10. External�memory�data�structures Externalizing�Priority�Search�Tree 9 16.20 4 16 5,6 19,9 1 5 13 19 1,2 9,4 13,3 20,3 1 4 5 9 13 16 19 20 4,1 • Natural�idea:�Block�tree • Problem:� – O (log N ) I/Os�to�follow�paths�to�to q 1 and q 2 B – But O ( T )�I/Os�may�be�used�to�visit�other�nodes�(“overshooting”)� + � (log ) O N T query� B Lars�Arge 10

  11. External�memory�data�structures Externalizing�Priority�Search�Tree 9 16.20 4 16 5,6 19,9 1 5 13 19 1,2 9,4 13,3 20,3 1 4 5 9 13 16 19 20 4,1 • Solution�idea: – Store� B points�in�each�node� � * O ( B 2 )�points�stored�in�each�supernode * B output�points�can�pay�for�“overshooting” – Bootstrapping: * Store� O ( B 2 )�points�in�each�supernode in�static�structure Lars�Arge 11

  12. External�memory�data�structures External�Priority�Search�Tree • Base�tree:�Weight-balanced�B-tree�on� x -coordinates�( a,k=B ) • Points�in�“heap�order”: Θ – Root�stores� B top�points�for�each�of�the����������child�slabs ( B ) – Remaining�points�stored�recursively • Points�in�each�node�stored�in�“ O ( B 2 )-structure” – Persistent�B-tree�structure�for�static�problem � Θ ( B ) Linear�space Lars�Arge 12

  13. External�memory�data�structures External�Priority�Search�Tree • Query with�( q 1 , q 2 ,� q 3 )�starting�at�root� v : – Query� O ( B 2 )-structure�and�report�points�satisfying�query – Visit�child� v if * v on�path�to� q 1 or q 2 * All�points�corresponding�to� v satisfy�query Lars�Arge 13

  14. External�memory�data�structures External�Priority�Search�Tree • Analysis: + = + 2 T T – I/Os�used�to�visit�node� v (log ) ( 1 ) O B O v v B B B – (log ) nodes�on�path�to� q 1 or q 2 O N B – For�each�node� v not�on�path�to� q 1 or q 2� visited,� B� points�reported� in� parent ( v ) � + (log T ) query O B N B Lars�Arge 14

  15. External�memory�data�structures External�Priority�Search�Tree • Insert�( x,y ) (assuming�fixed� x -coordinate�set�– static�base�tree): – Find�relevant�node� v: * Query� O ( B 2 )-structure�to�find B� points�in�root�corresponding to�node� u on�path�to� x u * If� y smaller�than� y -coordinates of�all� B points�then�recursively search�in� u – Insert�( x,y ) in� O ( B 2 )-structure�of� v – If� O ( B 2 )-structure�contains� >B points�for�child� u ,�remove�lowest� point�and�insert�recursively�in� u • Delete:�Similarly Lars�Arge 15

  16. External�memory�data�structures External�Priority�Search�Tree • Analysis: – Query�visits�������������������nodes (log ) O N B – O ( B 2 )-structure�queried/updated�in�each�node * One�query * One�insert�and�one�delete u • O ( B 2 )-structure�analysis: + = 2 – Query: (log / ) ( 1 ) O B B B O B – Update�in� O ( 1 )�I/Os�using�update block�and�global�rebuilding � I/Os (log ) O N B Lars�Arge 16

  17. External�memory�data�structures Removing�Fixed� x -coordinate�Set�Assumption • Deletion: – Delete�point�as�previously v – Delete� x -coordinate�from�base tree�using�global�rebuilding � (log ) I/Os�amortized O N B • Insertion: – Insert� x -coordinate�in�base�tree v’ v’’ and�rebalance�(using�splits) – Insert�point�as�previously • Split:�Boundary�in� v becomes�boundary�in� parent ( v ) Lars�Arge 17

  18. External�memory�data�structures Removing�Fixed� x -coordinate�Set�Assumption • Split:�When� v splits� B new�points�needed�in� parent ( v ) • One�point�obtained�from� v’ ( v’’ )�using�“bubble-up”�operation: – Find�top�point� p in� v’ – Insert� p in� O ( B 2 )-structure� v’ – Remove� p from� O ( B 2 )-structure�of� v’ v’’ – Recursively�bubble-up�point�to� v (log ( )) • Bubble-up in�����������������������I/Os O w v B – Follow�one�path�from� v to�leaf – Uses� O ( 1 )�I/O�in�each�node � = Split�in�������������������������������������������I/Os ( log ( )) ( ( )) O B w v O w v B Lars�Arge 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend