union find
play

Union-Find Part I Lecture 21 November 6, 2014 Union Find 1/45 - PowerPoint PPT Presentation

CS 573: Algorithms, Fall 2014 Union-Find Part I Lecture 21 November 6, 2014 Union Find 1/45 2/45 Requirements from the data-structure Amortized Analysis 1. Maintain a collection of sets. 1. Use data-structure as a black-box inside


  1. CS 573: Algorithms, Fall 2014 Union-Find Part I Lecture 21 November 6, 2014 Union Find 1/45 2/45 Requirements from the data-structure Amortized Analysis 1. Maintain a collection of sets. 1. Use data-structure as a black-box inside algorithm. ... Union-Find in Kruskal algorithm for computing MST. 2. makeSet ( x ) - creates a set that contains the single element x . 2. Bounded worst case time per operation. 3. find (x) - returns the set that contains x . 3. Care: overall running time spend in data-structure. 4. union ( A , B ) - returns set = union of A and B . That is 4. amortized running-time of operation A ∪ B . = average time to perform an operation on ... merges the two sets A and B and return the merged data-structure. 5. Amortized time per operation = overall running time set. number of operations. 3/45 4/45

  2. Reversed Trees Reversed Trees Representing sets in the Union-Find DS !esrever ni retteb si gnihtyreve esuaceB 1. Reversed Trees: 1.1 Initially: Every element is its own node. a 1.2 Node v : p ( v ) pointer to its parent. k 1.3 Set uniquely identified by root node/element. a g 2. makeSet : Create a singleton pointing to itself: b c f h 3. find ( x ): 3.1 Start from node containing x , j e traverse up tree, till arriving to i d a root. The Union-Find representation of the sets A = { a , b , c , d , e } b 3.2 find ( x ): c and B = { f , g , h , i , j , k } . The set A is uniquely identified by x → b → a d x a pointer to the root of A , which is the node containing a . 3.3 a : returned as set. 5/45 6/45 Union operation in reversed trees Pseudo-code of naive version... Just hang them on each other. makeSet (x) p ( x ) ← x union ( x , y ) union ( a , p ): Merge two sets. A ← find( x ) find (x) 1. Hanging the root of one tree, on the root of the other. B ← find( y ) if x = p ( x ) then p ( B ) ← A 2. A destructive operation, and the two original sets no return x longer exist. return find( p ( x )) 7/45 8/45

  3. Example... Find is slow, hack it! The long chain 1. find might require Ω( n ) time. 2. Q : How improve performance? g a a c e g a a c e g a a c e g a a c e g a a c f f f f h b d h b d h b d h b d h b 3. Two “hacks”: (i) Union by rank : After: makeSet ( a ), makeSet ( b ), makeSet ( c ), Maintain in root of tree , a bound on its depth makeSet ( d ), makeSet ( e ), makeSet ( f ), makeSet ( g ), ( rank ). makeSet ( h ) Rule : Hang the smaller tree on the larger tree union ( g , h ) in union . union ( f , g ) union ( e , f ) (ii) Path compression : union ( d , e ) During find, make all pointers on path point to union ( c , d ) root. union ( b , c ) union ( a , b ) 9/45 10/45 Path compression in action... Pseudo-code of improved version... union ( x , y ) makeSet (x) A ← find( x ) a p ( x ) ← x B ← find( y ) rank ( x ) ← 0 b if rank ( A ) > rank ( B ) then c a p ( B ) ← A x y x z find (x) b c d else if x � = p ( x ) then p ( A ) ← B y p ( x ) ← find( p ( x )) d if rank ( A ) = rank ( B ) then return p ( x ) rank ( B ) ← rank ( B ) + 1 z (a) (b) (a) The tree before performing find ( z ), and (b) The reversed tree after performing find ( z ) that uses path compression. 11/45 12/45

  4. Definition Definition v : Node UnionFind data-structure D Part II v is leader ⇐ ⇒ v root of a (reversed) tree in D . “When you’re not a leader, you’re little people.” Analyzing the Union-Find Data-Structure 13/45 14/45 Lemma Another Lemma Lemma Lemma Once node v stop being a leader, can never become leader Once a node stop being a leader then its rank is fixed. again. Proof. Proof. 1. rank of element changes only by union operation. 1. x stopped being leader because union operation hanged 2. union operation changes rank only for... x on y . the “new” leader of the new set. 2. From this point on... 3. if an element is no longer a leader, than its rank is fixed. 3. x might change only its parent pointer ( find ). 4. x parent pointer will never become equal to x again. 5. x never a leader again. 15/45 16/45

  5. Ranks are strictly monotonically increasing Proof... Lemma 1. Claim: ∀ u → v in DS: rank ( u ) < rank ( v ) . Ranks are monotonically increasing in the reversed trees... 2. Proof by induction. Base: all singletons. Holds. ...along a path from node to root of the tree. 3. Assume claim holds at time t , before an operation. 4. If operation is union ( A , B ), and assume that we hanged root ( A ) on root ( B ) . Must be that rank ( root ( B )) is now larger than rank ( root ( A )) (verify!). Claim true after operation! 5. If operation find : traverse path π , then all the nodes of π are made to point to the last node v of π . By induction, rank ( v ) > rank of all other nodes of π . All the nodes that get compressed, the rank of their new parent, is larger than their own rank. 17/45 18/45 Trees grow exponentially in size with rank Having higher rank is rare Lemma Lemma ⇒ at least ≥ 2 k elements in its # nodes that get assigned rank k throughout execution of When node gets rank k = Union-Find DS is at most n / 2 k . subtree. Proof. Proof. 1. By induction. For k = 0 it is obvious. 1. Proof is by induction. 2. when v become of rank k . Charge to roots merged: u 2. For k = 0 : obvious since a singleton has a rank zero, and and v . a single element in the set. 3. Before union: u and v of rank k − 1 3. node u gets rank k only if the merged two roots u , v has 4. After merge: rank ( v ) = k and rank ( u ) = k − 1 . rank k − 1 . 4. By induction, u and v have ≥ 2 k − 1 nodes before merge. 5. u no longer leader. Its rank is now fixed. 5. merged tree has ≥ 2 k − 1 + 2 k − 1 = 2 k nodes. 6. u , v leave rank k − 1 = ⇒ v enters rank k . 7. By induction: at most n / 2 k − 1 nodes of rank k − 1 created. � n / 2 k − 1 � / 2 = n / 2 k . = ⇒ # nodes rank k : ≤ 19/45 20/45

  6. log ∗ in detail Find takes logarithmic time 1. log ∗ ( n ) : number of times to take lg of number to get Lemma The time to perform a single find operation when we perform number smaller than two. 2. log ∗ 2 = 1 union by rank and path compression is O (log n ) time. 3. log ∗ 2 2 = 2 . Proof. 4. log ∗ 2 2 2 = 1 + log ∗ (2 2 ) = 2 + log ∗ 2 = 3 . 1. rank of leader v of reversed tree T , bounds depth of T . 5. log ∗ 2 2 22 = log ∗ (65536) = 4 . 2. By previous lemma: max rank ≤ lg n . 6. log ∗ 2 2 222 3. Depth of tree is O (log n ) . = log ∗ 2 65536 = 5 . 7. log ∗ is a monotone increasing function. 4. Time to perform find bounded by depth of tree. 8. β = 2 2 222 = 2 65536 : huge number For practical purposes, log ∗ returns value ≤ 5 . 21/45 22/45 Can do much better! The tower function... Theorem Definition Tower ( b ) = 2 Tower ( b − 1) and Tower (0) = 1 . For a sequence of m operations over n elements, the overall running time of the UnionFind data-structure is Tower ( i ) : a tower of 2 2 2 ··· 2 O (( n + m ) log ∗ n ) . of height i . Observe that log ∗ ( Tower ( i )) = i . 1. Intuitively: UnionFind data-structure takes constant Definition time per operation... For i ≥ 0 , let Block ( i ) = [ Tower ( i − 1) + 1 , Tower ( i )] ; (unless n is larger than β which is unlikely). that is 2. Not quite correct if n sufficiently large... � z , 2 z − 1 � Block ( i ) = for z = Tower ( i − 1) + 1 . Also Block (0) = [0 , 1] . As such, � � � � Block (0) = 0 , 1 , Block (1) = 2 , 2 , � � � � Block (2) = 3 , 4 , Block (3) = 5 , 16 , � � � 65537 , 2 65536 � Block (4) = 17 , 65536 , Block (5) = . . . 23/45 24/45

  7. Running time of find... Blocks and jumping pointers 1. RT of find (x) proportional to length of the path from x 1. maximum rank of node v is O (log n ) . 2. # of blocks is O (log ∗ n ) , as to the root of its tree. O (log n ) ∈ Block ( c log ∗ n ) , ( c : constant, say 2 ). 2. ...start from x and we visit the sequence: x 1 = x , x 2 = p ( x 1 ) , x 3 = p ( x 2 ) , . . . , x i = p ( x i − 1 ) , 3. find ( x ): π path used. . . . , x m = p ( x m − 1 ) = root of tree. 4. partition π into each by rank. 3. rank ( x 1 ) < rank ( x 2 ) < rank ( x 3 ) < . . . < 5. Price of find length π . rank ( x m ) . 6. node x : ν = index B ( x ) index block containing 4. RT of find ( x ) is O ( m ) . rank ( x ) . Definition � � 7. rank ( x ) ∈ Block index B ( x ) . A node x is in the ith block if rank ( x ) ∈ Block ( i ) . 8. index B ( x ) : block of x 5. Looking for ways to pay for the find operation. 6. Since other two operations take constant time... 25/45 26/45 The path of find operation, and its pointers The pointers between blocks... 1. During a find operation... Block(10) 2. π : path traversed. Block(9) 3. Ranks of the nodes visited in π monotone increasing. Block(8) 4. Once leave block i th, never go back! Block(6 . . . 7) 5. charge visit to nodes in π next to element in a different Block(5) between jump block... 6. to total number of blocks ≤ O (log ∗ n ) . internal jump Block(1 . . . 4) Block(1) Block(0) 27/45 28/45

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend