disjoint set data structure
play

Disjoint-set data structure CS 5633 -- Spring 2008 (Union-Find) - PowerPoint PPT Presentation

Disjoint-set data structure CS 5633 -- Spring 2008 (Union-Find) Problem: Maintain a dynamic collection of pairwise-disjoint sets S = { S 1 , S 2 , , S r }. Each set S i has one element distinguished as the representative element, rep [


  1. Disjoint-set data structure CS 5633 -- Spring 2008 (Union-Find) Problem: • Maintain a dynamic collection of pairwise-disjoint sets S = { S 1 , S 2 , …, S r }. • Each set S i has one element distinguished as the representative element, rep [ S i ]. • Must support 3 operations: • M AKE -S ET ( x ): adds new set { x } to S Union-Find Data Structures with rep [{ x }] = x (for any x ∉ S i for all i ) • U NION ( x , y ): replaces sets S x , S y with S x ∪ S y in S Carola Wenk (for any x , y in distinct sets S x , S y ) Slides courtesy of Charles Leiserson with small • F IND -S ET ( x ): returns representative rep [ S x ] changes by Carola Wenk of set S x containing element x 3/25/08 CS 5633 Analysis of Algorithms 1 3/25/08 CS 5633 Analysis of Algorithms 2 Disjoint-set data structure Union-Find Example (Union-Find) II The representative S = {} is underlined M AKE -S ET (2) S = {{2}} • In all operations pointers to the elements x , y S = {{2}, {3}} M AKE -S ET (3) in the data structure are given. S = {{2}, {3}, {4}} M AKE -S ET (4) • Hence, we do not need to first search for the F IND -S ET (4) = 4 element in the data structure. S = {{2, 4}, {3}} U NION (2, 4) F IND -S ET (4) = 2 • Let n denote the overall number of elements S = {{2, 4}, {3}, {5}} M AKE -S ET (5) (equivalently, the number of M AKE -S ET S = {{2, 4, 5}, {3}} operations). U NION (4, 5) 3/25/08 CS 5633 Analysis of Algorithms 3 3/25/08 CS 5633 Analysis of Algorithms 4 1

  2. Simple linked-list solution Simple balanced-tree solution maintain how? Store each set S i = { x 1 , x 2 , …, x k } as an (unordered) Store each set S i = { x 1 , x 2 , …, x k } as a balanced tree doubly linked list. Define representative element (ignoring keys). Define representative element rep [ S i ] to be the front of the list, x 1 . rep [ S i ] to be the root of the tree. S i = { x 1 , x 2 , x 3 , x 4 , x 5 } … • M AKE -S ET ( x ) initializes x S i : x 1 x 2 x k Θ (1) as a lone node. rep [ S i ] x 1 rep [ S i ] • F IND -S ET ( x ) walks up the tree Θ (1) Θ (log n ) • M AKE -S ET ( x ) initializes x as a lone node. containing x until reaching root. x 4 x 3 • F IND -S ET ( x ) walks left in the list containing • U NION ( x , y ) calls F IND -S ET on Θ ( n ) x until it reaches the front of the list. Θ (log n ) y, finds a leaf of x and • U NION ( x , y ) calls F IND -S ET on y, finds the Θ ( n ) x 2 x 5 concatenates both trees, last element of list x , and concatenates both changing rep. of y lists, leaving rep. as F IND -S ET [ x ]. How? 3/25/08 CS 5633 Analysis of Algorithms 5 3/25/08 CS 5633 Analysis of Algorithms 6 Plan of attack Augmented linked-list solution Store S i = { x 1 , x 2 , …, x k } as unordered doubly linked list. •We will build a simple disjoint-union data structure Augmentation: Each element x j also stores pointer that, in an amortized sense , performs significantly rep [ x j ] to rep [ S i ] (which is the front of the list, x 1 ). better than Θ (log n ) per op., even better than Θ (log log n ), Θ (log log log n ), ..., but not quite Θ (1). rep •To reach this goal, we will introduce two key tricks . … S i : x 1 x 2 x k Each trick converts a trivial Θ ( n ) solution into a rep [ S i ] simple Θ (log n ) amortized solution. Together, the – Θ (1) • F IND -S ET ( x ) returns rep [ x ]. two tricks yield a much better solution. • U NION ( x , y ) concatenates lists containing • First trick arises in an augmented linked list. x and y and updates the rep pointers for Second trick arises in a tree structure. – Θ ( n ) all elements in the list containing y . 3/25/08 CS 5633 Analysis of Algorithms 7 3/25/08 CS 5633 Analysis of Algorithms 8 2

  3. Example of Example of augmented linked-list solution augmented linked-list solution Each element x j stores pointer rep [ x j ] to rep [ S i ]. Each element x j stores pointer rep [ x j ] to rep [ S i ]. U NION ( x , y ) U NION ( x , y ) • concatenates the lists containing x and y , and • concatenates the lists containing x and y , and • updates the rep pointers for all elements in the • updates the rep pointers for all elements in the list containing y . list containing y . rep S x ∪ S y : rep S x : x 1 x 2 x 1 x 2 rep rep rep [ S x ] rep [ S x ] S y : y 1 y 2 y 3 y 1 y 2 y 3 rep [ S y ] rep [ S y ] 3/25/08 CS 5633 Analysis of Algorithms 9 3/25/08 CS 5633 Analysis of Algorithms 10 Example of Alternative concatenation augmented linked-list solution Each element x j stores pointer rep [ x j ] to rep [ S i ]. U NION ( x , y ) could instead U NION ( x , y ) • concatenate the lists containing y and x , and • concatenates the lists containing x and y , and • update the rep pointers for all elements in the • updates the rep pointers for all elements in the list containing x . list containing y . rep rep S x ∪ S y : S x : x 1 x 2 x 1 x 2 rep rep [ S x ∪ S y ] rep [ S x ] S y : y 1 y 2 y 3 y 1 y 2 y 3 rep [ S y ] 3/25/08 CS 5633 Analysis of Algorithms 11 3/25/08 CS 5633 Analysis of Algorithms 12 3

  4. Alternative concatenation Alternative concatenation U NION ( x , y ) could instead U NION ( x , y ) could instead • concatenate the lists containing y and x , and • concatenate the lists containing y and x , and • update the rep pointers for all elements in the • update the rep pointers for all elements in the list containing x . list containing x . rep rep x 1 x 2 x 1 x 2 rep rep S x ∪ S y : S x ∪ S y : rep [ S x ] y 1 y 2 y 3 y 1 y 2 y 3 rep [ S x ∪ S y ] rep [ S y ] 3/25/08 CS 5633 Analysis of Algorithms 13 3/25/08 CS 5633 Analysis of Algorithms 14 Analysis of Trick 1 Trick 1 : Smaller into larger (weighted-union heuristic) (weighted-union heuristic) To save work, concatenate smaller list onto the end Theorem: Total cost of U NION ’s is O( n log n ). of the larger list. Cost = Θ (length of smaller list). Proof. • Monitor an element x and set S x containing it. Augment list to store its weight (# elements). • After initial MAKE-SET( x ), weight [ S x ] = 1. • Let n denote the overall number of elements • Each time S x is united with S y : • if weight [ S y ] ≥ weight [ S x ]: (equivalently, the number of M AKE -S ET operations). – pay 1 to update rep [ x ], and • Let m denote the total number of operations. – weight [ S x ] at least doubles (increases by weight [ S y ]). • Let f denote the number of F IND -S ET operations. • if weight [ S y ] < weight [ S x ]: Theorem: Cost of all U NION ’s is O( n log n ). – pay nothing, and – weight [ S x ] only increases. Corollary: Total cost is O( m + n log n ). Thus pay ≤ log n for x . 3/25/08 CS 5633 Analysis of Algorithms 15 3/25/08 CS 5633 Analysis of Algorithms 16 4

  5. Disjoint set forest: Trick 1 adapted to trees Representing sets as trees Store each set S i = { x 1 , x 2 , …, x k } as an unordered, • U NION ( x , y ) can use a simple concatenation strategy: potentially unbalanced, not necessarily binary tree, Make root F IND -S ET ( y ) a child of root F IND -S ET ( x ). ⇒ F IND -S ET ( y ) = F IND -S ET ( x ). storing only parent pointers. rep [ S i ] is the tree root. x 1 • Adapt Trick 1 to this context: • M AKE -S ET ( x ) initializes x S i = { x 1 , x 2 , x 3 , x 4 , x 5 , x 6 } Union-by-weight: – Θ (1) as a lone node. x 4 x 3 y 1 Merge tree with smaller rep [ S i ] x 1 • F IND -S ET ( x ) walks up the weight into tree with tree containing x until it x 2 x 5 x 6 y 4 y 3 larger weight. – Θ ( depth [ x ]) reaches the root. x 4 x 3 • U NION ( x , y ) calls F IND -S ET twice • Variant of Trick 1 (see book): y 2 y 5 and concatenates the trees x 2 x 5 x 6 Union-by-rank: – Θ ( depth [ x ]) containing x and y … rank of a tree = its height 3/25/08 CS 5633 Analysis of Algorithms 17 3/25/08 CS 5633 Analysis of Algorithms 18 Trick 1 adapted to trees Trick 2 : Path compression (union-by-weight) • Height of tree is logarithmic in weight, because: When we execute a F IND -S ET operation and walk • Induction on n up a path p to the root, we know the representative • Height of a tree T is determined by the two subtrees T 1 , T 2 that T has been united from. for all the nodes on path p . x 1 • Inductively the heights of T 1 , T 2 are the logs of their weights. Path compression makes • If T 1 and T 2 have different heights: x 4 x 3 all of those nodes direct y 1 height( T ) = max(height( T 1 ), height( T 2 )) children of the root. = max(log weight( T 1 ), log weight( T 2 )) x 2 x 5 x 6 y 4 y 3 < log weight( T ) Cost of F IND -S ET ( x ) • If T 1 and T 2 have the same heights: is still Θ ( depth [ x ]). (Assume 2 ≤ weight( T 1 )<weight( T 2 ) ) y 2 y 5 F IND -S ET ( y 2 ) height( T ) = height( T 1 ) + 1 ≤ 2* log weight( T 1 ) ≤ log weight( T ) • Thus the total cost of any m operations is O( m log n ). 3/25/08 CS 5633 Analysis of Algorithms 19 3/25/08 CS 5633 Analysis of Algorithms 20 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend