union find data structures
play

Union-Find Data Structures Carola Wenk Slides courtesy of Charles - PowerPoint PPT Presentation

CMPS 2200 -- Fall 2012 Union-Find Data Structures Carola Wenk Slides courtesy of Charles Leiserson with small Slides courtesy of Charles Leiserson with small changes by Carola Wenk 10/29/12 CMPS 2200 Intro. to Algorithms 1 Disjoint-set data


  1. CMPS 2200 -- Fall 2012 Union-Find Data Structures Carola Wenk Slides courtesy of Charles Leiserson with small Slides courtesy of Charles Leiserson with small changes by Carola Wenk 10/29/12 CMPS 2200 Intro. to Algorithms 1

  2. Disjoint-set data structure (Union Find) (Union-Find) Problem: • Maintain a dynamic collection of pairwise-disjoint Maintain a dynamic collection of pairwise disjoint sets S = { S 1 , S 2 , …, S r }. • Each set S i has one element distinguished as the representative element, rep [ S i ]. i l • Must support 3 operations: • M AKE -S ET ( x ): adds new set { x } to S t { } t S • M AKE S ET ( x ): dd with rep [{ x }] = x (for any x ∉ S i for all i ) • U NION ( x y ): replaces sets S S with S ∪ S in S U NION ( x , y ): replaces sets S x , S y with S x ∪ S y in S (for any x , y in distinct sets S x , S y ) • F IND -S ET ( x ): returns representative rep [ S x ] ( ) p p [ x ] of set S x containing element x 10/29/12 CMPS 2200 Intro. to Algorithms 2

  3. Union-Find Example p S = {} The representative is underlined S = {{2}} S = {{2}} M AKE S ET (2) M AKE -S ET (2) S = {{2}, {3}} M AKE -S ET (3) S S = {{2}, {3}, {4}} {{2} {3} {4}} M M AKE -S ET (4) S (4) F IND -S ET (4) = 4 S S = {{2, 4}, {3}} {{2 4} {3}} U U NION (2, 4) (2 4) F IND -S ET (4) = 2 S = {{2, 4}, {3}, {5}} M AKE -S ET (5) S = {{2, 4, 5}, {3}} {{ } { }} U NION (4, 5) ( , ) 10/29/12 CMPS 2200 Intro. to Algorithms 3

  4. Plan of attack •We will build a simple disjoint-set data structure that, in an amortized sense , performs significantly th t i ti d f i ifi tl better than Θ (log n ) per op., even better than Θ (log log n ) Θ (log log log n ) Θ (log log n ), Θ (log log log n ), ..., but not quite Θ (1). but not quite Θ (1) •To reach this goal, we will introduce two key tricks . Each trick converts a trivial Θ ( n ) solution into a i i l Θ ( ) E h i k l i i simple Θ (log n ) amortized solution. Together, the two tricks yield a much better solution two tricks yield a much better solution. • First trick arises in an augmented linked list. Second trick arises in a tree structure. 10/29/12 CMPS 2200 Intro. to Algorithms 4

  5. Augmented linked-list solution g Store S i = { x 1 , x 2 , …, x k } as unordered doubly linked list. Augmentation: Each element x j also stores pointer Augmentation: Each element x j also stores pointer rep [ x j ] to rep [ S i ] (which is the front of the list, x 1 ). rep rep Assume pointer to x … S i : is given. g x 1 x 1 x 2 x 2 x k x k i rep [ S i ] – Θ (1) Θ (1) • F IND -S ET ( x ) returns rep [ x ]. • F IND S ET ( x ) returns rep [ x ] • U NION ( x , y ) concatenates lists containing x and y and updates the rep pointers for x and y and updates the rep pointers for – Θ ( n ) all elements in the list containing y . 10/29/12 CMPS 2200 Intro. to Algorithms 5

  6. Example of augmented linked-list solution augmented linked list solution Each element x j stores pointer rep [ x j ] to rep [ S i ]. U U NION ( x , y ) ( ) • concatenates the lists containing x and y , and • updates the rep pointers for all elements in the • updates the rep pointers for all elements in the list containing y . rep rep S x : x 1 x 2 rep rep rep [ S x ] S y : y 1 y 2 y 3 y rep [ S y ] 10/29/12 CMPS 2200 Intro. to Algorithms 6

  7. Example of augmented linked-list solution augmented linked list solution Each element x j stores pointer rep [ x j ] to rep [ S i ]. U U NION ( x , y ) ( ) • concatenates the lists containing x and y , and • updates the rep pointers for all elements in the • updates the rep pointers for all elements in the list containing y . S x ∪ S y : S ∪ S : rep rep x 1 x 2 rep rep rep [ S x ] y 1 y 2 y 3 rep [ S y ] 10/29/12 CMPS 2200 Intro. to Algorithms 7

  8. Example of augmented linked-list solution augmented linked list solution Each element x j stores pointer rep [ x j ] to rep [ S i ]. U U NION ( x , y ) ( ) • concatenates the lists containing x and y , and • updates the rep pointers for all elements in the • updates the rep pointers for all elements in the list containing y . rep S ∪ S : S x ∪ S y : x 1 x 2 rep [ S x ∪ S y ] y 1 y 2 y 3 10/29/12 CMPS 2200 Intro. to Algorithms 8

  9. Alternative concatenation U NION ( x , y ) could instead • concatenate the lists containing y and x , and t t th li t t i i d d • update the rep pointers for all elements in the list containing x list containing x . rep rep S x : x 1 x 2 rep rep rep [ S x ] S y : y 1 y 2 y 3 rep [ S y ] 10/29/12 CMPS 2200 Intro. to Algorithms 9

  10. Alternative concatenation U NION ( x , y ) could instead • concatenate the lists containing y and x , and t t th li t t i i d d • update the rep pointers for all elements in the list containing x list containing x . rep rep x 1 x 2 rep rep S ∪ S : S x ∪ S y : rep [ S x ] y 1 y 2 y 3 rep [ S y ] 10/29/12 CMPS 2200 Intro. to Algorithms 10

  11. Alternative concatenation U NION ( x , y ) could instead • concatenate the lists containing y and x , and t t th li t t i i d d • update the rep pointers for all elements in the list containing x list containing x . rep x 1 x 2 rep rep S ∪ S : S x ∪ S y : y 1 y 2 y 3 rep [ S x ∪ S y ] 10/29/12 CMPS 2200 Intro. to Algorithms 11

  12. Trick 1 : Smaller into larger (weighted union heuristic) (weighted-union heuristic) To save work, concatenate the smaller list onto the end of the larger list. Cost = Θ (length of smaller list). Θ (l d f th l li t C t th f ll li t) Augment list to store its weight (# elements). • Let n denote the overall number of elements (equivalently, the number of M AKE -S ET operations). • Let m denote the total number of operations. L d h l b f i • Let f denote the number of F IND -S ET operations. Theorem: Cost of all U NION ’s is O( n log n ). Corollary: Total cost is O( m + n log n ). y ( g ) 10/29/12 CMPS 2200 Intro. to Algorithms 12

  13. Analysis of Trick 1 (weighted union heuristic) (weighted-union heuristic) Theorem: Total cost of U NION ’s is O( n log n ). Proof. • Monitor an element x and set S x containing it. • After initial MAKE-SET( x ), weight [ S x ] = 1. ( ) g [ x ] • Each time S x is united with S y : • if weight [ S y ] ≥ weight [ S x ]: – pay 1 to update rep [ x ], and 1 t d t [ ] d – weight [ S x ] at least doubles (increases by weight [ S y ]). • if weight [ S ] < weight [ S ]: if weight [ S y ] weight [ S x ]: – pay nothing, and – weight [ S x ] only increases. Thus pay ≤ log n for x . 10/29/12 CMPS 2200 Intro. to Algorithms 13

  14. Disjoint set forest: Representing sets as trees Representing sets as trees Store each set S i = { x 1 , x 2 , …, x k } as an unordered, potentially unbalanced not necessarily binary tree potentially unbalanced, not necessarily binary tree, storing only parent pointers. rep [ S i ] is the tree root. • M AKE -S ET ( x ) initializes x S ( ) i i i li S i = { x 1 , x 2 , x 3 , x 4 , x 5 , x 6 } – Θ (1) as a lone node. rep [ S ] rep [ S i ] x x 1 • F IND -S ET ( x ) walks up the • F IND S ET ( x ) walks up the tree containing x until it – Θ ( depth [ x ]) Θ ( depth [ x ]) x 4 x 4 x 3 x 3 reaches the root. reaches the root • U NION ( x , y ) calls F IND -S ET twice and concatenates the trees x 2 x 5 x 6 2 5 6 – Θ ( depth [ x ]) containing x and y … 10/29/12 CMPS 2200 Intro. to Algorithms 14

  15. Trick 1 adapted to trees p • U NION ( x , y ) can use a simple concatenation strategy: Make root F IND S ET ( y ) a child of root F IND S ET ( x ) Make root F IND -S ET ( y ) a child of root F IND -S ET ( x ). x 1 1 • Adapt Trick 1 to this context: • Adapt Trick 1 to this context: Union-by-weight: x 4 x 3 y 1 Merge tree with smaller Merge tree with smaller weight into tree with x 2 x 5 x 6 y 4 y 3 larger weight. g g • Variant of Trick 1 (see book): y 2 y 5 Union-by-rank: Union by rank: Example: U NION( x 4 , y 2 ) rank of a tree = its height 10/29/12 CMPS 2200 Intro. to Algorithms 15

  16. Trick 1 adapted to trees (union-by-weight) (union-by-weight) • Height of tree is logarithmic in weight, because: • Induction on n • Height of a tree T is determined by the two subtrees T 1 , T 2 that T has been united from. • Inductively the heights of T 1 , T 2 are the logs of their y g 1 , g 2 weights. • If T 1 and T 2 have different heights: height( T ) = max(height( T 1 ) height( T 2 )) height( T ) max(height( T 1 ), height( T 2 )) = max(log weight( T 1 ), log weight( T 2 )) < log weight( T ) • If T If T 1 and T 2 have the same heights: d T h th h i ht (Assume 2 ≤ weight( T 1 )<weight( T 2 ) ) height( T ) = height( T 1 ) + 1 = log (2*weight( T 1 )) ≤ log weight( T ) • Thus the total cost of any m operations is O( m log n ). 10/29/12 CMPS 2200 Intro. to Algorithms 16

  17. Trick 2 : Path compression p When we execute a F IND -S ET operation and walk up a path p to the root, we know the representative th t th t k th t ti for all the nodes on path p . x 1 1 Path compression makes x 4 x 3 all of those nodes direct y 1 children of the root. x 2 x 5 x 6 y 4 y 3 Cost of F IND -S ET ( x ) ( ) is still Θ ( depth [ x ]). y 2 y 5 F IND -S ET ( y 2 ) 10/29/12 CMPS 2200 Intro. to Algorithms 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend