disjoint set data structure
play

Disjoint-set data structure CS 5633 -- Spring 2006 (Union-Find) - PowerPoint PPT Presentation

3/30/06 Disjoint-set data structure CS 5633 -- Spring 2006 (Union-Find) Problem: Maintain a dynamic collection of pairwise-disjoint sets S = { S 1 , S 2 , , S r }. Each set S i has one element distinguished as the representative


  1. 3/30/06 Disjoint-set data structure CS 5633 -- Spring 2006 (Union-Find) Problem: • Maintain a dynamic collection of pairwise-disjoint sets S = { S 1 , S 2 , …, S r }. • Each set S i has one element distinguished as the representative element, rep [ S i ]. • Must support 3 operations: • M AKE -S ET ( x ): adds new set { x } to S Union-Find Data Structures with rep [{ x }] = x (for any x ∉ S i for all i ) • U NION ( x , y ): replaces sets S x , S y with S x ∪ S y in S Carola Wenk (for any x , y in distinct sets S x , S y ) Slides courtesy of Charles Leiserson with small • F IND -S ET ( x ): returns representative rep [ S x ] changes by Carola Wenk of set S x containing element x 3/30/06 CS 5633 Analysis of Algorithms 1 3/30/06 CS 5633 Analysis of Algorithms 2 Disjoint-set data structure Simple linked-list solution (Union-Find) II Store each set S i = { x 1 , x 2 , …, x k } as an (unordered) doubly linked list. Define representative element • In all operations the elements x , y are rep [ S i ] to be the front of the list, x 1 . given (as pointers or references for example) … S i : x 1 x 2 x k • Hence, we do not need to first search for the rep [ S i ] element in the data structure. Θ (1) • M AKE -S ET ( x ) initializes x as a lone node. • F IND -S ET ( x ) walks left in the list containing • Let n denote the overall number of elements Θ ( n ) x until it reaches the front of the list. (equivalently, the number of M AKE -S ET • U NION ( x , y ) calls F IND -S ET on x and y and Θ ( n ) operations). concatenates the lists containing x and y , leaving rep. as F IND -S ET [ x ]. 3/30/06 CS 5633 Analysis of Algorithms 3 3/30/06 CS 5633 Analysis of Algorithms 4 1

  2. 3/30/06 Simple balanced-tree solution Plan of attack maintain how? Store each set S i = { x 1 , x 2 , …, x k } as a balanced tree •We will build a simple disjoint-union data structure (ignoring keys). Define representative element that, in an amortized sense , performs significantly rep [ S i ] to be the root of the tree. better than Θ (log n ) per op., even better than Θ (log log n ), Θ (log log log n ), ..., but not quite Θ (1). S i = { x 1 , x 2 , x 3 , x 4 , x 5 } • M AKE -S ET ( x ) initializes x Θ (1) as a lone node. •To reach this goal, we will introduce two key tricks . rep [ S i ] x 1 • F IND -S ET ( x ) walks up the tree Each trick converts a trivial Θ ( n ) solution into a Θ (log n ) containing x until reaching root. simple Θ (log n ) amortized solution. Together, the x 4 x 3 • U NION ( x , y ) calls F IND -S ET on two tricks yield a much better solution. Θ (log n ) x and y and concatenates the x 2 x 5 • First trick arises in an augmented linked list. trees containing x and y , Second trick arises in a tree structure. changing rep. of x or y 3/30/06 CS 5633 Analysis of Algorithms 5 3/30/06 CS 5633 Analysis of Algorithms 6 Example of Augmented linked-list solution augmented linked-list solution Store S i = { x 1 , x 2 , …, x k } as unordered doubly linked list. Each element x j stores pointer rep [ x j ] to rep [ S i ]. Augmentation: Each element x j also stores pointer U NION ( x , y ) rep [ x j ] to rep [ S i ] (which is the front of the list, x 1 ). • concatenates the lists containing x and y , and rep • updates the rep pointers for all elements in the list containing y . … S i : x 1 x 2 x k rep rep [ S i ] S x : x 1 x 2 rep – Θ (1) • F IND -S ET ( x ) returns rep [ x ]. rep [ S x ] • U NION ( x , y ) concatenates lists containing S y : y 1 y 2 y 3 x and y and updates the rep pointers for – Θ ( n ) rep [ S y ] all elements in the list containing y . 3/30/06 CS 5633 Analysis of Algorithms 7 3/30/06 CS 5633 Analysis of Algorithms 8 2

  3. 3/30/06 Example of Example of augmented linked-list solution augmented linked-list solution Each element x j stores pointer rep [ x j ] to rep [ S i ]. Each element x j stores pointer rep [ x j ] to rep [ S i ]. U NION ( x , y ) U NION ( x , y ) • concatenates the lists containing x and y , and • concatenates the lists containing x and y , and • updates the rep pointers for all elements in the • updates the rep pointers for all elements in the list containing y . list containing y . rep S x ∪ S y : rep S x ∪ S y : x 1 x 2 x 1 x 2 rep rep [ S x ] rep [ S x ∪ S y ] y 1 y 2 y 3 y 1 y 2 y 3 rep [ S y ] 3/30/06 CS 5633 Analysis of Algorithms 9 3/30/06 CS 5633 Analysis of Algorithms 10 Alternative concatenation Alternative concatenation U NION ( x , y ) could instead U NION ( x , y ) could instead • concatenate the lists containing y and x , and • concatenate the lists containing y and x , and • update the rep pointers for all elements in the • update the rep pointers for all elements in the list containing x . list containing x . rep rep S x : x 1 x 2 x 1 x 2 rep rep S x ∪ S y : rep [ S x ] rep [ S x ] S y : y 1 y 2 y 3 y 1 y 2 y 3 rep [ S y ] rep [ S y ] 3/30/06 CS 5633 Analysis of Algorithms 11 3/30/06 CS 5633 Analysis of Algorithms 12 3

  4. 3/30/06 Trick 1 : Smaller into larger Alternative concatenation (weighted-union heuristic) To save work, concatenate smaller list onto the end U NION ( x , y ) could instead of the larger list. Cost = Θ (length of smaller list). • concatenate the lists containing y and x , and Augment list to store its weight (# elements). • update the rep pointers for all elements in the list containing x . • Let n denote the overall number of elements (equivalently, the number of M AKE -S ET operations). rep • Let m denote the total number of operations. • Let f denote the number of F IND -S ET operations. x 1 x 2 rep S x ∪ S y : Theorem: Cost of all U NION ’s is O( n log n ). Corollary: Total cost is O( m + n log n ). y 1 y 2 y 3 rep [ S x ∪ S y ] 3/30/06 CS 5633 Analysis of Algorithms 13 3/30/06 CS 5633 Analysis of Algorithms 14 Analysis of Trick 1 Disjoint set forest: Representing sets as trees (weighted-union heuristic) Store each set S i = { x 1 , x 2 , …, x k } as an unordered, Theorem: Total cost of U NION ’s is O( n log n ). potentially unbalanced, not necessarily binary tree, Proof. • Monitor an element x and set S x containing it. storing only parent pointers. rep [ S i ] is the tree root. • After initial MAKE-SET( x ), weight [ S x ] = 1. • M AKE -S ET ( x ) initializes x • Each time S x is united with S y , weight [ S y ] ≥ weight [ S x ], S i = { x 1 , x 2 , x 3 , x 4 , x 5 , x 6 } – Θ (1) as a lone node. • pay 1 to update rep [ x ], and rep [ S i ] x 1 • F IND -S ET ( x ) walks up the • weight [ S x ] at least doubles (increases by weight [ S y ]). • Each time S x is united with smaller set S y , tree containing x until it • pay nothing, and – Θ ( depth [ x ]) reaches the root. x 4 x 3 • weight [ S x ] only increases. • U NION ( x , y ) concatenates Thus pay ≤ log n for x . the trees containing x and y … x 2 x 5 x 6 3/30/06 CS 5633 Analysis of Algorithms 15 3/30/06 CS 5633 Analysis of Algorithms 16 4

  5. 3/30/06 Trick 1 adapted to trees Trick 1 adapted to trees (union-by-weight) • U NION ( x , y ) can use a simple concatenation strategy: • Height of tree is logarithmic in weight, because: Make root F IND -S ET ( y ) a child of root F IND -S ET ( x ). • Induction on the weight ⇒ F IND -S ET ( y ) = F IND -S ET ( x ). • Height of a tree T is determined by the two x 1 subtrees T 1 , T 2 that T has been united from. • Adapt Trick 1 to this context: • Inductively the heights of T 1 , T 2 are the logs Union-by-weight: x 4 x 3 of their weights. y 1 Merge tree with smaller • height(T) = max(height(T 1 ), height(T 2 )) weight into tree with possibly +1, but only if T 1 , T 2 have same height x 2 x 5 x 6 y 4 y 3 larger weight. • Thus total cost is O( m log n ). • Variant of Trick 1 (see book): y 2 y 5 Union-by-rank: rank of a tree = its height 3/30/06 CS 5633 Analysis of Algorithms 17 3/30/06 CS 5633 Analysis of Algorithms 18 Trick 2 : Path compression Trick 2 : Path compression When we execute a F IND -S ET operation and walk When we execute a F IND -S ET operation and walk up a path p to the root, we know the representative up a path p to the root, we know the representative for all the nodes on path p . for all the nodes on path p . x 1 x 1 Path compression makes Path compression makes x 4 x 3 x 4 x 3 all of those nodes direct all of those nodes direct y 1 y 1 children of the root. children of the root. x 2 x 5 x 6 x 2 x 5 x 6 y 4 y 3 y 4 y 3 Cost of F IND -S ET ( x ) Cost of F IND -S ET ( x ) is still Θ ( depth [ x ]). is still Θ ( depth [ x ]). y 2 y 5 y 2 y 5 F IND -S ET ( y 2 ) F IND -S ET ( y 2 ) 3/30/06 CS 5633 Analysis of Algorithms 19 3/30/06 CS 5633 Analysis of Algorithms 20 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend