Data Structures for Disjoint Sets Course: CS 5130 - Advanced Data - PowerPoint PPT Presentation

Data Structures for Disjoint Sets Course: CS 5130 - Advanced Data Structures and Algorithms Instructor: Dr. Badri Adhikari

Overview Some applications involve grouping n distinct elements into a collection of disjoint sets. Two frequent operations on such applications: (a) finding the unique set that contains a given element (b) uniting two sets How can we maintain a data structure that supports these operations? Two implementations: (a) Linked list implementation of disjoint sets (b) Rooted trees implementation of disjoint sets

Disjoint set operations A disjoint-set data structure maintains a collection S = {S 1 , S 2 , S 3 , ..., S k } of disjoint dynamic sets. Each set is identified by a representative , which is some member of the set. - In some applications, it may not matter which member is used - In some applications, the smallest member - In some applications, a user selected member Each element of a set is represented by an object x .

Disjoint set operations MAKE-SET( x ) - creates a new set whose only member (and thus representative) is x . Sets are disjoint - implies - x is not already in another set. UNION( x , y ) - unites the dynamic sets that CONTAIN x and y , say S x and S y , into a new set. What will be the new representative? x or y ? Where do we implement it? FIND-SET( x ) - returns a pointer to the representative of the (unique) set containing x . Running times of disjoint-set data structures - depends on two parameters: (a) n - the number of MAKE-SET operations (b) m - the total number of MAKE-SET, UNION, and FIND-SET operations. Always, m ≥ n . Why?

Example application 1 - Reachability in Maze Maze - Is B reachable from A? https://www.coursera.org/learn/data-structures/

Example application 1 - Reachability in Maze preprocess ( maze ){ for each cell c in maze : MAKE-SET( c ) for each cell c in maze : for each neighbor n of c : UNION( c , n ) } is-reachable( A , B ){ return FIND( A ) = FIND( B ) }

Example application 2 - Connected Components Determine connected components in an undirected graph! a graph with four connected components SAME-COMPONENT( a , d ) SAME-COMPONENT( f , i )

Example application 2 - Connected Components disjoint sets after processing each edge at a time

Linked list representation of disjoint sets Say, S 1 contains members f , g , and d with f as the representative member . Each object in the list contains a set member, a pointer to the next object in the list, and a pointer back to the set object. Each object has pointers head and tail to the first and last objects. MAKE-SET( x ) - we create a new linked list whose only object is x . FIND-SET( x ) - we follow the pointer from from x back to its set object and then return the member of the object that the head points to. Example, FIND-SET( g ) would return f . linked list representation of disjoint sets S 1 MAKE-SET( x ) and FIND-SET( x ) both need O(1) time. How?

A simple implementation of Union We can perform UNION( x , y ) by appending y ’s list into the end of x ’s list. x’ s representative becomes the resulting set’s representative. Use the tail pointer of x ’s list to quickly find where to append y ’s list. We must update the pointer to the set object UNION( g , e ) for each object originally in y ’s list -> takes linear time proportional to the length of y . Example: UNION( g , e ) causes pointer updates If we did not have the pointers to head, the time for UNION would be very less. What is the downside? for c , h , e , b .

Running time of the linked list implementation Suppose we have objects x 1 , x 2 , …, x n . We execute a sequence of n MAKE-SET operations followed by n-1 UNION operations, so that m = 2n -1. [ m - the total number of MAKE-SET, UNION, and FIND-SET operations] Total time for n MAKE-SET operations = Θ (n) i th UNION operation updates i objects, so the number of objects updated by all n-1 UNION operations is A sequence of 2n-1 operations on n objects that takes Θ (n 2 ) time, or Θ (n) time per operation. So, each operation (total operations = 2n-1) , on average requires Θ (n).

A weighted-union heuristic In the worst case, our implementation of the UNION procedure requires an average of Θ (n) time per call. Why? May be we are always appending a longer list onto a shorter list. Solution: We maintain the length of the list along with each list This way, we will always append a shorter list onto the longer. With this simple weighted-union heuristic , a single UNION operation can still take Ω (n) time if both sets have Ω (n) members. Overall, the total time spent in updating object pointers over all UNION operations is O(n lg (n)). i.e. each UNION operation on average takes O(lg(n)) time. Each MAKE-SET and FIND-SET take O(1) time and there are total O(m) of them. Thus the total time for entire sequence is O(m+n lg(n)).

Disjoint-set forests We represent sets by rooted trees, with each node containing one member and each tree representing a set. Each member points only to its parent. The root of each tree contains the representative and is its own parent. MAKE-SET operation creates a tree with just one node. FIND-SET operation is following the parents pointer until we find the root of the tree. The nodes visited on this simple path towards the root constitute the find path . UNION operation causes the root of one tree to point to the root of the other. Algorithms that use this representation are no faster than the ones that use the linked-list representation. Each MAKE-SET takes O(1) time UNION Each UNION takes O(1) time Each FIND-SET can take anywhere from O(1) to O(n) time. (FIND-SET is the challenge here, compared to UNION in linked-list representation.)

Heuristics to improve running time - Union by Rank Scenario: A sequence of n-1 UNION operations may create a tree that is just a linear chain of n nodes. Similar to the weighted-union heuristic , we can make the root of the tree with fewer nodes point to the root of the tree with more nodes. For each node, we maintain a rank , which is an upper bound on the height of the node. We make the root with smaller rank point to the root with larger rank during a UNION operation. This will improve the time required for each FIND-SET from O(n) to O(lg n). The total running time is O(m lg n) because for each MAKE-SET and UNION, we may have to run FIND-SET.

Heuristics to improve running time - Path compression Path compression is simple and yet highly Prior to executing effective. FIND-SET( a ) During the FIND-SET operations, make each node on the find path point directly to the root. Path compression does not change any ranks. What is the consequence? After executing FIND-SET( a ) Future FIND-SET operations take constant time. Path compression during the FIND-SET operation. Triangles are subtrees whose root nodes are shown. Now, the total running time is O(m) and each operation, on average, takes almost constant time.

Pseudocode for disjoint-set forests Path compression implementation With each node x , we maintain the integer value x.rank , which is an upper bound on the height of x . The parent of x is x.p . MAKE-SET creates a singleton set, the single node in the corresponding tree has an initial rank 0. Each FIND-SET operation leaves the ranks unchanged. The FIND-SET procedure is a two-pass method : as it recurses, it makes one pass up the find path to find the root, and as the recursion unwinds, it makes a second pass back down the find path to update each node to point directly to the root.

Pseudocode for disjoint-set forests x y The UNION operation has two cases, depending on whether the roots of the trees have equal ranks. If the roots have unequal ranks, we make the root with higher rank the parent root of the root with lower rank, x y x y but the rank themselves remain unchanged . If the roots have equal ranks, we arbitrarily choose one of the roots as the parent and increment the rank.

Classwork S 3 Draw a linked-list representation and forest representation of the following disjoint-set graph: S 1 S 2

Summary Disjoint sets can be represented in two ways - using linked-list and using trees/forests. With the basic linked-list implementation, with weighted-union heuristics, has total running time of O(m + n lg(n)). With the ‘union by rank’ heuristic and the path compression heuristic, the disjoint-set forest implementation takes almost O(m) total running time.

Data Structures for Disjoint Sets Course: CS 5130 - Advanced Data - PowerPoint PPT Presentation

Data Structures for Disjoint Sets Course: CS 5130 - Advanced Data Structures and Algorithms Instructor: Dr. Badri Adhikari Overview Some applications involve grouping n distinct elements into a collection of disjoint sets. Two frequent

Disjoint Sets and Disjoint sets The UNION-FIND ADT for disjoint sets the UNION-FIND

CSE 326: Data Structures Maintain a set of pairwise disjoint sets. Disjoint Sets

Data Structures for representative member. Disjoint Sets ! Operations: Make-Set(x): create a

Disjoint Sets CptS 223 Advanced Data Structures Larry Holder School of Electrical

S 3 identified by a rep. identified by a rep. n n = # of = # of Make Make- -Set

13 A: External Algorithms II; Disjoint Sets; Java API Support CS1102S: Data Structures and

13 A: External Algorithms; Disjoint Sets; Java API Support CS1102S: Data Structures and

CS 225 Data Structures No Novem ember er 6 6 Di Disjoint Sets Finale e + Graphs G G

Lecture 21: Disjoint Sets CSE 373: Data Structures and with Arrays Algorithms CSE 373 19 WI -

Disjoint Sets with Arrays Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1 Warm

Data Structures for Disjoint Set Union-Find Data Structure Disjoint Set Data Structure Disjoint

CS 225 Data Structures No Novem ember er 4 Di Disjoint Sets G G Carl Evans Heap Heap

Week 6 Oliver Kullmann Introduction Operations Data Structures for Disjoint Sets Application:

Disjoint sets March 20, 2020 Cinda Heeren / Andy Roth / Geoffrey Tien 1 A data structure for

1. Make - Set Given an element x , create a new set consisting solely of x . 2. Find - Set Given an

Disjoint Sets Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1 Warm Up Finding

Lecture 17: Mutable Linked Lists Quiz 5 tomorrow at the beginning of lecture May cover

pointer-manipulating programs Nadia Polikarpova joint work with Ilya Sergey (Yale-NUS) follow

EXPLOITING STRUCTURE FOR META-LEARNING NeurIPS Metalearning Workshop | December 8, 2018 Lise

Implementing Procedure Calls February 1822, 2013 1 / 39 Outline Intro to procedure calls

Multimodal Interaction Multimodal Interaction and TV Wei Yun Yau Programme Manager,

Linked Rules Principles for Rule Reuse on the Web Ankesh Khandelwal 1 , Ian Jacobi 2 , Lalana

LinkedSpending: OpenSpending becomes Linked Open Data Konrad H offner October 5, 2013 Konrad

Linking t w o charts IN TE R ME D IATE IN TE R AC TIVE DATA VISU AL IZATION W ITH P L OTLY IN