Announcements Reading for this lecture: Chapter 8. CSE 332: - PDF document

Announcements • Reading for this lecture: Chapter 8. CSE 332: Disjoint Set Union/Find (and finishing Dijkstra’s algorithm) Richard Anderson, Steve Seitz Winter 2014 2 http://www.cs.utexas.edu/users/EWD/ • Edsger Wybe Dijkstra was one of the most influential members of computing science's founding generation. Among the domains in which his scientific contributions are fundamental are – algorithm design – programming languages – program design – operating systems – distributed processing – formal specification and verification – design of mathematical arguments Assume all edges have non-negative cost Simulate Dijkstra’s algorithm Dijkstra’s Algorithm (strarting from s) on the graph S = {}; d[s] = 0; d[v] = infinity for v != s Round Vertex While S != V s a b c d Added Choose v in V-S with minimum d[v] 1 1 c a Add v to S 1 3 2 2 For each w in the neighborhood of v 1 s 4 d[w] = min(d[w], d[v] + c(v, w)) 4 3 6 4 1 y b d 4 3 1 3 u 1 1 0 1 5 4 s x 2 2 2 2 v 2 3 5 z 1

Correctness Proof Proof • Let v be a vertex in V-S with minimum d[v] • Elements in S have the correct label • Let P v be a path of length d[v], with an edge (u,v) • Key to proof: when v is added to S, it has • Let P be some other path to v. Suppose P first the correct distance label. leaves S on the edge (x, y) – P = P sx + c(x,y) + P yv y y – Len(P sx ) + c(x,y) >= d[y] x x – Len(P yv ) >= 0 – Len(P) >= d[y] + 0 >= d[v] s s u u v v Making Connections Union-Find Data Structure You have a set of nodes (numbered 1-9) on a network. You are given a sequence of pairwise connections between them: • ADT Definition 3-5 • How it’s implemented with pointers 4-2 • Optimizations 1-6 5-7 • Results of analysis 4-8 – (Some of the strangest mathematics in CS) 3-7 Q: Are nodes 2 and 4 (indirectly) connected? Q: How about nodes 3 and 8? Q: Are any of the paired connections redundant due to indirect connections? Q: How many sub-networks do you have? 10 Making Connections Applications of Disjoint Sets Answering these questions is much easier if we create disjoint sets of nodes that are connected: Maintaining disjoint sets in this manner arises in a number of areas, including: Start: {1} {2} {3} {4} {5} {6} {7} {8} {9} – Networks 3-5 4-2 – Transistor interconnects 1-6 – Compilers 5-7 – Image segmentation 4-8 3-7 – Building mazes (this lecture) – Graph problems Q: Are nodes 2 and 4 (indirectly) connected? • Minimum Spanning Trees (upcoming topic in Q: How about nodes 3 and 8? this class) Q: Are any of the paired connections redundant due to indirect connections? 11 12 Q: How many sub-networks do you have? 2

Disjoint Set ADT Disjoint Sets and Naming • Maintain a set of pairwise disjoint sets. • Data: set of pairwise disjoint sets . – {3,5,7} , {4,2,8}, {9}, {1,6} • Required operations • Each set has a unique name: one of its – Union – merge two sets to create their union members (for convenience) – Find – determine which set an item appears in – {3,5,7} , {4,2,8}, {9}, {1,6} • A common operation sequence: – Connect two elements if not already connected: if (Find(x) != Find(y)) then Union(x,y) 13 14 Union Find • Union(x,y) – take the union of two sets • Find(x) – return the name of the set named x and y containing x. – {3,5,7} , {4,2,8}, {9}, {1,6} – {3,5,7,1,6}, {4,2,8}, {9}, – Union(5,1) – Find(1) = 5 – Find(4) = 8 {3,5,7,1,6}, {4,2,8}, {9}, 15 16 Example Nifty Application: Building Mazes Idea: Build a random maze by erasing walls. S S {1,2,7,8,9,13,19} {1,2,7,8,9,13,19,14,20 26,27} {3} Find(8) = 7 {3} {4} Find(14) = 20 {4} {5} {5} {6} {6} Union(7,20) {10} {10} {11,17} {11,17} {12} {12} {14,20,26,27} {15,16,21} {15,16,21} . . . . {22,23,24,29,39,32 {22,23,24,29,39,32 33,34,35,36} 33,34,35,36} 17 18 3

Building Mazes Building Mazes • Pick Start and End • Repeatedly pick random walls to delete. Start Start End End 19 20 Desired Properties A Cycle • None of the boundary is deleted (except at “start” and “end”). Start • Every cell is reachable from every other cell. • There are no cycles – no cell can reach itself by a path unless it retraces some part of the path. End 21 22 A Good Solution A Hidden Tree Start Start End End 23 24 4

Number the Cells Maze Building with Disjoint Union/Find We start with disjoint sets S ={ {1}, {2}, {3}, {4},… {36} }. Algorithm sketch : We have all possible walls between neighbors 1. Choose wall at random. W ={ (1,2), (1,7), (2,8), (2,3), … } 60 walls total. → Boundary walls are not in wall list, so left alone Start 1 2 3 4 5 6 2. Erase wall if the neighbors are in disjoint sets. 7 8 9 10 11 12 → Avoids cycles 13 14 15 16 17 18 3. Take union of those sets. 19 20 21 22 23 24 4. Go to 1, iterate until there is only one set. 25 26 27 28 29 30 → Every cell reachable from every other cell. 31 32 33 34 35 36 End Idea : Union-find operations will be done on cells. 25 26 Pseudocode Example Step • S = set of sets of connected cells – Initialize to {{1}, {2}, …, {n}} • W = set of walls Pick (8,14) S {1,2,7,8,9,13,19} – Initialize to set of all walls {{1,2},{1,7}, …} {3} • Maze = set of walls in maze (initially empty) Start 1 2 3 4 5 6 {4} {5} While there is more than one set in S 7 8 9 10 11 12 {6} Pick a random non-boundary wall (x,y) and remove from W {10} u = Find(x); 13 14 15 16 17 18 {11,17} v = Find(y); {12} if u  v then 19 20 21 22 23 24 {14,20,26,27} Union(u,v) 25 26 27 28 29 30 {15,16,21} else . Add wall (x,y) to Maze End 31 32 33 34 35 36 . Add remaining members of W to Maze {22,23,24,29,30,32 33,34,35,36} 27 28 Example Example S S S Pick (19,20) {1,2,7,8,9,13,19} {1,2,7,8,9,13,19,14,20 26,27} {1,2,7,8,9,13,19 {3} Find(8) = 7 {3} 14,20,26,27} {4} Find(14) = 20 {4} Start 1 2 3 4 5 6 {3} {5} {5} {4} {6} {6} Union(7,20) 7 8 9 10 11 12 {5} {10} {10} {6} {11,17} {11,17} 13 14 15 16 17 18 {10} {12} {12} {11,17} {14,20,26,27} 19 20 21 22 23 24 {15,16,21} {12} {15,16,21} . 25 26 27 28 29 30 {15,16,21} . . . . {22,23,24,29,39,32 31 32 33 34 35 36 End . {22,23,24,29,39,32 33,34,35,36} {22,23,24,29,39,32 33,34,35,36} 33,34,35,36} 29 30 5

Example at the End Data structure for disjoint sets? • Represent: {3,5,7} , {4,2,8}, {9}, {1,6} S • Support: find(x), union(x,y) {1,2,3,4,5,6,7 ,… 36} Start 1 2 3 4 5 6 Remaining 7 8 9 10 11 12 walls in W Previously added 13 14 15 16 17 18 to Maze 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 End 31 32 Tree-based Approach Union/Find Trade-off • Known result: Each set is a tree – Find and Union cannot both be done in worst- • Root of each tree is the set name. case O (1) time with any data structure. • We will instead aim for good amortized complexity. • For m operations on n elements: – Target complexity: O ( m ) i.e. O (1) amortized • Allow large fanout (why?) 33 34 Up-Tree for DS Union/Find Find Operation Observation : we will only traverse these trees upward from any given node to find the root. Find(x) follow x to the root and return the root. Idea : reverse the pointers (make them point up from child to parent). The result is an up-tree . Initial state 1 2 3 4 5 6 7 1 3 7 2 5 4 1 3 7 Intermediate state 6 2 5 4 Roots are the names of each set. 35 36 6 6

Union Operation Simple Implementation • Array of indices Union(i, j) - assuming i and j roots, point i to j. 1 2 3 4 5 6 7 up[x] = -1 means up x is a root. 1 3 7 1 3 7 2 5 4 4 2 5 6 6 37 38 A Bad Case Implementation void Union(int x, int y) { int Find(int x) { … 1 2 3 n assert(up[x]<0 && up[y]<0); while(up[x] >= 0) { Union(1,2) … up[x] = y; x = up[x]; 2 3 n } } return x; … Union(2,3) 1 } : 3 n : 2 Union(n-1,n) n 1 runtime for Union : runtime for Find : 3 Find(1) n steps!! 2 1 39 40 Amortized complexity is no better. Two Big Improvements Can we do better? Yes! 1. Union-by-size • Improve Union so that Find only takes worst case time of Θ (log n ). 2. Path compression • Improve Find so that, with Union-by-size, Find takes amortized time of almost Θ (1). 41 42 7

Union-by-Size Example Again Union-by-size … – Always point the smaller tree to the root of the 1 2 3 n S-Union(1,2) … larger tree 2 3 n S-Union(7,1) … S- Union(2,3) 1 2 1 2 n : 4 1 3 7 : 1 3 S- Union(n-1,n) 2 5 4 2 … 1 3 n Find(1) constant time 6 43 44 Analysis of Union-by-Size Analysis of Union-by-Size • Theorem: With union-by-size an up-tree of height h has size • What is worst case complexity of Find(x) in at least 2 h . an up-tree forest of n nodes? • Proof by induction – Base case: h = 0. The up-tree has one node, 2 0 = 1 – Inductive hypothesis: Assume true for h -1 – Observation: tree gets taller only as a result of a union . T = S-Union(T 1 ,T 2 ) ≤ h -1 h -1 T 1 T 2 • (Amortized complexity is no better.) 45 46 Example of Worst Cast (cont’) Worst Case for Union-by-Size n/2 Unions-by-size After n -1 = n /2 + n /4 + …+ 1 Unions -by-size n/4 Unions-by-size log 2 n Find If there are n = 2 k nodes then the longest path from leaf to root has length k . 47 48 8

Announcements Reading for this lecture: Chapter 8. CSE 332: - PDF document

Announcements Reading for this lecture: Chapter 8. CSE 332: Disjoint Set Union/Find (and finishing Dijkstras algorithm) Richard Anderson, Steve Seitz Winter 2014 2 http://www.cs.utexas.edu/users/EWD/ Edsger Wybe Dijkstra was one of

Announcements U 4: I

Announcements Lecture 22 System Development Leah Perlmutter / Summer 2018 Announcements

Recursion Announcements for Today Prelim 1 Other Announcements Reading: 5.8 5.10

Recursion Announcements for Today Prelim 1 Other Announcements Reading: 5.8 5.10

Announcements Announcements (Extra credit for any of these) Rosenfield Symposium: Tyranny of

For personal use only 7 August 2007 Manager Announcements Companies Announcements Office

Overview of the New Unit Activity Reporting Module Announcements Introduction and announcements:

61A Lecture 24 Monday, March 30 Announcements 2 Announcements Homework 7 due Wednesday 4/8

110 Announcements Announcements - Houses How-to use Zoom for Office-hours Video Posted on

Announcements Announcements Reading for Wednesday Reading for Wednesday the rest of

Announcements Lecture 16 Debugging Leah Perlmutter / Summer 2018 Announcements Reading

Lecture 22 System Development Leah Perlmutter / Summer 2018 Announcements Announcements

Superintendents Report April 10 th , 2018 Superintendents Report Announcements Proposed

Lecture 12 Subtypes and Subclasses Leah Perlmutter / Summer 2018 Announcements Announcements

Announcements Lecture 4 Specifications Leah Perlmutter / Summer 2018 Announcements

Lecture 14 Generics 1 Leah Perlmutter / Summer 2018 Announcements Announcements

61A Lecture 24 Friday, November 1 Announcements 2 Announcements Homework 7 due Tuesday 11/5

61A Lecture 14 Wednesday, February 25 Announcements 2 Announcements Project 2 due Thursday

Announcements Lecture 3 Loop Reasoning Leah Perlmutter / Summer 2018 Announcements Follow up

Lecture 10: Maps Part II: Core Commands Announcements HW3 due NOW! Announcements HW3 due

Lecture 10 Equality and Hashcode Leah Perlmutter / Summer 2018 Announcements Announcements

Lecture 7 Abstraction Functions Leah Perlmutter / Summer 2018 Announcements Announcements

CS 61A Lecture 10 Friday, February 13 Announcements 2 Announcements Guerrilla Section 2 is

Announcements PA1 available, due 01/28, 11:59p. HW2 available, due 02/05, 11:59p. MT1 2/4,