13 A: External Algorithms II; Disjoint Sets; Java API Support - - PowerPoint PPT Presentation

13 a external algorithms ii disjoint sets java api support
SMART_READER_LITE
LIVE PREVIEW

13 A: External Algorithms II; Disjoint Sets; Java API Support - - PowerPoint PPT Presentation

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers 13 A: External Algorithms II; Disjoint Sets; Java API Support CS1102S: Data Structures and Algorithms Martin Henz April 15, 2009 Generated on Friday 17 th April,


slide-1
SLIDE 1

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers

13 A: External Algorithms II; Disjoint Sets; Java API Support

CS1102S: Data Structures and Algorithms

Martin Henz

April 15, 2009

Generated on Friday 17th April, 2009, 12:37 CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 1

slide-2
SLIDE 2

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers

1

External Sorting

2

Disjoint Sets

3

Java API Support for Data Structures

4

Puzzlers

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 2

slide-3
SLIDE 3

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Model for External Sorting The Simple Algorithm Multiway Merge

1

External Sorting Model for External Sorting The Simple Algorithm Multiway Merge

2

Disjoint Sets

3

Java API Support for Data Structures

4

Puzzlers

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 3

slide-4
SLIDE 4

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Model for External Sorting The Simple Algorithm Multiway Merge

Tapes as Storage

Similar to disks Access time many orders of magnitude slower than main memory Additional characteristics Large amounts of data can be read sequentially quite efficiently Access of previous locations is extremely slow, as it requires re-winding the tape!

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 4

slide-5
SLIDE 5

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Model for External Sorting The Simple Algorithm Multiway Merge

External Sorting

Main idea Use tapes sequentially, and read one block from each input tape tape Merge blocks Sort the blocks Use merge procedure from mergesort to merge

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 5

slide-6
SLIDE 6

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Model for External Sorting The Simple Algorithm Multiway Merge

The Simple Algorithm: Overview

Four tapes Two input tapes; two output tapes Read and write runs Read runs from input tape, sort them and write alternatively to

  • utput tapes

Continue, writing larger runs Read two runs from each “output” tape, and merge them on the fly, writing alternatively to “input” tapes Continue until one tape has all sorted data

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 6

slide-7
SLIDE 7

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Model for External Sorting The Simple Algorithm Multiway Merge

Multiway Merge

Why only four tapes? If we have more than four tapes, we can take advantage of them by using multiway merge How finding the smallest element during merge? Priority queue! Each iteration of inner loop deleteMin to find smallest element insert new element from tape from which element was deleted

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 7

slide-8
SLIDE 8

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Model for External Sorting The Simple Algorithm Multiway Merge

Polyphase Merge and Replacement Selection

Polyphase merge: main idea Make use of fewer tapes, by re-using tapes for reading and writing Leading to tape organization using kth order Fibonacci numbers Replacement selection: main idea Make use of input tape as output tape, reusing the tapes “on the fly”

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 8

slide-9
SLIDE 9

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

1

External Sorting

2

Disjoint Sets Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants Applications

3

Java API Support for Data Structures

4

Puzzlers

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 9

slide-10
SLIDE 10

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Equivalence Relations

Definition An equivalence relation is a relation R that satisfies three properties:

1

(Reflexive) aRa, for all a ∈ S.

2

(Symmetric) aRb if and only if bRa.

3

(Transitive) aRb and bRc implies aRc. Examples Electrical connectivity (metal wires between points) Cities belonging to same country

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 10

slide-11
SLIDE 11

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

The Dynamic Equivalence Problem

Initial setup Collection of N disjoint sets, each with one element Operations find(a): return the set of which x is element union(a, b): merge the sets to which a and b belong, so that find(a) = find(b)

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 11

slide-12
SLIDE 12

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Strategies

Fast Find, Slow Union Use array repres to store equivalence class for each element find(a): return repres[a] union(a, b): if repres[x] = repres[b] then set repres[x] to repres[a] Fast Union, Reasonable Find Union/find data structure

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 12

slide-13
SLIDE 13

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Basic Data Structure

Idea Maintain forest corresponding to equivalence relation Union Merge trees Find Return root of tree Observe Only upward direction needed!

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 13

slide-14
SLIDE 14

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Example

Initial setup: After union(4, 5)

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 14

slide-15
SLIDE 15

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Example

After union(4, 5) After union(6, 7)

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 15

slide-16
SLIDE 16

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Example

After union(6, 7) After union(4, 6)

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 16

slide-17
SLIDE 17

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Representation

Idea Remember parent node only; mark root with −1

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 17

slide-18
SLIDE 18

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Variants

Problem How to choose root for union? Bad choice can lead to long paths Union-by-size Always make the smaller tree a subtree of the larger tree Analysis When depth increases, the tree is smaller than the other side. Thus, after union, it is at least twice as large. Height less than or equal to log N

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 18

slide-19
SLIDE 19

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Variants

Union-by-height Always make the shorter tree a subtree of the higher tree Height As with union-by-size: O(log N)

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 19

slide-20
SLIDE 20

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Path Compression

During find make every node point to root after find(14)

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 20

slide-21
SLIDE 21

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

A Very Slowly Growing Function

Definition log∗ N is the number of times log needs to be applied to N until N ≤ 1. Examples log∗ 2 = 1 log∗ 4 = 2 log∗ 16 = 3 log∗ 65536 = 4 ...

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 21

slide-22
SLIDE 22

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Equivalence Relations The Dynamic Equivalence Problem Basic Data Structure Variants

Runtime

Consider variant Union-by-height combined with path compression Theorem The running time of M unions and finds is O(M log∗ N).

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 22

slide-23
SLIDE 23

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

1

External Sorting

2

Disjoint Sets

3

Java API Support for Data Structures Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

4

Puzzlers

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 23

slide-24
SLIDE 24

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

The Top-level Collection Interface

public interface Collection <Any> extends Iterable <Any> { int size ( ) ; boolean isEmpty ( ) ; void clear ( ) ; boolean contains ( Any x ) ; boolean add ( Any x ) ; / / sic boolean remove ( Any x ) ; / / sic java . u t i l . I t e r a t o r <Any> i t e r a t o r ( ) ; }

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 24

slide-25
SLIDE 25

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

The List Interface in Collection API

public interface List <Any> extends Collection <Any> { Any get ( int idx ) ; Any set ( int idx , Any newVal ) ; void add ( int idx , Any x ) ; void remove ( int idx ) ; L i s t I t e r a t o r <Any> l i s t I t e r a t o r ( int pos ) ; }

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 25

slide-26
SLIDE 26

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

ArrayList and LinkedList

public class ArrayList <Any> implements List <Any> { . . . } public class LinkedList <Any> implements List <Any> { . . . }

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 26

slide-27
SLIDE 27

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

Iterators

public interface I t e r a t o r <Any> { boolean hasNext ( ) ; Any next ( ) ; void remove ( ) ; }

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 27

slide-28
SLIDE 28

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

ListIterators

public interface L i s t I t e r a t o r <Any> extends I t e r a t o r <Any> { boolean hasPrevious ( ) ; Any previous ( ) ; void add ( Any x ) ; void set ( Any newVal ) ; }

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 28

slide-29
SLIDE 29

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

TreeSet

Implements Collection Guarantees O(log N) time for add, remove and contains

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 29

slide-30
SLIDE 30

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

AbstractMap<K,V>

Basic operations V get(K key): Returns the value to which the specified key is mapped. V put(K key, V value): Associates the specified value with the specified key in this map. Other operations containsKey(key), containsValue(val), remove(key)

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 30

slide-31
SLIDE 31

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

TreeMap

Extends AbstractMap Guarantees O(log N) time for put, get, containsKey, containsValue, remove

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 31

slide-32
SLIDE 32

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

HashMap

Extends AbstractMap Uses separate chaining with rehashing Rehashing is governed by initial capacity and load factor, set in constructor

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 32

slide-33
SLIDE 33

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

HashSet

Implements Collection using HashMap

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 33

slide-34
SLIDE 34

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

PriorityQueue

Implements Collection Efficient implementation of heap data structure Operation names:

deleteMin is called “poll” insert is called “add”

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 34

slide-35
SLIDE 35

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers Collections, Lists, Iterators Trees Hashing PriorityQueue Sorting

Sorting

Generic sorting supported by class Collections Uses mergesort in order to minimize number of comparisons Sorting of built-in numerical types supported by class Arrays Uses efficient implementation of quicksort, to take advantage of tight inner loop.

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 35

slide-36
SLIDE 36

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers

1

External Sorting

2

Disjoint Sets

3

Java API Support for Data Structures

4

Puzzlers

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 36

slide-37
SLIDE 37

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers

Last Puzzler: Package Deal

package c l i c k ; public class CodeTalk { public void doIt ( ) { printMessage ( ) ; } void printMessage ( ) { System . out . p r i n t l n ( ” Click ” ) ; } }

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 37

slide-38
SLIDE 38

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers

Last Puzzler: Package Deal

package hack ; import c l i c k . CodeTalk ; public class TypeIt { private static class C l i c k I t extends CodeTalk { void printMessage ( ) { System . out . p r i n t l n ( ” Hack ” ) ; } } public static void main ( String [ ] args ) { C l i c k I t c l i c k i t = new C l i c k I t ( ) ; c l i c k i t . doIt ( ) ; } } What does clickit . doIt () print? “Click” or “Hack”?

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 38

slide-39
SLIDE 39

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers

Java’s Access Modifiers

public : available wherever the class is available private : only available within the class protected : available in subclasses and within same package none : available within the same package

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 39

slide-40
SLIDE 40

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers

Access Modifiers Govern Inheritance

Overriding only available methods A method can be overridden only when it is available according to the modifier rules. Package visibility Method printMessage can only be overridden within package click . Result ”Click” is printed.

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 40

slide-41
SLIDE 41

External Sorting Disjoint Sets Java API Support for Data Structures Puzzlers

This Week and Beyond

Thursday tutorial: Assignment 9 Friday lecture: CS1102S summary, outlook; questions? Next week: Reading week, consultation by appointment 27/4 and 28/4: no consultation 29/4, 5pm: Final

CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support 41