Amortized Analysis and Union-Find 02283, Inge Li Grtz 1 Today - - PowerPoint PPT Presentation

amortized analysis and union find
SMART_READER_LITE
LIVE PREVIEW

Amortized Analysis and Union-Find 02283, Inge Li Grtz 1 Today - - PowerPoint PPT Presentation

Amortized Analysis and Union-Find 02283, Inge Li Grtz 1 Today Amortized analysis 3 different methods 2 examples Union-Find data structures Worst-case complexity Amortized complexity 2 Amortized Analysis


slide-1
SLIDE 1

Amortized Analysis and Union-Find

02283, Inge Li Gørtz

1

slide-2
SLIDE 2

Today

  • Amortized analysis
  • 3 different methods
  • 2 examples
  • Union-Find data structures
  • Worst-case complexity
  • Amortized complexity

2

slide-3
SLIDE 3

Amortized Analysis

  • Amortized analysis.
  • Average running time per operation over a worst-case sequence of
  • perations.
  • Time required to perform a sequence of data operations is

averaged over all the operations performed.

  • Motivation: traditional worst-case-per-operation analysis can give

too pessimistic bound if the only way of having an expensive

  • peration is to have a lot of cheap ones before it.
  • Different from average case analysis: average over time, not input.

3

slide-4
SLIDE 4

Amortized Analysis

  • Methods.
  • Aggregate method
  • Accounting method
  • Potential method

4

slide-5
SLIDE 5

Aggregate method

  • Aggregate.
  • Determine total cost.
  • Amortized cost = total cost/#operations.

5

slide-6
SLIDE 6

Dynamic Tables

  • Doubling strategy.
  • Start with empty array of size 1.
  • Insert: If array is full create a new array of double the size and

reinsert all elements.

  • Analysis: n insert operations. Assume n is a power of 2.
  • Number of insertions 1 + 2 + 4 + ... + 2log n = O(n).
  • Total cost: O(n).
  • Amortized cost per insert: O(1).

6

slide-7
SLIDE 7

Accounting method

  • Accounting.
  • Some types of operations are overcharged.
  • Credit allocated with elements in the data structure used to pay for

subsequent operations

  • Total credit non-negative at all times -> total amortized cost an

upper bound on the actual cost.

7

slide-8
SLIDE 8

Dynamic Tables

  • Amortized costs:
  • Amortized cost of insertion: 3
  • 1 for own insertion
  • 1 for its first reinsertion.
  • 1 to pay for reinsertion of one of the items that have already

been reinserted once.

8

slide-9
SLIDE 9

Dynamic Tables

  • Analysis: keep 2 credits on each element in the array that is beyond

the middle.

  • table not full: insert costs 1, and we have 2 credits to save.
  • table full, i.e., doubling: half of the elements have 2 credits each.

Use these to pay for reinsertion of all in the new array.

  • Amortized cost per operation: 3.

2 2 2 x x x x x x x x x x x 2 2 2 2 2 2 2 2 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

9

slide-10
SLIDE 10

Example: Stack with MultiPop

  • Stack with MultiPop.
  • Push(e): push element e onto stack.
  • MultiPop(k): pop top k elements from the stack
  • Worst case: Implement via linked list or array.
  • Push: O(1).
  • MultiPop: O(k).
  • Amortized cost per operation: 2.

10

slide-11
SLIDE 11

Stack: Aggregate Analysis

  • Amortized analysis. Sequence of n Push and MultiPop operations.
  • Each object popped at most once for each time it is pushed.
  • #pops on non-empty stack ≤ #Push operations ≤ n.
  • Total time O(n).
  • Amortized cost per operation: 2n/n = 2.

11

slide-12
SLIDE 12

Stack: Accounting Method

  • Amortized analysis. Sequence of n Push and MultiPop operations.
  • Pay 2 credits for each Push.
  • Keep 1 credit on each element on the stack.
  • Amortized cost per operation:
  • Push: 2
  • MultiPop: 1 (to pay for pop on empty stack).

12

slide-13
SLIDE 13

Potential method

  • Potential functions.
  • Prepaid credit (potential) associated with the data structure (money

in the bank).

  • Can be used to pay for future operations.
  • Ensure there is always enough “money in the bank”.
  • Amortized cost of an operation: potential cost plus increase in

potential due to the operation.

  • Di: data structure after i operations
  • Potential function Φ(Di) maps Di onto a real value.
  • amortized cost = actual cost + Δ(Di) = actual cost + Φ(Di) - Φ(Di-1).

13

slide-14
SLIDE 14

Potential Functions

  • Amortized cost:
  • amortized cost = actual cost + Δ(Di) = actual cost + Φ(Di) - Φ(Di-1).
  • Stack.
  • Φ(Di) = #elements on the stack.
  • amortized cost of Push = 1 + Δ(Di) = 2.
  • amortized cost of MultiPop(k): If k’=min(k,|S|) elements are popped.
  • if S ≠ ∅: amortized cost = k‘+ Φ(Di) -Φ(Di-1) = k’ - k’ = 0.
  • if S = ∅: amortized cost = 1 + Δ(Di) = 1.

14

slide-15
SLIDE 15

Potential Functions

  • Amortized cost:
  • amortized cost = actual cost + Δ(Di) = actual cost + Φ(Di) - Φ(Di-1).
  • Dynamic tables
  • Φ(Di) =
  • L = current array size, k = number of elements in array.
  • amortized cost of insertion:
  • Array not full: amortized cost = 1 + 2 = 3
  • Array full (doubling): Actual cost = L + 1, Φ(Di-1) = L, Φ(Di)=2:

amortized cost = L + 1 + (2 - L) = 3.

  • 2(k − L/2)

if k ≥ L/2

  • therwise

15

slide-16
SLIDE 16

Amortized Cost vs Actual Cost

  • Total cost:
  • ∑ amortized cost = ∑(actual cost + Δ(Di)) =∑ actual cost + Φ(Dn) -

Φ(D0).

  • ∑ actual cost = ∑ amortized cost + Φ(D0) - Φ(Dn).
  • If potential always nonnegative and Φ(D0) = 0 then

∑ actual cost ≤ ∑ amortized cost.

16

slide-17
SLIDE 17

Potential Method

  • Summary:
  • 1. Pick a potential function, Φ, that will work (art).
  • 2. Use potential function to bound the amortized cost of the
  • perations you're interested in.
  • 3. Bound Φ(D0) - Φ(Dfinal)
  • Techniques to find potential functions: if the actual cost of an
  • peration is high, then decrease in potential due to this operation

must be large, to keep the amortized cost low.

17

slide-18
SLIDE 18

Union-Find Data Structures

18

slide-19
SLIDE 19

Union-Find Data Structure

  • Union-Find data structure:
  • Makeset(x): Create a singleton set containing x and return its identifier.
  • Union(A,B): Combine the sets identified by A and B into a new set, destroying

the old sets. Return the identifier of the new set.

  • Find(x): Return the identifier of the set containing x.
  • Only requirement for identifier: find(x) = find(y) iff x and y are in the same set.
  • Applications: Connectivity, Kruskal’s algorithm for MST, ...

19

slide-20
SLIDE 20

A Simple Union-Find Data Structure

  • Quick-Union:
  • Each set represented by a tree. Elements are represented by nodes. Root is

also identifier.

  • Make-Set(x): Create a new node x. Set p(x) = x.
  • Find(x): Follow parent pointers to the root. Return the root.
  • Union(A,B): Make root(B) a child of root(A).

20

slide-21
SLIDE 21

A Simple Union-Find Data Structure

  • Quick-Union:
  • Union(A,B): Make root(B) a child
  • f root(A).

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

Union(3,1) Union(7,5) Union(7,8) Union(3,7)

21

slide-22
SLIDE 22

A Simple Union-Find Data Structure

  • Quick-Union:
  • Each set represented by a tree. Elements are represented by nodes. Root is

also identifier.

  • Make-Set(x): Create a new node x. Set p(x) = x.
  • Find(x): Follow parent pointers to the root. Return the root.
  • Union(A,B): Make root(B) a child of root(A).
  • Analysis:
  • Make-Set(x) and Union(A,B): O(1)
  • Find(x): O(h), where h is the height of the tree containing x. Worst-case O(n).

22

slide-23
SLIDE 23

A Simple Union-Find Data Structure

  • Quick Find:
  • Each set represented by a tree of height at most one. Elements are

represented by nodes. Root is also identifier.

  • Make-Set(x): Create a new node x. Set p(x) = x and size(x) = 1.
  • Find(x): Follow parent pointer to root. Return root.
  • Union(A,B): Move all elements from smallest set to larger set (change parent

pointers). I.e., set p(B) = A and size(A) = size(A) + size(B).

23

slide-24
SLIDE 24

A Simple Union-Find Data Structure

  • Quick Find:
  • Union(A,B): Move all elements

from smallest set to larger set (change parent pointers).

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

Union(3,1) Union(7,5) Union(7,8) Union(3,7)

1 2 3 4 5 6 7 8 9

24

slide-25
SLIDE 25

A Simple Union-Find Data Structure

  • Quick Find:
  • Each set represented by a tree of height at most one. Elements are

represented by nodes. Root is also identifier.

  • Make-Set(x): Create a new node x. Set p(x) = x and size(x) = 1.
  • Find(x): Follow parent pointer to root. Return root.
  • Union(A,B): Move all elements from smallest set to larger set (change parent

pointers). I.e., set p(B) = A and size(A) = size(A) + size(B).

  • Analysis:
  • Make-Set(x) and Find(x): O(1)
  • Union(A,B): O(n)

25

slide-26
SLIDE 26

Amortized Complexity of Quick-Find

  • Amortized analysis: Consider a sequence of k Unions.
  • Observation 1: How many elements can be touched by the k Unions?
  • Consider an element x:
  • What can we say about the size of the set containing x before and after a

union that changes x’s parent pointer?

  • How large can the set containing x be after the m Unions?
  • How many times can x’s parent pointer be changed?

26

slide-27
SLIDE 27

Amortized Complexity of Quick-Find

  • Amortized analysis:
  • Each time x’s parent pointer changes the size of the set containing it at least

doubles.

  • At most 2k elements can be touched by k unions.
  • Size of set containing x after k unions at most 2k.
  • x’s parent pointer is updated at most lg(2k) times.
  • In total O(k log k) parent pointers updated in a sequence of k unions.
  • Amortized time per union: O(log k).
  • Lemma. Using the Quick-Find data structure a Find operation takes worst case

time O(1), a Make-Set operation time O(1), and a sequence of n Union

  • perations takes time O(n log n).

27

slide-28
SLIDE 28

A Better Union-Find Data Structure

  • Union-by-Weight or Union-by-Rank.
  • Union-by-Weight. Make the root of the smallest tree a child of the root of the

bigger tree.

  • Union-by-Rank. Each node x has an integer rank(x) associated.
  • Make-Set(x): Create a new node x. Set p(x) = x and rank(x) = 0.
  • Find(x): Follow parent pointers to the root. Return the root.
  • Union(A,B): 3 cases:
  • rank(A) > rank(B). Make B a child of A.
  • rank(A) < rank(B). Make A a child of B.
  • rank(A) = rank(B). Make B a child of A and set rank(A) = rank(A)+1.

28

slide-29
SLIDE 29

Analysis of Union-by-Rank

  • Increasing ranks.
  • rank(x) < rank(p(x)).
  • A root of set containing x: Find(x) takes O(rank(A)+1) time.
  • rank(A) ≤ lg n: Show |A| ≥ 2rank(A) by induction.
  • A=Makeset(x): rank(A)=0 and |A| = 20 = 1.
  • A=Union(B,C): 2 cases
  • rank(A)=rank(B) or rank(A)=rank(C): ok, since set only got larger.
  • rank(B)=rank(C)=k and rank(A)=k+1.

|A| = |B| + |C| ≥ 2k + 2k = 2k+1.

  • Lemma. Using the Union-by-Rank data structure a Find operation takes worst

case time O(log n), and a Make-Set or Union operation takes time O(1).

29

slide-30
SLIDE 30

Path Compression

  • Path compression. After each Find(x) operation, for all nodes y on the path from

x to the root, set p(y) = x.

  • Example. Find(6).

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

30

slide-31
SLIDE 31

Path Compression

  • Path compression. After each Find(x) operation, for all nodes y on the path from

x to the root, set p(y) = x.

  • Tarjan: Union-by-Rank and path compression. Starting from an empty data

structure the total time of n Makes-Set operations, at most n Union operations, and m Find operations takes time O(n + m α(n)).

  • α(n). Extremely slowly growing inverse Ackermann’s function.
  • Analysis complicated, but algorithm simple.
  • 2 one-pass variants path halving and path splitting. Same asymptotic running

time.

31

slide-32
SLIDE 32
  • Ackermann’s function
  • Inverse Ackermann:
  • Grows extremely slowly.

Ackermann’s function

Ak(j) =

  • j + 1

if k = 0, A(j+1)

k−1 (j)

if k ≥ 1. α(n) = min{k : Ak(1) ≥ n}.

32

slide-33
SLIDE 33
  • Potential function
  • Potential of node x after i operations:
  • Potential of forest after i operations:
  • Auxiliary functions: (for x, where rank(x) >0)
  • Properties:

(1) (2)

Path Compression: Analysis

Φi(x) =

  • x

φi(x). φi(x). level(x) = max{k : rank(p(x)) ≥ Ak(rank(x))}. 1 ≤ iter(x) ≤ rank(x). iter(x) = max{j : rank(p(x)) ≥ A(j)

level(x)(rank(x))}

0 ≤ level(x) < α(n)

33

slide-34
SLIDE 34
  • Potential function
  • Potential of forest after i operations:
  • Properties:

(1) (2) (3) (4) If x not root and rank(x) > 0,

Path Compression: Analysis

Φi(x) =

  • x

φi(x). 1 ≤ iter(x) ≤ rank(x).

φi(x) =

  • α(n) · rank(x)

if x root or rank(x) = 0 (α(n) − level(x)) · rank(x) − iter(x) if x not root and rank(x) ≥ 1.

0 ≤ φi(x) ≤ α(n) · rank(x). φi(x) < α(n) · rank(x).

0 ≤ level(x) < α(n)

34

slide-35
SLIDE 35
  • Potential function
  • Potential of forest after i operations:
  • Makeset(x): O(1)
  • Union(A,B) og Find(x): O(α(n))
  • Lemma 1. Suppose x not root, and ith operation Union or Find. Then
  • x’s potential cannot increase
  • if x has positive rank and iter or level changes, then x’s potential

decrease by at least 1.

Path Compression: Analysis

Φi(x) =

  • x

φi(x).

φi(x) =

  • α(n) · rank(x)

if x root or rank(x) = 0 (α(n) − level(x)) · rank(x) − iter(x) if x not root and rank(x) ≥ 1.

35

slide-36
SLIDE 36
  • Lemma 1. Suppose x not root, and ith operation Union or Find. Then
  • x’s potential cannot increase
  • if x has positive rank and iter or level changes, then x’s potential

decrease by at least 1.

  • Proof:
  • x not root => rank(x) unchanged
  • n does not change => α(n)
  • rank(x) = 0: ok

Path Compression: Analysis

36

slide-37
SLIDE 37
  • Lemma 1. Suppose x not root, and ith operation Union or Find. Then
  • x’s potential cannot increase
  • if x has positive rank and iter or level changes, then x’s potential

decrease by at least 1.

  • Proof: rank(x)>0
  • level increases monotonically over time
  • level unchanged => iter either increase or unchanged
  • both unchanged: ok

Path Compression: Analysis

37

slide-38
SLIDE 38
  • Lemma 1. Suppose x not root, and ith operation Union or Find. Then
  • x’s potential cannot increase
  • if x has positive rank and iter or level changes, then x’s potential

decrease by at least 1.

  • Proof: rank(x)>0
  • level unchanged and iter increases: iter increase by at least 1.
  • level increases:
  • level increases by at least 1 => (α(n)-level)rank(x)-iter(x) drops by

at least rank(x).

  • iter might decrease at most rank(x)-1
  • x’s potential decrease by at least 1.

Path Compression: Analysis

38

slide-39
SLIDE 39
  • Lemma 1. Suppose x not root, and ith operation Union or Find. Then
  • x’s potential cannot increase
  • if x has positive rank and iter or level changes, then x’s potential

decrease by at least 1.

  • Lemma. The amortized cost of Union(A,B) is O(α(n)).
  • Assume B made parent of A.
  • Real cost 1. Show increase in potential at most α(n).
  • Only A’s, B’s and the children of B’s potential can change.
  • potential of B’s children can only decrease.
  • A: decreases due to property 4 (before operation it was α(n)rank(A)).
  • B: root both before and after. rank increase by at most 1 => potential
  • f B increases with at most α(n).

Path Compression: Analysis

39

slide-40
SLIDE 40
  • Lemma 1. Suppose x not root, and ith operation Union or Find. Then
  • x’s potential cannot increase
  • if x has positive rank and iter or level changes, then x’s potential

decrease by at least 1.

  • Lemma. The amortized cost of Find(x) is O(α(n)).
  • Real cost s = length of path.
  • Show decrease in potential at least max{0, s - α(n)+2}.
  • No nodes potential increase: (Lemma + rank of root unchanged).
  • Show at least max{0, s - α(n)+2} nodes’ potential decrease with

at least 1.

  • Amortized cost = s - max{0, s - α(n)+2} = O(α(n)).

Path Compression: Analysis

40

slide-41
SLIDE 41
  • Show at least max{0, s - α(n)+2} nodes’ potential decrease with at least 1.
  • x node such that
  • rank(x) > 0
  • x has an ancestor y (not the root) with level(x) = level(y) before the Find
  • peration.
  • x’s potential decrease by at least 1: k = level(x), i = iter(x) before Find.
  • Before Find operation
  • After: rank(p(x)) = rank(p(y)), rank(p(y)) not decreased, and rank(x) unchanged:
  • Either iter(x) or level (x) increases. Lemma 1 implies potential of x decreases.

Path Compression: Analysis

rank(p(y)) ≥ Ak(rank(y)) ≥ Ak(rank(p(x)) ≥ Ak(Ai

k(rank(x))

= A(i+1)

k

(rank(x)). Ak(rank(p(x))) ≥ A(i+1)

k

(rank(x))

41

slide-42
SLIDE 42

Summary

  • Amortized analysis.
  • 2 Examples: Dynamic tables, stack with multipop.
  • Union-Find Data Structure.
  • Union-by-Rank + path compression: worst case + amortized bounds.

42