Randomized algorithms Quick-sort Closest pair of points Inge Li - - PowerPoint PPT Presentation

randomized algorithms
SMART_READER_LITE
LIVE PREVIEW

Randomized algorithms Quick-sort Closest pair of points Inge Li - - PowerPoint PPT Presentation

Randomized algorithms Today What are randomized algorithms? Properties of randomized algorithms Three examples: Median/Select. Randomized algorithms Quick-sort Closest pair of points Inge Li Grtz 1 2


slide-1
SLIDE 1

Randomized algorithms

Inge Li Gørtz

1

  • Today
  • What are randomized algorithms?
  • Properties of randomized algorithms
  • Three examples:
  • Median/Select.
  • Quick-sort
  • Closest pair of points

Randomized algorithms

2

Randomized Algorithms

3

  • So far we dealt with deterministic algorithms:
  • Giving the same input to the algorithm repeatedly results in:
  • The same running time.
  • The same output.
  • Randomized algorithm:
  • Can make random choices (using a random number generator)

Randomized Algorithms

4

slide-2
SLIDE 2
  • A randomized algorithm has access to a random number generator

rand(a, b), where a, b are integers, a < b.

  • rand(a, b) returns a number x ∈ {a, (a + 1), . . . , (b − 1), b}.
  • Each of the b − a + 1 numbers is equally likely.
  • Calling rand(a, b) again results in a new, possibly different number.
  • rand(a, b) “has no memory”. The current number does not depend on

the results of previous calls.

  • Statistically speaking: rand(a, b) generates numbers independently

according to uniform distribution.

Random number generator

5

  • The algorithm can make random decisions using rand

Randomized algorithms

Running while (rand(0,1) = 0) do print(“running”); end while print(“stopped”); Julefrokost while (rand(1,4) = rand(1,4)) do have another Christmas beer; end while go home;

6

  • A randomized algorithm might never stop.
  • Example: Find the position of 7 in a three-element array containing the numbers 4, 3

and 7.

Properties of randomized algorithms

Find7(A[1…3]) goon := true; while (goon) do i := rand(1,3); if (A[i] = 7) then print(Bingo: 7 found at position i); goon := false; end if end while

7

  • A randomized algorithm might find the wrong solution.
  • Example: Find the position of the minimum element in a three-element array.
  • If A = [5,6,4] and i = 2 and j = 1 then the algorithm will wrongly return 5.

Properties of randomized algorithms

MinOfThree(A[1…3]) i := rand(1,3); j := rand(1,3); return(min(A[i],A[j]);

8

slide-3
SLIDE 3
  • Both examples are not so bad as they look.
  • It is highly unlikely that Find7 runs a long time.
  • It is more likely that MinOfThree returns a correct result than a wrong one.

Properties of randomized algorithms

9

Analysis of Randomized Algorithms

10

  • Running time: O(1).
  • What is the chance that the answer of MinOfThree is correct?
  • 9 possible pairs all with probability 1/9:
  • (1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3)
  • Assume wlog that A[1] is the minimum:
  • 5 of the 9 pairs contain the index 1 and give the correct minimum.
  • P[correct] = 5/9 > 1/2.

Analysis of MinOfThree

MinOfThree(A[1…3]) i := rand(1,3); j := rand(1,3); return(min(A[i],A[j]);

11

  • Running MinOfThree once gives the correct answer with probaility 5/9

and the incorrect one with probability 4/9.

  • Idea: Run MinOfThree many times and pick the smallest value returned.
  • The probability that MinOfThree fails every time in k runs is:
  • To be 99% sure to find the minimum, choose k such that

Analysis of MinOfThree

MinOfThree(A[1…3]) i := rand(1,3); j := rand(1,3); return(min(A[i],A[j]);

12

slide-4
SLIDE 4
  • Running time is variable. What is the expected running time?
  • Assume wlog that A[1] = 7.
  • If rand(1, 3) = 1 then the while-loop is terminated.
  • P[rand(1,3) = 1] = 1/3
  • If rand(1, 3) ≠ 1 then the while-loop is not terminated.
  • P[rand(1,3) ≠ 1] = 2/3
  • The while-loop is terminated after one iteration with probability

P[rand(1,3) = 1] = 1/3.

Analysis of Find7

Find7(A[1…3]) goon := true; while (goon) do i := rand(1,3); if (A[i] = 7) then print(Bingo: 7 found at position i); goon := false; end if end while

13

  • The while-loop stops after exactly two iterations if
  • rand(1, 3) ≠ 1 in the first iteration.
  • rand(1, 3) = 1 in the second iteration.
  • This happens with probability 2/3 and 1/3, resp.
  • The probability for both to happen is - by independence -

Analysis of Find7

Find7(A[1…3]) goon := true; while (goon) do i := rand(1,3); if (A[i] = 7) then print(Bingo: 7 found at position i); goon := false; end if end while

14

  • The while-loop stops after exactly three iterations if
  • rand(1, 3) ≠ 1 in the first two iterations.
  • rand(1, 3) = 1 in the third iteration.
  • This happens with probability 2/3, 2/3 and 1/3, resp.
  • The probability for all three to happen is - by independence -
  • The while-loop stops after exactly k iterations if
  • rand(1, 3) ≠ 1 in the first k-1 iterations.
  • rand(1, 3) = 1 in the kth iteration.
  • This happens with probability 2/3, 2/3 and 1/3, resp.
  • The probability for all three to happen is - by independence -

Analysis of Find7

15

  • We have
  • The expected running time of Find7 is
  • Idea: Stop Las Vegas algorithm if it runs “much longer” than the

expected time and restart it.

Expected running time

16

k=0

k ⋅ xk = x (1 − x)2 for |x| < 1.

slide-5
SLIDE 5
  • We have seen two kinds of algorithms:
  • Monte Carlo algorithms: stop after a fixed (polynomial) time and give

the correct answer with probability greater 50%. 


  • Las Vegas algorithms: have variable running time but always give the

correct answer.

Types of randomized algorithms

17

  • Analyse the expected number of times running is printed:
  • Analyse the expected number of beers you get if you follow the algorithm:

Randomized algorithms

Running while (rand(0,1) = 0) do print(“running”); end while print(“stopped”); Julefrokost while (rand(1,4) = rand(1,4)) do have another Christmas beer; end while go home;

18

Summations

k=0

k ⋅ xk = x (1 − x)2 for |x| < 1.

k=0

xk = 1 (1 − x) for |x| < 1.

k=0

k ⋅ xk−1 = 1 (1 − x)2 for |x| < 1.

Median/Select

20

slide-6
SLIDE 6
  • Given n numbers S = {a1, a2, …, an}.
  • Median: number that is in the middle position if in sorted order.
  • Select(S,k): Return the kth smallest number in S.
  • Min(S) = Select(S,1), Max(S)= Select(S,n), Median = Select(S,n/2).
  • Assume the numbers are distinct.

Select

Select(S, k) { Choose a pivot s ∈ S uniformly at random. For each element e in S if e < s put e in S’ if e > s put e in S’’ if |S’| = k-1 then return s if |S’| ≥ k then call Select(S’, k) if |S’| < k then call Select(S’’, k - |S’| - 1) }

21

  • Worst case running time:
  • If there is at least an fraction of elements both larger and smaller than s:
  • Limit number of bad pivots.
  • Intuition: A fairly large fraction of elements are “well-centered” => random pivot

likely to be good.

Select

Select(S, k) { Choose a pivot s ∈ S uniformly at random. For each element e in S if e < s put e in S’ if e > s put e in S’’ if |S’| = k-1 then return s if |S’| ≥ k then call Select(S’, k) if |S’| < k then call Select(S’’, k - |S’| - 1) }

T(n) = cn + c(n − 1) + c(n − 2) + · · · = Θ(n2). ε T(n) = cn + (1 − ε)cn + (1 − ε)2cn + · · · =

  • 1 + (1 − ε) + (1 − ε)2 + · · ·
  • cn

≤ cn/ε.

22

  • Phase j: Size of set at most and at least .
  • Element is central if at least a quarter of the elements in the current call are smaller

and at least a quarter are larger.

  • At least half the elements are central.
  • Pivot central => size of set shrinks with by at least a factor 3/4 => current phase

ends.

  • Pr[s is central] = 1/2.
  • Expected number of iterations before a central pivot is found = 2 =>

expected number of iterations in phase j at most 2.

  • X: random variable equal to number of steps taken by algorithm.
  • Xj: expected number of steps in phase j.
  • X = X1+ X2 + .…
  • Number of steps in one iteration in phase j is at most .
  • E[Xj] = .
  • Expected running time:

Select

n(3/4)j n(3/4)j+1 E[X] = X

j

E[Xj] ≤ X

j

2cn ✓3 4 ◆j = 2cn X

j

✓3 4 ◆j ≤ 8cn. 2cn(3/4)j cn(3/4)j

23

Quicksort

24

slide-7
SLIDE 7
  • Given n numbers S = {a1, a2, …, an} return the sorted list.
  • Assume the numbers are distinct.

Quicksort

Quicksort(A,p,r) { if |S| ≤ 1 return S else Choose a pivot s ∈ S uniformly at random. For each element e in S if e < s put e in S’ if e > s put e in S’’ L = Quicksort(S’) R = Quicksort(S’’) Return the sorted list L◦s◦R. }

25

  • Worst case Quicksort requires Ω(n2) comparisons: if pivot is the smallest element in

the list in each recursive call.

  • If pivot aways is the median then T(n) = O(n log n).
  • for i < j: random variable
  • X total number of comparisons:
  • Expected number of comparisons:

Quicksort: Analysis

Xij = ( 1 if ai and aj compared by algorithm

  • therwise

X =

n−1

X

i=1 n

X

j=i+1

Xij E[X] = E[

n−1

X

i=1 n

X

j=i+1

Xij] =

n−1

X

i=1 n

X

j=i+1

E[Xij]

26

  • Expected number of comparisons:
  • Since Xij only takes values 0 and 1:
  • ai and aj compared iff ai or aj is the first pivot chosen from Zij = {ai,…,aj}.
  • Pivot chosen independently uniformly at random => all elements from Zij equally

likely to be chosen as first pivot from this set.

  • We have
  • Thus

Quicksort: Analysis

E[X] = E[

n−1

X

i=1 n

X

j=i+1

Xij] =

n−1

X

i=1 n

X

j=i+1

E[Xij] E[Xij] = Pr[Xij = 1] Pr[Xij = 1] = 2/(j − i + 1) E[X] =

n−1

X

i=1 n

X

j=i+1

E[Xij] =

n−1

X

i=1 n

X

j=i+1

Pr[Xij] =

n−1

X

i=1 n

X

j=i+1

2 j − i + 1 <

n−1

X

i=1 n

X

k=1

2 k =

n−1

X

i=1

O(log n) = O(n log n) =

n−1

X

i=1 n−i+1

X

k=2

2 k

27

Closest pair of points

A randomized algorithm

28

slide-8
SLIDE 8
  • Closest pair of points. Given n points in the plane, find a pair with smallest

euclidean distance between them.

Closest Pair of Points

Thank you to Kevin Wayne for inspiration to slides.

29

  • Closest pair of points. Given n points in the plane, find a pair with smallest euclidean

distance between them.

  • Fundamental geometric primitive.
  • Graphics, computer vision, geographic information systems, molecular modeling,

air traffic control.

  • Special case of nearest neighbor, Euclidean MST, Voronoi diagrams.
  • Brute force. Compare all pairs => O(n2) time.
  • 1-D version. Sort and scan => O(n log n) time.
  • Simplifying assumption. No two points coincide (for a simpler presentation).

Closest Pair of Points

Thank you to Kevin Wayne for inspiration to slides.

30

  • Assume wlog that points are in the unit square.
  • Sort points in random order.
  • Let δ = d(p1,p2). Check for each point pi (in order) if there exists a point pj, j<i, such that

d(pi,pj) < δ.

  • If such a point found. Update δ.

Randomized algorithm

3 4 11 9 10 15 5 1 2 14 12 6 13 8 7 δ2 δ1

31

How to check a point

3 4 11 9 10 15 5 1 2 14 6 13 8 7 δ δ/2 δ/2 12

  • δ current smallest distance. Divide unit square into subsquares with side lengths δ/2.

32

slide-9
SLIDE 9

How to check a point

3 4 11 9 10 15 5 1 2 14 6 13 8 7 δ/2 δ/2 12

  • δ current smallest distance. Divide unit square into subsquares with side lengths δ/2.
  • If two points i and j are in the same subsquare then d(i,j) < δ.

33

How to check a point

3 4 11 9 10 15 5 1 2 14 6 13 8 7 δ/2 δ/2 12

  • δ current smallest distance. Divide unit square into subsquares with side lengths δ/2.
  • If two points i and j are in the same subsquare then d(i,j) < δ.
  • If d(i,j) < δ then j is in the 5x5 grid of subsquares around i.

34

  • Use hashtable to store which square a point is in. Only store points already

looked at (red points).

Closest Pair of Points: Randomized algorithm

3 4 11 9 10 15 5 1 2 14 12 6 13 8 7 δ2 δ1

35

Closest Pair of Points

3 4 11 9 10 15 5 1 2 14 12 6 13 8 7 δ2 δ3

  • Use hashtable to store which square a point is in. Only store points already

looked at (red points).

  • When starting new round: rehash all points from 1…i.

36

slide-10
SLIDE 10

Closest Pair of Points

3 4 11 9 10 15 5 1 2 14 12 6 13 8 7 δ3

  • Use hashtable to store which square a point is in. Only store points already

looked at (red points).

  • When starting new round: rehash all points from 1…i.

37

  • Number of lookup operations:
  • Number of distance calculations:
  • Number of MakeDictionary operations:

Closest Pair of Points: Analysis

38

  • Number of lookup operations:
  • Number of distance calculations:
  • Number of MakeDictionary operations:

Closest Pair of Points: Analysis

at most 25 per point => O(n) total O(n) at most 25 per point => O(n) total

39

  • Number of lookup operations:
  • Number of distance calculations:
  • Number of MakeDictionary operations:
  • Number of insertions:
  • Random variable X = number of insertions
  • Random variable
  • Pr[Xi = 1] ≤ 2/i
  • Expected number of insertions:
  • Use hashtable as dictionary: Expected O(n) time in total.

Closest Pair of Points: Analysis

at most 25 per point => O(n) total O(n) at most 25 per point => O(n) total Xi = ( 1 i causes δ to change

  • therwise

E[X] = n + i⋅E[Xi

i=1 n

] = n + i⋅Pr[Xi = 1]

i=1 n

≤ n + i⋅2 /i = n + 2n = 3n

i=1 n

40