CS 473: Algorithms Chandra Chekuri Ruta Mehta University of - - PowerPoint PPT Presentation

cs 473 algorithms
SMART_READER_LITE
LIVE PREVIEW

CS 473: Algorithms Chandra Chekuri Ruta Mehta University of - - PowerPoint PPT Presentation

CS 473: Algorithms Chandra Chekuri Ruta Mehta University of Illinois, Urbana-Champaign Fall 2016 Chandra & Ruta (UIUC) CS473 1 Fall 2016 1 / 57 CS 473: Algorithms, Fall 2016 Review session Lecture 99 September 30, 2016 Chandra


slide-1
SLIDE 1

CS 473: Algorithms

Chandra Chekuri Ruta Mehta

University of Illinois, Urbana-Champaign

Fall 2016

Chandra & Ruta (UIUC) CS473 1 Fall 2016 1 / 57

slide-2
SLIDE 2

CS 473: Algorithms, Fall 2016

Review session

Lecture 99

September 30, 2016

Chandra & Ruta (UIUC) CS473 2 Fall 2016 2 / 57

slide-3
SLIDE 3

What we saw so far...

Fast Fourier Transform (FFT). Dynamic Programming String algorithms. Graph algorithms: shortest path, independent set, dominating set, etc. Randomozed Algorithms Quick sort, High probability analysis: Markov, Chebyshev, and Chernoff inequalities

Chandra & Ruta (UIUC) CS473 3 Fall 2016 3 / 57

slide-4
SLIDE 4

What we saw so far...

Fast Fourier Transform (FFT). Dynamic Programming String algorithms. Graph algorithms: shortest path, independent set, dominating set, etc. Randomozed Algorithms Quick sort, High probability analysis: Markov, Chebyshev, and Chernoff inequalities Hashing, Fingerprinting

Chandra & Ruta (UIUC) CS473 3 Fall 2016 3 / 57

slide-5
SLIDE 5

Part I FFT

Chandra & Ruta (UIUC) CS473 4 Fall 2016 4 / 57

slide-6
SLIDE 6

What is Fast Fourier Transform

Definition

Given vector a = (a0, a1, . . . , an−1) the Discrete Fourier Transform (DFT) of a is the vector a′ = (a′

0, a′ 1, . . . , a′ n−1) where a′ j = a(ωj n)

for 0 ≤ j < n. a′ is a sample representation of polynomial with coefficient reprentation a at n’th roots of unity. We have shown that a′ can be computed from a in O(n log n) time. This divide and conquer algorithm is called the Fast Fourier Transform (FFT).

Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 57

slide-7
SLIDE 7

Why FFT? Convolution and Polynomial Multiplication

Convolution

Convolution of vectors a = (a0, a1, . . . an−1) and b = (b0, b1, . . . bn−1) is a vector c = (c0, c1, . . . , c2n−2), where ck =

  • i,j: i+j=k

ai · bj

Chandra & Ruta (UIUC) CS473 6 Fall 2016 6 / 57

slide-8
SLIDE 8

Why FFT? Convolution and Polynomial Multiplication

Convolution

Convolution of vectors a = (a0, a1, . . . an−1) and b = (b0, b1, . . . bn−1) is a vector c = (c0, c1, . . . , c2n−2), where ck =

  • i,j: i+j=k

ai · bj

Polynomial Multiplication

If vectors a and b are coefficients of two n − 1 degree polynomials, (abusing notation) a(x) = n−1

i=0 aixi,

b(x) = n−1

i=0 bixi then c is

the coefficient vector of the product polynomial a(x) ∗ b(x).

Chandra & Ruta (UIUC) CS473 6 Fall 2016 6 / 57

slide-9
SLIDE 9

Why FFT? Convolution and Polynomial Multiplication

Convolution

Given vectors a = (a0, a1, . . . an−1) and b = (b0, b1, . . . bn−1) find its convolution vector c = (c0, c1, . . . , c2n−2).

1

Compute values of Pa and Pb at the 2nth roots of unity, to get their sample representation a′ and b′.

2

Compute sample representation c′ =

  • a′

0b′ 0, . . . , a′ 2n−2b′ 2n−2 of

product c = a · b

3

Compute c from c′ using inverse Fourier transform. Step 1 takes O(n log n) using two FFTs Step 2 takes O(n) time Step 3 takes O(n log n) using one FFT

Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 57

slide-10
SLIDE 10

Problem

Suppose we are given a bit string B[1..n]. A triple of distinct indices 1 ≤ i < j < k ≤ n is called a well-spaced triple in B if B[i] = B[j] = B[k] = 1 and k − j = j − i. (a) Describe a brute-force algorithm to determine whether B has a well-spaced triple in O(n2) time.

Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 57

slide-11
SLIDE 11

Application of FFT

Suppose we are given a bit string B[1..n]. A triple of distinct indices 1 ≤ i < j < k ≤ n is called a well-spaced triple in B if B[i] = B[j] = B[k] = 1 and k − j = j − i. (b) Describe an algorithm to determine whether B has a well-spaced triple in O(n log n) time.

Chandra & Ruta (UIUC) CS473 9 Fall 2016 9 / 57

slide-12
SLIDE 12
slide-13
SLIDE 13

Application of FFT

Suppose we are given a bit string B[1..n]. A triple of distinct indices 1 ≤ i < j < k ≤ n is called a well-spaced triple in B if B[i] = B[j] = B[k] = 1 and k − j = j − i. (c) Describe an algorithm to determine the number of well-spaced triples in B in O(n log n) time.

Chandra & Ruta (UIUC) CS473 10 Fall 2016 10 / 57

slide-14
SLIDE 14

Part II Dynamic Programming

Chandra & Ruta (UIUC) CS473 11 Fall 2016 11 / 57

slide-15
SLIDE 15

Recursion

Reduction:

Reduce one problem to another

Recursion

A special case of reduction

1

reduce problem to a smaller instance of itself

2

self-reduction

1

Problem instance of size n is reduced to one or more instances

  • f size n − 1 or less.

2

For termination, problem instances of small size are solved by some other method as base cases.

Chandra & Ruta (UIUC) CS473 12 Fall 2016 12 / 57

slide-16
SLIDE 16

What is Dynamic Programming?

Every recursion can be memoized. Automatic memoization does not help us understand whether the resulting algorithm is efficient or not.

Chandra & Ruta (UIUC) CS473 13 Fall 2016 13 / 57

slide-17
SLIDE 17

What is Dynamic Programming?

Every recursion can be memoized. Automatic memoization does not help us understand whether the resulting algorithm is efficient or not.

Dynamic Programming:

A recursion that when memoized leads to an efficient algorithm.

Chandra & Ruta (UIUC) CS473 13 Fall 2016 13 / 57

slide-18
SLIDE 18

Edit Distance

Definition

Edit distance between two words X and Y is the number of letter insertions, letter deletions and letter substitutions required to obtain Y from X.

Example

The edit distance between FOOD and MONEY is at most 4: FOOD → MOOD → MONOD → MONED → MONEY

Chandra & Ruta (UIUC) CS473 14 Fall 2016 14 / 57

slide-19
SLIDE 19

Edit Distance: Alternate View

Alignment

Place words one on top of the other, with gaps in the first word indicating insertions, and gaps in the second word indicating deletions. F O O D M O N E Y

Chandra & Ruta (UIUC) CS473 15 Fall 2016 15 / 57

slide-20
SLIDE 20

Edit Distance: Alternate View

Alignment

Place words one on top of the other, with gaps in the first word indicating insertions, and gaps in the second word indicating deletions. F O O D M O N E Y Formally, an alignment is a set M of pairs (i, j) such that each index appears at most once, and there is no “crossing”: i < i′ and i is matched to j implies i′ is matched to j′ > j. In the above example, this is M = {(1, 1), (2, 2), (3, 3), (4, 5)}.

Chandra & Ruta (UIUC) CS473 15 Fall 2016 15 / 57

slide-21
SLIDE 21

Edit Distance: Alternate View

Alignment

Place words one on top of the other, with gaps in the first word indicating insertions, and gaps in the second word indicating deletions. F O O D M O N E Y Formally, an alignment is a set M of pairs (i, j) such that each index appears at most once, and there is no “crossing”: i < i′ and i is matched to j implies i′ is matched to j′ > j. In the above example, this is M = {(1, 1), (2, 2), (3, 3), (4, 5)}. Cost of an alignment is the number of mismatched columns plus number of unmatched indices in both strings.

Chandra & Ruta (UIUC) CS473 15 Fall 2016 15 / 57

slide-22
SLIDE 22

Edit Distance Problem

Problem

Given two words, find the edit distance between them, i.e., an alignment of smallest cost.

Chandra & Ruta (UIUC) CS473 16 Fall 2016 16 / 57

slide-23
SLIDE 23

Edit Distance

Basic observation

Let X = αx and Y = βy α, β: strings. x and y single characters. Possible alignments between X and Y α x β y

  • r

α x βy

  • r

αx β y

Observation

Prefixes must have optimal alignment!

Chandra & Ruta (UIUC) CS473 17 Fall 2016 17 / 57

slide-24
SLIDE 24

Edit Distance

Basic observation

Let X = αx and Y = βy α, β: strings. x and y single characters. Possible alignments between X and Y α x β y

  • r

α x βy

  • r

αx β y

Observation

Prefixes must have optimal alignment! EDIST(X, Y) = min      EDIST(α, β) + [x = y] 1 + EDIST(α, Y) 1 + EDIST(X, β)

Chandra & Ruta (UIUC) CS473 17 Fall 2016 17 / 57

slide-25
SLIDE 25

Recursive Algorithm

Assume X is stored in array A[1..m] and Y is stored in B[1..n]

EDIST(A[1..i], B[1..j]) If (i = 0) return j If (j = 0) return i m1 = 1 + EDIST(A[1..(i − 1)], B[1..j]) m2 = 1 + EDIST(A[1..i], B[1..(j − 1)])) If (A[m] = B[n]) then m3 = EDIST(A[1..(i − 1)], B[1..(j − 1)]) Else m3 = 1 + EDIST(A[1..(i − 1)], B[1..(j − 1)]) return min(m1, m2, m3)

Call EDIST(A[1..m], B[1..n])

Chandra & Ruta (UIUC) CS473 18 Fall 2016 18 / 57

slide-26
SLIDE 26

Memoizing the Recursive Algorithm

int M[0..m][0..n] Initialize all entries of M[i][j] to ∞ return EDIST(A[1..m], B[1..n]) EDIST(A[1..i], B[1..j]) If (M[i][j] < ∞) return M[i][j] (* return stored value *) If (i = 0) M[i][j] = j ElseIf (j = 0) M[i][j] = i Else m1 = 1 + EDIST(A[1..(i − 1)], B[1..j]) m2 = 1 + EDIST(A[1..i], B[1..(j − 1)])) If (A[i] = B[j]) m3 = EDIST(A[1..(i − 1)], B[1..(j − 1)]) Else m3 = 1 + EDIST(A[1..(i − 1)], B[1..(j − 1)]) M[i][j] = min(m1, m2, m3) return M[i][j]

Chandra & Ruta (UIUC) CS473 19 Fall 2016 19 / 57

slide-27
SLIDE 27

Removing Recursion to obtain Iterative Algorithm

EDIST(A[1..m], B[1..n]) int M[0..m][0..n]

for i = 0 to m do M[i, 0] = i for j = 0 to n do M[0, j] = j for i = 1 to m do for j = 1 to n do

M[i][j] = min      [xi = yj] + M[i − 1][j − 1], 1 + M[i − 1][j], 1 + M[i][j − 1]

Chandra & Ruta (UIUC) CS473 20 Fall 2016 20 / 57

slide-28
SLIDE 28

Removing Recursion to obtain Iterative Algorithm

EDIST(A[1..m], B[1..n]) int M[0..m][0..n]

for i = 0 to m do M[i, 0] = i for j = 0 to n do M[0, j] = j for i = 1 to m do for j = 1 to n do

M[i][j] = min      [xi = yj] + M[i − 1][j − 1], 1 + M[i − 1][j], 1 + M[i][j − 1]

Analysis

Chandra & Ruta (UIUC) CS473 20 Fall 2016 20 / 57

slide-29
SLIDE 29

Removing Recursion to obtain Iterative Algorithm

EDIST(A[1..m], B[1..n]) int M[0..m][0..n]

for i = 0 to m do M[i, 0] = i for j = 0 to n do M[0, j] = j for i = 1 to m do for j = 1 to n do

M[i][j] = min      [xi = yj] + M[i − 1][j − 1], 1 + M[i − 1][j], 1 + M[i][j − 1]

Analysis

1

Running time is O(mn).

2

Space used is O(mn).

Chandra & Ruta (UIUC) CS473 20 Fall 2016 20 / 57

slide-30
SLIDE 30

Matrix and DAG of Computation

. . . . . . . . . . . . . . . . . . ... ... i, j

m, n

αxixj δ δ

0, 0

Figure : Iterative algorithm in previous slide computes values in row order.

Chandra & Ruta (UIUC) CS473 21 Fall 2016 21 / 57

slide-31
SLIDE 31

Problem

Given a graph G = (V, E) a matching is a set of edges M ⊂ E such that no two edges in M share an end point. Describe an efficient algorithm that given a tree T = (V, E) and non-negative weights w : E → R+ finds a maximum weight matching in T.

Chandra & Ruta (UIUC) CS473 22 Fall 2016 22 / 57

slide-32
SLIDE 32
slide-33
SLIDE 33

Dijkstra’s Algorithm

Initialize for each node v, dist(s, v) = ∞ Initialize S = {s}, dist(s, s) = 0

for i = 1 to |V| do

Let v be such that dist(s, v) = minu∈V−S dist(s, u) S = S ∪ {v}

for each u in Adj(v) do

dist(s, u) = min

  • dist(s, u), dist(s, v) + ℓ(v, u)
  • 1

Using Fibonacci heaps. Running time: O(m + n log n).

2

Can compute shortest path tree.

Chandra & Ruta (UIUC) CS473 23 Fall 2016 23 / 57

slide-34
SLIDE 34

Single-Source Shortest Paths with Negative Edge Lengths

Single-Source Shortest Path Problems

Input: A directed graph G = (V, E) with arbitrary (including negative) edge

  • lengths. For edge e = (u, v),

ℓ(e) = ℓ(u, v) is its length. Given nodes s, t find shortest path from s to t. Given node s find shortest path from s to all other nodes.

s 2 3 4 5 6 7 t 9 15 6 10

  • 8 20

30 18 11 16

  • 16

19 6 6 44

Chandra & Ruta (UIUC) CS473 24 Fall 2016 24 / 57

slide-35
SLIDE 35

Negative Length Cycles

Definition

A cycle C is a negative length cycle if the sum of the edge lengths of C is negative.

s b c d e f g t 9 15 6 10

  • 8 20

30 18 11 16

  • 16

19 3 6 44

Dijkstra’s algorithm does not work with negative edges.

Chandra & Ruta (UIUC) CS473 25 Fall 2016 25 / 57

slide-36
SLIDE 36

Shortest Paths and Recursion

1

Compute the shortest path distance from s to t recursively?

2

What are the smaller sub-problems?

Chandra & Ruta (UIUC) CS473 26 Fall 2016 26 / 57

slide-37
SLIDE 37

Shortest Paths and Recursion

1

Compute the shortest path distance from s to t recursively?

2

What are the smaller sub-problems?

Lemma

Let G be a directed graph with arbitrary edge lengths. If s = v0 → v1 → v2 → . . . → vk is a shortest path from s to vk then for 1 ≤ i < k:

1

s = v0 → v1 → v2 . . . → vi is a shortest path from s to vi

Chandra & Ruta (UIUC) CS473 26 Fall 2016 26 / 57

slide-38
SLIDE 38

Shortest Paths and Recursion

1

Compute the shortest path distance from s to t recursively?

2

What are the smaller sub-problems?

Lemma

Let G be a directed graph with arbitrary edge lengths. If s = v0 → v1 → v2 → . . . → vk is a shortest path from s to vk then for 1 ≤ i < k:

1

s = v0 → v1 → v2 . . . → vi is a shortest path from s to vi Sub-problem idea: paths of fewer hops/edges

Chandra & Ruta (UIUC) CS473 26 Fall 2016 26 / 57

slide-39
SLIDE 39

Hop-based Recursion: Bellman-Ford Algorithm

Single-source problem: fix source s. Assume that all nodes can be reached by s in G. (Remove nodes unreachable from s). d(v, k): shortest walk length from s to v using at most k edges.

Chandra & Ruta (UIUC) CS473 27 Fall 2016 27 / 57

slide-40
SLIDE 40

Hop-based Recursion: Bellman-Ford Algorithm

Single-source problem: fix source s. Assume that all nodes can be reached by s in G. (Remove nodes unreachable from s). d(v, k): shortest walk length from s to v using at most k edges.

Chandra & Ruta (UIUC) CS473 27 Fall 2016 27 / 57

slide-41
SLIDE 41

Hop-based Recursion: Bellman-Ford Algorithm

Single-source problem: fix source s. Assume that all nodes can be reached by s in G. (Remove nodes unreachable from s). d(v, k): shortest walk length from s to v using at most k edges. Recursion for d(v, k):

Chandra & Ruta (UIUC) CS473 27 Fall 2016 27 / 57

slide-42
SLIDE 42

Hop-based Recursion: Bellman-Ford Algorithm

Single-source problem: fix source s. Assume that all nodes can be reached by s in G. (Remove nodes unreachable from s). d(v, k): shortest walk length from s to v using at most k edges. Recursion for d(v, k): d(v, k) = min

  • minu∈V(d(u, k − 1) + ℓ(u, v)).

d(v, k − 1) Base case: d(s, 0) = 0 and d(v, 0) = ∞ for all v = s.

Chandra & Ruta (UIUC) CS473 27 Fall 2016 27 / 57

slide-43
SLIDE 43

A Basic Lemma

Lemma

Assume s can reach all nodes in G = (V, E). Then,

1

There is a negative length cycle in G iff d(v, n) < d(v, n − 1) for some node v ∈ V.

2

If there is no negative length cycle in G then dist(s, v) = d(v, n − 1) for all v ∈ V.

Chandra & Ruta (UIUC) CS473 28 Fall 2016 28 / 57

slide-44
SLIDE 44

Bellman-Ford Algorithm

for each u ∈ V do

d(u, 0) ← ∞ d(s, 0) ← 0

for k = 1 to n do for each v ∈ V do

d(v, k) ← d(v, k − 1)

for each edge (u, v) ∈ In(v) do

d(v, k) = min{d(v, k), d(u, k − 1) + ℓ(u, v)}

for each v ∈ V do

dist(s, v) ← d(v, n − 1) If d(v, n) < d(v, n − 1) Return ‘‘Negative Cycle in G’’

Chandra & Ruta (UIUC) CS473 29 Fall 2016 29 / 57

slide-45
SLIDE 45

Bellman-Ford Algorithm

for each u ∈ V do

d(u, 0) ← ∞ d(s, 0) ← 0

for k = 1 to n do for each v ∈ V do

d(v, k) ← d(v, k − 1)

for each edge (u, v) ∈ In(v) do

d(v, k) = min{d(v, k), d(u, k − 1) + ℓ(u, v)}

for each v ∈ V do

dist(s, v) ← d(v, n − 1) If d(v, n) < d(v, n − 1) Return ‘‘Negative Cycle in G’’

Running time:

Chandra & Ruta (UIUC) CS473 29 Fall 2016 29 / 57

slide-46
SLIDE 46

Bellman-Ford Algorithm

for each u ∈ V do

d(u, 0) ← ∞ d(s, 0) ← 0

for k = 1 to n do for each v ∈ V do

d(v, k) ← d(v, k − 1)

for each edge (u, v) ∈ In(v) do

d(v, k) = min{d(v, k), d(u, k − 1) + ℓ(u, v)}

for each v ∈ V do

dist(s, v) ← d(v, n − 1) If d(v, n) < d(v, n − 1) Return ‘‘Negative Cycle in G’’

Running time: O(mn)

Chandra & Ruta (UIUC) CS473 29 Fall 2016 29 / 57

slide-47
SLIDE 47

Bellman-Ford Algorithm

for each u ∈ V do

d(u, 0) ← ∞ d(s, 0) ← 0

for k = 1 to n do for each v ∈ V do

d(v, k) ← d(v, k − 1)

for each edge (u, v) ∈ In(v) do

d(v, k) = min{d(v, k), d(u, k − 1) + ℓ(u, v)}

for each v ∈ V do

dist(s, v) ← d(v, n − 1) If d(v, n) < d(v, n − 1) Return ‘‘Negative Cycle in G’’

Running time: O(mn) Space:

Chandra & Ruta (UIUC) CS473 29 Fall 2016 29 / 57

slide-48
SLIDE 48

Bellman-Ford Algorithm

for each u ∈ V do

d(u, 0) ← ∞ d(s, 0) ← 0

for k = 1 to n do for each v ∈ V do

d(v, k) ← d(v, k − 1)

for each edge (u, v) ∈ In(v) do

d(v, k) = min{d(v, k), d(u, k − 1) + ℓ(u, v)}

for each v ∈ V do

dist(s, v) ← d(v, n − 1) If d(v, n) < d(v, n − 1) Return ‘‘Negative Cycle in G’’

Running time: O(mn) Space: O(n2)

Chandra & Ruta (UIUC) CS473 29 Fall 2016 29 / 57

slide-49
SLIDE 49

Bellman-Ford Algorithm

for each u ∈ V do

d(u, 0) ← ∞ d(s, 0) ← 0

for k = 1 to n do for each v ∈ V do

d(v, k) ← d(v, k − 1)

for each edge (u, v) ∈ In(v) do

d(v, k) = min{d(v, k), d(u, k − 1) + ℓ(u, v)}

for each v ∈ V do

dist(s, v) ← d(v, n − 1) If d(v, n) < d(v, n − 1) Return ‘‘Negative Cycle in G’’

Running time: O(mn) Space: O(n2) Space can be reduced to O(m + n).

Chandra & Ruta (UIUC) CS473 29 Fall 2016 29 / 57

slide-50
SLIDE 50

Bellman-Ford with Space Saving

for each u ∈ V do

d(u) ← ∞ d(s) ← 0

for k = 1 to n − 1 do for each v ∈ V do for each edge (u, v) ∈ In(v) do

d(v) = min{d(v), d(u) + ℓ(u, v)} (* One more iteration to check if distances change *)

for each v ∈ V do for each edge (u, v) ∈ In(v) do if (d(v) > d(u) + ℓ(u, v))

Output ‘‘Negative Cycle’’

for each v ∈ V do

dist(s, v) ← d(v)

Chandra & Ruta (UIUC) CS473 30 Fall 2016 30 / 57

slide-51
SLIDE 51

Problem

Given a directed graph G = (V, E) with non-negative edge lengths ℓ : E → R+, describe an algorithm that finds the shortest cycle in G that contains a specific node s.

Chandra & Ruta (UIUC) CS473 31 Fall 2016 31 / 57

slide-52
SLIDE 52

Problem

Given a directed graph G = (V, E) with non-negative edge lengths ℓ : E → R+. Describe an algorithm to find the shortest cycle containing s with at most k edges.

Chandra & Ruta (UIUC) CS473 32 Fall 2016 32 / 57

slide-53
SLIDE 53

Part III Randomization

Chandra & Ruta (UIUC) CS473 33 Fall 2016 33 / 57

slide-54
SLIDE 54

Randomized Algorithms

Input x Output y

Deterministic Algorithm

Chandra & Ruta (UIUC) CS473 34 Fall 2016 34 / 57

slide-55
SLIDE 55

Randomized Algorithms

Input x Output y

Deterministic Algorithm

Input x Output yr

Randomized Algorithm

random bits r

Chandra & Ruta (UIUC) CS473 34 Fall 2016 34 / 57

slide-56
SLIDE 56

Types of Randomized Algorithms

Typically one encounters the following types:

1

Las Vegas randomized algorithms: for a given input x

  • utput of algorithm is always correct but the running time is a

random variable. In this case we are interested in analyzing the expected running time.

Chandra & Ruta (UIUC) CS473 35 Fall 2016 35 / 57

slide-57
SLIDE 57

Types of Randomized Algorithms

Typically one encounters the following types:

1

Las Vegas randomized algorithms: for a given input x

  • utput of algorithm is always correct but the running time is a

random variable. In this case we are interested in analyzing the expected running time.

2

Monte Carlo randomized algorithms: for a given input x the running time is deterministic but the output is random; correct with some probability. In this case we are interested in analyzing the probability of the correct output (and also the running time).

3

Algorithms whose running time and output may both be random.

Chandra & Ruta (UIUC) CS473 35 Fall 2016 35 / 57

slide-58
SLIDE 58

Ping and find.

Consider a deterministic algorithm A that is trying to find an element in an array X of size n. At every step it is allowed to ask the value of

  • ne cell in the array, and the adversary is allowed after each such

ping, to shuffle elements around in the array in any way it seems fit. For the best possible deterministic algorithm the number of rounds it has to play this game till it finds the required element is (A) O(1) (B) O(n) (C) O(n log n) (D) O(n2) (E) ∞.

Chandra & Ruta (UIUC) CS473 36 Fall 2016 36 / 57

slide-59
SLIDE 59

Ping and find randomized.

Consider an algorithm randFind that is trying to find an element in an array X of size n. At every step it asks the value of one random cell in the array, and the adversary is allowed after each such ping, to shuffle elements around in the array in any way it seems fit. This algorithm would stop in expectation after (A) O(1) (B) O(log n) (C) O(n) (D) O(n2) (E) ∞. steps.

Chandra & Ruta (UIUC) CS473 37 Fall 2016 37 / 57

slide-60
SLIDE 60

Median

Consider the problem of finding an “approximate median” of an unsorted array A[1..n]: an element of A with rank between n/4 and 3n/4. Finding an approximate median is not any easier than a proper median. n/2 elements of A qualify as approximate medians and hence a random element is good with probability 1/2!

Chandra & Ruta (UIUC) CS473 38 Fall 2016 38 / 57

slide-61
SLIDE 61

Part IV Basics of Randomization

Chandra & Ruta (UIUC) CS473 39 Fall 2016 39 / 57

slide-62
SLIDE 62

Discrete Probability Space

Definition

A discrete probability space is a pair (Ω, Pr) consists of finite set Ω

  • f elementary events and function p : Ω → [0, 1] which assigns a

probability Pr[ω] for each ω ∈ Ω such that

ω∈Ω Pr[ω] = 1.

Example

An unbiased coin. Ω = {H, T} and Pr[H] = Pr[T] = 1/2.

Chandra & Ruta (UIUC) CS473 40 Fall 2016 40 / 57

slide-63
SLIDE 63

Events

Definition

Event is a collection of elementary events. The probability of an event A ⊂ Ω, denoted by Pr[A], is

ω∈A Pr[ω].

Chandra & Ruta (UIUC) CS473 41 Fall 2016 41 / 57

slide-64
SLIDE 64

Events

Definition

Event is a collection of elementary events. The probability of an event A ⊂ Ω, denoted by Pr[A], is

ω∈A Pr[ω].

Union Bound

For any two events E and F, we have that Pr

  • E ∪ F
  • ≤ Pr
  • E
  • + Pr
  • F
  • .

Chandra & Ruta (UIUC) CS473 41 Fall 2016 41 / 57

slide-65
SLIDE 65

Events

Definition

Event is a collection of elementary events. The probability of an event A ⊂ Ω, denoted by Pr[A], is

ω∈A Pr[ω].

Union Bound

For any two events E and F, we have that Pr

  • E ∪ F
  • ≤ Pr
  • E
  • + Pr
  • F
  • .

Independence

Events A and B are called independent if Pr[A ∩ B] = Pr[A] Pr[B].

Chandra & Ruta (UIUC) CS473 41 Fall 2016 41 / 57

slide-66
SLIDE 66

Random Variables

Definition

Given a probability space (Ω, Pr) a (real-valued) random variable X

  • ver Ω is a function X : Ω → R.

Chandra & Ruta (UIUC) CS473 42 Fall 2016 42 / 57

slide-67
SLIDE 67

Random Variables

Definition

Given a probability space (Ω, Pr) a (real-valued) random variable X

  • ver Ω is a function X : Ω → R.

Definition (Expectation: Average of X as per Pr)

Expectation of X, E[X], is defined as

ω∈Ω Pr[ω] X(ω).

If S is the set of all values that X takes, then expectation can also be written as

x∈S x Pr[X = x].

Chandra & Ruta (UIUC) CS473 42 Fall 2016 42 / 57

slide-68
SLIDE 68

Random Variables

Definition

Given a probability space (Ω, Pr) a (real-valued) random variable X

  • ver Ω is a function X : Ω → R.

Definition (Expectation: Average of X as per Pr)

Expectation of X, E[X], is defined as

ω∈Ω Pr[ω] X(ω).

If S is the set of all values that X takes, then expectation can also be written as

x∈S x Pr[X = x].

Linearity of Expectation

Given two random variables X1 and X2, E[X1 + X2] = E[X1] + E[X2].

Chandra & Ruta (UIUC) CS473 42 Fall 2016 42 / 57

slide-69
SLIDE 69

Independence of Random Variables

Random variables X and Y are said to be independent if ∀x, y, Pr[X = x ∧ Y = y] = Pr[X = x] · Pr[Y = y]

Multiplication

If X and Y are independent then E[XY] = E[X] E[Y].

Chandra & Ruta (UIUC) CS473 43 Fall 2016 43 / 57

slide-70
SLIDE 70

Part V Randomized Quick Sort

Chandra & Ruta (UIUC) CS473 44 Fall 2016 44 / 57

slide-71
SLIDE 71

Randomized QuickSort

Randomized QuickSort

1

Pick a pivot element uniformly at random from the array.

2

Split array into 3 subarrays: those smaller than pivot, those larger than pivot, and the pivot itself.

3

Recursively sort the subarrays, and concatenate them.

Chandra & Ruta (UIUC) CS473 45 Fall 2016 45 / 57

slide-72
SLIDE 72

Analysis via Recurrence

1

Given array A of size n, let Q(A) be number of comparisons of randomized QuickSort on A.

2

Note that Q(A) is a random variable.

3

Let Ai

left and Ai right be the left and right arrays obtained if:

Let Xi be indicator random variable, which is set to 1 if the pivot is of rank i in A, else zero. Q(A) = n +

n

  • i=1

Xi ·

  • Q(Ai

left) + Q(Ai right)

  • .

Chandra & Ruta (UIUC) CS473 46 Fall 2016 46 / 57

slide-73
SLIDE 73

Analysis via Recurrence

1

Given array A of size n, let Q(A) be number of comparisons of randomized QuickSort on A.

2

Note that Q(A) is a random variable.

3

Let Ai

left and Ai right be the left and right arrays obtained if:

Let Xi be indicator random variable, which is set to 1 if the pivot is of rank i in A, else zero. Q(A) = n +

n

  • i=1

Xi ·

  • Q(Ai

left) + Q(Ai right)

  • .

Since each element of A has probability exactly of 1/n of being chosen: E[Xi] = Pr[pivot is the element with rank i] = 1/n.

Chandra & Ruta (UIUC) CS473 46 Fall 2016 46 / 57

slide-74
SLIDE 74

Independence of Random Variables

Lemma

Random variables Xi is independent of random variables Q(Ai

left) as

well as Q(Ai

right), i.e.

E

  • Xi · Q(Ai

left)

  • = E[Xi] E
  • Q(Ai

left)

  • E
  • Xi · Q(Ai

right)

  • = E[Xi] E
  • Q(Ai

right)

  • Proof.

This is because the algorithm, while recursing on Q(Ai

left) and

Q(Ai

right) uses new random coin tosses that are independent of the

coin tosses used to decide the first pivot. Only the latter decides value of Xi.

Chandra & Ruta (UIUC) CS473 47 Fall 2016 47 / 57

slide-75
SLIDE 75

Analysis via Recurrence

Let T(n) = maxA:|A|=n E[Q(A)] be the worst-case expected running time of randomized QuickSort on arrays of size n. We have, for any A: Q(A) = n +

n

  • i=1

Xi

  • Q(Ai

left) + Q(Ai right)

  • Chandra & Ruta (UIUC)

CS473 48 Fall 2016 48 / 57

slide-76
SLIDE 76

Analysis via Recurrence

Let T(n) = maxA:|A|=n E[Q(A)] be the worst-case expected running time of randomized QuickSort on arrays of size n. We have, for any A: Q(A) = n +

n

  • i=1

Xi

  • Q(Ai

left) + Q(Ai right)

  • By linearity of expectation, and independence random variables:

E

  • Q(A)
  • = n +

n

  • i=1

E[Xi]

  • E
  • Q(Ai

left)

  • + E
  • Q(Ai

right)

  • .

Chandra & Ruta (UIUC) CS473 48 Fall 2016 48 / 57

slide-77
SLIDE 77

Analysis via Recurrence

Let T(n) = maxA:|A|=n E[Q(A)] be the worst-case expected running time of randomized QuickSort on arrays of size n. We have, for any A: Q(A) = n +

n

  • i=1

Xi

  • Q(Ai

left) + Q(Ai right)

  • By linearity of expectation, and independence random variables:

E

  • Q(A)
  • = n +

n

  • i=1

E[Xi]

  • E
  • Q(Ai

left)

  • + E
  • Q(Ai

right)

  • .

⇒ E

  • Q(A)
  • ≤ n +

n

  • i=1

1 n (T(i − 1) + T(n − i)) .

Chandra & Ruta (UIUC) CS473 48 Fall 2016 48 / 57

slide-78
SLIDE 78

Analysis via Recurrence

Let T(n) = maxA:|A|=n E[Q(A)] be the worst-case expected running time of randomized QuickSort on arrays of size n.

Chandra & Ruta (UIUC) CS473 49 Fall 2016 49 / 57

slide-79
SLIDE 79

Analysis via Recurrence

Let T(n) = maxA:|A|=n E[Q(A)] be the worst-case expected running time of randomized QuickSort on arrays of size n. We derived: E

  • Q(A)
  • ≤ n +

n

  • i=1

1 n (T(i − 1) + T(n − i)) . Note that above holds for any A of size n. Therefore max

A:|A|=n E[Q(A)] = T(n) ≤ n + n

  • i=1

1 n (T(i − 1) + T(n − i)) .

Chandra & Ruta (UIUC) CS473 49 Fall 2016 49 / 57

slide-80
SLIDE 80

Solving the Recurrence

T(n) ≤ n +

n

  • i=1

1 n (T(i − 1) + T(n − i)) with base case T(1) = 0.

Chandra & Ruta (UIUC) CS473 50 Fall 2016 50 / 57

slide-81
SLIDE 81

Solving the Recurrence

T(n) ≤ n +

n

  • i=1

1 n (T(i − 1) + T(n − i)) with base case T(1) = 0.

Lemma

T(n) = O(n log n).

Chandra & Ruta (UIUC) CS473 50 Fall 2016 50 / 57

slide-82
SLIDE 82

Solving the Recurrence

T(n) ≤ n +

n

  • i=1

1 n (T(i − 1) + T(n − i)) with base case T(1) = 0.

Lemma

T(n) = O(n log n).

Proof.

(Guess and) Verify by induction.

Chandra & Ruta (UIUC) CS473 50 Fall 2016 50 / 57

slide-83
SLIDE 83

Part VI Inequalities

Chandra & Ruta (UIUC) CS473 51 Fall 2016 51 / 57

slide-84
SLIDE 84

Markov’s Inequality

Markov’s inequality

Let X be a non-negative random variable over a probability space (Ω, Pr). For any a > 0, Pr[X ≥ a] ≤ E[X] a

Chandra & Ruta (UIUC) CS473 52 Fall 2016 52 / 57

slide-85
SLIDE 85

Chebyshev’s Inequality

Variance

Variance of X is the measure of how much does it deviate from its mean value. Formally, Var(X) = E

  • (X − E[X])2

= E

  • X2

− E[X]2

Chebyshev’s Inequality

Given a ≥ 0, Pr[|X − E[X] | ≥ a] ≤ Var(X)

a2

Chandra & Ruta (UIUC) CS473 53 Fall 2016 53 / 57

slide-86
SLIDE 86

Chebyshev’s Inequality

Variance

Variance of X is the measure of how much does it deviate from its mean value. Formally, Var(X) = E

  • (X − E[X])2

= E

  • X2

− E[X]2

Chebyshev’s Inequality

Given a ≥ 0, Pr[|X − E[X] | ≥ a] ≤ Var(X)

a2

If X and Y are independent then Var(X + Y) = Var(X) + Var(Y).

Chandra & Ruta (UIUC) CS473 53 Fall 2016 53 / 57

slide-87
SLIDE 87

Chebyshev’s Inequality: Under Mutual Independence

Let X1, . . . , Xk be k independent random variables such that, for each i ∈ [1, k], Xi equals 1 with probability pi, and 0 with probability (1 − pi). Let X = k

i=1 Xi and µ = E[X] = i pi.

Var(X) ≤ µ ⇒ Pr[|X − µ| ≥ a] ≤ Var(X) a2 < µ a2

Chandra & Ruta (UIUC) CS473 54 Fall 2016 54 / 57

slide-88
SLIDE 88

Chebyshev’s Inequality: Under Mutual Independence

Let X1, . . . , Xk be k independent random variables such that, for each i ∈ [1, k], Xi equals 1 with probability pi, and 0 with probability (1 − pi). Let X = k

i=1 Xi and µ = E[X] = i pi. For

any 0 < δ < 1, it holds that: Var(X) ≤ µ ⇒ Pr[|X − µ| ≥ a] ≤ Var(X) a2 < µ a2 For δ > 0, Pr[X ≥ (1 + δ)µ] ≤

1 δ2µ

For 0 < δ < 1, Pr[X ≤ (1 − δ)µ] ≤

1 δ2µ

Chandra & Ruta (UIUC) CS473 54 Fall 2016 54 / 57

slide-89
SLIDE 89

Chernoff Bound

Let X1, . . . , Xk be k independent random variables such that, for each i ∈ [1, k], Xi equals 1 with probability pi, and 0 with probability (1 − pi). Let X = k

i=1 Xi and µ = E[X] = i pi. For

any 0 < δ < 1, it holds that:

Chandra & Ruta (UIUC) CS473 55 Fall 2016 55 / 57

slide-90
SLIDE 90

Chernoff Bound

Let X1, . . . , Xk be k independent random variables such that, for each i ∈ [1, k], Xi equals 1 with probability pi, and 0 with probability (1 − pi). Let X = k

i=1 Xi and µ = E[X] = i pi. For

any 0 < δ < 1, it holds that: Pr[X ≥ (1 + δ)µ] ≤ e

−δ2µ 3

and Pr[X ≤ (1 − δ)µ] ≤ e

−δ2µ 2 Chandra & Ruta (UIUC) CS473 55 Fall 2016 55 / 57

slide-91
SLIDE 91

Chernoff Bound

Let X1, . . . , Xk be k independent random variables such that, for each i ∈ [1, k], Xi equals 1 with probability pi, and 0 with probability (1 − pi). Let X = k

i=1 Xi and µ = E[X] = i pi. For

any 0 < δ < 1, it holds that: Pr[X ≥ (1 + δ)µ] ≤ e

−δ2µ 3

and Pr[X ≤ (1 − δ)µ] ≤ e

−δ2µ 2

Tighter bound

For any δ > 0, Pr[X ≥ (1 + δ)µ] ≤

(1+δ)(1+δ)

µ Pr[X ≥ (1 − δ)µ] ≤

  • e−δ

(1−δ)(1−δ)

µ

Chandra & Ruta (UIUC) CS473 55 Fall 2016 55 / 57

slide-92
SLIDE 92

Problem: Approximate Median

Suppose you are presented with a very large set S of real numbers, and you would like to approximate the median of these numbers by

  • sampling. You may assume all numbers in S are distinct. Let

|S| = n. We say x is an ǫ-approximate median of S if at least (1/2 − ǫ)n are less than x and at least (1/2 − ǫ)n are greater than

  • x. Consider an algorithm that samples S′ ⊆ S u.a.r. and outputs a

median of S′. Show that there is an absolute constant c, independent of n, if the sample size is c, then with probability 1 − δ the number returned will be ǫ-approximate median.

Chandra & Ruta (UIUC) CS473 56 Fall 2016 56 / 57

slide-93
SLIDE 93

Chandra & Ruta (UIUC) CS473 57 Fall 2016 57 / 57