15-251 Great Theoretical Ideas in Computer Science Lecture 20: - - PowerPoint PPT Presentation

15 251 great theoretical ideas in computer science
SMART_READER_LITE
LIVE PREVIEW

15-251 Great Theoretical Ideas in Computer Science Lecture 20: - - PowerPoint PPT Presentation

15-251 Great Theoretical Ideas in Computer Science Lecture 20: Randomized Algorithms November 5th, 2015 So far Formalization of computation/algorithm Computability / Uncomputability Computational complexity Graph theory and graph algorithms


slide-1
SLIDE 1

15-251 Great Theoretical Ideas in Computer Science

Lecture 20: Randomized Algorithms

November 5th, 2015

slide-2
SLIDE 2

So far

Formalization of computation/algorithm Computability / Uncomputability Computational complexity NP-completeness. Identifying intractable problems. Making use of intractable problems (in Social choice). Dealing with intractable problems: Approximation algs. Online algs. Graph theory and graph algorithms

slide-3
SLIDE 3

Next

slide-4
SLIDE 4

Randomness and the universe

Newtonian physics suggests that the universe evolves deterministically. Does the universe have true randomness? Quantum physics says otherwise.

slide-5
SLIDE 5

Randomness and the universe

Does the universe have true randomness? God does not play dice with the world.

  • Albert Einstein

Einstein, don’t tell God what to do.

  • Niels Bohr
slide-6
SLIDE 6

Randomness and the universe

Does the universe have true randomness? Even if it doesn’t, we can still model our uncertainty about things using probability. Randomness is an essential tool in modeling and analyzing nature. It also plays a key role in computer science.

slide-7
SLIDE 7

Randomness in computer science

Randomized algorithms Does randomness speed up computation? Statistics via sampling e.g. election polls Nash equilibrium in Game Theory Nash equilibrium always exists if players can have probabilistic strategies. Cryptography A secret is only as good as the entropy/uncertainty in it.

slide-8
SLIDE 8

Randomness in computer science

Randomized models for deterministic objects e.g. the www graph Quantum computing Randomness is inherent in quantum mechanics.

Machine learning theory Data is generated by some probability distribution. Coding Theory Encode data to be able to deal with random noise.

slide-9
SLIDE 9

Randomness and algorithms

How can randomness be used in computation? Where can it come into the picture? Given some algorithm that solves a problem…

  • What if the input is chosen randomly?
  • What if the algorithm can make random choices?
slide-10
SLIDE 10

Randomness and algorithms

Given some algorithm that solves a problem…

  • What if the input is chosen randomly?
  • What if the algorithm can make random choices?

How can randomness be used in computation? Where can it come into the picture?

slide-11
SLIDE 11

Randomness and algorithms

A randomized algorithm is an algorithm that is allowed to flip a coin. What is a randomized algorithm?

(it can make decisions based on the output of the coin flip.)

In 15-251: A randomized algorithm is an algorithm that is allowed to call:

  • RandInt(n)
  • Bernoulli(p)

(we’ll assume these take time)

O(1)

slide-12
SLIDE 12

Randomness and algorithms

For a fixed input (e.g. x = 3)

  • the output can vary
  • the running time can vary

def f(x): y = Bernoulli(0.5) if(y == 0): while(x > 0): print(“What up?”) x = x - 1 return x+y

An Example

slide-13
SLIDE 13

Randomness and algorithms

For a randomized algorithm, how should we:

  • measure its correctness?
  • measure its running time?

then we have a deterministic alg. with time compl. .

O(T(n))

If we require it to be

  • always correct, and
  • always runs in time O(T(n))

(Why?)

slide-14
SLIDE 14

Randomness and algorithms

So for a randomized algorithm to be interesting:

  • it is not correct all the time, or
  • it doesn’t always run in time ,

O(T(n))

(It either gambles with correctness or running time.)

slide-15
SLIDE 15

Types of randomized algorithms

repeat: k = RandInt(n) if A[k] = 1, return k

Given an array with n elements (n even). A[1 … n]. Half of the array contains 0s, the other half contains 1s. Goal: Find an index that contains a 1.

repeat 300 times: k = RandInt(n) if A[k] = 1, return k return “Failed”

Doesn’t gamble with correctness Gambles with run-time Gambles with correctness Doesn’t gamble with run-time

slide-16
SLIDE 16

Types of randomized algorithms

Pr[failure] = 1 2300 Worst-case running time: O(1) This is called a Monte Carlo algorithm.

(gambles with correctness but not time)

repeat 300 times: k = RandInt(n) if A[k] = 1, return k return “Failed”

slide-17
SLIDE 17

Types of randomized algorithms

Pr[failure] = This is called a Las Vegas algorithm.

repeat: k = RandInt(n) if A[k] = 1, return k

can’t bound

(could get super unlucky)

Worst-case running time: Expected running time: O(1)

(2 iterations) (gambles with time but not correctness)

slide-18
SLIDE 18

Given an array with n elements (n even). A[1 … n]. Half of the array contains 0s, the other half contains 1s. Goal: Find an index that contains a 1. Deterministic Monte Carlo Las Vegas Correctness Run-time always always w.h.p.

w.h.p. = with high probability

Ω(n) O(1) O(1) w.h.p.

slide-19
SLIDE 19

Formal definition: Monte Carlo algorithm

Let be a computational problem. f : Σ∗ → Σ∗ Suppose is a randomized algorithm such that: A ∀x ∈ Σ∗, Pr[A(x) 6= f(x)]  ✏ ∀x ∈ Σ∗, # steps A(x) takes is ≤ T(|x|). Then we say is a -time Monte Carlo algorithm for with probability of error. A T(n) f ✏

(no matter what the random choices are)

slide-20
SLIDE 20

Formal definition: Las Vegas algorithm

Let be a computational problem. f : Σ∗ → Σ∗ Suppose is a randomized algorithm such that: A Then we say is a -time Las Vegas algorithm for . A T(n) f ∀x ∈ Σ∗, A(x) = f(x) ∀x ∈ Σ∗, E[# steps A(x) takes] ≤ T(|x|)

slide-21
SLIDE 21

Example of a Monte Carlo Algorithm: Min Cut Example of a Las Vegas Algorithm: Quicksort NEXT ON THE MENU

slide-22
SLIDE 22

Example of a Monte Carlo Algorithm: Min Cut Gambles with correctness. Doesn’t gamble with running time.

slide-23
SLIDE 23

Cut Problems

Max Cut Problem (Ryan O’Donnell’s favorite problem): Given a graph , color the vertices red and blue so that the number of edges with two colors (e = {u,v}) is maximized. G = (V, E) S V − S red blue

slide-24
SLIDE 24

Cut Problems

Max Cut Problem (Ryan O’Donnell’s favorite problem): Given a graph , find a non-empty subset such that number of edges from to is maximized. G = (V, E) S ⊂ V S V − S S V − S size of the cut = # edges from to . S V − S

slide-25
SLIDE 25

Cut Problems

Min Cut Problem (my favorite problem): Given a graph , find a non-empty subset such that number of edges from to is minimized. G = (V, E) S ⊂ V S V − S S V − S size of the cut = # edges from to . S V − S

slide-26
SLIDE 26

Contraction algorithm for min cut

Let’s see a super simple randomized algorithm Min-Cut.

slide-27
SLIDE 27

Contraction algorithm for min cut

a c b e d

Select an edge randomly: Green edge selected. Contract that edge. Size of min-cut: 2

slide-28
SLIDE 28

a c b e d

Contraction algorithm for min cut

Select an edge randomly: Green edge selected. Contract that edge. (delete self loops) Size of min-cut: 2

slide-29
SLIDE 29

a c b e d

Contraction algorithm for min cut

Purple edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-30
SLIDE 30

a c b e d

Contraction algorithm for min cut

Purple edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-31
SLIDE 31

a c b e d

Contraction algorithm for min cut

Blue edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-32
SLIDE 32

a c b e d

Contraction algorithm for min cut

Blue edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-33
SLIDE 33

a c b e d

Contraction algorithm for min cut

Blue edge selected. Contract that edge. When two vertices remain, you have your cut: {a, b, c, d} {e} size: 2 (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-34
SLIDE 34

a c b e d

Contraction algorithm for min cut

Green edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-35
SLIDE 35

a c b e d

Contraction algorithm for min cut

Green edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-36
SLIDE 36

a c b e d

Contraction algorithm for min cut

Yellow edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-37
SLIDE 37

a c b e d

Contraction algorithm for min cut

Yellow edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-38
SLIDE 38

a c b e d

Contraction algorithm for min cut

Red edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-39
SLIDE 39

a c b e d

Contraction algorithm for min cut

Red edge selected. Contract that edge. (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-40
SLIDE 40

a c b e d

Contraction algorithm for min cut

Red edge selected. Contract that edge. When two vertices remain, you have your cut: {a} {b,c,d,e} size: 3 (delete self loops) Size of min-cut: 2 Select an edge randomly:

slide-41
SLIDE 41

G = G0 − → G1 − → G2 − → · · · − → Gn−2

vertices

n

vertices

2

contract contract contract contract

n − 2 iterations For any : A cut in of size corresponds exactly to Gi k a cut in of size . k G i Observation:

slide-42
SLIDE 42

a c b e d

Gi

a c b e d

G

slide-43
SLIDE 43

Poll

Let be the size of a minimum cut. k

Which of the following are true (can select more than one): For every ,

Gi k ≤ min

v

degGi(v)

For ,

G = G0 k ≤ min

v

degG(v)

For every ,

Gi

For ,

G = G0 k ≥ min

v

degG(v) k ≥ min

v

degGi(v)

slide-44
SLIDE 44

Poll

For every ,

Gi k ≤ min

v

degGi(v)

i.e., for every and every ,

Gi v ∈ Gi k ≤ degGi(v)

Why?

Same cut exists in original graph. This cut has size . deg(a) = 3

A single vertex forms a cut of size .

v deg(v) k ≤ 3.

So

a c b e d

Gi

slide-45
SLIDE 45

Contraction algorithm for min cut

Should we be impressed?

  • The algorithm runs in polynomial time.
  • There are exponentially many cuts. (~ )

2n ~

  • There is a way to boost the probability of success to

1 − 1 en (and still remain in polynomial time) Let be a graph with n vertices. The probability that the contraction algorithm will

  • utput a min-cut is .

Theorem: G = (V, E) ≥ 1/n2

slide-46
SLIDE 46

Proof of theorem

Fix some minimum cut. S V − S F |F| = k |V| = n |E| = m Pr[algorithm outputs F] ≥ 1/n2 Will show (Note ) Pr[success] ≥ Pr[algorithm outputs F]

slide-47
SLIDE 47

Proof of theorem

Fix some minimum cut. S V − S F When does the algorithm output F ? What if the algorithm picks an edge in to contract? F Then it cannot output F. What if it never picks an edge in to contract? F Then it will output F. |F| = k |V| = n |E| = m

slide-48
SLIDE 48

Proof of theorem

Pr[algorithm outputs F] = Pr[algorithm never contracts an edge in F] = an edge in F is contracted in iteration . Ei i Pr[E1 ∩ E2 ∩ · · · ∩ En−2] = Fix some minimum cut. S V − S F |F| = k |V| = n |E| = m

slide-49
SLIDE 49

Let = an edge in F is contracted in iteration . Ei i

Proof of theorem

Goal: Pr[E1 ∩ E2 ∩ · · · ∩ En−2] = Pr[E1] · Pr[E2|E1] · Pr[E3|E1 ∩ E2] Pr[En−2|E1 ∩ E2 ∩ · · · ∩ En−3] · · · Pr[E1 ∩ E2 ∩ · · · ∩ En−2] ≥ 1/n2. want to write in terms of k and n

chain rule

Pr[E1] Pr[E1] = 1− = 1 − # edges in F total # edges = 1 − k m

slide-50
SLIDE 50

Proof of theorem

Recall: X

v∈V

deg(v) = 2m = ⇒ 2m ≥ kn Let = an edge in F is contracted in iteration . Ei Goal: Pr[E1 ∩ E2 ∩ · · · ∩ En−2] ≥ 1/n2. Observation: ∀v ∈ V : k ≤ deg(v) S V − S v i

≥ kn

= ⇒ m ≥ kn 2 Pr[E1] = 1 − k m = ✓ 1 − 2 n ◆ ≥ 1 − k kn/2

slide-51
SLIDE 51

Proof of theorem

Pr[E1 ∩ E2 ∩ · · · ∩ En−2] Pr[En−2|E1 ∩ E2 ∩ · · · ∩ En−3] · Pr[E2|E1] · Pr[E3|E1 ∩ E2] · · · ≥ ✓ 1 − 2 n ◆ Pr[E2|E1] = 1 − Pr[E2|E1] = 1 − k # remaining edges want to write in terms of k and n Let = an edge in F is contracted in iteration . Ei Goal: Pr[E1 ∩ E2 ∩ · · · ∩ En−2] ≥ 1/n2. i

slide-52
SLIDE 52

Proof of theorem

Let = an edge in F is contracted in iteration . Ei Goal: Pr[E1 ∩ E2 ∩ · · · ∩ En−2] ≥ 1/n2. i Let be the graph after iteration 1. G0 = (V 0, E0) Observation: ∀v ∈ V 0 : k ≤ degG0(v) X

v2V 0

degG0(v) = 2|E0|

≥ k(n − 1)

= ⇒ 2|E0| ≥ k(n − 1) = ⇒ |E0| ≥ k(n − 1) 2

Pr[E2|E1] = 1 − k |E0|

= ✓ 1 − 2 n − 1 ◆

≥ 1 − k k(n − 1)/2

slide-53
SLIDE 53

Proof of theorem

Pr[E1 ∩ E2 ∩ · · · ∩ En−2] Pr[En−2|E1 ∩ E2 ∩ · · · ∩ En−3] ≥ ✓ 1 − 2 n ◆ · ✓ 1 − 2 n − 1 ◆ · Pr[E3|E1 ∩ E2] · · · Let = an edge in F is contracted in iteration . Ei Goal: Pr[E1 ∩ E2 ∩ · · · ∩ En−2] ≥ 1/n2. i

slide-54
SLIDE 54

Proof of theorem

Pr[E1 ∩ E2 ∩ · · · ∩ En−2] Let = an edge in F is contracted in iteration . Ei Goal: Pr[E1 ∩ E2 ∩ · · · ∩ En−2] ≥ 1/n2. i

≥ ✓ 1 − 2 n ◆ ✓ 1 − 2 n − 1 ◆ ✓ 1 − 2 n − 2 ◆ · · · ✓ 1 − 2 n − (n − 4) ◆ ✓ 1 − 2 n − (n − 3) ◆

= ✓n − 2 n ◆ ✓n − 3 n − 1 ◆ ✓n − 4 n − 2 ◆ ✓n − 5 n − 3 ◆ · · · ✓2 4 ◆ ✓1 3 ◆

= 2 n(n − 1) ≥ 1 n2

slide-55
SLIDE 55

Contraction algorithm for min cut

Should we be impressed?

  • The algorithm runs in polynomial time.
  • There are exponentially many cuts. (~ )

2n ~

  • There is a way to boost the probability of success to

1 − 1 en (and still remain in polynomial time) Let be a graph with n vertices. The probability that the contraction algorithm will

  • utput a min-cut is .

Theorem: G = (V, E) ≥ 1/n2

slide-56
SLIDE 56

Contraction algorithm for min cut

Should we be impressed?

  • The algorithm runs in polynomial time.
  • There are exponentially many cuts. (~ )

2n ~

  • There is a way to boost the probability of success to

(and still remain in polynomial time) 1 − 1 en Theorem: Let be a graph with n vertices. The probability that the contraction algorithm will

  • utput a min-cut is .

Theorem: G = (V, E) ≥ 1/n2

slide-57
SLIDE 57

Boosting phase

Run the algorithm t times using fresh random bits. Output the smallest cut among the ones you find. G G G G

Contraction Algorithm Contraction Algorithm Contraction Algorithm Contraction Algorithm

… … … F1 F2 Ft F3 Output the minimum among ’s. Fi larger better success probability t = ⇒ What is the relation between and success probability? t

slide-58
SLIDE 58

Boosting phase

Let = in the i’th repetition, we don’t find a min cut. Ai = Pr[A1] Pr[A2] · · · Pr[At] = Pr[A1]t ≤ ✓ 1 − 1 n2 ◆t What is the relation between and success probability? t Pr[error] = Pr[A1 ∩ A2 ∩ · · · ∩ At]

ind. events

= Pr[don’t find a min cut]

slide-59
SLIDE 59

Boosting phase

Pr[error] ≤ ✓ 1 − 1 n2 ◆t Extremely useful inequality: ∀x ∈ R : 1 + x ≤ ex

slide-60
SLIDE 60

Boosting phase

Pr[error] ≤ ✓ 1 − 1 n2 ◆t Extremely useful inequality: ∀x ∈ R : 1 + x ≤ ex x = −1/n2 Let t = n3 = ⇒ Pr[error] ≤ e−n3/n2 = 1/en ≤ (ex)t = ext = e−t/n2 Pr[success] ≥ 1 − 1 en = ⇒ Pr[error] ≤ (1 + x)t

slide-61
SLIDE 61

Conclusion for min cut

We have a polynomial time algorithm that solves the min cut problem with probability . 1 − 1/en Theoretically, not equal to 1. Practically, equal to 1.

slide-62
SLIDE 62

We can boost the success probability of Monte Carlo algorithms via repeated trials. Important Note Boosting is not specific to Min-cut algorithm.

slide-63
SLIDE 63

Example of a Las Vegas Algorithm: Quicksort Doesn’t gamble with correctness. Gambles with running time.

slide-64
SLIDE 64

Quicksort Algorithm

8 2 7 99 5 4 On input S = (x1, x2, . . . , xn) If , return n ≤ 1 S

slide-65
SLIDE 65

Quicksort Algorithm

8 2 7 99 5 4 On input S = (x1, x2, . . . , xn) If , return n ≤ 1 S Pick uniformly at random a “pivot” xm

slide-66
SLIDE 66

Quicksort Algorithm

8 2 7 99 5 4 On input S = (x1, x2, . . . , xn) If , return n ≤ 1 S

  • Pick uniformly at random a “pivot” xm
slide-67
SLIDE 67

Quicksort Algorithm

8 2 7 99 5 On input S = (x1, x2, . . . , xn) If , return n ≤ 1 S Compare to all other ’s xm

x

Let , S1 = {xi : xi < xm} S2 = {xi : xi > xm}

  • 4

Pick uniformly at random a “pivot” xm

slide-68
SLIDE 68

Quicksort Algorithm

8 7 99 5 On input S = (x1, x2, . . . , xn) If , return n ≤ 1 S Compare to all other ’s xm

x

Let , S1 = {xi : xi < xm} S2 = {xi : xi > xm}

  • 4

2 S1 Pick uniformly at random a “pivot” xm

slide-69
SLIDE 69

Quicksort Algorithm

8 7 99 5 On input S = (x1, x2, . . . , xn) If , return n ≤ 1 S Compare to all other ’s xm

x

Let , S1 = {xi : xi < xm} S2 = {xi : xi > xm}

  • 4

2 S1 S2 Pick uniformly at random a “pivot” xm

slide-70
SLIDE 70

Quicksort Algorithm

8 7 99 5 On input S = (x1, x2, . . . , xn) If , return n ≤ 1 S Compare to all other ’s xm

x

Let , S1 = {xi : xi < xm} S2 = {xi : xi > xm}

  • 4

2 S1 S2 Recursively sort and . S1 S2

  • Pick uniformly at random a “pivot” xm
slide-71
SLIDE 71

Quicksort Algorithm

5 7 8 99 On input S = (x1, x2, . . . , xn) If , return n ≤ 1 S Compare to all other ’s xm

x

Let , S1 = {xi : xi < xm} S2 = {xi : xi > xm}

  • 4

2 S1 S2 Recursively sort and . S1 S2

  • Pick uniformly at random a “pivot” xm
slide-72
SLIDE 72

Quicksort Algorithm

5 7 8 99 On input S = (x1, x2, . . . , xn) If , return n ≤ 1 S Compare to all other ’s xm

x

Let , S1 = {xi : xi < xm} S2 = {xi : xi > xm}

  • 4

2 S1 S2 Recursively sort and . S1 S2

  • Return [S1, xm, S2]
  • Pick uniformly at random a “pivot” xm
slide-73
SLIDE 73

Quicksort Algorithm

This is a Las Vegas algorithm:

  • always gives the correct answer
  • running time can vary depending on our luck

It is not too difficult to show that the expected run-time is ≤ 2n ln n = O(n log n). In practice, it is basically the fastest sorting algorithm!

slide-74
SLIDE 74

Final remarks

Another (morally) million dollar question: Does every efficient randomized algorithm have an efficient deterministic counterpart? P = BPP Is ? Randomized algorithms can be faster and much more elegant than their deterministic counterparts. There are some interesting problems for which:

  • there is a poly-time randomized algorithm,
  • we can’t find a poly-time deterministic algorithm.

Randomness adds an interesting dimension to computation.