SLIDE 1 Advanced Algorithms (III)
Shanghai Jiao Tong University
Chihao Zhang
March 16th, 2020
SLIDE 2
Balls-into-Bins
SLIDE 3
Balls-into-Bins
Throw balls into bins uniformly at random
m n
SLIDE 4 Balls-into-Bins
Throw balls into bins uniformly at random
m n
- What is the chance that some bin contains more
than one balls? (Birthday paradox)
SLIDE 5 Balls-into-Bins
Throw balls into bins uniformly at random
m n
- What is the chance that some bin contains more
than one balls? (Birthday paradox)
- How many balls in the fullest bin? (Max load)
SLIDE 6 Balls-into-Bins
Throw balls into bins uniformly at random
m n
- What is the chance that some bin contains more
than one balls? (Birthday paradox)
to hit all bins (Coupon Collector)
m
- How many balls in the fullest bin? (Max load)
SLIDE 7
Birthday Paradox
SLIDE 8
Birthday Paradox
In a group of more than 30 people, which very high chances that two of them have the same birthday
SLIDE 9 Birthday Paradox
In a group of more than 30 people, which very high chances that two of them have the same birthday Pr[no same birthday] ≤ 1 ⋅ ( n − 1 n ) ⋅ ( n − 2 n )…( n − m + 1 n ) =
m−1
∏
i=1 (1 − i
n ) ≤ exp − ∑m−1
i=1 i
n = exp (− m(m − 1) 2n )
SLIDE 10
SLIDE 11
Pr[no same birthday] ≤ exp (− m(m − 1) 2n )
SLIDE 12
Pr[no same birthday] ≤ exp (− m(m − 1) 2n )
For , , the probability is less than 0.304
m = 30 n = 365
SLIDE 13
Pr[no same birthday] ≤ exp (− m(m − 1) 2n )
For , , the probability is less than 0.304
m = 30 n = 365
For , the probability can be arbitrarily close to 0.
m = O ( n)
SLIDE 14
Max Load
SLIDE 15
Max Load
Let be the number of balls in the -th bin
Xi i
SLIDE 16 Max Load
Let be the number of balls in the -th bin
Xi i
What is We analyze this when
X = max
i∈[n] Xi?
m = n
SLIDE 17 Max Load
Let be the number of balls in the -th bin
Xi i
What is We analyze this when
X = max
i∈[n] Xi?
m = n
If we can argue that, is less than with probability , then by union bound,
X1 k 1 − O ( 1 n ) Pr[X ≥ k] = O(1)
SLIDE 18
SLIDE 19
Again by union bound, Pr[X1 ≥ k] ≤ (
n k) n−k ≤ 1 k!
SLIDE 20 Again by union bound, Pr[X1 ≥ k] ≤ (
n k) n−k ≤ 1 k!
We apply the Stirling’s formula k! ≈
2πk ( k e )
k
SLIDE 21 Again by union bound, Pr[X1 ≥ k] ≤ (
n k) n−k ≤ 1 k!
We apply the Stirling’s formula k! ≈
2πk ( k e )
k
So Pr[X ≥ k] ≤ 1
k! ≤ ( e k)
k
SLIDE 22 Again by union bound, Pr[X1 ≥ k] ≤ (
n k) n−k ≤ 1 k!
We apply the Stirling’s formula k! ≈
2πk ( k e )
k
So Pr[X ≥ k] ≤ 1
k! ≤ ( e k)
k
We want . Choose
( e k )
k
= O ( 1 n ) k = O ( log n log log n)
SLIDE 23
Concentration Bounds
SLIDE 24
Concentration Bounds
We shall develop general tools to obtain “with high probability” results…
SLIDE 25
Concentration Bounds
We shall develop general tools to obtain “with high probability” results… These results are critical for analyzing randomized algorithms
SLIDE 26
Concentration Bounds
We shall develop general tools to obtain “with high probability” results… This is the main topic in the coming 4-5 weeks These results are critical for analyzing randomized algorithms
SLIDE 27
Markov Inequality
SLIDE 28
Markov Inequality
Markov Inequality For any nonnegative random variable and ,
X a > 0 Pr[X > a] ≤ E[X] a
SLIDE 29
Markov Inequality
Markov Inequality For any nonnegative random variable and ,
X a > 0 Pr[X > a] ≤ E[X] a
E[X] = E[X ∣ X > a] ⋅ Pr[X > a] + E[X|X ≤ a] ⋅ Pr[X ≤ a] ≥ a ⋅ Pr[X > a] Proof.
SLIDE 30
Applications
SLIDE 31 Applications
- A Las-Vegas randomized algorithm with expected
running time terminates in time with probability
O(n) O(n2) 1 − O ( 1 n )
SLIDE 32 Applications
- A Las-Vegas randomized algorithm with expected
running time terminates in time with probability
O(n) O(n2) 1 − O ( 1 n )
- In -balls-into- -bins problem,
. So
n n E[Xi] = 1
Pr [X1 > log n log log n ] ≤ log log n log n
SLIDE 33 Applications
- A Las-Vegas randomized algorithm with expected
running time terminates in time with probability
O(n) O(n2) 1 − O ( 1 n )
- In -balls-into- -bins problem,
. So
n n E[Xi] = 1
Pr [X1 > log n log log n ] ≤ log log n log n This is far from the truth…
SLIDE 34
Chebyshev’s Inequality
SLIDE 35
Chebyshev’s Inequality
A common trick to improve concentration is to consider instead of for some non- decreasing
E[f(X)] E[X] f : ℝ → ℝ
SLIDE 36
Chebyshev’s Inequality
A common trick to improve concentration is to consider instead of for some non- decreasing
E[f(X)] E[X] f : ℝ → ℝ
Pr [X ≥ a] = Pr [f(X) ≥ f(a)] ≤ E [f(X)] f(a)
SLIDE 37
Chebyshev’s Inequality
A common trick to improve concentration is to consider instead of for some non- decreasing
E[f(X)] E[X] f : ℝ → ℝ
Pr [X ≥ a] = Pr [f(X) ≥ f(a)] ≤ E [f(X)] f(a) gives the Chebyshev’s inequality
f(x) = x2
SLIDE 38 Chebyshev’s Inequality
A common trick to improve concentration is to consider instead of for some non- decreasing
E[f(X)] E[X] f : ℝ → ℝ
Pr [X ≥ a] = Pr [f(X) ≥ f(a)] ≤ E [f(X)] f(a) gives the Chebyshev’s inequality
f(x) = x2
Pr[X ≥ a] ≤ E[X2] a2
- r Pr [|X − E[X]| ≥ a] ≤ Var[X]
a2
SLIDE 39
Coupon Collector
SLIDE 40
Coupon Collector
Recall the coupon collector problem is to ask
SLIDE 41 Coupon Collector
Recall the coupon collector problem is to ask “How many ball one needs to throw so that none
n
SLIDE 42 Coupon Collector
Recall the coupon collector problem is to ask “How many ball one needs to throw so that none
n
We already established that E[X] = nHn ≈ n(log n + γ)
SLIDE 43 Coupon Collector
Recall the coupon collector problem is to ask “How many ball one needs to throw so that none
n
We already established that E[X] = nHn ≈ n(log n + γ) The Markov inequality only provides a very weak concentration…
SLIDE 44
SLIDE 45
In order to apply Chebyshev’s inequality, we need to compute
Var[X] = E[X2] − (E[X])2
SLIDE 46 In order to apply Chebyshev’s inequality, we need to compute
Var[X] = E[X2] − (E[X])2
Recall that where each follows geometric distribution with parameter
X =
n−1
∑
i=0
Xi Xi n − i n
SLIDE 47 In order to apply Chebyshev’s inequality, we need to compute
Var[X] = E[X2] − (E[X])2
Recall that where each follows geometric distribution with parameter
X =
n−1
∑
i=0
Xi Xi n − i n
are independent, so
X0, …, Xn−1
SLIDE 48 In order to apply Chebyshev’s inequality, we need to compute
Var[X] = E[X2] − (E[X])2
Recall that where each follows geometric distribution with parameter
X =
n−1
∑
i=0
Xi Xi n − i n
are independent, so
X0, …, Xn−1
Var [
n−1
∑
i=0
Xi] =
n−1
∑
i=0
Var[Xi]
SLIDE 49 Variance of Geometric Variables
Assume follow geometric distribution with parameter
Y p
E[Y2] =
∞
∑
i=1
i2(1 − p)i−1p = 2 − p p2 Var[Y] = E[Y2] − (E[Y])2 = 1 − p p2
SLIDE 50
SLIDE 51 Var[X] =
n−1
∑
i=0
Var[Xi] =
n−1
∑
i=0
n ⋅ i (n − i)2 ≤ n2
n−1
∑
i=0
1 (n − i)2 = n2 ( 1 12 + 1 22 + 1 32 + … + 1 n2 ) = π2n2 6 .
SLIDE 52 Var[X] =
n−1
∑
i=0
Var[Xi] =
n−1
∑
i=0
n ⋅ i (n − i)2 ≤ n2
n−1
∑
i=0
1 (n − i)2 = n2 ( 1 12 + 1 22 + 1 32 + … + 1 n2 ) = π2n2 6 . By Chebyshev’s inequality, Pr[X ≥ nHn + cn] ≤ π2 6c2
SLIDE 53 Var[X] =
n−1
∑
i=0
Var[Xi] =
n−1
∑
i=0
n ⋅ i (n − i)2 ≤ n2
n−1
∑
i=0
1 (n − i)2 = n2 ( 1 12 + 1 22 + 1 32 + … + 1 n2 ) = π2n2 6 . By Chebyshev’s inequality, Pr[X ≥ nHn + cn] ≤ π2 6c2 The use of Chebyshev’s inequality is often referred to as the “second-moment method”
SLIDE 54
Random Graph
SLIDE 55
Random Graph
Erdős–Rényi random graph G(n, p)
SLIDE 56
Random Graph
Erdős–Rényi random graph G(n, p) vertices, each edge appears with probability independently
n p
SLIDE 57
Random Graph
Erdős–Rényi random graph G(n, p) Given a graph property , define its threshold function as:
P r(n)
vertices, each edge appears with probability independently
n p
SLIDE 58 Random Graph
Erdős–Rényi random graph G(n, p) Given a graph property , define its threshold function as:
P r(n)
vertices, each edge appears with probability independently
n p
, does not satisfy whp;
, satisfies P whp.
p ≪ r(n) G ∼ G(n, p) P p ≫ r(n) G ∼ G(n, p)
SLIDE 59
SLIDE 60
We will show that the property “ contains a -clique” has threshold function
P = G 4 n−2/3
SLIDE 61
We will show that the property “ contains a -clique” has threshold function
P = G 4 n−2/3
For every , let be the indicator that “ is a clique”.
S ∈ ( [n] 4 ) XS G[S]
SLIDE 62 We will show that the property “ contains a -clique” has threshold function
P = G 4 n−2/3
For every , let be the indicator that “ is a clique”.
S ∈ ( [n] 4 ) XS G[S]
Let , then satisfies iff .
X = ∑
S∈(
[n] 4 )
XS G P X > 0
SLIDE 63
SLIDE 64 Then E[X] =
∑
S∈(
[n] 4 )
E[XS] ≈ n4p6 24 .
SLIDE 65 Then E[X] =
∑
S∈(
[n] 4 )
E[XS] ≈ n4p6 24 .
If , . So by Markov inequality
p ≪ n− 2
3 E[X] = o(1)
SLIDE 66 Then E[X] =
∑
S∈(
[n] 4 )
E[XS] ≈ n4p6 24 .
If , . So by Markov inequality
p ≪ n− 2
3 E[X] = o(1)
Pr[X ≥ 1] ≤ E[X] = o(1)
SLIDE 67
SLIDE 68
It is not necessary that implies . (Why?)
E[X] = Ω(1) Pr[X > 0] = 1 − o(1)
SLIDE 69
It is not necessary that implies . (Why?)
E[X] = Ω(1) Pr[X > 0] = 1 − o(1)
We require some control over Var[X]
SLIDE 70
It is not necessary that implies . (Why?)
E[X] = Ω(1) Pr[X > 0] = 1 − o(1)
We require some control over Var[X] By Chebyshev’s inequality,
SLIDE 71
It is not necessary that implies . (Why?)
E[X] = Ω(1) Pr[X > 0] = 1 − o(1)
We require some control over Var[X] By Chebyshev’s inequality,
Pr[X = 0] ≤ Pr[|X − E[X]| ≥ E[X]] ≤ Var[X] E[X]2 = E[X2] E[X]2 − 1
SLIDE 72
It is not necessary that implies . (Why?)
E[X] = Ω(1) Pr[X > 0] = 1 − o(1)
We require some control over Var[X] By Chebyshev’s inequality,
Pr[X = 0] ≤ Pr[|X − E[X]| ≥ E[X]] ≤ Var[X] E[X]2 = E[X2] E[X]2 − 1
A sufficient condition is E[X2] = (1 + o(1)) ⋅ E[X]2
SLIDE 73
SLIDE 74 E[X2] − E[X]2 = E[( ∑
S∈(
[n] 4 )
XS)
2
] − (E[ ∑
S∈(
[n] 4 )
XS])
2
= ∑
S,T∈(
[n] 4 ):|S∩T|=2
(E[XS ⋅ XT] − E[X]E[XT])+ ∑
S,T∈(
[n] 4 ):|S∩T|=3
(E[XS ⋅ XT] − E[XS]E[XT])+ ∑
S∈(
[n] 4 )
(E[X2
S] − E[XS]2)
≤ n6p11 + n5p9 + n4p6 = o(E[X]2)