[PPT] - Hoeffdings Bound Theorem Let X 1 , . . . , X n be independent random PowerPoint Presentation

SLIDE 1

Hoeffding’s Bound

Theorem Let X1, . . . , Xn be independent random variables with E[Xi] = µi and Pr(Bi ≤ Xi ≤ Bi + ci) = 1, then Pr(|

n

i=1

Xi −

n

i=1

µi| ≥ ǫ) ≤ e

−

2ǫ2 n i=1 c2 i

Do we need independence?

SLIDE 2

Martingales

Definition A sequence of random variables Z0, Z1, . . . is a martingale with respect to the sequence X0, X1, . . . if for all n ≥ 0 the following hold:

1 Zn is a function of X0, X1, . . . , Xn; 2 E[|Zn|] < ∞; 3 E[Zn+1|X0, X1, . . . , Xn] = Zn;

Definition A sequence of random variables Z0, Z1, . . . is a martingale when it is a martingale with respect to itself, that is

1 E[|Zn|] < ∞; 2 E[Zn+1|Z0, Z1, . . . , Zn] = Zn;

SLIDE 3

Conditional Expectation

Definition E[Y | Z = z] =

y

y Pr(Y = y | Z = z) , where the summation is over all y in the range of Y . Lemma For any random variables X and Y , E[X] = EY [EX[X | Y ]] =

y

Pr(Y = y)E[X | Y = y] , where the sum is over all values in the range of Y .

SLIDE 4

Lemma For any random variables X and Y , E[X] = EY [EX[X | Y ]] =

y

Pr(Y = y)E[X | Y = y] , where the sum is over all values in the range of Y . Proof.

y

Pr(Y = y)E[X | Y = y] =

y

Pr(Y = y)

x

x Pr(X = x | Y = y) =

x
y

x Pr(X = x | Y = y) Pr(Y = y) =

x
y

x Pr(X = x ∩ Y = y) =

x

x Pr(X = x) = E[X].

SLIDE 5

Example

Consider a two phase game:

Phase I: roll one die. Let X be the outcome.
Phase II: Flip X fair coins, let Y be the number of HEADs.
You receive a dollar for each HEAD.

Y is distributed B(X, 1

2),

E[Y | X = a] = a 2 E[Y ] =

6

i=1

E[Y | X = i] Pr(X = i) =

6

i=1

i 2 Pr(X = i) = 7 4

SLIDE 6

Conditional Expectation as a Random variable

Definition The expression E[Y | Z] is a random variable f (Z) that takes on the value E[Y | Z = z] when Z = z. Consider the outcome of rolling two dice X1, X2, X = X1 + X2. E[X | X1] =

x

x Pr(X = x | X1) =

X1+6

x=X1+1

x · 1 6 = X1 + 7 2 . Consider the two phase game E[Y | X] = X 2

SLIDE 7

If E[Y | Z] is a random variable, it has an expectation. Theorem E[Y ] = E[E[Y | Z]] . E[X | X1] = X1 + 7 2 . Thus E[E[X | X1]] = E

X1 + 7

2

= 7

2 + 7 2 = 7 .

SLIDE 8

Martingales

Definition A sequence of random variables Z0, Z1, . . . is a martingale with respect to the sequence X0, X1, . . . if for all n ≥ 0 the following hold:

1 Zn is a function of X0, X1, . . . , Xn; 2 E[|Zn|] < ∞; 3 E[Zn+1|X0, X1, . . . , Xn] = Zn;

Definition A sequence of random variables Z0, Z1, . . . is a martingale when it is a martingale with respect to itself, that is

1 E[|Zn|] < ∞; 2 E[Zn+1|Z0, Z1, . . . , Zn] = Zn;

SLIDE 9

Martingale Example

A series of fair games (E[gain] = 0), not necessarily independent.. Game 1: bet $1. Game i > 1: bet 2i if won in round i − 1; bet i otherwise. Xi = amount won in ith game. (Xi < 0 if ith game lost). Zi = total winnings at end of ith game.

SLIDE 10

Example

Xi = amount won in ith game. (Xi < 0 if ith game lost). Zi = total winnings at end of ith game. Z1, Z2, . . . is martingale with respect to X1, X2, . . . E[Xi] = 0. E[Zi] = E[Xi] = 0 < ∞. E[Zi+1|X1, X2, . . . , Xi] = Zi + E[Xi+1] = Zi.

SLIDE 11

Gambling Strategies

I play series of fair games (win with probability 1/2). Game 1: bet $1. Game i > 1: bet 2i if I won in round i − 1; bet i otherwise. Xi = amount won in ith game. (Xi < 0 if ith game lost). Zi = total winnings at end of ith game. Assume that (before starting to play) I decide to quit after k games: what are my expected winnings?

SLIDE 12

Lemma Let Z0, Z1, Z2, . . . be a martingale with respect to X0, X1, . . . . For any fixed n, EX[1:n][Zn] = E[Z0] . (X[1 : i] = X1, . . . , Xi) Proof. Since Zi is a martingale Zi−1 = EXi[Zi|X0, X1, . . . , Xi−1]. Then EX[1:i−1][Zi−1] = EX[1:i−1][EXi[Zi|X0, X1, . . . , Xi−1]] But EX[1:i−1][EXi[Zi|X0, X1, . . . , Xi−1]] = EX[1:i][Zi] . Thus, EX[1:n][Zn] = EX[n−1][Zn−1] = . . . , = E[Z0] .

SLIDE 13

Gambling Strategies

I play series of fair games (win with probability 1/2). Game 1: bet $1. Game i > 1: bet 2i if I won in round i − 1; bet i otherwise. Xi = amount won in ith game. (Xi < 0 if ith game lost). Zi = total winnings at end of ith game. Assume that (before starting to gamble) we decide to quit after k games: what are my expected winnings? E[Zk] = E[Z1] = 0.

SLIDE 14

A Different Strategy

Same gambling game. What happens if I:

play a random number of games?
decide to stop only when I have won $1000?

SLIDE 15

Stopping Time

Definition A non-negative, integer random variable T is a stopping time for the sequence Z0, Z1, . . . if the event “T = n” depends only on the value of random variables Z0, Z1, . . . , Zn. Intuition: corresponds to a strategy for determining when to stop a sequence based only on values seen so far. In the gambling game:

first time I win 10 games in a row: is a stopping time;
the last time when I win: is not a stopping time.

SLIDE 16

Consider again the gambling game: let T be a stopping time. Zi = total winnings at end of ith game. What are my winnings at the stopping time, i.e. E[ZT]? Fair game: E[ZT] = E[Z0] = 0? “T =first time my total winnings are at least $1000” is a stopping time, and E[ZT] > 1000...

SLIDE 17

Martingale Stopping Theorem

Theorem If Z0, Z1, . . . is a martingale with respect to X1, X2, . . . and if T is a stopping time for X1, X2, . . . then E[ZT] = E[Z0] whenever one of the following holds:

there is a constant c such that, for all i, |Zi| ≤ c;
T is bounded;
E[T] < ∞, and there is a constant c such that

E

|Zi+1 − Zi|
X1, . . . , Xi
< c.

SLIDE 18

Proof of Martingale Stopping Theorem (Sketch)

We need to show that E[|ZT|] < ∞. So we can use E[ZT] = E[Z0] +

i≤T E[E[(Zi − Zi−1) | X1, . . . , Xi−1]]

there is a constant c such that, for all i, |Zi| ≤ c - the sum is

bounded.

T is bounded - the sum has finite number of elements.
E[T] < ∞, and there is a constant c such that

E

|Zi+1 − Zi|
X1, . . . , Xi
< c

E[|ZT|] ≤ E[|Z0|] +

∞

i=1

E[E

|Zi+1 − Zi|
X1, . . . , Xi
1i≤T]

≤ E[|Z0|] + c

∞

i=1

Pr(T ≥ i) ≤ E[|Z0|] + cE[T] < ∞

SLIDE 19

Martingale Stopping Theorem Applications

We play a sequence of fair game with the following stopping rules:

1 T is chosen from distribution with finite expectation:

E[ZT] = E[Z0].

2 T is the first time we made $1000: E[T] is unbounded. 3 We double until the first win. E[T] = 2 but

E

|Zi+1 − Zi|
X1, . . . , Xi
is unbounded.

SLIDE 20

Example: The Gambler’s Ruin

Consider a sequence of independent, fair 2-player gambling

games.

In each round, each player wins or loses $1 with probability 1

2.

Xi = amount won by player 1 on ith round.
If player 1 has lost in round i: Xi < 0.
Zi = total amount won by player 1 after ith rounds.
Z0 = 0.
Game ends when one player runs out of money
Player 1 must stop when she loses net ℓ1 dollars (Zt = −ℓ1)
Player 2 terminates when she loses net ℓ2 dollars (Zt = ℓ2).
q = probability game ends with player 1 winning ℓ2 dollars.

SLIDE 21

Example: The Gambler’s Ruin

T = first time player 1 wins ℓ2 dollars or loses ℓ1 dollars.
T is a stopping time for X1, X2, . . . .
Z0, Z1, . . . is a martingale.
Zi’s are bounded.
Martingale Stopping Theorem: E[ZT] = E[Z0] = 0.

E[ZT] = qℓ2 − (1 − q)ℓ1 = 0 q = ℓ1 ℓ1 + ℓ2

SLIDE 22

Example: A Ballot Theorem

Candidate A and candidate B run for an election.
Candidate A gets a votes.
Candidate B gets b votes.
a > b.
Votes are counted in random order:
chosen from all permutations on all n = a + b votes.
What is the probability that A is always ahead in the count?

SLIDE 23

Example: A Ballot Theorem

Si = number of votes A is leading by after i votes counted
If A is trailing: Si < 0).
Sn = a − b.
For 0 ≤ k ≤ n − 1: Xk = Sn−k

n−k .

Consider X0, X1, . . . , Xn.
This sequence goes backward in time!

E[Xk|X0, X1, . . . , Xk−1] = ?

SLIDE 24

Example: A Ballot Theorem

E[Xk|X0, X1, . . . , Xk−1] = ?

Conditioning on X0, X1, . . . , Xk−1: equivalent to conditioning
n Sn, Sn−1, . . . , Sn−k+1,
ai = number of votes for A after first i votes are counted.
(n − k + 1)th vote: random vote among these first n − k + 1

votes. Sn−k = Sn−k+1 + 1 if (n − k + 1)th vote is for B Sn−k+1 − 1 if (n − k + 1)th vote is for A Sn−k =

Sn−k+1 + 1

with prob. n−k+1−an−k+1

n−k+1

Sn−k+1 − 1 with prob.

an−k+1 n−k+1

SLIDE 25

E[Sn−k|Sn−k+1] = (Sn−k+1 + 1)n − k + 1 − an−k+1 (n − k + 1) + (Sn−k+1 − 1) an−k+1 (n − k + 1) = Sn−k+1 n − k n − k + 1 (Since 2an−k+1 − n − k + 1 = Sn−k+1) E[Xk|X0, X1, . . . , Xk−1] = E Sn−k n − k

Sn, . . . , Sn−k+1]

= Sn−k+1 n − k + 1 = Xk−1 = ⇒ X0, X1, . . . , Xn is a martingale.

SLIDE 26

Example: A Ballot Theorem

T = min{k : Xk = 0} if such k exists n − 1

therwise
T is a stopping time.
T is bounded.
Martingale Stopping Theorem:

E[XT] = E[X0] = E[Sn] n = a − b a + b . Two cases:

1 A leads throughout the count. 2 A does not lead throughout the count.

SLIDE 27

1 A leads throughout the count.

For 0 ≤ k ≤ n − 1: Sn−k > 0, then Xk > 0. T = n − 1. XT = Xn−1 = S1. A gets the first vote in the count: S1 = 1, then XT = 1.

2 A does not lead throughout the count.

For some k: Sk = 0. Then Xk = 0. T = k < n − 1. XT = 0.

SLIDE 28

Example: A Ballot Theorem

Putting all together:

1 A leads throughout the count: XT = 1. 2 A does not lead throughout the count: XT = 0

E[XT] = a − b a + b = 1 ∗ Pr(Case 1) + 0 ∗ Pr(Case 2) . That is Pr(A leads throughout the count) = a − b a + b .

SLIDE 29

A Different Gambling Game

Two stages:

1 roll one die; let X be the outcome; 2 roll X standard dice; your gain Z is the sum of the outcomes

f the X dice.

What is your expected gain?

SLIDE 30

Wald’s Equation

Theorem Let X1, X2, . . . be nonnegative, independent, identically distributed random variables with distribution X. Let T be a stopping time for this sequence. If T and X have bounded expectation, then E T

i

Xi

= E[T]E[X] .

Note that T is not independent of X1, X2, . . . . Corollary of the martingale stopping theorem.

SLIDE 31

Proof

For i ≥ 1, let Zi = i

j=1(Xj − E[X]).

The sequence Z1, Z2, . . . is a martingale with respect to X1, X2, . . .. E[Z1] = 0, E[T] < ∞, and since Xi are nonnegative E

|Zi+1 − Zi|
X1, . . . , Xi
= E[|Xi+1 − E[X]|] ≤ 2E[X] .

Hence we can apply the martingale stopping theorem to compute E[ZT] = E[Z1] = 0 . We now find = E[ZT] = E  

T

j=1

(Xj − E[X])   = E  

T

j=1

Xj − TE[X]   = E  

T

j=1

Xj   − E[T] · E[X] = 0,

SLIDE 32

A Different Gambling Game

Two stages:

1 roll one die; let X be the outcome; 2 roll X standard dice; your gain Z is the sum of the outcomes

f the X dice.

What is your expected gain? Yi = outcome of ith die in second stage. E[Z] = E X

i=1

Yi

.

X is a stopping time for Y1, Y2, . . . . By Wald’s equation: E[Z] = E[X]E[Yi] = 7 2 2 .

SLIDE 33

Example: a k-run

We flip a fair coin until we get a consecutive sequence of k

HEADs.

What’s the expected number of times we flip the coin.
A SWITCH is a HEAD followed by a TAIL.
Let X1 be the number of flips till k HEADs or the first

SWITCH

Let Xi be the number of flips following the i − 1 SWITCH till

k HEADs or the next SWITCH (Xi includes the last HEAD or TAIL).

Let T be the first i with k HEADs

E[Xi] =

j≥1

j2−j +

k−1

j=1

j2−j + (k − 1)2−(k−1) E[T] = 2k−1

The expected number of coin flips is E[Xi]E[T]

SLIDE 34

Let Xi be the number of flips following the i − 1 SWITCH till

k HEADs or the next SWITCH (Xi includes the last HEAD or TAIL).

Let T be the first i with k HEADs
Xi = number of flips till (including) first HEAD + up to k − 2

HEADs followed by a TAIL, or k − 1 HEADS E[Xi] =

j≥1

j2−j +

k−1

j=1

j2−j + (k − 1)2−(k−1)

The probability that Xi ends with k HEADS is 2−(k−1) -

sequence of k − 1 HEADS following the first one. E[T] = 2k−1

The expected number of coin flips is E[Xi]E[T]

SLIDE 35

Hoeffding’s Bound

Theorem Let X1, . . . , Xn be independent random variables with E[Xi] = µi and Pr(Bi ≤ Xi ≤ Bi + ci) = 1, then Pr

n
i=1

Xi −

n

i=1

µi

≥ ǫ
≤ 2e

−

2ǫ2 n i=1 c2 i

Do we need independence?

SLIDE 36

Tail Inequalities

Theorem (Azuma-Hoeffding Inequality) Let Z0, Z1, . . . , Zn be a martingale (with respect to X1, X2, . . . ) such that |Zk − Zk−1| ≤ ck. Then, for all t ≥ 0 and any λ > 0, Pr(|Zt − Z0| ≥ λ) ≤ 2e−λ2/(2 t

k=1 c2 k ) .

The following corollary is often easier to apply. Corollary Let X0, X1, . . . be a martingale such that for all k ≥ 1, |Xk − Xk−1| ≤ c . Then for all t ≥ 1 and λ > 0, Pr

|Xt − X0| ≥ λc

√ t

≤ 2e−λ2/2 .

SLIDE 37

Tail Inequalities: A More General Form

Theorem (Azuma-Hoeffding Inequality) Let Z0, Z1, . . . , be a martingale with respect to X0, X1, X2, . . . , such that Bk ≤ Zk − Zk−1 ≤ Bk + ck , for some constants ck and for some random variables Bk that may be functions of X0, X1, . . . , Xk−1. Then, for any t ≥ 0 and λ > 0, Pr(|Zt − Z0| ≥ λ) ≤ 2e−2λ2/(t

k=1 c2 k ) .

SLIDE 38

Proof

Let X k = X0, . . . , Xk and Yi = Zi − Zi−1. Since E[Zi | X i−1] = Zi−1, E[Yi | X i−1] = E[Zi − Zi−1 | X i−1] = 0 . Since Pr(Bi ≤ Yi ≤ Bi + ci | X i−1) = 1, by Hoeffding’s Lemma: E[eβYi | X i−1] ≤ eβ2c2

i /8 .

Lemma (Hoeffding’s Lemma) Let X be a random variable such that Pr(X ∈ [a, b]) = 1 and E[X] = 0. Then for every λ > 0, E[eλX] ≤ eλ2(a−b)2/8.

SLIDE 39

Proof of the Lemma

Since f (x) = eλx is a convex function, for any α ∈ (0, 1) and x ∈ [a, b], f (X) ≤ αf (a) + (1 − α)f (b) . Thus, for α = b−x

b−a ∈ (0, 1),

eλx ≤ b − x b − a eλa + x − a b − aeλb . Taking expectation, and using E[X] = 0, we have E

eλX

≤ b b − aeλa + a b − aeλb ≤ eλ2(b−a)2/8 .

SLIDE 40

Proof of Azuma-Hoeffding Inequality

E

eβYi
X i−1

≤ eβ2c2

i /8 .

EX n

eβ n

i=1 Yi

=

EX n−1

EXn
eβ n

i=1 Yi

X n−1 = EX n

eβ n−1

i=1 YiEXn−1

eβYn | X n−1

≤ eβ2c2

n/8EX n−1

eβ n−1

i=1 Yi

≤

eβ2 n

i=1 c2 i /8

SLIDE 41

E[eβ n

i=1 Yi] ≤ eβ2 n i=1 c2 i /8

Pr(Zt − Z0 ≥ λ) = Pr t

i=1

Yi ≥ λ

≤ E[eβ t

i=1 Yi]

eβλ ≤ e−λβeβ2 t

i=1 c2 i /8

≤ 2e−2λ2/(t

k=1 c2 k ),

For β =

4λ t

i=1 c2 i .

Pr(|Zt − Z0| ≥ λ) ≤ 2e−2λ2/(t

k=1 c2 k )

SLIDE 42

Example

Assume that you play a sequence of n fair games, where the bet bi in game i depends on the outcome of previous games. Let B = maxi bi. The probability of winning or losing more than λ is bounded by Pr(|Zn| ≥ λ) ≤ 2e−2λ2/nB2 Pr(|Zn| ≥ λB√n) ≤ 2e−2λ2 Pr  |Zn| ≥ λ

n
i=1

b2

i

  ≤ 2e−2λ2

SLIDE 43

Doob Martingale

Let X1, X2, . . . , Xn be sequence of random variables. Let Y = f (X1, . . . , Xn) be a random variable with E[|Y |] < ∞. For i = 0, 1, . . . , n, let Z0 = E[Y ] = EX[1,n]f (X1, . . . , Xn] Zi = EX[i+1,n][Y |X1 = x1, X2 = x2, . . . , Xi = xi] Zn = E[Y |X1 = x1, X2 = x2, . . . , Xn = xn] = f (x1, . . . , xn) Theorem Z0, Z1, . . . , Zn is martingale with respect to X1, X2, . . . , Xn.

SLIDE 44

Proof

We use: Fact E[E[V |U, W ]|W ] = E[V |W ]. Y = f (X1, . . . , Xn), Z0 = E[Y ], Zi = EX[i+1,n][Y |X1 = x1, . . . , Xi = xi], Z1, Z2. . . . , Zn is a martingale if EXi+1[Zi+1|X1 = x1, . . . , Xi = xi] = Zi EXi+1[Zi+1|x1, x2, . . . , xi] = EXi+1[EX[i+2,n][Y |X1, . . . , Xi+1]|x1, ..., xi] = EX[i+1,n][Y |x1, x2, . . . , xi] = Zi .

SLIDE 45

Simple Example

Y = f (X1, . . . , Xn) = n

i=1 Xi,

Xi’s are independent and distributed uniform U[0, 1]. Z0 = E[Y ] = EX[1,n]f (X1, . . . , Xn)] = E[

n

i=1

Xi] = n/2 Zi = EX[i+1,n][Y |x1, . . . , xi] =

i

j=1

xj + E[

n

j=i

Xi] =

i

j=1

xj + (n − i)/2 Zn = E[Y |x1, . . . , xn] = f (x1, . . . , xn) =

n

j=1

xj EXi+1[Zi+1|x1, . . . , xi] = EXi+1  

i+1

j=1

Xi + n − i − 1 2

x1, . . . , xi

  =

i

j=1

xi + n − i 2 = Zi

SLIDE 46

Example: Polya’s Urn

Start with M balls, R red, M − R blue.
Repeat n times: We pick a ball uniformly at random. If Red

we add a red ball, else we add a blue ball.

Xi = 1 if we add a red ball in step i, else Xi = 0
We want to estimate Sn(R/M) = n

i=1 Xi = f (X1, . . . , Xn)

Claim: E[Sn(R/M)] = nR/M.

Proof by induction on t, that E[St] = tR/M. E[St+1 | St] = St + R + St M + t E[St+1] = E[E[St+1 | St]] = E

St + R + St

M + t

= t R

M + R + tR/M M + t = (t + 1) R M

SLIDE 47

Example: Polya’s Urn

Start with M balls, R red, M − R blue. Repeat n times: We pick a ball uniformly at random. If Red we add a red ball, else we add a blue ball. Xi = 1 if added a red ball in step i, else Xi = 0, Sn(R/M) = n

i=1 Xi, and E[Sn(R/M)] = nR/M

Let Zi = E[Sn| X1 = x1, . . . , Xi = xi]. We prove that Z1, . . . , Zn is a martingale. Zi = E[Sn| X1 = x1, . . . , Xi = xi] =

i

j=1

xj + E[Sn−i( R + i

j=1 xj

M + i )] =

i

j=1

xj + (n − i) R + i

j=1 xj

M + i E[Zi+1 | X1, . . . , Xi] = E[E[Sn|X1, X2, . . . , Xi+1] | X1 = x1, . . . , Xi = xi] = E  

i

j=1

xj + Xi+1 + Sn−i−1

R + i

j=1 xj + Xi+1

M + i + 1  

SLIDE 48

Zi = E[Sn| X1 = x1, . . . , Xi = xi] = i

j=1 xj + (n − i) R+i

j=1 xj

M+i

E[Zi+1 | X1, ..., Xi] = E  

i

j=1

xj + Xi+1 + Sn−i−1

R + i

j=1 xj + Xi+1

M + i + 1   = E  

i

j=1

xj + Xi+1 + (n − i − 1) R + i

j=1 xj + Xi+1

M + i + 1   =

i

j=1

xj + R + i

j=1 xj

M + i + (n − i − 1) R + i

j=1 xj + R+i

j=1 xj

M+i

M + i + 1 =

i

j=1

xj + R + i

j=1 xj

M + i + (n − i − 1)

M+i+1 M+i (R + i j=1 xj)

M + i + 1 = Zi

SLIDE 49

Example: Edge Exposure Martingale

Let G random graph from Gn,p. Consider m = n

2

possible edges

in arbitrary order. Xi = 1 if ith edge is present

therwise

F(G) = size of maximum clique in G. Z0 = E[F(G)] Zi = E[F(G)|X1, X2, . . . , Xi], for i = 1, . . . , m. Z0, Z1, . . . , Zm is a Doob martingale. (F(G) could be any finite-valued function on graphs.)

SLIDE 50

Tail Inequalities: Doob Martingales

Let X1, . . . , Xn be sequence of random variables. Random variable Y :

Y is a function of X1, X2, . . . , Xn;
E[|Y |] < ∞.

Let Zi = E[Y |X1, . . . , Xi], i = 0, 1, . . . , n. Z0, Z1, . . . , Zn is martingale with respect to X1, . . . , Xn. If we can use Azuma-Hoeffding inequality: Pr(|Zn − Z0| ≥ λ) ≤ ..... then we have, Pr(|Y − E[Y ]| ≥ λ) ≤ ...... We need a bound on |Zi − Zi−1|.

SLIDE 51

McDiarmid Bound

Theorem Assume that f (X1, X2, . . . , Xn) satisfies, |f (x1, . . . , , xi, . . . , xn) − f (x1, . . . , , yi, . . . , xn)| ≤ ci . and X1, . . . , Xn are independent, then Pr(|f (X1, . . . , Xn) − E[f (X1, . . . , Xn)]| ≥ λ) ≤ 2e−2λ2/(n

k=1 c2 k ) .

[Changing the value of Xi changes the value of the function by at most ci.]

SLIDE 52

Proof

Define a Doob martingale Z0, Z1, . . . , Zn:

Z0 = E[f (X1, . . . , Xn)] = E[f ( ¯

X)]

Zi = E[f (X0, . . . , Xn) | X1, . . . , Xi] = E[f (Xi, . . . , Xn) | X i]
Zn = f (X1, . . . , Xn) = f ( ¯

X) We want to prove that this martingale satisfies the conditions of Theorem (Azuma-Hoeffding Inequality) Let Z0, Z1, . . . , be a martingale with respect to X0, X1, X2, . . . , such that Bk ≤ Zk − Zk−1 ≤ Bk + ck , for some constants ck and for some random variables Bk that may be functions of X0, X1, . . . , Xk−1. Then, for all t ≥ 0 and any λ > 0, Pr(|Zt − Z0| ≥ λ) ≤ 2e−2λ2/(t

k=1 c2 k ) .

SLIDE 53

Lemma If X1, . . . , Xn are independent then for some random variable Bk, Bk ≤ Zk − Zk−1 ≤ Bk + ck , Zk − Zk−1 = E[f ( ¯ X) | X k] − E[f ( ¯ X) | X k−1] . Hence Zk − Zk−1 is bounded above by sup

x E[f ( ¯

X) | X k−1, Xk = x] − E[f ( ¯ X) | X k−1] and bounded below by inf

y E[f ( ¯

X) | X k−1, Xk = y] − E[f ( ¯ X) | X k−1] . Thus, we need to show sup

x E[f ( ¯

X) | X k−1, Xk = x] − inf

y E[f ( ¯

X) | X k−1, Xk = y] ≤ c ,

SLIDE 54

Zk − Zk−1 = sup

x,y E[f ( ¯

X, x) − f ( ¯ X, y) | X k−1]. Because the Xi are independent, the values for Xk+1, . . . , Xn do not depend on the values of X1, . . . , Xk. Hence, for any values ¯ z, x, y we have Pr(¯ z, x) = Pr(¯ z, y), and therefore sup

x,y E[f ( ¯

X, x) − f ( ¯ X, y) | X1 = z1, . . . , Xk−1 = zk−1] = sup

x,y

zk+1,...,zn

Pr

(Xk+1 = zk+1) ∩ . . . ∩ (Xn = zn)
∗
f (¯

z, x) − f (¯ z, y)

.

But f (¯ z, x) − f (¯ z, y) ≤ ci and therefore E[f ( ¯ X, x) − f ( ¯ X, y) | X k−1] ≤ ci

SLIDE 55

Example: Pattern Matching

Given a string and a pattern: is the pattern interesting? Does it appear more often than is expected in a random string? Is the number of occurrences of the pattern concentrated around the expectation?

SLIDE 56

A = (a1, a2, . . . , an) string of characters, each chosen independently and uniformly at random from Σ, with s = |Σ|. pattern: B = (b1, . . . , bk) fixed string, bi ∈ Σ. F = number of occurrences of B in random string A. E[F] = ?

SLIDE 57

A = (a1, a2, . . . , an) string of characters, each chosen independently and uniformly at random from Σ, with m = |Σ|. pattern: B = (b1, . . . , bk) fixed string, bi ∈ Σ. F= number occurrences of B in random string S. E[F] = (n − k + 1) 1 m k . Can we bound the deviation of F from its expectation?

SLIDE 58

F= number occurrences of B in random string A. Z0 = E[F] Zi = E[F|a1, . . . , ai], for i = 1, . . . , n. Z0, Z1, . . . , Zn is a Doob martingale. Zn = F.

SLIDE 59

F= number occurrences of B in random string A. Z0 = E[F] Zi = E[F|a1, . . . , ai], for i = 1, . . . , n. Z0, Z1, . . . , Zn is a Doob martingale. Zn = F. Each character in A can participate in no more than k occurrences

f B:

|Zi − Zi+1| ≤ k . Azuma-Hoeffding inequality (version 1): Pr(|F − E[F]| ≥ λ) ≤ 2e−λ2/(2nk2) .

SLIDE 60

Application: Balls and Bins

We are throwing m balls independently and uniformly at random into n bins. Let Xi = the bin that the ith ball falls into. Let F be the number of empty bins after the m balls are thrown. Then the sequence Zi = E[F | X1, . . . , Xi] is a Doob martingale. F = f (X1, X2, . . . , Xm) satisfies the Lipschitz condition with bound 1, thus |Zi+1 − Zi| ≤ 1 We therefore obtain Pr(|F − E[F]| ≥ ǫ) ≤ 2e−2ǫ2/m Here E[F] = n

1 − 1

n m , but we could obtain the concentration result without knowing E[F].

SLIDE 61

Application: Chromatic Number

Given a random graph G in Gn,p, the chromatic number χ(G) is the minimum number of colors required to color all vertices of the graph so that no adjacent vertices have the same color. We use the vertex exposure martingale defined below: Let Gi be the random subgraph of G induced by the set of vertices 1, . . . , i, let Z0 = E[χ(G)], and let Zi = E[χ(G) | G1, . . . , Gi] . Since a vertex uses no more than one new color, again we have that the gap between Zi and Zi−1 is at most 1. We conclude Pr(|χ(G) − E[χ(G)]| ≥ λ√n) ≤ 2e−2λ2 . This result holds even without knowing E[χ(G)].

SLIDE 62

Example: Edge Exposure Martingale

Let G random graph from Gn,p. Consider m = n

2

possible edges

in arbitrary order. Xi = 1 if ith edge is present

therwise

F(G) = size maximum clique in G. Z0 = E[F(G)] Zi = E[F(G)|X1, X2, . . . , Xi], for i = 1, . . . , m. Z0, Z1, . . . , Zm is a Doob martingale. (F(G) could be any finite-valued function on graphs.)