 
              Randomized Algorithms Randomized Algorithms The Chernoff bound The Chernoff bound Speaker: Chuang- -Chieh Lin Chieh Lin Speaker: Chuang Advisor: Professor Maw- -Shang Chang Shang Chang Advisor: Professor Maw National Chung Cheng University National Chung Cheng University 2006/10/25
Outline Outline � Introduction Introduction � � The Chernoff bound The Chernoff bound � � Markov Markov’ ’s Inequality s Inequality � � The Moment Generating Functions The Moment Generating Functions � � The Chernoff Bound for a Sum of Poisson Trials The Chernoff Bound for a Sum of Poisson Trials � � The Chernoff Bound for Special cases The Chernoff Bound for Special cases � � Set Balancing Problem Set Balancing Problem � BPP � Error Error- -reduction for reduction for BPP � � References References � 2006/10/25 2 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Introduction Introduction � Goal: Goal: � The Chernoff bound can be used in the analysis on the can be used in the analysis on the tail � The Chernoff bound tail � of the distribution of the of the sum of independent random sum of independent random of the distribution , with some extensions to the case of dependent or variables , with some extensions to the case of dependent or variables correlated random variables. correlated random variables. s Inequality and and Moment generating � Markov Markov’ ’s Inequality Moment generating � functions which we shall introduce will be greatly which we shall introduce will be greatly functions needed. needed. 2006/10/25 3 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Math tool Math tool Professor Herman Chernoff Chernoff’ ’s s bound, bound, Professor Herman 1952 of Mathematical Statistics 1952 Annal of Mathematical Statistics Annal 2006/10/25 4 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Chernoff bounds Chernoff bounds In it 0 s most general form, the Cherno ff bound for a random vari- able X is obtained as follows: for any t > 0, Pr [ X ≥ a ] ≤ E [ e tX ] A moment generating function e ta or equivalently, ln Pr [ X ≥ a ] ≤ − ta + ln E [ e tX ] . The value of t that minimizes E [ e tX ] gives the best possible e ta bounds. 2006/10/25 5 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Markov ’ s Inequality Markov ’ s Inequality For any random variable X ≥ 0 and any a > 0, Pr [ X ≥ a ] ≤ E [ X ] . a ’ We can use Markov s Inequality to derive the famous ’ Chebyshev s Inequality : Pr [ | X − E [ X ] | ≥ a ] = Pr [( X − E [ X ]) 2 ≥ a 2 ] ≤ Var [ X ] . a 2 2006/10/25 6 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Proof of the Chernoff bound Proof of the Chernoff bound ’ It follows directly from Markov s inequality: Pr [ X ≥ a ] = Pr [ e tX ≥ e ta ] ≤ E [ e tX ] e ta So, how to calculate this term? So, how to calculate this term? 2006/10/25 7 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Moment Generating Functions Moment Generating Functions M X ( t ) = E [ e tX ] . This function gets its name because we can generate the i th mo- ment by di ff erentiating M X ( t ) i times and then evaluating the result for t = 0: ¯ ¯ d i ¯ E [ X i ] . dt i M X ( t ) = ¯ t =0 The i th moment of r.v. X Remark: E [ X i ] = P x i · Pr [ X = x ] x ∈ X 2006/10/25 8 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Moment Generating Functions (cont ’ d) Moment Generating Functions (cont ’ d) We can easily see why the moment generating function works as follows: ¯ ¯ ¯ ¯ d i d i ¯ ¯ dt i E [ e tX ] dt i M X ( t ) = ¯ ¯ t =0 t =0 ¯ ¯ X d i ¯ e ts Pr [ X = s ] = ¯ ¯ dt i s t =0 ¯ X ¯ d i ¯ dt i e ts Pr [ X = s ] = ¯ t =0 s X ¯ ¯ s i e ts Pr [ X = s ] = t =0 s X s i Pr [ X = s ] = s E [ X i ] . = 2006/10/25 9 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Moment Generating Functions Moment Generating Functions (cont ’ d) (cont ’ d) � The concept of the moment generating function The concept of the moment generating function � (mgf mgf) is connected with a distribution rather than ) is connected with a distribution rather than ( with a random variable. with a random variable. � Two different random variables with the same Two different random variables with the same � distribution will have the same mgf mgf. . distribution will have the same 2006/10/25 10 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Moment Generating Functions Moment Generating Functions (cont ’ d) (cont ’ d) F Fact: If M X ( t ) = M Y ( t ) for all t ∈ ( − c, c ) for some c > 0, then X and Y have the same distribution. F If X and Y are two independent random variables, then M X + Y ( t ) = M X ( t ) M Y ( t ) . F Let X 1 , . . . , X k be independent random variables with ’ mgf s M 1 ( t ) , . . . , M k ( t ). Then the mgf of the random variable Y = P k i =1 X i is given by Y k M Y ( t ) = M i ( t ) . i =1 2006/10/25 11 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Moment Generating Functions Moment Generating Functions (cont ’ d) (cont ’ d) F If X and Y are two independent random variables, then M X + Y ( t ) = M X ( t ) M Y ( t ) . Proof: E [ e t ( X + Y ) ] M X + Y ( t ) = E [ e tX e tY ] = E [ e tX ] E [ e tY ] = = M X ( t ) M Y ( t ) . Here we have used that X and Y are independent — and hence e tX and e tY are independent — to conclude that E [ e tX e tY ] = E [ e tX ] E [ e tY ]. 2006/10/25 12 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Chernoff bound for the sum of Chernoff bound for the sum of Poisson trials Poisson trials � Poisson trials: Poisson trials: � � The distribution of a sum of independent 0 The distribution of a sum of independent 0- -1 random variables, 1 random variables, � which may not be identical may not be identical. . which � Bernoulli trials: Bernoulli trials: � � The same as above except that all the random variables are The same as above except that all the random variables are � identical. . identical 2006/10/25 13 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Chernoff bound for the sum of Chernoff bound for the sum of Poisson trials (cont ’ d) Poisson trials (cont ’ d) F X i : i = 1 , . . . , n, mutually independent 0-1 random variables with Pr [ X i = 1] = p i and Pr [ X i = 0] = 1 − p i . Let X = X 1 + . . . + X n and E [ X ] = μ = p 1 + . . . + p n . M X i ( t ) = E [ e tX i ] = p i e t · 1 + (1 − p i ) e t · 0 = p i e t + (1 − p i ) = 1 + p i ( e t − 1) ≤ e p i ( e t − 1) . (Since 1 + y (Since 1 + y ≤ ≤ e y .) .) e y F M X ( t ) = E [ e tX ] = M X 1 ( t ) M X 2 ( t ) . . . M X n ( t ) ≤ e ( p 1 + p 2 + ... + p n )( e t − 1) = e ( e t − 1) μ , since μ = p 1 + p 2 + . . . + p n . We will use this result later. 2006/10/25 14 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Chernoff bound for the sum of Chernoff bound for the sum of Poisson trials (cont ’ d) Poisson trials (cont ’ d) Poisson trials Theorem 1: Let X = X 1 + · · · + X n , where X 1 , . . . , X n are n independent trials such that Pr [ X i = 1] = p i holds for each i = 1 , 2 , . . . , n . Then, ³ ´ μ e d (1) for any d > 0, Pr [ X ≥ (1 + d ) μ ] ≤ ; (1+ d ) 1+ d (2) for d ∈ (0 , 1], Pr [ X ≥ (1 + d ) μ ] ≤ e − μ d 2 / 3 ; (3) for R ≥ 6 μ , Pr [ X ≥ R ] ≤ 2 − R . 2006/10/25 15 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
For any random vari- Proof of Theorem 1: Proof of Theorem 1: able X ≥ 0 and any a > 0, Pr [ X ≥ a ] ≤ By Markov inequality, for any t > 0 we have E [ X ] a . Pr [ X ≥ (1 + d ) μ ] = Pr [ e tX ≥ e t (1+ d ) μ ] ≤ E [ e tX ] /e t (1+ d ) μ ≤ e ( e t − 1) μ /e t (1+ d ) μ . For any d > 0, set t = ln(1 + d ) > 0 we have (1). To prove (2), we need to show for 0 < d ≤ 1, e d / (1+ d ) (1+ d ) ≤ e − d 2 / 3 . Taking the logarithm of both sides, we have d − (1+ d ) ln(1+ d ) + d 2 / 3 ≤ 0, which can be proved with calculus. To prove (3), let R = (1+ d ) μ . Then, for R ≥ 6 μ , d = R/ μ − ³ ´ μ e d 1 ≥ 5. Hence, using (1), Pr [ X ≥ (1+ d ) μ ] ≤ ≤ (1+ d ) (1+ d ) 1+ d ) (1+ d ) μ ≤ ( e/ 6) R ≤ 2 − R . e ( from (1) 2006/10/25 16 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
probability � Similarly, we have: Similarly, we have: X � μ μ − d μ μ + d μ P n Let X = i =1 X i , where X 1 , . . . , X n are n Theorem: independent Poisson trials such that Pr [ X i = 1] = p i . Let μ = E [ X ]. Then, for 0 < d < 1: ³ ´ μ e − d (1) Pr [ X ≤ (1 − d ) μ ] ≤ ; (1 − d ) (1 − d ) (2) Pr [ X ≤ (1 − d ) μ ] ≤ e − μ d 2 / 2 . For 0 < d < 1, Pr [ | X − μ | ≥ d μ ] ≤ 2 e − μ d 2 / 3 . Corollary: 2006/10/25 17 Computation Theory Lab, CSIE, CCU, Taiwan Computation Theory Lab, CSIE, CCU, Taiwan
Recommend
More recommend