concentration inequalities and tail bounds
play

Concentration inequalities and tail bounds John Duchi Prof. John - PowerPoint PPT Presentation

Concentration inequalities and tail bounds John Duchi Prof. John Duchi Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno ff bounds II Sub-Gaussian random variables 1 Definitions 2 Examples 3 Hoe ff ding


  1. Concentration inequalities and tail bounds John Duchi Prof. John Duchi

  2. Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno ff bounds II Sub-Gaussian random variables 1 Definitions 2 Examples 3 Hoe ff ding inequalities III Sub-exponential random variables 1 Definitions 2 Examples 3 Cherno ff /Bernstein bounds Prof. John Duchi

  3. Motivation I Often in this class, goal is to argue that sequence of random (vectors) X 1 , X 2 , . . . satisfies n 1 p X X i ! E [ X ] . n i =1 I Law of large numbers: if E [ k X k ] < 1 , then n ! 1 X lim X i 6 = E [ X ] = 0 . P n n →∞ i =1 Prof. John Duchi

  4. Markov inequalities Theorem (Markov’s inequality) Let X be a non-negative random variable. Then P ( X � t )  E [ X ] . t Prof. John Duchi

  5. Chebyshev inequalities Theorem (Chebyshev’s inequality) Let X be a real-valued random variable with E [ X 2 ] < 1 . Then P ( X � E [ X ] � t )  E [( X � E [ X ]) 2 ] = Var( X ) . t 2 t 2 Example: i.i.d. sampling Prof. John Duchi

  6. Cherno ff bounds Moment generating function: for random variable X , the MGF is M X ( λ ) := E [ e � X ] Example: Normally distributed random variables Prof. John Duchi

  7. Cherno ff bounds Theorem (Cherno ff bound) For any random variable and t � 0 , � ≥ 0 M X − E [ X ] ( λ ) e − � t = inf � ≥ 0 E [ e � ( X − E [ X ]) ] e − � t . P ( X � E [ X ] � t )  inf Prof. John Duchi

  8. Sub-Gaussian random variables Definition (Sub-Gaussianity) A mean-zero random variable X is σ 2 -sub-Gaussian if ✓ λ 2 σ 2 ◆ h e � X i  exp for all λ 2 R E 2 Example: X ⇠ N (0 , σ 2 ) Prof. John Duchi

  9. Properties of sub-Gaussians Proposition (sums of sub-Gaussians) Let X i be independent, mean-zero σ 2 i -sub-Gaussian. Then P n i =1 X i is P n i =1 σ 2 i -sub-Gaussian. Prof. John Duchi

  10. Concentration inequalities Theorem Let X be σ 2 -sub-Gaussian. Then for t � 0 , � t 2 ✓ ◆ P ( X � E [ X ] � t )  exp 2 σ 2 � t 2 ✓ ◆ P ( X � E [ X ]  � t )  exp 2 σ 2 Prof. John Duchi

  11. Concentration: convergence of an independent sum Corollary Let X i be independent σ 2 i -sub-Gaussian. Then for t � 0 , n ! ! nt 2 1 X X i � t  exp � P P n 2 1 i =1 σ 2 n i n i =1 Prof. John Duchi

  12. Example: bounded random variables Proposition Let X 2 [ a, b ] , with E [ X ] = 0 . Then λ 2( b − a )2 E [ e � X ]  e . 8 Prof. John Duchi

  13. Maxima of sub-Gaussian random variables (in probability)  � 2 σ 2 log n p max j ≤ n X j  E Prof. John Duchi

  14. Maxima of sub-Gaussian random variables (in expectation) ✓ ◆  e − t . p 2 σ 2 (log n + t ) max j ≤ n X j � P Prof. John Duchi

  15. Hoe ff ding’s inequality If X i are bounded in [ a i , b i ] then for t � 0 , n ! ! 2 nt 2 1 X ( X i � E [ X i ]) � t  exp � P P n 1 n i =1 ( b i � a i ) 2 n i =1 n ! ! 2 nt 2 1 X ( X i � E [ X i ])  � t  exp � . P P n 1 n i =1 ( b i � a i ) 2 n i =1 Prof. John Duchi

  16. Equivalent definitions of sub-Gaussianity Theorem The following are equivalent (up to constants) i E [exp( X 2 / σ 2 )]  e p ii E [ | X | k ] 1 /k  σ k iii P ( | X | � t )  exp( � t 2 2 � 2 ) If in addition X is mean-zero, then this is also equivalent to i–iii above iv X is σ 2 -sub-Gaussian Prof. John Duchi

  17. Sub-exponential random variables Definition (Sub-exponential) A mean-zero random variable X is ( τ 2 , b ) -sub-Exponential if ✓ λ 2 τ 2 ◆ for | λ |  1 E [exp ( λ X )]  exp b. 2 Example: Exponential RV, density p ( x ) = β e − � x for x � 0 Prof. John Duchi

  18. Sub-exponential random variables Example: χ 2 -random variable. Let Z ⇠ N (0 , σ 2 ) and X = Z 2 . Then 1 E [ e � X ] = . 1 [1 � 2 λσ 2 ] 2 + Prof. John Duchi

  19. Concentration of sub-exponentials Theorem Let X be ( τ 2 , b ) -sub-exponential. Then e − t 2 ( if 0  t  ⌧ 2 ⇢ e − t 2 � 2 τ 2 2 τ 2 , e − t P ( X � E [ X ]+ t )  b = max . 2 b e − t if t � ⌧ 2 2 b b Prof. John Duchi

  20. Sums of sub-exponential random variables Let X i be independent ( τ 2 i , b i ) -sub-exponential random variables. Then P n i =1 X i is ( P n i =1 τ 2 i , b ∗ ) -sub-exponential, where b ∗ = max i b i Corollary: If X i satisfy above, then � n � ! ( )! nt 2 1 , nt � � X X i � E [ X i ] � � t  2 exp � min . P � � P n 2 1 i =1 τ 2 n 2 b ∗ � � i n � i =1 Prof. John Duchi

  21. Bernstein conditions and sub-exponentials Suppose X is mean-zero with | E [ X k ] |  1 2 k ! σ 2 b k − 2 Then λ 2 σ 2 ✓ ◆ E [ e � X ]  exp 2(1 � b | λ | ) Prof. John Duchi

  22. Johnson-Lindenstrauss and high-dimensional embedding Question: Let u 1 , . . . , u m 2 R d be arbitrary. Can we find a mapping F : R d ! R n , n ⌧ d , such that � u i � u j � � 2 � 2 � u i � u j � � 2 � F ( u i ) � F ( u j ) � � � � (1 � δ ) 2  2  (1 + δ ) 2 Theorem (Johnson-Lindenstrauss embedding) For n & 1 ✏ 2 log m such a mapping exists. Prof. John Duchi

  23. Proof of Johnson-Lindenstrauss continued � � ! k Xu k 2 � nt 2 ✓ ◆ � � 2 � 1 � � t  2 exp for t 2 [0 , 1] . P � � n k u k 2 8 � � 2 � Prof. John Duchi

  24. Reading and bibliography 1. S. Boucheron, O. Bousquet, and G. Lugosi. Concentration inequalities. In O. Bousquet, U. Luxburg, and G. Ratsch, editors, Advanced Lectures in Machine Learning , pages 208–240. Springer, 2004 2. V. Buldygin and Y. Kozachenko. Metric Characterization of Random Variables and Random Processes , volume 188 of Translations of Mathematical Monographs . American Mathematical Society, 2000 3. M. Ledoux. The Concentration of Measure Phenomenon . American Mathematical Society, 2001 4. S. Boucheron, G. Lugosi, and P. Massart. Concentration Inequalities: a Nonasymptotic Theory of Independence . Oxford University Press, 2013 Prof. John Duchi

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend