self bounding functions and concentration of variance
play

Self-bounding functions and concentration of variance Andreas - PowerPoint PPT Presentation

Self-bounding functions and concentration of variance Andreas Maurer Advances in stochastic inequalities and their applications, BIRS 2009 Notation and denitions := Q n k =1 k is some product space with product probability = n


  1. Self-bounding functions and concentration of variance Andreas Maurer Advances in stochastic inequalities and their applications, BIRS 2009

  2. Notation and de…nitions � := Q n k =1 � k is some product space with product probability � = � n k =1 � k . � x 1 ; :::; x k � 1 ; y; x k +1 ; :::; x n � . for x 2 � write x y;k := f : � ! R is some generic function and bded below For 1 � k � n de…ne functions inf k f , Df : � ! R by � � k f ( x ) x y;k inf : = inf f y 2 � k � � 2 n X Df ( x ) f ( x ) � inf k f ( x ) : = : k =1 Df is a local measure of the sensitivity of f to modi…cations of individual arguments.

  3. Theorem 1 Boucheron, Lugosi, Massart (2003), Maurer (2006) ! � t 2 Pr f f � E [ f ] � t g � exp : 2 k Df k 1 If also 8 k; f � inf k f � 1 a.s. then ! � t 2 Pr f E [ f ] � f � t g � exp : 2 k Df k 1 + 2 t= 3 Applies to convex Lipschitz functions, eigenvalues of random symmetric matri- ces, shortest TSP’s...

  4. Theorem 2 Boucheron, Lugosi, Massart (2003), Maurer (2006) Suppose Df � af a.s., with a > 0 ; Then ! � t 2 Pr f f � E [ f ] � t g � exp : 2 a E [ f ] + at If also 8 k; f � inf k f � 1 a.s. and a � 1 then ! � t 2 Pr f E [ f ] � f � t g � exp : 2 a E [ f ] This talk is about applications of this result.

  5. Application 1 Amendment to Theorem 1, idea from Boucheron, Lugosi, Massart (2009) If f � 0 and f 2 � inf k f 2 � 1 , then ! � t 2 Pr f E [ f ] � f � t g � exp : 8 k Df k 1 Proof: � � 2 � � 2 � � 2 � f 2 � X X f 2 � inf k f 2 = = f � inf f + inf D k f k f k k ( Df ) (2 f ) 2 � 4 k Df k 1 f 2 � so by Theorem 2 applied to f 2 ! n h f 2 i o � t 2 � f 2 � E [ f ] t � Pr f E [ f ] � f � t g � Pr E � exp 8 k Df k 1

  6. Application 2 (with Massi Pontil for COLT09): X; X 1 ; :::; X n iid r.v. with values in [0 ; 1] . Want to give bounds on E X in terms of X = ( X 1 ; :::; X n ) with high con…dence 1 � � . Hoe¤ding: 8 9 s < = ln 1 =� : E X � � Pr X � ; � 1 � �: 2 n Bernstein/Bennett: 8 9 s < = p 2 ln 1 =� + ln 1 =� : E X � � Pr ; � 1 � �: X � V n 3 n To use Bernstein without other information we need a bound on the standard p deviation V in terms of sample.

  7. Estimators for variance and standard deviation For the variance use the sample variance ^ V � � 2 for x 2 [0 ; 1] n X 1 ^ V ( x ) = x i � x j 2 n ( n � 1) i;j p ^ For the standard deviation we use V . Then we can show this: f := n ^ V satis…es n f � inf k f � 1 and Df � n � 1 f; and Theorem 2 gives the lower tail bounds ! n o � ( n � 1) t 2 V � ^ Pr V > t � exp , and 2 V ! q � p � � ( n � 1) t 2 ^ Pr V � V > t � exp : 2

  8. Other methods to get such bounds Audibert, Munos, Szepesvári (2007): Apply Bernstein-like bounds to X i , � X i and � ( X i � E X ) 2 respectively, combine to get ! � p � q � nt 2 ^ Pr � 3 exp V � V emp > t ; 3 : 24 where ^ V emp = ( n � 1) ^ V =n (=variance of empirical distribution). � x; x 0 � = � x � x 0 � 2 = 2 . Alternative: ^ V is U-statistic with kernel q Hoe¤dings version of Bennett’s inequality for U-statistics leads to ! q � p � � ( n � 1) t 2 ^ Pr V � V > t � exp : 2 : 62

  9. Empirical Bernstein bounds Substitution of above in Bernstein’s inequality gives empirical version: 8 9 s q < = 2 ln 2 =� + 7 ln 2 =� : E X � � ^ Pr X � V ; � 1 � �: 3 ( n � 1) n Applications: Multi-armed bandit problem (Audibert, Munos, Szepesvári, 2007), stopping algorithms (Mnih, Szepesvári, Audibert, 2008), sample variance pe- nalization (Pontil, Maurer, 2009).

  10. Application 3 (Largest eigenvalue of the Gramian): X = ( X 1 ; :::; X n ) indep. r.v. distributed in unit ball B of Hilbert space H: D E G ( x ) ij = x i ; x j , f ( x ) = � max ( x ) = largest eigenvalue of G ( x ) : � � By Weyls monotonicity inf k f ( x ) = f x 0 ;k . Also 9 u 2 R n ; k u k R n = 1 , such that � � � � * + 2 2 � � � � � � X X X X � � � � � � � � f ( x ) � f x 0 ;k = u i x i + � u i x i � u i x i u k x k ; u i x i � � � � � � � � i i 6 = k i i 6 = k � � � � q X � � � � 2 j u k j � = 2 j u k j f ( x ) : � u i x i � � � i Conclusion1: f � inf k f � 1 Conclusion2: Square and sum over k to get Df � 4 f

  11. Application 3 (Largest eigenvalue of the Gramian): X = ( X 1 ; :::; X n ) indep. r.v. distributed in unit ball B of Hilbert space H: D E G ( x ) ij = x i ; x j , f ( x ) = � max ( x ) = largest eigenvalue of G ( x ) : From Theorem 2 we get ! � t 2 Pr f � max � E � max > t g � exp 8 E � max + 4 t ! � t 2 Pr f E � max � � max > t g � exp 8 E � max For the largest singular value of the matrix X we get Pr f� ( � max � E � max ) > t g � e � t 2 = 8 :

  12. Another result related to self-bounded functions: Theorem 3 Suppose f; g : � ! R , 0 � f � g and Df � ag and Dg � ag and a � 1 Then ! � t 2 Pr f f � E f > t g � exp 4 a E g + 3 at= 2 If also f � inf k f � 1 ! � t 2 Pr f E f � f > t g � exp 4 a E g + at

  13. Application 4 (any eigenvalue of the Gramian) X = ( X 1 ; :::; X n ) indep. r.v. distributed in unit ball B of Hilbert space H: D E G ( X ) ij = X i ; X j , now let � d ( X ) be any eigenvalue of G ( X ) Set f := � d = 2 and g = � max = 2 . We can show 0 � f � g and f � inf k f � 1 and Df � 2 g and Dg � 2 g: Applying Theorem 3 gives ! � t 2 Pr f � d � E � d > t g � exp 16 E � max + 6 t ! � t 2 Pr f E � d � � d > t g � exp : 16 E � max + 4 t

  14. References [1] J. Y. Audibert, R. Munos, C. Szepesvári. Exploration-exploitation trade- o¤ using variance estimates in multi-armed bandits, Theoretical Computer Science, 2008. [2] S. Boucheron, G. Lugosi, P. Massart, Concentration inequalities using the entropy method , Annals of Probability (2003) 31:1583-1614. [3] M. Ledoux, The Concentration of Measure Phenomenon, AMS Surveys and Monographs 89 (2001) [4] A. Maurer, Concentration inequalities for functions of independent vari- ables. Random Structures Algorithms 29 121–138 2006

  15. [5] Volodymyr Mnih, C. Szepesvári, J. Y. Audibert. Empirical Bernstein Stop- ping. ICML 2008

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend