mathematical foundations of infinite dimensional
play

Mathematical Foundations of Infinite-Dimensional Statistical Models: - PowerPoint PPT Presentation

Mathematical Foundations of Infinite-Dimensional Statistical Models: 3.3 The Entropy Method and Talagrands Inequality(3.3.4 3.3.5) Seoul National University ga0408@snu.ac.kr Nov 15, 2018 1/20 Table of Contents 3.3 The Entropy


  1. Mathematical Foundations of Infinite-Dimensional Statistical Models: 3.3 The Entropy Method and Talagrand’s Inequality(3.3.4 3.3.5) 이 종 진 Seoul National University ga0408@snu.ac.kr Nov 15, 2018 1/20

  2. Table of Contents 3.3 The Entropy Method and Talagrand’s Inequality 3.3.2 & 3.3.3 3.3.4 The Entropy Method for Random Variables with Bounded Differences and for Self-Bounding Random Variables 3.3.5 The Upper Tail in Talagrand’s Inequality for Nonidentically Distributed Random Variables ∗ 2/20

  3. Ent µ f := E µ flogf − E µ f · log E µ f ◮ Exponential inequality Ee λ ( Z − EZ ) ≤ ... 1. Subadditivity random variable 2. Functions with bounded differences condition 3. Self-bounding random variables ◮ Talagrand’s inequality 1. The upper tail in Talagrand’s Inequality, Bousquet’s version,( v n ) 2. The lower tail in Talagrand’s Inequality, Klein’s version,( v n ) 3. The lower tail in Talagrand’s Inequality, Klein-Rio version, ( V n ) 4. The upper tail in Talagrand’s inequality for nonidentically distributed random variable, ( V n ) 3/20

  4. 3.3.2 & 3.3.3 4/20

  5. Theorem 3.3.7 Let Z = Z ( X 1 , . . . , X n ) , X i independent, be a subadditive random variable relative to Z k = Z k ( X 1 , . . . , X k − 1 , X k + 1 , . . . , X n ) , k = 1 , . . . , n , such that EZ ≥ 0 and for which there exist random variables Y k ≥ Z ` Z k ≥ 1 such that E k Y k ≤ 0 . Let σ 2 < ∞ be any real number satisfying n 1 � E k Y 2 k ≤ σ 2 , n k = 1 and set v := 2 EZ + n σ 2 . Then log Ee λ ( Z − EZ ) ≤ v ( e λ − λ − 1 ) = v φ ( − λ ) , λ ≥ 0 . 5/20

  6. ◮ Taylor 전 개 하 면 Var Z ≤ 2 EZ + n σ 2 ◮ Prop 3.1.6 에 Thm 3.3.7 을 적 용 하 면 Z − EZ 의 꼬 리 확 률 의 상 한 들 을 얻 음 Corollary 3.3.8 Let Z be as in Theorem 3.3.7. Then, for all t ≥ 0 , P ( Z ≥ EZ + t ) ≤ exp( − vh 1 ( t / v )) ≤ exp( − 3 t 4 log( 1 + 2 t 3 v )) t 2 ≤ exp( − 2 v + 2 t / 3 ) and √ 2 vx + x / 3 ) ≤ e − x , P ( Z ≥ EZ + x ≤ 0 . 6/20

  7. Theorem 3.3.9 (Upper tail of Talagrand’s inequality, Bousquet’s version) Let ( S , S ) be a measurable space, and let n ∈ N . Let X 1 , . . . , X n be independent S -valued random variables. Let F be a countable set of measurable real-valued functions on S such that || f || ∞ ≤ U < ∞ and Ef ( X 1 ) = · · · = Ef ( X n ) = 0 , for all f ∈ F . Let j j � � S j = sup f ( X k ) S j = sup | f ( X k ) | , j = 1 , . . . , n , or f ∈F f ∈F k = 1 k = 1 and let the parameters σ 2 and v be defined by n U 2 ≥ σ 2 ≥ 1 � Ef 2 ( X k ) , v n = 2 UES n + n σ 2 . sup and n f ∈F k = 1 7/20

  8. Theorem 3.3.9 (Upper tail of Talagrand’s inequality, Bousquet’s version) Then log Ee λ ( S n − ES n ) ≤ v n ( e λ − 1 − λ ) , λ ≥ 0 . As a consequence, 1 ≤ j ≤ n S j ≥ ES n + x ) ≤ e − ( v n / U 2 ) h 1 ( xU / v n ) P ( S n ≥ ES n + x ) ≤ P ( max x 2 ≤ exp[ − 3 x 4 U log( 1 + 2 xU 3 v n )] ≤ exp[ − 2 v n + 2 xU / 3 ] and √ √ 2 v n x + Ux 2 v n x + Ux 3 ) ≤ e − x , P ( S n ≥ ES n + 3 ) ≤ P ( max 1 ≤ j ≤ n S j ≥ ES n + for all x ≥ 0 . 7/20

  9. Theorem 3.3.10 (Lower tail of Talagrand’s inequality: Klein’s version) Under the same hypotheses and notation as in Theorem 3.3.9, we have Ee − t ( S n − ES n ) ≤ exp( v n e 4 t − 1 − 4 t ) = e v n φ ( − 4 t ) / 16 , for 0 ≤ t < 1 . 16 As a consequence, for all x ≥ 0 , 16 U 2 h 1 ( 4 xU v n P ( S n ≤ ES n − x ) ≤ exp( − v n )) x 2 ≤ exp( − 3 x 16 U 2 log( 1 + 8 xU 3 v n )) ≤ exp( − 2 v n + 8 xU / 3 ) and √ 2 v n x − 4 Ux 3 ) ≤ e − x . P ( S n ≤ ES n − 8/20

  10. Remark 3.3.11 (Klein-Rio version, Klein and Rio(2005)) Setting n � Ef 2 ( X k ) , V n = 2 UES n + sup f k = 1 then Ee − t ( S n − ES n ) ≤ exp( V n e 3 t − 1 − 3 t ) = e v n φ ( − 3 t ) / 9 , for 0 ≤ t < 1 , 9 and that, as a consequence, for all x ≥ 0, P ( S n ≤ ES n − x ) ≤ exp( − v n 9 U 2 h 1 ( 3 xU V n )) x 2 4 U 2 log( 1 + 2 xU ≤ exp( − x V n )) ≤ exp( − 2 V n + 2 xU ) and √ 2 V n x − Ux ) ≤ e − x . P ( S n ≤ ES n − 9/20

  11. 3.3.4 The Entropy Method for Random Variables with Bounded Differences and for Self-Bounding Random Variables 10/20

  12. Bounded Differences Definition 3.3.12 Let ( S i , S i ) , i =1,...,n, be measurable spaces, and let f : � n i = 1 S i �→ R be a measurable function. f has bounded differences if � � ′ sup � f ( x 1 , ..., x n ) − f ( x 1 , ... x i − 1 , x i , x i + 1 , ..., x n ) � ≤ c i � � x i , x ′ k ∈ S , i , j ≤ n where, for each i, c i is a measuralbe function of x j , j � = i and there exists a n i ≤ c 2 for all ( x 1 , ..., x n ) ∈ S n . c 2 finite constant c such that � i = 1 If Z = f ( X 1 , ..., X n ) , where X i are S i -valued independent random variables, we say that the random variable Z has bounded differences. 11/20

  13. Theorem 3.3.14 If Z has bounded differences and � c 2 i ≤ c 2 , then, for all λ ≥ 0 Ee λ ( Z − EZ ) ≤ e λ 2 c 2 / 8 (3.115) so that, for all t ≥ 0 Pr { Z ≥ EZ + t } ≤ e − 2 t 2 / c 2 , Pr { Z ≤ EZ − t } ≥ e − 2 t 2 / c 2 (3.116) Moreover, Var ( Z ) ≤ c 2 4 . (3.117) Proof. Y ( λ ) − L Y ( λ )) = Ee λ Y � λ Ent µ ( e λ ( Y − EY ) ) = Ee λ Y ( λ L ′ ′′ Y ( t ) dt , L Y = log F Y 0 tL & tensorisation of entropy(Proposition 2.5.3) 12/20

  14. Previous seminar Definition Z , Z k 가 � 0 ≤ Z − Z k ≤ 1 ( 1 ≤ k ≤ n ) , ( Z − Z k ) ≤ Z k 를 만 족 하 면 Z 를 자 기 경계 (self-bounding) 라 한 다 . ◮ Z 가 자 기 경계 이 면 명 제 3.3.1 에 서 L ( λ ) := log F ( λ ) 일 때 ( λ − φ ( λ )) L ′ ( λ ) − L ( λ ) ≤ φ ( λ ) EZ (3.79) 처 럼 훨 씬 간 단 한 꼴 로 바 꿀 수 있음 13/20

  15. Theorem Theorem 3.3.15 Let Z be a self-bounding random variable. Then log E ( e λ ( Z − EZ ) ) ≤ φ ( − λ ) EZ , λ ∈ R . (3.123) n This applies in particular to Z = sup f ∈F � f ( X i ) , where F is countable and k = 1 0 ≤ f ( x ) ≤ 1 for all x ∈ S and f ∈ F . – φ ( λ ) = e − λ + λ − 1 Proof. ′ ( λ ) φ ′ ( − λ ) , ψ 0 ( λ ) := v φ ( − λ ) is solution of (3.79) Since φ ( λ ) + φ ( − λ ) = φ 14/20

  16. As a consequence, Theorem 3.3.15 and Propositioin 3.1.6 Pr { Z ≥ EZ + t } ≤ exp ( − ( EZ ) h 1 ( t / EZ )) (3.124) Pr { Z ≤ EZ − t } ≤ exp ( − ( EZ ) h 1 ( − t / EZ )) � � t 2 − 3 t 2 t Pr { Z ≥ EZ + t } ≤ exp 4 log( 1 + ≤ exp ( − 2 EZ + 2 t / 3 ) 3 EZ Pr { Z ≤ EZ − t } ≤ exp ( − t 2 / ( 2 EZ )) and Var ( Z ) ≤ EZ 15/20

  17. 3.3.5 The Upper Tail in Talagrand’s Inequality for Nonidentically Distributed Random Variables ∗ 16/20

  18. Theorem 3.3.16 Let X i , i ∈ N , be independent S-valued random variables, and let F be a countable class of functions f = ( f 1 , ..., f n ) : S �→ [ − 1 , 1 ] n such that Ef k ( X k ) = 0 for all f i ∈ F and k=1,...,n. Set n � f k ( X k ) , Z = sup T n ( f ) = T n ( f ) f ∈ F k = 1 and n ET 2 � E [ f k ( X k )] 2 , V n = 2 EZ + V n . V n = sup n ( f ) = sup (3.126) f ∈ F f ∈ F k = 1 Then, for all t ∈ [ 0 , 2 / 3 ] , t 2 L ( t ) := log( Ee tZ ) ≤ tEZ + 2 − 3 t V n , (3.127) and therefore, for all x ≥ 0 , √ � 2 V n x + 3 x � ≤ e − x Pr Z ≥ EZ + (3.128) 2 17/20

  19. Proof. To prove Theorem 3.3.16 we need Lemma 3.3.17 ∼ 3.3.19 Lemma 3.3.17 Let F(t) = Ee tZ , let g(t; X 1 , ..., X n ) = e tZ and let g k ( t ; , X 1 , ..., X n ) , k = 1,...,n, be nonnegative functions such that E ( g k log g k ) < ∞ for all t ≤ 0 . Then n � tF ′ ( t ) − F ( t ) log F ( t ) = Ent p ( g ( t )) ≤ E [ g k log( g k / E k g k )]+ k = 1 (3.129) n � E [( g − g k ) log( g / E k g )] . k = 1 18/20

  20. Lemma 3.3.18 For g = e tZ and the functions g k , 1 ≤ k ≤ n , defined by (3.130), we have E (( g − g k ) log( g / E k g )) ≤ tE ( g − g k ) Lemma 3.3.19 ≤ t 2 e t V n � � g k log g k E h k + ( 1 + t )( h k − g k ) F ( t ) (3.134) 2 19/20

  21. proof of Theorem 3.3.16 Proof. Setting as usual L ( t ) = log Ee tZ = log F ( t ) , Through Lemmma 3.3.17 ∼ Lemma 3.3.19 ′ ( t ) − L ( t ) ≤ t 2 e t V n / 2 t ( 1 − t ) L ′ = l Dividing both sides by t 2 and noting that ( L / t ) ′ / t − L / t 2 , it becomes � ′ ′ ≤ e t V n � L − L t 2 And, integrates and uses taylor expansion, ... t 2 1 − t EZ ≤ t 2 ( V n + 2 EZ ) t 2 L ( t ) − tEZ ≤ ( 2 − t )( 1 − t ) V n + 2 − 3 t This proves (3.127), (To prove (3.128), Propositon 3.1.6, φ ( λ ) = V n λ 2 / ( 2 ( 1 − 3 λ/ 2 )) . 20/20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend