Mathematical Foundations of Infinite-Dimensional Statistical Models: - PowerPoint PPT Presentation

Mathematical Foundations of Infinite-Dimensional Statistical Models: 3.3 The Entropy Method and Talagrand’s Inequality(3.3.4 3.3.5) 이 종 진 Seoul National University ga0408@snu.ac.kr Nov 15, 2018 1/20

Table of Contents 3.3 The Entropy Method and Talagrand’s Inequality 3.3.2 & 3.3.3 3.3.4 The Entropy Method for Random Variables with Bounded Differences and for Self-Bounding Random Variables 3.3.5 The Upper Tail in Talagrand’s Inequality for Nonidentically Distributed Random Variables ∗ 2/20

Ent µ f := E µ flogf − E µ f · log E µ f ◮ Exponential inequality Ee λ ( Z − EZ ) ≤ ... 1. Subadditivity random variable 2. Functions with bounded differences condition 3. Self-bounding random variables ◮ Talagrand’s inequality 1. The upper tail in Talagrand’s Inequality, Bousquet’s version,( v n ) 2. The lower tail in Talagrand’s Inequality, Klein’s version,( v n ) 3. The lower tail in Talagrand’s Inequality, Klein-Rio version, ( V n ) 4. The upper tail in Talagrand’s inequality for nonidentically distributed random variable, ( V n ) 3/20

3.3.2 & 3.3.3 4/20

Theorem 3.3.7 Let Z = Z ( X 1 , . . . , X n ) , X i independent, be a subadditive random variable relative to Z k = Z k ( X 1 , . . . , X k − 1 , X k + 1 , . . . , X n ) , k = 1 , . . . , n , such that EZ ≥ 0 and for which there exist random variables Y k ≥ Z ` Z k ≥ 1 such that E k Y k ≤ 0 . Let σ 2 < ∞ be any real number satisfying n 1 � E k Y 2 k ≤ σ 2 , n k = 1 and set v := 2 EZ + n σ 2 . Then log Ee λ ( Z − EZ ) ≤ v ( e λ − λ − 1 ) = v φ ( − λ ) , λ ≥ 0 . 5/20

◮ Taylor 전 개 하 면 Var Z ≤ 2 EZ + n σ 2 ◮ Prop 3.1.6 에 Thm 3.3.7 을 적 용 하 면 Z − EZ 의 꼬 리 확 률 의 상 한 들 을 얻 음 Corollary 3.3.8 Let Z be as in Theorem 3.3.7. Then, for all t ≥ 0 , P ( Z ≥ EZ + t ) ≤ exp( − vh 1 ( t / v )) ≤ exp( − 3 t 4 log( 1 + 2 t 3 v )) t 2 ≤ exp( − 2 v + 2 t / 3 ) and √ 2 vx + x / 3 ) ≤ e − x , P ( Z ≥ EZ + x ≤ 0 . 6/20

Theorem 3.3.9 (Upper tail of Talagrand’s inequality, Bousquet’s version) Let ( S , S ) be a measurable space, and let n ∈ N . Let X 1 , . . . , X n be independent S -valued random variables. Let F be a countable set of measurable real-valued functions on S such that || f || ∞ ≤ U < ∞ and Ef ( X 1 ) = · · · = Ef ( X n ) = 0 , for all f ∈ F . Let j j � � S j = sup f ( X k ) S j = sup | f ( X k ) | , j = 1 , . . . , n , or f ∈F f ∈F k = 1 k = 1 and let the parameters σ 2 and v be defined by n U 2 ≥ σ 2 ≥ 1 � Ef 2 ( X k ) , v n = 2 UES n + n σ 2 . sup and n f ∈F k = 1 7/20

Theorem 3.3.9 (Upper tail of Talagrand’s inequality, Bousquet’s version) Then log Ee λ ( S n − ES n ) ≤ v n ( e λ − 1 − λ ) , λ ≥ 0 . As a consequence, 1 ≤ j ≤ n S j ≥ ES n + x ) ≤ e − ( v n / U 2 ) h 1 ( xU / v n ) P ( S n ≥ ES n + x ) ≤ P ( max x 2 ≤ exp[ − 3 x 4 U log( 1 + 2 xU 3 v n )] ≤ exp[ − 2 v n + 2 xU / 3 ] and √ √ 2 v n x + Ux 2 v n x + Ux 3 ) ≤ e − x , P ( S n ≥ ES n + 3 ) ≤ P ( max 1 ≤ j ≤ n S j ≥ ES n + for all x ≥ 0 . 7/20

Theorem 3.3.10 (Lower tail of Talagrand’s inequality: Klein’s version) Under the same hypotheses and notation as in Theorem 3.3.9, we have Ee − t ( S n − ES n ) ≤ exp( v n e 4 t − 1 − 4 t ) = e v n φ ( − 4 t ) / 16 , for 0 ≤ t < 1 . 16 As a consequence, for all x ≥ 0 , 16 U 2 h 1 ( 4 xU v n P ( S n ≤ ES n − x ) ≤ exp( − v n )) x 2 ≤ exp( − 3 x 16 U 2 log( 1 + 8 xU 3 v n )) ≤ exp( − 2 v n + 8 xU / 3 ) and √ 2 v n x − 4 Ux 3 ) ≤ e − x . P ( S n ≤ ES n − 8/20

Remark 3.3.11 (Klein-Rio version, Klein and Rio(2005)) Setting n � Ef 2 ( X k ) , V n = 2 UES n + sup f k = 1 then Ee − t ( S n − ES n ) ≤ exp( V n e 3 t − 1 − 3 t ) = e v n φ ( − 3 t ) / 9 , for 0 ≤ t < 1 , 9 and that, as a consequence, for all x ≥ 0, P ( S n ≤ ES n − x ) ≤ exp( − v n 9 U 2 h 1 ( 3 xU V n )) x 2 4 U 2 log( 1 + 2 xU ≤ exp( − x V n )) ≤ exp( − 2 V n + 2 xU ) and √ 2 V n x − Ux ) ≤ e − x . P ( S n ≤ ES n − 9/20

3.3.4 The Entropy Method for Random Variables with Bounded Differences and for Self-Bounding Random Variables 10/20

Bounded Differences Definition 3.3.12 Let ( S i , S i ) , i =1,...,n, be measurable spaces, and let f : � n i = 1 S i �→ R be a measurable function. f has bounded differences if � � ′ sup � f ( x 1 , ..., x n ) − f ( x 1 , ... x i − 1 , x i , x i + 1 , ..., x n ) � ≤ c i � � x i , x ′ k ∈ S , i , j ≤ n where, for each i, c i is a measuralbe function of x j , j � = i and there exists a n i ≤ c 2 for all ( x 1 , ..., x n ) ∈ S n . c 2 finite constant c such that � i = 1 If Z = f ( X 1 , ..., X n ) , where X i are S i -valued independent random variables, we say that the random variable Z has bounded differences. 11/20

Theorem 3.3.14 If Z has bounded differences and � c 2 i ≤ c 2 , then, for all λ ≥ 0 Ee λ ( Z − EZ ) ≤ e λ 2 c 2 / 8 (3.115) so that, for all t ≥ 0 Pr { Z ≥ EZ + t } ≤ e − 2 t 2 / c 2 , Pr { Z ≤ EZ − t } ≥ e − 2 t 2 / c 2 (3.116) Moreover, Var ( Z ) ≤ c 2 4 . (3.117) Proof. Y ( λ ) − L Y ( λ )) = Ee λ Y � λ Ent µ ( e λ ( Y − EY ) ) = Ee λ Y ( λ L ′ ′′ Y ( t ) dt , L Y = log F Y 0 tL & tensorisation of entropy(Proposition 2.5.3) 12/20

Previous seminar Definition Z , Z k 가 � 0 ≤ Z − Z k ≤ 1 ( 1 ≤ k ≤ n ) , ( Z − Z k ) ≤ Z k 를 만 족 하 면 Z 를 자 기 경계 (self-bounding) 라 한 다 . ◮ Z 가 자 기 경계 이 면 명 제 3.3.1 에 서 L ( λ ) := log F ( λ ) 일 때 ( λ − φ ( λ )) L ′ ( λ ) − L ( λ ) ≤ φ ( λ ) EZ (3.79) 처 럼 훨 씬 간 단 한 꼴 로 바 꿀 수 있음 13/20

Theorem Theorem 3.3.15 Let Z be a self-bounding random variable. Then log E ( e λ ( Z − EZ ) ) ≤ φ ( − λ ) EZ , λ ∈ R . (3.123) n This applies in particular to Z = sup f ∈F � f ( X i ) , where F is countable and k = 1 0 ≤ f ( x ) ≤ 1 for all x ∈ S and f ∈ F . – φ ( λ ) = e − λ + λ − 1 Proof. ′ ( λ ) φ ′ ( − λ ) , ψ 0 ( λ ) := v φ ( − λ ) is solution of (3.79) Since φ ( λ ) + φ ( − λ ) = φ 14/20

As a consequence, Theorem 3.3.15 and Propositioin 3.1.6 Pr { Z ≥ EZ + t } ≤ exp ( − ( EZ ) h 1 ( t / EZ )) (3.124) Pr { Z ≤ EZ − t } ≤ exp ( − ( EZ ) h 1 ( − t / EZ )) � � t 2 − 3 t 2 t Pr { Z ≥ EZ + t } ≤ exp 4 log( 1 + ≤ exp ( − 2 EZ + 2 t / 3 ) 3 EZ Pr { Z ≤ EZ − t } ≤ exp ( − t 2 / ( 2 EZ )) and Var ( Z ) ≤ EZ 15/20

3.3.5 The Upper Tail in Talagrand’s Inequality for Nonidentically Distributed Random Variables ∗ 16/20

Theorem 3.3.16 Let X i , i ∈ N , be independent S-valued random variables, and let F be a countable class of functions f = ( f 1 , ..., f n ) : S �→ [ − 1 , 1 ] n such that Ef k ( X k ) = 0 for all f i ∈ F and k=1,...,n. Set n � f k ( X k ) , Z = sup T n ( f ) = T n ( f ) f ∈ F k = 1 and n ET 2 � E [ f k ( X k )] 2 , V n = 2 EZ + V n . V n = sup n ( f ) = sup (3.126) f ∈ F f ∈ F k = 1 Then, for all t ∈ [ 0 , 2 / 3 ] , t 2 L ( t ) := log( Ee tZ ) ≤ tEZ + 2 − 3 t V n , (3.127) and therefore, for all x ≥ 0 , √ � 2 V n x + 3 x � ≤ e − x Pr Z ≥ EZ + (3.128) 2 17/20

Proof. To prove Theorem 3.3.16 we need Lemma 3.3.17 ∼ 3.3.19 Lemma 3.3.17 Let F(t) = Ee tZ , let g(t; X 1 , ..., X n ) = e tZ and let g k ( t ; , X 1 , ..., X n ) , k = 1,...,n, be nonnegative functions such that E ( g k log g k ) < ∞ for all t ≤ 0 . Then n � tF ′ ( t ) − F ( t ) log F ( t ) = Ent p ( g ( t )) ≤ E [ g k log( g k / E k g k )]+ k = 1 (3.129) n � E [( g − g k ) log( g / E k g )] . k = 1 18/20

Lemma 3.3.18 For g = e tZ and the functions g k , 1 ≤ k ≤ n , defined by (3.130), we have E (( g − g k ) log( g / E k g )) ≤ tE ( g − g k ) Lemma 3.3.19 ≤ t 2 e t V n � � g k log g k E h k + ( 1 + t )( h k − g k ) F ( t ) (3.134) 2 19/20

proof of Theorem 3.3.16 Proof. Setting as usual L ( t ) = log Ee tZ = log F ( t ) , Through Lemmma 3.3.17 ∼ Lemma 3.3.19 ′ ( t ) − L ( t ) ≤ t 2 e t V n / 2 t ( 1 − t ) L ′ = l Dividing both sides by t 2 and noting that ( L / t ) ′ / t − L / t 2 , it becomes � ′ ′ ≤ e t V n � L − L t 2 And, integrates and uses taylor expansion, ... t 2 1 − t EZ ≤ t 2 ( V n + 2 EZ ) t 2 L ( t ) − tEZ ≤ ( 2 − t )( 1 − t ) V n + 2 − 3 t This proves (3.127), (To prove (3.128), Propositon 3.1.6, φ ( λ ) = V n λ 2 / ( 2 ( 1 − 3 λ/ 2 )) . 20/20

Mathematical Foundations of Infinite-Dimensional Statistical Models: - PowerPoint PPT Presentation

Mathematical Foundations of Infinite-Dimensional Statistical Models: 3.3 The Entropy Method and Talagrands Inequality(3.3.4 3.3.5) Seoul National University ga0408@snu.ac.kr Nov 15, 2018 1/20 Table of Contents 3.3 The Entropy

Infinite dimensional sub-Riemannian geometry Sylvain Arguill` ere (CIS, Johns Hopkins

Infinite graphs P eter Komj ath LC12 P eter Komj ath Infinite graphs Infinite

recap to this point foundations foundations foundations foundations genetics =

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

18.175: Lecture 20 Infinite divisibility and L evy processes Scott Sheffield MIT 18.175 Lecture

n -dimensional manifold M with T := TM n -dimensional manifold M with T := TM T n -dimensional

Feedback stabilization of diagonal infinite-dimensional systems in the presence of delays IFAC

Infinite-dimensional calculus with a view towards Lie theory Helge Gl ockner (Universit at

Infinite Dimensional Compressed Sensing Anders C. Hansen, University of Cambridge Chemnitz,

High-dimensional and infinite-dimensional hyperbolic crosses and their applications in

Infinite Campus Parent Portal Scan and Go https://goo.gl/kNtHrw Infinite Campus Parent Portal

Happy 103rd birthday, Richard Guy Karl Dilcher Infinite products Infinite products involving

CS 210 Foundations of Computer Science Debdeep Mukhopadhyay Mathematical Reasoning Foundations

Foundations of Machine Learning Learning with Infinite Hypothesis Sets Motivation With an

BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD CLASS BUILDING THE

For personal use only BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD

SHIN INE E Special alized H Hom ome Independe dent Neig ighbor orhood ood E

Research LLC Conference March 2014 Cautionary statements All monetary amounts in U.S. dollars

Life After Littleton Its Time to Start Thinking About Your Options You have many options

Integrated Architecture Development 28 Jan 2004 Brig Gen J. Maluda, USAF SIAP System Engineer

A Lower Bound for the Distributed Lovsz Local Lemma Sebastian Brandt, Orr Fischer, Juho

and Dispersion: Global Distribution and Characteristics Daran Rife, A. Monaghan, J. Pinto, C.

Solvability Complexity Index (=SCI) and Towers of Algorithms Olavi Nevanlinna Aalto SCI

Future of US-China Pulsar Work Scott Ransom National Radio Astronomy Observatory / University of

Sambuz

Useful Links

Newsletter

Mail Us