Mathematical Foundations of Infinite-Dimensional Statistical Models: - - PowerPoint PPT Presentation

mathematical foundations of infinite dimensional
SMART_READER_LITE
LIVE PREVIEW

Mathematical Foundations of Infinite-Dimensional Statistical Models: - - PowerPoint PPT Presentation

Mathematical Foundations of Infinite-Dimensional Statistical Models: 3.3 The Entropy Method and Talagrands Inequality(3.3.4 3.3.5) Seoul National University ga0408@snu.ac.kr Nov 15, 2018 1/20 Table of Contents 3.3 The Entropy


slide-1
SLIDE 1

Mathematical Foundations of Infinite-Dimensional Statistical Models: 3.3 The Entropy Method and Talagrand’s Inequality(3.3.4 3.3.5)

이종진

Seoul National University ga0408@snu.ac.kr

Nov 15, 2018

1/20

slide-2
SLIDE 2

Table of Contents

3.3 The Entropy Method and Talagrand’s Inequality 3.3.2 & 3.3.3 3.3.4 The Entropy Method for Random Variables with Bounded Differences and for Self-Bounding Random Variables 3.3.5 The Upper Tail in Talagrand’s Inequality for Nonidentically Distributed Random Variables∗

2/20

slide-3
SLIDE 3

Entµf := Eµflogf − Eµf · log Eµf ◮ Exponential inequality Eeλ(Z−EZ) ≤ ...

  • 1. Subadditivity random variable
  • 2. Functions with bounded differences condition
  • 3. Self-bounding random variables

◮ Talagrand’s inequality

  • 1. The upper tail in Talagrand’s Inequality, Bousquet’s version,(vn)
  • 2. The lower tail in Talagrand’s Inequality, Klein’s version,(vn)
  • 3. The lower tail in Talagrand’s Inequality, Klein-Rio version, (Vn)
  • 4. The upper tail in Talagrand’s inequality for nonidentically distributed

random variable, (Vn)

3/20

slide-4
SLIDE 4

3.3.2 & 3.3.3

4/20

slide-5
SLIDE 5

Theorem 3.3.7

Let Z = Z(X1, . . . , Xn), Xi independent, be a subadditive random variable relative to Zk = Zk(X1, . . . , Xk−1, Xk+1, . . . , Xn), k = 1, . . . , n, such that EZ ≥ 0 and for which there exist random variables Yk ≥ Z`Zk ≥ 1 such that EkYk ≤ 0. Let σ2 < ∞ be any real number satisfying 1 n

n

  • k=1

EkY 2

k ≤ σ2,

and set v := 2EZ + nσ2. Then log Eeλ(Z−EZ) ≤ v(eλ − λ − 1) = vφ(−λ), λ ≥ 0.

5/20

slide-6
SLIDE 6

◮ Taylor 전개하면 VarZ ≤ 2EZ + nσ2 ◮ Prop 3.1.6에 Thm 3.3.7을 적용하면 Z − EZ의 꼬리 확률의 상한들을 얻음

Corollary 3.3.8

Let Z be as in Theorem 3.3.7. Then, for all t ≥ 0, P(Z ≥ EZ + t) ≤ exp(−vh1(t/v)) ≤ exp(−3t 4 log(1 + 2t 3v )) ≤ exp(− t2 2v + 2t/3) and P(Z ≥ EZ + √ 2vx + x/3) ≤ e−x, x ≤ 0.

6/20

slide-7
SLIDE 7

Theorem 3.3.9 (Upper tail of Talagrand’s inequality, Bousquet’s version)

Let (S, S) be a measurable space, and let n ∈ N. Let X1, . . . , Xn be independent S-valued random variables. Let F be a countable set of measurable real-valued functions on S such that ||f ||∞ ≤ U < ∞ and Ef (X1) = · · · = Ef (Xn) = 0, for all f ∈ F. Let Sj = sup

f ∈F j

  • k=1

f (Xk)

  • r

Sj = sup

f ∈F

|

j

  • k=1

f (Xk)|, j = 1, . . . , n, and let the parameters σ2 and v be defined by U2 ≥ σ2 ≥ 1 n

n

  • k=1

sup

f ∈F

Ef 2(Xk), and vn = 2UESn + nσ2.

7/20

slide-8
SLIDE 8

Theorem 3.3.9 (Upper tail of Talagrand’s inequality, Bousquet’s version)

Then log Eeλ(Sn−ESn) ≤ vn(eλ − 1 − λ), λ ≥ 0. As a consequence, P(Sn ≥ ESn + x) ≤ P( max

1≤j≤n Sj ≥ ESn + x) ≤ e−(vn/U2)h1(xU/vn)

≤ exp[− 3x 4U log(1 + 2xU 3vn )] ≤ exp[− x2 2vn + 2xU/3] and P(Sn ≥ ESn + √ 2vnx + Ux 3 ) ≤ P( max

1≤j≤n Sj ≥ ESn +

√ 2vnx + Ux 3 ) ≤ e−x, for all x ≥ 0.

7/20

slide-9
SLIDE 9

Theorem 3.3.10 (Lower tail of Talagrand’s inequality: Klein’s version)

Under the same hypotheses and notation as in Theorem 3.3.9, we have Ee−t(Sn−ESn) ≤ exp(vn e4t − 1 − 4t 16 ) = evnφ(−4t)/16, for 0 ≤ t < 1. As a consequence, for all x ≥ 0, P(Sn ≤ ESn − x) ≤ exp(− vn 16U2 h1(4xU vn )) ≤ exp(− 3x 16U2 log(1 + 8xU 3vn )) ≤ exp(− x2 2vn + 8xU/3) and P(Sn ≤ ESn − √ 2vnx − 4Ux 3 ) ≤ e−x.

8/20

slide-10
SLIDE 10

Remark 3.3.11 (Klein-Rio version, Klein and Rio(2005)) Setting Vn = 2UESn + sup

f n

  • k=1

Ef 2(Xk), then Ee−t(Sn−ESn) ≤ exp(Vn e3t − 1 − 3t 9 ) = evnφ(−3t)/9, for 0 ≤ t < 1, and that, as a consequence, for all x ≥ 0, P(Sn ≤ ESn − x) ≤ exp(− vn 9U2 h1(3xU Vn )) ≤ exp(− x 4U2 log(1 + 2xU Vn )) ≤ exp(− x2 2Vn + 2xU ) and P(Sn ≤ ESn − √ 2Vnx − Ux) ≤ e−x.

9/20

slide-11
SLIDE 11

3.3.4 The Entropy Method for Random Variables with Bounded Differences and for Self-Bounding Random Variables

10/20

slide-12
SLIDE 12

Bounded Differences

Definition 3.3.12

Let (Si, Si), i =1,...,n, be measurable spaces, and let f : n

i=1 Si → R be a

measurable function. f has bounded differences if sup

xi ,x′

k ∈S,i,j≤n

  • f (x1, ..., xn) − f (x1, ...xi−1, x

i , xi+1, ..., xn)

  • ≤ ci

where, for each i, ci is a measuralbe function of xj, j = i and there exists a finite constant c such that

n

  • i=1

c2

i ≤ c2 for all (x1, ..., xn) ∈ Sn.

If Z = f (X1, ..., Xn), where Xi are Si-valued independent random variables, we say that the random variable Z has bounded differences.

11/20

slide-13
SLIDE 13

Theorem 3.3.14

If Z has bounded differences and c2

i ≤ c2, then, for all λ ≥ 0

Eeλ(Z−EZ) ≤ eλ2c2/8 (3.115) so that, for all t ≥ 0 Pr{Z ≥ EZ + t} ≤ e−2t2/c2, Pr{Z ≤ EZ − t} ≥ e−2t2/c2 (3.116) Moreover, Var(Z) ≤ c2 4 . (3.117)

Proof.

Entµ(eλ(Y −EY )) = EeλY (λL

Y (λ) − LY (λ)) = EeλY λ 0 tL

′′

Y (t)dt, LY = log FY

& tensorisation of entropy(Proposition 2.5.3)

12/20

slide-14
SLIDE 14

Previous seminar

Definition

Z, Zk가 0 ≤ Z − Zk ≤ 1 (1 ≤ k ≤ n),

  • k

(Z − Zk) ≤ Z 를 만족하면 Z를 자기 경계(self-bounding)라 한다. ◮ Z가 자기 경계이면 명제 3.3.1에서 L(λ) := log F(λ)일 때 (λ − φ(λ))L′(λ) − L(λ) ≤ φ(λ)EZ (3.79) 처럼 훨씬 간단한 꼴로 바꿀 수 있음

13/20

slide-15
SLIDE 15

Theorem Theorem 3.3.15

Let Z be a self-bounding random variable. Then log E(eλ(Z−EZ)) ≤ φ(−λ)EZ, λ ∈ R. (3.123) This applies in particular to Z = supf ∈F

n

  • k=1

f (Xi), where F is countable and 0 ≤ f (x) ≤ 1 for all x ∈ S and f ∈ F. – φ(λ) = e−λ + λ − 1

Proof.

Since φ(λ) + φ(−λ) = φ

′(λ)φ ′(−λ), ψ0(λ) := vφ(−λ) is solution of (3.79) 14/20

slide-16
SLIDE 16

As a consequence, Theorem 3.3.15 and Propositioin 3.1.6 Pr{Z ≥ EZ + t} ≤ exp(−(EZ)h1(t/EZ)) Pr{Z ≤ EZ − t} ≤ exp(−(EZ)h1(−t/EZ)) (3.124) Pr{Z ≥ EZ + t} ≤ exp

  • −3t

4 log(1 + 2t 3EZ

  • ≤ exp(−

t2 2EZ + 2t/3) Pr{Z ≤ EZ − t} ≤ exp(−t2/(2EZ)) and Var(Z) ≤ EZ

15/20

slide-17
SLIDE 17

3.3.5 The Upper Tail in Talagrand’s Inequality for Nonidentically Distributed Random Variables∗

16/20

slide-18
SLIDE 18

Theorem 3.3.16

Let Xi, i ∈ N, be independent S-valued random variables, and let F be a countable class of functions f = (f 1, ..., f n) : S → [−1, 1]n such that Ef k(Xk) = 0 for all fi ∈ F and k=1,...,n. Set Tn(f ) =

n

  • k=1

f k(Xk), Z = sup

f ∈F

Tn(f ) and Vn = sup

f ∈F

ET 2

n (f ) = sup f ∈F n

  • k=1

E[f k(Xk)]2, Vn = 2EZ + Vn. (3.126) Then, for all t ∈ [0, 2/3], L(t) := log(EetZ) ≤ tEZ + t2 2 − 3t Vn, (3.127) and therefore, for all x ≥ 0, Pr

  • Z ≥ EZ +

√ 2Vnx + 3x 2

  • ≤ e−x

(3.128)

17/20

slide-19
SLIDE 19

Proof.

To prove Theorem 3.3.16 we need Lemma 3.3.17 ∼ 3.3.19

Lemma 3.3.17

Let F(t) = EetZ, let g(t;X1, ..., Xn) = etZ and let gk(t; , X1, ..., Xn), k = 1,...,n, be nonnegative functions such that E(gk log gk) < ∞ for all t ≤ 0. Then tF ′(t) − F(t) log F(t) =Entp(g(t)) ≤

n

  • k=1

E[gk log(gk/Ekgk)]+

n

  • k=1

E[(g − gk) log(g/Ekg)]. (3.129)

18/20

slide-20
SLIDE 20

Lemma 3.3.18

For g = etZ and the functions gk, 1 ≤ k ≤ n, defined by (3.130), we have E((g − gk) log(g/Ekg)) ≤ tE(g − gk)

Lemma 3.3.19

E

  • gk log gk

hk + (1 + t)(hk − gk)

  • ≤ t2etVn

2 F(t) (3.134)

19/20

slide-21
SLIDE 21

proof of Theorem 3.3.16

Proof.

Setting as usual L(t) = log EetZ = log F(t), Through Lemmma 3.3.17 ∼ Lemma 3.3.19 t(1 − t)L

′(t) − L(t) ≤ t2etVn/2

Dividing both sides by t2 and noting that (L/t)

′ = l ′/t − L/t2, it becomes

L t ′ − L

′ ≤ et Vn

2 And, integrates and uses taylor expansion, ... L(t) − tEZ ≤ t2 (2 − t)(1 − t)Vn + t2 1 − t EZ ≤ t2(Vn + 2EZ) 2 − 3t This proves (3.127), (To prove (3.128), Propositon 3.1.6, φ(λ) = Vnλ2/(2(1 − 3λ/2)).

20/20