Stability of Talagrands Gaussian Transport-Entropy Inequality Dan - - PowerPoint PPT Presentation

stability of talagrand s gaussian transport entropy
SMART_READER_LITE
LIVE PREVIEW

Stability of Talagrands Gaussian Transport-Entropy Inequality Dan - - PowerPoint PPT Presentation

Stability of Talagrands Gaussian Transport-Entropy Inequality Dan Mikulincer Geometric and Functional Inequalities in Convexity and Probability Weizmann Institute of Science Based on joint work with Ronen Eldan and Alex Zhai Geometry and


slide-1
SLIDE 1

Stability of Talagrand’s Gaussian Transport-Entropy Inequality

Dan Mikulincer Geometric and Functional Inequalities in Convexity and Probability

Weizmann Institute of Science Based on joint work with Ronen Eldan and Alex Zhai

slide-2
SLIDE 2

Geometry and Information

Throughout, G ∼ γ will denote the standard Gaussian in Rd. Definition (Wasserstein distance between µ and γ) W2(µ, γ) := inf

π

  • ||x − y||2 1/2

where π ranges over all possible couplings of µ and γ. Definition (Relative entropy between µ and γ) Ent(µ||γ) := Eµ

  • ln

dµ dγ (x)

  • .

Remark: if X ∼ µ we will also write Ent(X||G), W2(X, G).

slide-3
SLIDE 3

Geometry and Information

Throughout, G ∼ γ will denote the standard Gaussian in Rd. Definition (Wasserstein distance between µ and γ) W2(µ, γ) := inf

π

  • ||x − y||2 1/2

where π ranges over all possible couplings of µ and γ. Definition (Relative entropy between µ and γ) Ent(µ||γ) := Eµ

  • ln

dµ dγ (x)

  • .

Remark: if X ∼ µ we will also write Ent(X||G), W2(X, G).

slide-4
SLIDE 4

Geometry and Information

Throughout, G ∼ γ will denote the standard Gaussian in Rd. Definition (Wasserstein distance between µ and γ) W2(µ, γ) := inf

π

  • ||x − y||2 1/2

where π ranges over all possible couplings of µ and γ. Definition (Relative entropy between µ and γ) Ent(µ||γ) := Eµ

  • ln

dµ dγ (x)

  • .

Remark: if X ∼ µ we will also write Ent(X||G), W2(X, G).

slide-5
SLIDE 5

Geometry and Information

Throughout, G ∼ γ will denote the standard Gaussian in Rd. Definition (Wasserstein distance between µ and γ) W2(µ, γ) := inf

π

  • ||x − y||2 1/2

where π ranges over all possible couplings of µ and γ. Definition (Relative entropy between µ and γ) Ent(µ||γ) := Eµ

  • ln

dµ dγ (x)

  • .

Remark: if X ∼ µ we will also write Ent(X||G), W2(X, G).

slide-6
SLIDE 6

Talagrand’s Inequality

In 96′ Talagrand proved the following inequality, which connects between geometry and information. Theorem (Talagrand’s Gaussian transport-entropy inequality) Let µ be a measure on Rd. Then W2

2(µ, γ) ≤ 2Ent(µ||γ).

It is enough to consider measures such that µ ≪ ν.

slide-7
SLIDE 7

Talagrand’s Inequality - Applications

  • By considering measures of the form ✶Adγ the inequality

implies a (non-sharp) Gaussian isoperimetric inequality.

  • The inequality tensorizes and may be used to show

dimension-free Gaussian concentration bounds.

  • If f is convex, then applying the inequality to e−λf dγ yields a
  • ne sides Gaussian concentration for concave functions.
slide-8
SLIDE 8

Talagrand’s Inequality - Applications

  • By considering measures of the form ✶Adγ the inequality

implies a (non-sharp) Gaussian isoperimetric inequality.

  • The inequality tensorizes and may be used to show

dimension-free Gaussian concentration bounds.

  • If f is convex, then applying the inequality to e−λf dγ yields a
  • ne sides Gaussian concentration for concave functions.
slide-9
SLIDE 9

Talagrand’s Inequality - Applications

  • By considering measures of the form ✶Adγ the inequality

implies a (non-sharp) Gaussian isoperimetric inequality.

  • The inequality tensorizes and may be used to show

dimension-free Gaussian concentration bounds.

  • If f is convex, then applying the inequality to e−λf dγ yields a
  • ne sides Gaussian concentration for concave functions.
slide-10
SLIDE 10

Talagrand’s Inequality - Applications

  • By considering measures of the form ✶Adγ the inequality

implies a (non-sharp) Gaussian isoperimetric inequality.

  • The inequality tensorizes and may be used to show

dimension-free Gaussian concentration bounds.

  • If f is convex, then applying the inequality to e−λf dγ yields a
  • ne sides Gaussian concentration for concave functions.
slide-11
SLIDE 11

Gaussians

If γa,Σ = N(a, Σ), in Rd:

  • Ent(γa,Σ||γ) = 1

2

  • Tr(Σ) + ||a||2

2 − ln(det(Σ)) − d

  • W2

2(γa,Σ, γ) = ||a||2 2 +

Σ − Id

  • 2

HS

In particular, for any a ∈ Rd, W2

2(γa,Id, γ) = 2Ent(γa,Id||γ).

These are the only equality cases.

slide-12
SLIDE 12

Gaussians

If γa,Σ = N(a, Σ), in Rd:

  • Ent(γa,Σ||γ) = 1

2

  • Tr(Σ) + ||a||2

2 − ln(det(Σ)) − d

  • W2

2(γa,Σ, γ) = ||a||2 2 +

Σ − Id

  • 2

HS

In particular, for any a ∈ Rd, W2

2(γa,Id, γ) = 2Ent(γa,Id||γ).

These are the only equality cases.

slide-13
SLIDE 13

Gaussians

If γa,Σ = N(a, Σ), in Rd:

  • Ent(γa,Σ||γ) = 1

2

  • Tr(Σ) + ||a||2

2 − ln(det(Σ)) − d

  • W2

2(γa,Σ, γ) = ||a||2 2 +

Σ − Id

  • 2

HS

In particular, for any a ∈ Rd, W2

2(γa,Id, γ) = 2Ent(γa,Id||γ).

These are the only equality cases.

slide-14
SLIDE 14

Gaussians

If γa,Σ = N(a, Σ), in Rd:

  • Ent(γa,Σ||γ) = 1

2

  • Tr(Σ) + ||a||2

2 − ln(det(Σ)) − d

  • W2

2(γa,Σ, γ) = ||a||2 2 +

Σ − Id

  • 2

HS

In particular, for any a ∈ Rd, W2

2(γa,Id, γ) = 2Ent(γa,Id||γ).

These are the only equality cases.

slide-15
SLIDE 15

Stability

Define the deficit δTal(µ) = 2Ent(µ||γ) − W2

2(µ, γ).

The question of stability deals with approximate equality cases. Question Suppose that δTal(µ) is small, must µ be close to a translate of the standard Gaussian? Note that the deficit is invariant to translations. So, it will be enough to consider centered measures.

slide-16
SLIDE 16

Stability

Define the deficit δTal(µ) = 2Ent(µ||γ) − W2

2(µ, γ).

The question of stability deals with approximate equality cases. Question Suppose that δTal(µ) is small, must µ be close to a translate of the standard Gaussian? Note that the deficit is invariant to translations. So, it will be enough to consider centered measures.

slide-17
SLIDE 17

Stability

Define the deficit δTal(µ) = 2Ent(µ||γ) − W2

2(µ, γ).

The question of stability deals with approximate equality cases. Question Suppose that δTal(µ) is small, must µ be close to a translate of the standard Gaussian? Note that the deficit is invariant to translations. So, it will be enough to consider centered measures.

slide-18
SLIDE 18

Instability

Theorem (Fathi, Indrei, Ledoux 14’) Let µ be a centered measure on Rd. Then δTal(µ) min W1,1(µ, γ)2 d , W1,1(µ, γ) √ d

  • The 1-dimensional case was proven earlier by Barthe and

Kolesnikov. However: Theorem There exists a sequence of centered Gaussian mixtures {µn} on R, such that δTal(µn) → 0. but W2

2(µn, γ) > 1.

slide-19
SLIDE 19

Instability

Theorem (Fathi, Indrei, Ledoux 14’) Let µ be a centered measure on Rd. Then δTal(µ) min W1,1(µ, γ)2 d , W1,1(µ, γ) √ d

  • The 1-dimensional case was proven earlier by Barthe and

Kolesnikov. However: Theorem There exists a sequence of centered Gaussian mixtures {µn} on R, such that δTal(µn) → 0. but W2

2(µn, γ) > 1.

slide-20
SLIDE 20

Bounding the Deficit

In the 1-dimensional case, Talagrand actually showed δTal(µ) =

  • R
  • ϕ′

µ − 1 − ln(ϕ′ µ)

  • dγ > 0,

where ϕ is the transport map ϕµ = F −1

γ

  • Fµ.

For translated Gaussians, ϕγa,1(x) = x + a, which shows the equality cases. We will take a different route.

slide-21
SLIDE 21

Bounding the Deficit

In the 1-dimensional case, Talagrand actually showed δTal(µ) =

  • R
  • ϕ′

µ − 1 − ln(ϕ′ µ)

  • dγ > 0,

where ϕ is the transport map ϕµ = F −1

γ

  • Fµ.

For translated Gaussians, ϕγa,1(x) = x + a, which shows the equality cases. We will take a different route.

slide-22
SLIDE 22

Bounding the Deficit

In the 1-dimensional case, Talagrand actually showed δTal(µ) =

  • R
  • ϕ′

µ − 1 − ln(ϕ′ µ)

  • dγ > 0,

where ϕ is the transport map ϕµ = F −1

γ

  • Fµ.

For translated Gaussians, ϕγa,1(x) = x + a, which shows the equality cases. We will take a different route.

slide-23
SLIDE 23

Bounding the Deficit - the F¨

  • llmer Drift

Our central construct will be the F¨

  • llmer drift, which is the

solution to the following variational problem: vt := arg min

ut

1 2

1

  • E
  • ||ut||2

dt, where ut ranges over all adapted drifts for which B1 +

1

  • utdt has

the same law as µ. We denote Xt := Bt +

t

  • vsds.
slide-24
SLIDE 24

Bounding the Deficit - the F¨

  • llmer Drift

Our central construct will be the F¨

  • llmer drift, which is the

solution to the following variational problem: vt := arg min

ut

1 2

1

  • E
  • ||ut||2

dt, where ut ranges over all adapted drifts for which B1 +

1

  • utdt has

the same law as µ. We denote Xt := Bt +

t

  • vsds.
slide-25
SLIDE 25

Bounding the Deficit - the F¨

  • llmer Drift

The process vt goes back at least to the works of F¨

  • llmer (86’). In

a later work by Lehec (12’) it is shown that if µ has finite entropy relative to γ, then vt is well defined and that:

  • 1. vt is a martingale, with vt(Xt) = ∇ ln
  • P1−t

dγ (Xt)

  • .
  • 2. Ent (µ||γ) = Ent (X·||B·) = 1

2 1

  • E[||vt||2]dt.
  • 3. In the Wiener space, the density of Xt with respect to Bt is

given by dµ

dγ (ω1).

  • 4. If G ∼ γ, independent from X1,

Xt

law

= tX1 +

  • t(1 − t)G.
slide-26
SLIDE 26

Bounding the Deficit - the F¨

  • llmer Drift

The process vt goes back at least to the works of F¨

  • llmer (86’). In

a later work by Lehec (12’) it is shown that if µ has finite entropy relative to γ, then vt is well defined and that:

  • 1. vt is a martingale, with vt(Xt) = ∇ ln
  • P1−t

dγ (Xt)

  • .
  • 2. Ent (µ||γ) = Ent (X·||B·) = 1

2 1

  • E[||vt||2]dt.
  • 3. In the Wiener space, the density of Xt with respect to Bt is

given by dµ

dγ (ω1).

  • 4. If G ∼ γ, independent from X1,

Xt

law

= tX1 +

  • t(1 − t)G.
slide-27
SLIDE 27

Bounding the Deficit - the F¨

  • llmer Drift

The process vt goes back at least to the works of F¨

  • llmer (86’). In

a later work by Lehec (12’) it is shown that if µ has finite entropy relative to γ, then vt is well defined and that:

  • 1. vt is a martingale, with vt(Xt) = ∇ ln
  • P1−t

dγ (Xt)

  • .
  • 2. Ent (µ||γ) = Ent (X·||B·) = 1

2 1

  • E[||vt||2]dt.
  • 3. In the Wiener space, the density of Xt with respect to Bt is

given by dµ

dγ (ω1).

  • 4. If G ∼ γ, independent from X1,

Xt

law

= tX1 +

  • t(1 − t)G.
slide-28
SLIDE 28

Bounding the Deficit - the F¨

  • llmer Drift

The process vt goes back at least to the works of F¨

  • llmer (86’). In

a later work by Lehec (12’) it is shown that if µ has finite entropy relative to γ, then vt is well defined and that:

  • 1. vt is a martingale, with vt(Xt) = ∇ ln
  • P1−t

dγ (Xt)

  • .
  • 2. Ent (µ||γ) = Ent (X·||B·) = 1

2 1

  • E[||vt||2]dt.
  • 3. In the Wiener space, the density of Xt with respect to Bt is

given by dµ

dγ (ω1).

  • 4. If G ∼ γ, independent from X1,

Xt

law

= tX1 +

  • t(1 − t)G.
slide-29
SLIDE 29

Bounding the Deficit - the F¨

  • llmer Drift

The process vt goes back at least to the works of F¨

  • llmer (86’). In

a later work by Lehec (12’) it is shown that if µ has finite entropy relative to γ, then vt is well defined and that:

  • 1. vt is a martingale, with vt(Xt) = ∇ ln
  • P1−t

dγ (Xt)

  • .
  • 2. Ent (µ||γ) = Ent (X·||B·) = 1

2 1

  • E[||vt||2]dt.
  • 3. In the Wiener space, the density of Xt with respect to Bt is

given by dµ

dγ (ω1).

  • 4. If G ∼ γ, independent from X1,

Xt

law

= tX1 +

  • t(1 − t)G.
slide-30
SLIDE 30

Proof of Talagrand’s Inequality

Proof of Talagrand’s Inequality (Lehec). W2

2(µ||γ) ≤ E

  • X1 − B1
  • 2

2

  • = E
  • 1

vtdt

  • 2

2

1 E

  • ||vt||2

2

  • dt = 2Ent(µ||γ).

The goal is to make this quantitative.

slide-31
SLIDE 31

Proof of Talagrand’s Inequality

Proof of Talagrand’s Inequality (Lehec). W2

2(µ||γ) ≤ E

  • X1 − B1
  • 2

2

  • = E
  • 1

vtdt

  • 2

2

1 E

  • ||vt||2

2

  • dt = 2Ent(µ||γ).

The goal is to make this quantitative.

slide-32
SLIDE 32

Stability for Measures with a Finite Poincar´ e Constant

We say that µ satisfies a Poincar´ e inequality, with constant Cp(µ), if for every every smooth function f , Varµ (f ) ≤ Cp(µ)Eµ

  • ||∇f ||2

2

  • .

We will prove: Theorem Let µ be a centered measure on Rd with Cp(µ) < ∞. Then δTal(µ) ≥ ln(Cp(µ) + 1) 4Cp(µ) Ent(µ||γ).

slide-33
SLIDE 33

Stability for Measures with a Finite Poincar´ e Constant

We say that µ satisfies a Poincar´ e inequality, with constant Cp(µ), if for every every smooth function f , Varµ (f ) ≤ Cp(µ)Eµ

  • ||∇f ||2

2

  • .

We will prove: Theorem Let µ be a centered measure on Rd with Cp(µ) < ∞. Then δTal(µ) ≥ ln(Cp(µ) + 1) 4Cp(µ) Ent(µ||γ).

slide-34
SLIDE 34

Measures with a Finite Poincar´ e Constant

The Poincar´ e constant is inequality for the following comparison lemma: Lemma Assume that µ is centered and that Cp(µ) < ∞. Then

  • For 0 ≤ t ≤ 1

2,

E

  • ||vt||2

2

  • ≤ E
  • v1/2
  • 2

2

  • (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1.

  • For 1

2 ≤ t ≤ 1,

E

  • ||vt||2

2

  • ≥ E
  • v1/2
  • 2

2

  • (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1.

slide-35
SLIDE 35

Proof. Recall Xt

law

= tX1 +

  • t(1 − t)G. Hence,

Cp(Xt) ≤ t2Cp(µ) + t(1 − t), and E

  • ||vt(Xt)||2

2

  • ≤ (t2Cp(µ) + t(1 − t))E
  • ||∇vt(Xt)||2

2

  • = (t2Cp(µ) + t(1 − t)) d

dt E

  • ||vt(Xt)||2

2

  • .

g(t) := E

  • v1/2
  • 2

2

  • (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1 solves f (t) = t2Cp(µ) + t(1 − t)f ′(t), with f 1 2

  • = E
  • v1/2
  • 2

2

  • .

Now apply Gromwall’s inequality.

slide-36
SLIDE 36

Proof. Recall Xt

law

= tX1 +

  • t(1 − t)G. Hence,

Cp(Xt) ≤ t2Cp(µ) + t(1 − t), and E

  • ||vt(Xt)||2

2

  • ≤ (t2Cp(µ) + t(1 − t))E
  • ||∇vt(Xt)||2

2

  • = (t2Cp(µ) + t(1 − t)) d

dt E

  • ||vt(Xt)||2

2

  • .

g(t) := E

  • v1/2
  • 2

2

  • (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1 solves f (t) = t2Cp(µ) + t(1 − t)f ′(t), with f 1 2

  • = E
  • v1/2
  • 2

2

  • .

Now apply Gromwall’s inequality.

slide-37
SLIDE 37

Proof. Recall Xt

law

= tX1 +

  • t(1 − t)G. Hence,

Cp(Xt) ≤ t2Cp(µ) + t(1 − t), and E

  • ||vt(Xt)||2

2

  • ≤ (t2Cp(µ) + t(1 − t))E
  • ||∇vt(Xt)||2

2

  • = (t2Cp(µ) + t(1 − t)) d

dt E

  • ||vt(Xt)||2

2

  • .

g(t) := E

  • v1/2
  • 2

2

  • (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1 solves f (t) = t2Cp(µ) + t(1 − t)f ′(t), with f 1 2

  • = E
  • v1/2
  • 2

2

  • .

Now apply Gromwall’s inequality.

slide-38
SLIDE 38

Proof. Recall Xt

law

= tX1 +

  • t(1 − t)G. Hence,

Cp(Xt) ≤ t2Cp(µ) + t(1 − t), and E

  • ||vt(Xt)||2

2

  • ≤ (t2Cp(µ) + t(1 − t))E
  • ||∇vt(Xt)||2

2

  • = (t2Cp(µ) + t(1 − t)) d

dt E

  • ||vt(Xt)||2

2

  • .

g(t) := E

  • v1/2
  • 2

2

  • (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1 solves f (t) = t2Cp(µ) + t(1 − t)f ′(t), with f 1 2

  • = E
  • v1/2
  • 2

2

  • .

Now apply Gromwall’s inequality.

slide-39
SLIDE 39

A Martingale Formulation

We will use the following martingale formulation: Yt := E [X1|Ft] . By the martingale representation theorem, for some process Γt, which is uniquely defined, Yt satisfies Yt =

t

  • ΓsdBs.

This implies vt =

t

  • Γs − Id

1 − s dBs.

slide-40
SLIDE 40

A Martingale Formulation

We will use the following martingale formulation: Yt := E [X1|Ft] . By the martingale representation theorem, for some process Γt, which is uniquely defined, Yt satisfies Yt =

t

  • ΓsdBs.

This implies vt =

t

  • Γs − Id

1 − s dBs.

slide-41
SLIDE 41

A Martingale Formulation

We will use the following martingale formulation: Yt := E [X1|Ft] . By the martingale representation theorem, for some process Γt, which is uniquely defined, Yt satisfies Yt =

t

  • ΓsdBs.

This implies vt =

t

  • Γs − Id

1 − s dBs.

slide-42
SLIDE 42

A Martingale Formulation

It turns out that Γt is a positive definite matrix, hence Ent(µ||γ) = 1 2

1

  • E
  • ||vs||2

2

  • ds = 1

2Tr

1

  • s
  • E
  • (Γt − Id)2

(1 − t)2 dtds = 1 2Tr

1

  • E
  • (Γt − Id)2

1 − t dt, and W2

2(µ, γ) ≤ E

  

  • 1
  • ΓtdBt −

1

  • dBt
  • 2

2

   = Tr

1

  • E
  • (Γt − Id)2

dt.

slide-43
SLIDE 43

A Martingale Formulation

It turns out that Γt is a positive definite matrix, hence Ent(µ||γ) = 1 2

1

  • E
  • ||vs||2

2

  • ds = 1

2Tr

1

  • s
  • E
  • (Γt − Id)2

(1 − t)2 dtds = 1 2Tr

1

  • E
  • (Γt − Id)2

1 − t dt, and W2

2(µ, γ) ≤ E

  

  • 1
  • ΓtdBt −

1

  • dBt
  • 2

2

   = Tr

1

  • E
  • (Γt − Id)2

dt.

slide-44
SLIDE 44

Bounding the Deficit - Martingales

δTal(µ) = 2Ent(µ||γ) − W2

2(µ, γ) ≥ Tr 1

  • t · E
  • (Γt − Id)2

1 − t dt Integration by parts gives: δTal(µ) ≥ Tr

1

  • t(1 − t) · E
  • (Γt − Id)2

(1 − t)2 dt =

1

  • t(1 − t) d

dt E

  • ||vt||2

2

  • dt =

1

  • (2t − 1)E
  • ||vt||2

2

  • dt
slide-45
SLIDE 45

Bounding the Deficit - Martingales

δTal(µ) = 2Ent(µ||γ) − W2

2(µ, γ) ≥ Tr 1

  • t · E
  • (Γt − Id)2

1 − t dt Integration by parts gives: δTal(µ) ≥ Tr

1

  • t(1 − t) · E
  • (Γt − Id)2

(1 − t)2 dt =

1

  • t(1 − t) d

dt E

  • ||vt||2

2

  • dt =

1

  • (2t − 1)E
  • ||vt||2

2

  • dt
slide-46
SLIDE 46

Bounding the Deficit - Martingales

δTal(µ) = 2Ent(µ||γ) − W2

2(µ, γ) ≥ Tr 1

  • t · E
  • (Γt − Id)2

1 − t dt Integration by parts gives: δTal(µ) ≥ Tr

1

  • t(1 − t) · E
  • (Γt − Id)2

(1 − t)2 dt =

1

  • t(1 − t) d

dt E

  • ||vt||2

2

  • dt =

1

  • (2t − 1)E
  • ||vt||2

2

  • dt
slide-47
SLIDE 47

Applying the Lemma δTal(µ) ≥

1

  • (2t − 1)E
  • ||vt||2

2

  • dt

≥ E

  • v1/2
  • 2

2

  • 1
  • (2t − 1) (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1 dt ≥ E

  • v1/2
  • 2

2

ln(Cp(µ) + 1) 4Cp(µ) If E

  • v1/2
  • 2

2

  • ≥ Ent(µ||γ), this shows

δTal(µ) ≥ ln(Cp(µ) + 1) 4Cp(µ) Ent(µ||γ). The other case is easier.

slide-48
SLIDE 48

Applying the Lemma δTal(µ) ≥

1

  • (2t − 1)E
  • ||vt||2

2

  • dt

≥ E

  • v1/2
  • 2

2

  • 1
  • (2t − 1) (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1 dt ≥ E

  • v1/2
  • 2

2

ln(Cp(µ) + 1) 4Cp(µ) If E

  • v1/2
  • 2

2

  • ≥ Ent(µ||γ), this shows

δTal(µ) ≥ ln(Cp(µ) + 1) 4Cp(µ) Ent(µ||γ). The other case is easier.

slide-49
SLIDE 49

Applying the Lemma δTal(µ) ≥

1

  • (2t − 1)E
  • ||vt||2

2

  • dt

≥ E

  • v1/2
  • 2

2

  • 1
  • (2t − 1) (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1 dt ≥ E

  • v1/2
  • 2

2

ln(Cp(µ) + 1) 4Cp(µ) If E

  • v1/2
  • 2

2

  • ≥ Ent(µ||γ), this shows

δTal(µ) ≥ ln(Cp(µ) + 1) 4Cp(µ) Ent(µ||γ). The other case is easier.

slide-50
SLIDE 50

Applying the Lemma δTal(µ) ≥

1

  • (2t − 1)E
  • ||vt||2

2

  • dt

≥ E

  • v1/2
  • 2

2

  • 1
  • (2t − 1) (Cp(µ) + 1) t

(Cp(µ) − 1) t + 1 dt ≥ E

  • v1/2
  • 2

2

ln(Cp(µ) + 1) 4Cp(µ) If E

  • v1/2
  • 2

2

  • ≥ Ent(µ||γ), this shows

δTal(µ) ≥ ln(Cp(µ) + 1) 4Cp(µ) Ent(µ||γ). The other case is easier.

slide-51
SLIDE 51

Further Results

Other bounds on d

dt E

  • ||vt||2

2

  • , will yields different results.

For example, if tr (Cov(µ)) ≤ d, then d dt E

  • ||vt||2

2

  • E
  • ||vt||2

2

2 d . This gives: Theorem Let µ be a measure on Rd such that tr (Cov(µ)) ≤ d. Then δTal(µ) ≥ min Ent(µ||γ)2 6d , Ent(µ||γ) 4

  • .
slide-52
SLIDE 52

Further Results

Other bounds on d

dt E

  • ||vt||2

2

  • , will yields different results.

For example, if tr (Cov(µ)) ≤ d, then d dt E

  • ||vt||2

2

  • E
  • ||vt||2

2

2 d . This gives: Theorem Let µ be a measure on Rd such that tr (Cov(µ)) ≤ d. Then δTal(µ) ≥ min Ent(µ||γ)2 6d , Ent(µ||γ) 4

  • .
slide-53
SLIDE 53

Further Results

Other bounds on d

dt E

  • ||vt||2

2

  • , will yields different results.

For example, if tr (Cov(µ)) ≤ d, then d dt E

  • ||vt||2

2

  • E
  • ||vt||2

2

2 d . This gives: Theorem Let µ be a measure on Rd such that tr (Cov(µ)) ≤ d. Then δTal(µ) ≥ min Ent(µ||γ)2 6d , Ent(µ||γ) 4

  • .
slide-54
SLIDE 54

Further Results

Two other results: Theorem Let µ be a measure on Rd and let {λi}d

i=1 be the eigenvalues of

Cov(µ). Then δTal(µ) ≥

d

  • i=1

2(1 − λi) + (λi + 1) ln(λi) λi − 1 ✶{λi<1}. Theorem Let µ be a measure on Rd. There exists another measure ν such that δTal(µ) ≥ 1 3 √ 3 Ent(µ||γ)3/2 √ d

slide-55
SLIDE 55

Further Results

Two other results: Theorem Let µ be a measure on Rd and let {λi}d

i=1 be the eigenvalues of

Cov(µ). Then δTal(µ) ≥

d

  • i=1

2(1 − λi) + (λi + 1) ln(λi) λi − 1 ✶{λi<1}. Theorem Let µ be a measure on Rd. There exists another measure ν such that δTal(µ) ≥ 1 3 √ 3 Ent(µ||γ)3/2 √ d

slide-56
SLIDE 56

Further Results

Two other results: Theorem Let µ be a measure on Rd and let {λi}d

i=1 be the eigenvalues of

Cov(µ). Then δTal(µ) ≥

d

  • i=1

2(1 − λi) + (λi + 1) ln(λi) λi − 1 ✶{λi<1}. Theorem Let µ be a measure on Rd. There exists another measure ν such that δTal(µ) ≥ 1 3 √ 3 Ent(µ||γ)3/2 √ d

slide-57
SLIDE 57

Log-Sobolev Inequality

Definition (Fisher information of µ with respect to γ) I(µ||γ) = Eµ

  • ∇ ln

dµ dγ

  • 2

2

  • .

In 75’ Gross proved: Theorem (Log-Sobolev inequality) Let µ be a measure on Rd. Then 2Ent(µ||γ) ≤ I(µ||γ).

slide-58
SLIDE 58

Log-Sobolev Inequality

Definition (Fisher information of µ with respect to γ) I(µ||γ) = Eµ

  • ∇ ln

dµ dγ

  • 2

2

  • .

In 75’ Gross proved: Theorem (Log-Sobolev inequality) Let µ be a measure on Rd. Then 2Ent(µ||γ) ≤ I(µ||γ).

slide-59
SLIDE 59

Define δLS(µ) = I(µ||γ) − 2Ent(µ||γ), and recall vt := vt(Xt) = ∇ ln

  • P1−t

dµ dγ (Xt)

  • .

It follows that Tr

1

  • E
  • (Γt − Id)2

(1 − t)2 dt = E

  • ||v1||2

2

  • = I(µ||γ).

Since Ent(µ||γ) = 1

2Tr 1

  • E
  • (Γt − Id)2

1−t

dt, we get δLS(µ) = Tr

1

  • t · E
  • (Γt − Id)2

(1 − t)2 dt.

slide-60
SLIDE 60

Define δLS(µ) = I(µ||γ) − 2Ent(µ||γ), and recall vt := vt(Xt) = ∇ ln

  • P1−t

dµ dγ (Xt)

  • .

It follows that Tr

1

  • E
  • (Γt − Id)2

(1 − t)2 dt = E

  • ||v1||2

2

  • = I(µ||γ).

Since Ent(µ||γ) = 1

2Tr 1

  • E
  • (Γt − Id)2

1−t

dt, we get δLS(µ) = Tr

1

  • t · E
  • (Γt − Id)2

(1 − t)2 dt.

slide-61
SLIDE 61

Define δLS(µ) = I(µ||γ) − 2Ent(µ||γ), and recall vt := vt(Xt) = ∇ ln

  • P1−t

dµ dγ (Xt)

  • .

It follows that Tr

1

  • E
  • (Γt − Id)2

(1 − t)2 dt = E

  • ||v1||2

2

  • = I(µ||γ).

Since Ent(µ||γ) = 1

2Tr 1

  • E
  • (Γt − Id)2

1−t

dt, we get δLS(µ) = Tr

1

  • t · E
  • (Γt − Id)2

(1 − t)2 dt.

slide-62
SLIDE 62

Define δLS(µ) = I(µ||γ) − 2Ent(µ||γ), and recall vt := vt(Xt) = ∇ ln

  • P1−t

dµ dγ (Xt)

  • .

It follows that Tr

1

  • E
  • (Γt − Id)2

(1 − t)2 dt = E

  • ||v1||2

2

  • = I(µ||γ).

Since Ent(µ||γ) = 1

2Tr 1

  • E
  • (Γt − Id)2

1−t

dt, we get δLS(µ) = Tr

1

  • t · E
  • (Γt − Id)2

(1 − t)2 dt.

slide-63
SLIDE 63

The Shannon-Stam Inequality

In 48′ Shannon noted the following inequality, which was later proved by Stam, in 56′. Theorem (Shannon-Stam Inequality) Let X, Y be independent random vectors in Rd and let G ∼ γ. Then, for any λ ∈ [0, 1], Ent( √ λX + √ 1 − λY ||G) ≤ λEnt(X||G) + (1 − λ)Ent(Y ||G). Moreover, equality holds if and only if X and Y are Gaussians with identical covariances. Define δλ(X, Y ) = λEnt(X||G)+(1−λ)Ent(Y ||G)−Ent( √ λX+ √ 1 − λY ||G).

slide-64
SLIDE 64

The Shannon-Stam Inequality

In 48′ Shannon noted the following inequality, which was later proved by Stam, in 56′. Theorem (Shannon-Stam Inequality) Let X, Y be independent random vectors in Rd and let G ∼ γ. Then, for any λ ∈ [0, 1], Ent( √ λX + √ 1 − λY ||G) ≤ λEnt(X||G) + (1 − λ)Ent(Y ||G). Moreover, equality holds if and only if X and Y are Gaussians with identical covariances. Define δλ(X, Y ) = λEnt(X||G)+(1−λ)Ent(Y ||G)−Ent( √ λX+ √ 1 − λY ||G).

slide-65
SLIDE 65

Deficit of the Shannon-Stam Inequality

For simplicity we’ll focus on the case λ = 1

2.

Now, for X, Y independent random variables, take two independent Brownian motions BX

t , BY t

and ΓX

t , ΓY t as above.

We get X + Y √ 2 = 1 √ 2  

1

  • ΓX

t dBX t + 1

  • ΓY

t dBY t

  law =

1

  • (ΓX

t )2 + (ΓY t )2

2 dBt. for some Brownian motion Bt.

slide-66
SLIDE 66

Deficit of the Shannon-Stam Inequality

For simplicity we’ll focus on the case λ = 1

2.

Now, for X, Y independent random variables, take two independent Brownian motions BX

t , BY t

and ΓX

t , ΓY t as above.

We get X + Y √ 2 = 1 √ 2  

1

  • ΓX

t dBX t + 1

  • ΓY

t dBY t

  law =

1

  • (ΓX

t )2 + (ΓY t )2

2 dBt. for some Brownian motion Bt.

slide-67
SLIDE 67

Bounding the Deficit

If Ht =

  • (ΓX

t )2+(ΓY t )2

2

, Ent

  • X+Y

√ 2 ||G

  • ≤ 1

2Tr 1

  • E
  • (Id − Ht)2

1−t

dt. Consequently, 2δ 1

2 (X, Y ) ≥ Tr

1

  • E
  • (Id − ΓY

t )2

2(1 − t) + E

  • (Id − ΓX

t )2

2(1 − t) − E

  • (Id − Ht)2

1 − t dt = Tr

1

  • 2E[Ht] − E[ΓX

t ] − E[ΓY t ]

1 − t . Manipulating the matrix square root then shows δ 1

2 (X, Y ) Tr

1

  • E

(ΓX

t − ΓY t )2(ΓX t + ΓY t )−1

(1 − t)

  • dt.
slide-68
SLIDE 68

Bounding the Deficit

If Ht =

  • (ΓX

t )2+(ΓY t )2

2

, Ent

  • X+Y

√ 2 ||G

  • ≤ 1

2Tr 1

  • E
  • (Id − Ht)2

1−t

dt. Consequently, 2δ 1

2 (X, Y ) ≥ Tr

1

  • E
  • (Id − ΓY

t )2

2(1 − t) + E

  • (Id − ΓX

t )2

2(1 − t) − E

  • (Id − Ht)2

1 − t dt = Tr

1

  • 2E[Ht] − E[ΓX

t ] − E[ΓY t ]

1 − t . Manipulating the matrix square root then shows δ 1

2 (X, Y ) Tr

1

  • E

(ΓX

t − ΓY t )2(ΓX t + ΓY t )−1

(1 − t)

  • dt.
slide-69
SLIDE 69

Bounding the Deficit

If Ht =

  • (ΓX

t )2+(ΓY t )2

2

, Ent

  • X+Y

√ 2 ||G

  • ≤ 1

2Tr 1

  • E
  • (Id − Ht)2

1−t

dt. Consequently, 2δ 1

2 (X, Y ) ≥ Tr

1

  • E
  • (Id − ΓY

t )2

2(1 − t) + E

  • (Id − ΓX

t )2

2(1 − t) − E

  • (Id − Ht)2

1 − t dt = Tr

1

  • 2E[Ht] − E[ΓX

t ] − E[ΓY t ]

1 − t . Manipulating the matrix square root then shows δ 1

2 (X, Y ) Tr

1

  • E

(ΓX

t − ΓY t )2(ΓX t + ΓY t )−1

(1 − t)

  • dt.
slide-70
SLIDE 70

Bounding the Deficit

If Ht =

  • (ΓX

t )2+(ΓY t )2

2

, Ent

  • X+Y

√ 2 ||G

  • ≤ 1

2Tr 1

  • E
  • (Id − Ht)2

1−t

dt. Consequently, 2δ 1

2 (X, Y ) ≥ Tr

1

  • E
  • (Id − ΓY

t )2

2(1 − t) + E

  • (Id − ΓX

t )2

2(1 − t) − E

  • (Id − Ht)2

1 − t dt = Tr

1

  • 2E[Ht] − E[ΓX

t ] − E[ΓY t ]

1 − t . Manipulating the matrix square root then shows δ 1

2 (X, Y ) Tr

1

  • E

(ΓX

t − ΓY t )2(ΓX t + ΓY t )−1

(1 − t)

  • dt.
slide-71
SLIDE 71

Deficit of Log-Concave Measures

Fact: if X is log-concave, then ΓX

t 1 t Id almost surely.

So, if both X and Y are log-concave, δ 1

2 (X, Y ) Tr

1

  • t ·

E

  • (ΓX

t − ΓY t )2

1 − t dt. In particular, δ 1

2 (X, G) Tr

1

  • t ·

E

  • (ΓX

t − Id)2

1 − t dt.

slide-72
SLIDE 72

Deficit of Log-Concave Measures

Fact: if X is log-concave, then ΓX

t 1 t Id almost surely.

So, if both X and Y are log-concave, δ 1

2 (X, Y ) Tr

1

  • t ·

E

  • (ΓX

t − ΓY t )2

1 − t dt. In particular, δ 1

2 (X, G) Tr

1

  • t ·

E

  • (ΓX

t − Id)2

1 − t dt.

slide-73
SLIDE 73

Deficit of Log-Concave Measures

Fact: if X is log-concave, then ΓX

t 1 t Id almost surely.

So, if both X and Y are log-concave, δ 1

2 (X, Y ) Tr

1

  • t ·

E

  • (ΓX

t − ΓY t )2

1 − t dt. In particular, δ 1

2 (X, G) Tr

1

  • t ·

E

  • (ΓX

t − Id)2

1 − t dt.

slide-74
SLIDE 74

The Entropic Central Limit Theorem

Let {Xi} be i.i.d. copies of X and Sn =

1 √n n

  • i=1

Xi. Set Ht = (Γi

t)2

n

. Then Sn

law

=

1

  • HtdBt.

Using this, we show Ent(Sn||G) ≤ CXTr

1

  • E
  • (Ht − E[Ht])2

1 − t dt, where CX > 0, depends on X. This can be used to prove the entropic central limit theorem.

slide-75
SLIDE 75

The Entropic Central Limit Theorem

Let {Xi} be i.i.d. copies of X and Sn =

1 √n n

  • i=1

Xi. Set Ht = (Γi

t)2

n

. Then Sn

law

=

1

  • HtdBt.

Using this, we show Ent(Sn||G) ≤ CXTr

1

  • E
  • (Ht − E[Ht])2

1 − t dt, where CX > 0, depends on X. This can be used to prove the entropic central limit theorem.

slide-76
SLIDE 76

The Entropic Central Limit Theorem

Let {Xi} be i.i.d. copies of X and Sn =

1 √n n

  • i=1

Xi. Set Ht = (Γi

t)2

n

. Then Sn

law

=

1

  • HtdBt.

Using this, we show Ent(Sn||G) ≤ CXTr

1

  • E
  • (Ht − E[Ht])2

1 − t dt, where CX > 0, depends on X. This can be used to prove the entropic central limit theorem.

slide-77
SLIDE 77

The Entropic Central Limit Theorem

Let {Xi} be i.i.d. copies of X and Sn =

1 √n n

  • i=1

Xi. Set Ht = (Γi

t)2

n

. Then Sn

law

=

1

  • HtdBt.

Using this, we show Ent(Sn||G) ≤ CXTr

1

  • E
  • (Ht − E[Ht])2

1 − t dt, where CX > 0, depends on X. This can be used to prove the entropic central limit theorem.

slide-78
SLIDE 78

Quantitative Entropic Central Limit Theorem

For a more quantitative result we have the formula Ent(Sn||G) ≤ poly(Cp(X)) n Tr

1

  • E
  • Γ2

t − E

  • H2

t

2 1 − t dt, = poly(Cp(X)) n Tr

1

  • Var(Γ2

t )

1 − t dt, valid for X which satisfies a Poincar´ e inequality. For X log-concave, Γt 1

t Id, and

Tr

1

  • Var(Γ2

t )

1 − t dt ≤ Tr

1

  • 1

t2 E

  • (Γt − Id)2

1 − t dt.

slide-79
SLIDE 79

Thank You