Complete Monotonicity Conjecture of Heat Equation MDDS, SJTU, 2019 - - PowerPoint PPT Presentation

β–Ά
complete monotonicity conjecture of heat equation
SMART_READER_LITE
LIVE PREVIEW

Complete Monotonicity Conjecture of Heat Equation MDDS, SJTU, 2019 - - PowerPoint PPT Presentation

How to Solve Gaussian Interference Channel Complete Monotonicity Conjecture of Heat Equation MDDS, SJTU, 2019 Fan Cheng Shanghai Jiao Tong University chengfan@sjtu.edu.cn From 2008 to 2019 2 2 2 , =


slide-1
SLIDE 1

Complete Monotonicity Conjecture of Heat Equation

MDDS, SJTU, 2019 Fan Cheng Shanghai Jiao Tong University

chengfan@sjtu.edu.cn

How to Solve Gaussian Interference Channel

slide-2
SLIDE 2

From 2008 to 2019

slide-3
SLIDE 3

β„Ž π‘Œ + π‘’π‘Ž

πœ–2 2πœ–π‘¦2 𝑔 𝑦, 𝑒 = πœ– πœ–π‘’ 𝑔(𝑦, 𝑒) β„Ž π‘Œ = βˆ’βˆ« 𝑦log𝑦 d𝑦 𝑍 = π‘Œ + π‘’π‘Ž π‘Ž ∼ π’ͺ(0,1)

β–‘ A new mathematical theory on Gaussian distribution β–‘ Its application on Gaussian interference channel β–‘ History, progress, and future

slide-4
SLIDE 4

β–‘ History of β€œSuper-H” Theorem β–‘ Boltzmann equation, heat equation β–‘ Shannon Entropy Power Inequality β–‘ Complete Monotonicity Conjecture β–‘ How to Solve Gaussian Interference Channel

Outline

slide-5
SLIDE 5

Study of Heat

Heat transfer

β–‘ The history begins with the work of Joseph Fourier around 1807 β–‘ In a remarkable memoir, Fourier invented both Heat equation and the method of Fourier analysis for its solution

πœ– πœ–π‘’ 𝑔 𝑦, 𝑒 = 1 2 πœ–2 πœ–π‘¦2 𝑔(𝑦, 𝑒)

slide-6
SLIDE 6

Information Age

π‘Žπ‘’ ∼ π’ͺ(0, 𝑒)

Gaussian Channel: X and Z are mutually independent. The p.d.f of X is g(x) 𝑍

𝑒 is the convolution of X and π‘Žπ‘’.

𝑍

𝑒 ≔ π‘Œ + π‘Žπ‘’

The probability density function (p.d.f.) of 𝑍

𝑒

𝑔(𝑧; 𝑒) = ∫ 𝑕(𝑦) 1 2πœŒπ‘’ 𝑓

(π‘§βˆ’π‘¦)2 2𝑒

πœ– πœ–π‘’ 𝑔(𝑧; 𝑒) = 1 2 πœ–2 πœ–π‘§2 𝑔(𝑧; 𝑒)

The p.d.f. of Y is the solution to the heat equation, and vice versa. Gaussian channel and heat equation are identical in mathematics.

A mathematical theory of communication, Bell System Technical Journal.

slide-7
SLIDE 7

Ludwig Boltzmann

Boltzmann formula: Boltzmann equation: H-theorem:

Ludwig Eduard Boltzmann 1844-1906 Vienna, Austrian Empire 𝑇 = βˆ’π‘™πΆln𝑋 𝑇 = βˆ’π‘™π‘βˆ‘

𝑗

π‘žπ‘—lnπ‘žπ‘— 𝑒𝑔 𝑒𝑒 = (πœ–π‘” πœ–π‘’)force + (πœ–π‘” πœ–π‘’)diff + (πœ–π‘” πœ–π‘’)coll 𝐼(𝑔(𝑒))is nonβˆ’decreasing

Gibbs formula:

slide-8
SLIDE 8

β–‘ McKean’s Problem on Boltzmann equation (1966):

β–‘ 𝐼(𝑔(𝑒)) is CM in 𝑒, when 𝑔 𝑒 satisfies Boltzmann equation β–‘ False, disproved by E. Lieb in 1970s β–‘ the particular Bobylev-Krook-Wu explicit solutions, this β€œtheorem” holds true for π‘œ ≀ 101 and breaks downs afterwards

β€œSuper H-theorem” for Boltzmann Equation

  • H. P. McKean, NYU.

National Academy of Sciences

A function is completely monotone (CM) iff all the signs of its derivatives are alternating in +/-: +, -, +, -,…… (e.g., 1/𝑒, π‘“βˆ’π‘’ )

slide-9
SLIDE 9

β–‘ Heat equation: Is 𝐼(𝑔(𝑒)) CM in 𝑒, if 𝑔(𝑒) satisfies heat equation

β–‘ Equivalently, is 𝐼(π‘Œ + π‘’π‘Ž) CM in t? β–‘ The signs of the first two order derivatives were obtained β–‘ Failed to obtain the 3rd and 4th. (It is easy to compute the derivatives, it is hard to obtain their signs)

β€œSuper H-theorem” for Heat Equation

β€œThis suggests that……, etc., but I could not prove it”

  • - H. P. McKean
  • C. Villani, 2010 Fields Medalist
slide-10
SLIDE 10

Claude E. Shannon and EPI

β–‘ Entropy power inequality (Shannon 1948): For any two independent continuous random variables X and Y, Equality holds iff X and Y are Gaussian β–‘ Motivation: Gaussian noise is the worst noise β–‘ Impact: A new characterization of Gaussian distribution in information theory β–‘ Comments: most profound! (Kolmogorov)

𝑓2β„Ž(𝒀+𝒁) β‰₯ 𝑓2β„Ž(𝒀) + 𝑓2β„Ž(𝒁)

Central limit theorem Capacity region of Gaussian broadcast channel Capacity region of Gaussian Multiple-Input Multiple-Output broadcast channel Uncertainty principle

All of them can be proved by Entropy Power Inequality (EPI)

slide-11
SLIDE 11

β–‘ Shannon himself didn’t give a proof but an explanation, which turned

  • ut to be wrong

β–‘ The first proof is given by A. J. Stam (1959), N. M. Blachman (1966) β–‘ Research on EPI

Generalization, new proof, new connection. E.g., Gaussian interference channel is

  • pen, some stronger β€œEPI’’ should exist.

β–‘ Stanford Information Theory School: Thomas Cover and his students: A. El Gamel, M. H. Costa, A. Dembo, A. Barron (1980- 1990) β–‘ After 2000, Princeton && UC Berkeley

Entropy Power Inequality

Heart of Shannon theory

slide-12
SLIDE 12

Ramification of EPI

Shannon EPI Gaussian perturbation: β„Ž(π‘Œ + π‘’π‘Ž) Fisher Information: 𝐽 π‘Œ + π‘’π‘Ž =

πœ– πœ–π‘’ β„Ž(π‘Œ +

π‘’π‘Ž)/2 Fisher Information is decreasing in 𝑒 𝑓2β„Ž(π‘Œ+ π‘’π‘Ž) is concave in 𝑒 Fisher information inequality (FII):

1 𝐽(π‘Œ+𝑍) β‰₯ 1 𝐽(π‘Œ) + 1 𝐽(𝑍)

Tight Young’s inequality π‘Œ + 𝑍 𝑠 β‰₯ 𝑑 π‘Œ π‘ž 𝑍 π‘Ÿ Status Quo: FII can imply EPI and all its generalizations. Many network information problems remain open even the noise is Gaussian.

  • -Only EPI is not sufficient
slide-13
SLIDE 13

Remarks

ο‚‘

Costa’s EPI: 𝑓2β„Ž(𝑍

𝑒) is concave in 𝑒

ο‚‘

Derived the first two derivatives by very involved calculus (1986)

ο‚‘

IT society did not know McKean’s paper until 2014

ο‚‘

Log-Sobolev inequality

ο‚‘

  • A. Dembo gave a very simple proof via FII (1987)

ο‚‘

  • C. Villani simplified Costa’s calculus (2006)

ο‚‘

The first two derivatives are not commonly used in network information theory

ο‚‘

In geometry, mathematician need the first derivative to estimate the speed

  • f convergence. However, information theorists are not interested

ο‚‘

Relation with CLT

slide-14
SLIDE 14

Where our journey begins

 Shannon Entropy power inequality  Fisher information inequality  β„Ž(π‘Œ + π‘’π‘Ž)  β„Ž 𝑔 𝑒 is CM  When 𝑔(𝑒) satisfied Boltzmann equation, disproved  When 𝑔(𝑒) satisfied heat equation, unknown  We even don’t know what CM is!

Information theorists get lost in the past 70 years Mathematician ignored it  Raymond introduced this paper to me in 2008  I made some progress with Chandra Nair in 2011 (MGL)  Complete monotonicity (CM) was discovered in 2012  The third derivative in 2013 (Key breakthrough)  The fourth order in 2014  Recently, CM οƒ  GIC

slide-15
SLIDE 15

Motivation

Motivation: to find some inequalities to obtain a better rate region; e.g., the convexity of π’Š(𝒀 + π’‡βˆ’π’–π’‚), the concavity of

𝑱 𝒀+ 𝒖𝒂 𝒖

, etc. β€œAny progress?” β€œNope…” It is widely believed that there should be no new EPI except Shannon EPI and FII. Observation: 𝑱(𝒀 + 𝒖𝒂) is convex in 𝒖

𝐽 π‘Œ + π‘’π‘Ž =

πœ– 2πœ–π‘’ β„Ž π‘Œ +

π‘’π‘Ž β‰₯ 0 (de Bruijn, 1958) 𝐽(1) =

πœ– πœ–π‘’ 𝐽 π‘Œ +

π‘’π‘Ž ≀ 0 (McKean1966, Costa 1985) Could the third one be determined?

slide-16
SLIDE 16

Discovery

Observation: 𝑱(𝒀 + 𝒖𝒂) is convex in 𝒖  β„Ž π‘Œ + π‘’π‘Ž =

1 2 ln 2πœŒπ‘“π‘’, 𝐽 π‘Œ +

π‘’π‘Ž =

1 𝑒 . 𝐽 is CM: +, -, +, -…

 If the observation is true, the first three derivatives are: +, -, +  Q: Is the 4th order derivative -? Because π‘Ž is Gaussian! If so, then…  The signs of derivatives of β„Ž(π‘Œ + π‘’π‘Ž) are independent of π‘Œ. Invariant!  Exactly the same problem in McKean’s 1966 paper To convince people, must prove its convexity

My own opinion:

  • A new fundamental result on Gaussian distribution
  • Invariant is very important in mathematics
  • In mathematics, the more beautiful, the more powerful
  • Very hard to make any progress
slide-17
SLIDE 17

Challenge

Let π‘Œ ∼ 𝑕(𝑦)

  • β„Ž 𝑍

𝑒 = βˆ’βˆ« 𝑔(𝑧, 𝑒) ln 𝑔(𝑧, 𝑒) 𝑒𝑧: no closed-form expression

except for some special 𝑕 𝑦 .

  • 𝑔(𝑧, 𝑒) satisfies heat equation.
  • 𝐽 𝑍

𝑒 = ∫ 𝑔

1 2

𝑔 𝑒𝑧

  • 𝐽 1 𝑍

𝑒 = βˆ’βˆ« 𝑔

2

𝑔 βˆ’ 𝑔

1 2

𝑔2 2

𝑒𝑧

  • So what is 𝐽(2)? (Heat equation, integration by parts)
slide-18
SLIDE 18

Challenge (cont’d)

It is trivial to calculate derivatives. It is not generally obvious to prove their signs.

𝑱

slide-19
SLIDE 19

Breakthrough

Integration by parts: ∫ 𝑣𝑒𝑀 = 𝑣𝑀 βˆ’ ∫ 𝑀𝑒𝑣

First breakthrough since McKean 1966

slide-20
SLIDE 20
slide-21
SLIDE 21

GCMC

Gaussian complete monotonicity conjecture (GCMC): 𝑱(𝒀 + 𝒖𝒂) is CM in 𝒖 A general form: number partition. Hard to determine the coefficients. Conjecture 2: 𝐦𝐩𝐑𝑱(𝒀 + 𝒖𝒂) is convex in 𝒖

Hard to find 𝛾𝑙,π‘˜ !

slide-22
SLIDE 22

Remarks:

  • C. Villani showed the work of H. P. McKean to us.
  • G. Toscani cited our work within two weeks:

 the consequences of the evolution of the entropy and of its subsequent derivatives along the solution to the heat equation have important consequences.  Indeed the argument of McKean about the signs of the first two derivatives are equivalent to the proof of the logarithmic Sobolev inequality.

Gaussian optimality for derivatives of differential entropy using linear matrix inequalities

  • X. Zhang, V. Anantharam, Y. Geng - Entropy, 2018 - mdpi.com
  • A new method to prove signs by LMI
  • Verified the first four derivatives
  • For the fifth order derivative, current methods cannot find a solution
slide-23
SLIDE 23

Complete monotone function

Herbert R. Stahl, 2013

𝑔 𝑒 = ΰΆ±

∞

π‘“βˆ’π‘’π‘¦ π‘’πœˆ(𝑦)

A new expression for entropy involved special functions in mathematical physics

How to construct 𝜈(𝑦)?

𝜈

slide-24
SLIDE 24

Complete monotone function

Theorem: A function 𝑔(𝑒) is CM in 𝑒, then log 𝑔(𝑒) is also convex in 𝑒  𝐽 𝑍

𝑒 is CM in 𝑒, then log 𝐽(𝑍 𝑒) is convex in 𝑒 (Conjecture 1 implies

Conjecture 2)  A function f(t) is CM, a Schur-convex function can be obtained by f(t)  Schur-convex β†’ Majority theory

Remarks: The current tools in information theory don’t work. More sophisticated tools should be built to attack this problem. A new mathematical foundation of information theory

1946

slide-25
SLIDE 25

True Vs. False

ο‚‘

If GCMC is true

ο‚‘

A fundamental breakthrough in mathematical physics, information theory and any disciplines related to Gaussian distribution

ο‚‘

A new expression for Fisher information

ο‚‘

Derivatives are an invariant

ο‚‘

Though β„Ž(π‘Œ + π‘’π‘Ž) looks very messy, certain regularity exists

ο‚‘

Application: Gaussian interference channel?

ο‚‘

If GCMC is false

ο‚‘

No Failure, as heat equation is a physical phenomenon

ο‚‘

A Gauss constant (e.g. 2019), where Gaussian distribution fails. Painful!

slide-26
SLIDE 26

Complete Monotonicity: How to Solve Gaussian Interference Channel

 Two fundamental channel coding problem: BC and GIC  β„Ž π‘π‘Œ1 + π‘‘π‘Œ2 + 𝑂1 , β„Ž π‘π‘Œ1 + π‘’π‘Œ2 + 𝑂2 exceed the capability of EPI  Han-Kobayashi inner bound  Many researchers have contributed to this model  Foundation of wireless communication

slide-27
SLIDE 27

The Thick Shell over β„Ž(π‘Œ + π‘’π‘Ž)

β„Ž(π‘Œ + π‘’π‘Ž) is hard to estimate:  The p.d.f of π‘Œ + π‘’π‘Ž is messy  𝑔 𝑦 log 𝑔(𝑦)  ∫ 𝑔 𝑦 log𝑔(𝑦) No generally useful lower or upper bounds

  • -The thick shell over π‘Œ +

π‘’π‘Ž

slide-28
SLIDE 28

Analysis: alternating is the worst

ο‚‘ If the CM property of β„Ž(π‘Œ +

π‘’π‘Ž) is not true

ο‚‘ Take 5 for example: if CM breaks down after n=5 ο‚‘ If we just take the 5th derivative, there may be nothing special.

(So GIC won’t be so hard)

ο‚‘ CM affected the rate region of GIC ο‚‘ Prof. Siu, Yum-Tong: β€œAlternating is the worst thing in analysis as

the integral is hard to converge, though CM is very beautiful”

ο‚‘ It is not strange that Gaussian distribution is the worst in

information theory

ο‚‘ Common viewpoint: information theory is about information

inequality: EPI, MGL, etc.

ο‚‘ CM is a class of inequalities. We should regard it as a whole in

  • application. We should pivot our viewpoint from inequalities.
slide-29
SLIDE 29

Information Decomposition

ο‚‘ The lesson learned from complete monotonicity

𝐽 π‘Œ + π‘’π‘Ž = ΰΆ±

∞

π‘“βˆ’π‘’π‘¦π‘’πœˆ(𝑦)

ο‚‘ Two independent components: ο‚‘

π‘“βˆ’π‘’π‘¦ stands for complete monotonicity

ο‚‘

π‘’πœˆ(𝑦) serves as the identity of 𝐽 π‘Œ + π‘’π‘Ž

ο‚‘ Information decomposition:

Fisher Information = Complete Monotonicity + Borel Measure

ο‚‘ CM is the thick shell. It can be used to estimate in majority theory ο‚‘

Very useful in analysis and geometry

ο‚‘ π‘’πœˆ(𝑦) involves only 𝑦, and 𝑒 is removed ο‚‘

The thick shell is removed from Fisher information

ο‚‘

π‘’πœˆ(𝑦) is relatively easier to study than Fisher information

ο‚‘

WE know very little about π‘’πœˆ(𝑦)

ο‚‘ Only CM is useless for (network) information theory ο‚‘

The current constraints on π‘’πœˆ(𝑦) are too loose

ο‚‘

Only the β€œspecial one” is useful, otherwise every CM function should have the same meaning in information theory

slide-30
SLIDE 30

CM && GIC

A fundamental problem should have a nice and clean solution. To understand complete monotonicity is not an easy job (10 years). Top players are ready, but the football is missing…

slide-31
SLIDE 31

Thanks!