Representation formulae for score functions Ivan Nourdin, Giovanni - PowerPoint PPT Presentation

Representation formulae for score functions Ivan Nourdin, Giovanni Peccati and Yvik Swan ⋆ D´ epartement de Math´ ematique, Universit´ e de Li` ege July 2, 2014

1 Score 2 Stein and Fisher 3 Controlling the relative entropy 4 Key identity 5 Cattywampus Stein’s method 6 Extension 7 Coda

Scoooores

Let X be a centered d -random vector with covariance B > 0. Definition The Stein kernel of X is a d × d matrix τ X ( X ) such that E [ τ X ( X ) ∇ ϕ ( X )] = E [ X ϕ ( X )] for all ϕ ∈ C ∞ c ( R d ) . Definition The score of X is the d × 1 vector ρ X ( X ) such that E [ ρ X ( X ) ϕ ( X )] = − E [ ∇ ϕ ( X )] for all ϕ ∈ C ∞ c ( R d ) .

In the Gaussian case Z ∼ N d (0 , C ) the Stein identity E [ Z ϕ ( Z )] = E [ C ∇ ϕ ( Z )] gives ρ Z ( Z ) = − C − 1 Z and τ Z ( Z ) = C . Intuitively, a measure of proximity ρ X ( X ) ≈ − B − 1 X and τ X ( X ) ≈ B should provide an assessment of “Gaussianity”.

Definition The standardised Fisher information of X is �� T � ρ X ( X ) + B − 1 X ρ X ( X ) + B − 1 X � � J st ( X ) = BE . A simple computation gives J st ( X ) = BJ ( X ) − Id � ρ X ( X ) ρ X ( X ) T � with J ( X ) = E the Fisher information matrix. Definition The Stein discrepancy is � τ X ( X ) − B � 2 � � S ( X ) = E . H . S .

Control on J st ( X ) and S ( X ) provides control on several distances (Kullback-Leibler, Kolmogorov, Wasserstein, Hellinger, Total Variation, ...) between the law of X and the Gaussian. Controlling J st ( X ) : • Johnson and Barron through careful analysis of the score function (PTRF, 2004) • Artstein, Ball, Barthe, Naor through “variational tour de force” (PTRF, 2004) Controlling S ( X ) : • Cacoullos Papathanassiou and Utev (AoP 1994) in a number of settings • Nourdin and Peccati through their infamous Malliavin/Stein fourth moment theorem (PTRF, 2009) • Extension to abstract settings (Ledoux, AoP 2012)

Let Z be centered Gaussian with density φ = φ d ( · ; C ). Definition The relative entropy between X and Z is � � f ( x ) � D ( F || Z ) = E [log( f ( X ) /φ ( X ))] = R d f ( x ) log dx . φ ( x ) The Pinsker-Csiszar-Kullback inequality yields � 2 TV ( X , Z ) ≤ 2 D ( X || Z ) . In other words D ( X || Z ) ⇒ TV ( X , Z ) 2 .

Usefulness of J st ( X ) can be seen via the de Bruijn identity. Let X t = √ tX + √ 1 − tZ and Γ t = tB + (1 − t ) C . Then 1 � 1 C Γ − 1 � � D ( X || Z ) = 2 t tr t J st ( X t ) dt 0 1 + 1 � 1 C − 1 B C Γ − 1 � � � � � � tr − d + 2 t tr − I d dt t 2 0 In other words J st ( X t ) ⇒ D ( X || Z ) ⇒ ( TV ( X , Z )) 2 .

Usefulness of S ( X ) can be seen via Stein’s method. Fix d = 1. Then, given h : R → R such that � h � ∞ ≤ 1 seek g h solution of the Stein equation to get g ′ � � E [ h ( X )] − E [ h ( Z )] = E h ( X ) − Xg h ( X ) (1 − τ X ( X )) g ′ � � = E h ( X ) so that TV ( X , Z ) = 1 sup | E [ h ( X )] − E [ h ( Z )] | 2 � h � ∞ ≤ 1 � � 1 � g ′ � ≤ sup h � S ( X ) . 2 � h � ∞ ≤ 1 In other words S ( X ) ⇒ TV ( X , Z ) 2 .

If h is not smooth there is no way of obtaining sufficiently precise estimates on the quantity “ ∇ g h ” in dimension greater than 1. For the moment Stein’s method only works in dimension 1 for total variation distance. The IT approach via de Bruijn’s identity does not suffer from this “dimensionality issue”. We aim to mix the Stein method approach and the IT approach. To this end we need one final ingredient : a representation formulae for the score in terms of the Stein kernel.

Theorem Let X t = √ tX + √ 1 − tZ with X and Z independent. Then t ρ t ( X t ) + C − 1 X t = − ( I d − C − 1 τ X ( X )) Z | X t √ 1 − t E � � (1) for all 0 < t < 1 . Proof when d = 1 and C = 1. E [ E [(1 − τ X ( X )) Z | X t ] φ ( X t )] = E [(1 − τ X ( X )) Z φ ( X t )] √ √ φ ′ ( X t ) τ X ( X ) φ ′ ( X t ) � � � � = 1 − tE − 1 − tE √ � 1 − t φ ′ ( X t ) � � 1 − tE − = E [ X φ ( X t )] t √ 1 − t √ E [ X t φ ( X t )] + 1 − t φ ′ ( X t ) � � = 1 − tE − E [ Z φ ( X t )] t t √ 1 − t √ √ E [ X t φ ( X t )] + 1 − t φ ′ ( X t ) φ ′ ( X t ) � � � � = 1 − tE − 1 − tE t t √ 1 − t φ ′ ( X t ) = − � � � − E [ X t φ ( X t )] � E t

This formula provides a nearly one-line argument. Define ( I d − C − 1 τ X ( X )) Z | X t � � ∆( X , t ) = E . Take d = 1 and all variances set to 1. Then t 2 ( ρ t ( X t ) + X t ) 2 � ∆( X , t ) 2 � � � J st ( X t ) = E = 1 − t E so that � 1 D ( X || Z ) = 1 t ∆( X , t ) 2 � � dt . 1 − t E 2 0 Also, ∆( X , t ) 2 � (1 − τ X ( X )) 2 � � ≤ E � E = S ( X ) .

This yields � 1 D ( X || Z ) ≤ 1 t 2 S ( X ) 1 − t dt 0 which is useless. There is hope, nevertheless : � 1 t 1 − t dt 0 is barely infinity.

Recall X t = √ tX + √ 1 − tZ . Then ∆( X , t ) = E [(1 − τ X ( X )) Z | X t ] is such that ∆( X , 0) = ∆( X , 1) = 0 a.s. Hence we need to identify conditions under which t ∆( X , t ) 2 � � 1 − t E is integrable at t = 1.

The behaviour of ∆( X , t ) around t ≈ 1 is central to the understanding of the law of X . The behaviour of ∆( X , t ) 2 � � at t ≈ 1 E is closely connected to the so-called MMSE dimension studied by the IT community. This quantity revolves around the remarkable “MMSE formula” dr I ( X ; √ rX + Z ) = E ( X − E [ X | √ rX + Z ]) 2 � d � due to Guo, Shamai and Verdu (IEEE, 2005) The connexion is explicitly stated in NPSb (IEEE, 2014).

In NPSa (JFA, 2014) we suggest the following IT alternative to Stein’s method. First cut the integral : 2 D ( X || Z ) (1 − τ X ( X )) 2 � � 1 − ǫ � 1 t t ∆( X , t ) 2 � � � ≤ E 1 − t dt + 1 − t E dt 0 1 − ǫ � 1 t (1 − τ X ( X )) 2 � ∆( X , t ) 2 � � � ≤ E | log ǫ | + 1 − t E dt . 1 − ǫ Next suppose that when t is close to 1 we have ∆( X , t ) 2 � ≤ C κ t − 1 (1 − t ) κ � E (2) for some κ > 0.

We deduce � 1 (1 − t ) − 1+ κ dt 2 D ( X || Z ) ≤ S ( X ) | log ǫ | + C η 1 − ǫ = S ( X ) | log ǫ | + C κ κ ǫ κ . (1 − τ X ( X )) 2 � 1 /κ which leads to � The optimal choice is ǫ = E D ( X || Z ) ≤ 1 2 κ S ( X ) log S ( X ) + C κ 2 κ S ( X ) which provides a bound on the total variation distance in terms of S ( X ) which is of the correct order up to a logarithmic factor.

Under what conditions do we have (2)? It is relatively easy to show (via H¨ older’s inequality) that � | τ X ( X ) | 2+ η � < ∞ and E [ | ∆( X , t ) | ] ≤ ct − 1 (1 − t ) δ (3) E implies (2). It now remains to identify under which conditions we have (3). Lemma (Poly’s first lemma) Let X be an integrable random variable and let Y be a R d -valued random vector having an absolutely continuous distribution. Then E | E [ X | Y ] | = sup E [ Xg ( Y )] , where the supremum is taken over all g ∈ C 1 c such that � g � ∞ ≤ 1

Thus E | E [ Z (1 − τ X ( X )) | X t ] | = sup E [ Z (1 − τ X ( X )) g ( X t )] . Now choose g ∈ C 1 c such that � g � ∞ ≤ 1. Then E [ Z (1 − τ X ( X )) g ( X t )] = E [ Zg ( X t )] − E [ Zg ( X t ) τ X ( X )] √ τ X ( X ) g ′ ( X t ) � � = E [ Zg ( X t )] − 1 − tE � 1 − t = E [ Z ( g ( X t ) − g ( X ))] − E [ g ( X t ) X ] t and thus | E [ Z (1 − τ X ( X )) g ( X t )] | ≤ | E [ Z ( g ( X t ) − g ( X ))] | + t − 1 √ 1 − t .

Recall that we want E | E [ Z (1 − τ X ( X )) | X t ] | ≤ ct − 1 (1 − t ) δ . (3) As it turns out, in view of previous results, a sufficient condition for (3) is √ √ 1 − tx , X ) ≤ κ (1 + | x | ) t − 1 (1 − t ) α . TV ( tX + This condition – and its multivariate extension – is satisfied by a wide family of random vectors including those for which they can apply their fourth moment bound X 4 � S ( X ) ≤ c ( E � − 3) .

Representation formulae for score functions Ivan Nourdin, Giovanni - PowerPoint PPT Presentation

Representation formulae for score functions Ivan Nourdin, Giovanni Peccati and Yvik Swan D epartement de Math ematique, Universit e de Li` ege July 2, 2014 1 Score 2 Stein and Fisher 3 Controlling the relative entropy 4 Key identity

MARC Fall Meeting 09/24/17 MARC Fall Meeting 09/24/17 SCORE Presentation SCORE

JUST THE MATHS SLIDES NUMBER 1.6 ALGEBRA 6 (Formulae and algebraic equations) by

Revision: Negation of propositional formulae Conjunctive and disjunctive normal forms of

Neural representations of formulae A brief introduction Karel Chvalovsk CIIRC CTU

Sample Score Report by three areas, or claims. Sample Score

Entrepreneurship & SCORE By: Mort Harris Agenda Who is SCORE Entrepreneurship

Score Distribution Models Evangelos Kanoulas Keshi Dai Virgil Pavlu Javed Aslam

Linear Classification w T x i is the classifier score for the instance x i The score can be used

Consultation on national funding formulae for schools and high needs 14 December 2016 to 22

Two-parameter Deformation of Multivariate Hook Product Formulae Soichi OKADA (Nagoya

Quantifier Elimination Assia Mahboubi Syntax of first order formulae Terms T on a signature

Propositional Fragments for Knowledge Compilation and Quantified Boolean Formulae Sylvie

From optimal cubature formulae to Chebyshev lattices: a way towards generalised Clenshaw-Curtis

Exhaustive search of optimal formulae for bilinear maps Svyatoslav Covanov Supervisors:

More on Functions Thomas Schwarz, SJ Marquette University Functions of Functions Functions

Elementary Functions Part 1, Functions Lecture 1.4a, Symmetries of Functions: Even and Odd

Speed up evaluation by parallelization /////////// November 2018 Michael Weiss Bayer AG

On Churchs Thesis in Cubical Assemblies Andrew W Swan and Taichi Uemura ILLC, University of

Counterexamples in Cubical Sets Andrew W Swan ILLC, University of Amsterdam August 20, 2019

Activity presentations are considered intellectual property These slides may not be published or

Continuous Semantics for Strong Normalization Ulrich Berger Swansea 1 Contents 1. Motivation

Developing a Whole School Approach to Problem Solving and Word Problems Day 2 Dr Paul Swan and

Photo 1. Looking north at erosion from subdrain outlet into edge of River at Site #1. 08\15\85-13

Shoreline Special Needs PTSA General Meeting January 22nd AGENDA 6:30-7pm Meet and Greet PTA

Representation formulae for score functions Ivan Nourdin, Giovanni - PowerPoint PPT Presentation

Representation formulae for score functions Ivan Nourdin, Giovanni Peccati and Yvik Swan D epartement de Math ematique, Universit e de Li` ege July 2, 2014 1 Score 2 Stein and Fisher 3 Controlling the relative entropy 4 Key identity

MARC Fall Meeting 09/24/17 MARC Fall Meeting 09/24/17 SCORE Presentation SCORE

JUST THE MATHS SLIDES NUMBER 1.6 ALGEBRA 6 (Formulae and algebraic equations) by

Revision: Negation of propositional formulae Conjunctive and disjunctive normal forms of

Neural representations of formulae A brief introduction Karel Chvalovsk CIIRC CTU

Sample Score Report by three areas, or claims. Sample Score

Entrepreneurship &amp; SCORE By: Mort Harris Agenda Who is SCORE Entrepreneurship

Score Distribution Models Evangelos Kanoulas Keshi Dai Virgil Pavlu Javed Aslam

Linear Classification w T x i is the classifier score for the instance x i The score can be used

Consultation on national funding formulae for schools and high needs 14 December 2016 to 22

Two-parameter Deformation of Multivariate Hook Product Formulae Soichi OKADA (Nagoya

Quantifier Elimination Assia Mahboubi Syntax of first order formulae Terms T on a signature

Propositional Fragments for Knowledge Compilation and Quantified Boolean Formulae Sylvie

From optimal cubature formulae to Chebyshev lattices: a way towards generalised Clenshaw-Curtis

Exhaustive search of optimal formulae for bilinear maps Svyatoslav Covanov Supervisors:

More on Functions Thomas Schwarz, SJ Marquette University Functions of Functions Functions

Elementary Functions Part 1, Functions Lecture 1.4a, Symmetries of Functions: Even and Odd

Speed up evaluation by parallelization /////////// November 2018 Michael Weiss Bayer AG

On Churchs Thesis in Cubical Assemblies Andrew W Swan and Taichi Uemura ILLC, University of

Counterexamples in Cubical Sets Andrew W Swan ILLC, University of Amsterdam August 20, 2019

Activity presentations are considered intellectual property These slides may not be published or

Continuous Semantics for Strong Normalization Ulrich Berger Swansea 1 Contents 1. Motivation

Developing a Whole School Approach to Problem Solving and Word Problems Day 2 Dr Paul Swan and

Photo 1. Looking north at erosion from subdrain outlet into edge of River at Site #1. 08\15\85-13

Shoreline Special Needs PTSA General Meeting January 22nd AGENDA 6:30-7pm Meet and Greet PTA

Entrepreneurship & SCORE By: Mort Harris Agenda Who is SCORE Entrepreneurship