Gaussian Model Selection with Unknown Variance Y. Baraud, C. Giraud - PowerPoint PPT Presentation

Gaussian Model Selection with Unknown Variance Y. Baraud, C. Giraud and S. Huet Universit´ e de Nice - Sophia Antipolis, INRA Jouy en Josas Luminy, 13-17 novembre 2006

The statistical setting The statistical model Observations: Y i = µ i + σε i , i = 1 , . . . , n • µ = ( µ 1 , . . . , µ n ) ′ ∈ R n and σ > 0 are unknown • ε 1 , . . . , ε n are i.i.d standard Gaussian Collection of models / estimators • S = { S m , m ∈ M} a countable collection of linear subspaces of R n (models) • ˆ µ m = least-squares estimator of µ on S m

Example: change-points detection • µ i = f ( x i ) with f : [0 , 1] �→ R , piecewise constant. • M is the set of increasing sequences m = ( t 0 , . . . , t q ) with q ∈ { 1 , . . . , p } , t 0 = 0 , t q = 1 , and { t 1 , . . . , t q − 1 } ⊂ { x 1 , . . . , x n } . • models: S m = { ( g ( x 1 ) , . . . , g ( x n )) ′ , g ∈ F m } , where   q   � ( a 1 , . . . , a q ) ∈ R q F ( t 0 ,...,t q ) =  g = a j 1 [ t j − 1 ,t j [ with  . j =1 • No residual squares to estimate the variance.

Risk on a single model Euclidean risk on S m : � | 2 � | 2 + D m σ 2 | | µ − ˆ µ m | = | | µ − µ m | E � �� variance bias � | 2 � µ m ∗ , where m ∗ minimizes m �→ E Ideal: estimate µ with ˆ | | µ − ˆ µ m | . . .

Model selection Selection rule: we set D m = dim ( S m ) and select ˆ m minimizing � � 1 + pen ( m ) | 2 Crit L ( m ) = | | Y − ˆ µ m | (1) n − D m or � | � | 2 Crit K ( m ) = n | Y − ˆ µ m | + 1 2 pen ′ ( m ) . 2 log (2) n Some classical penalties: FPE AIC BIC AMDL pen ′ ( m ) = 2 D m pen ′ ( m ) = D m log n pen ′ ( m ) = 3 D m log n pen ( m ) = 2 D m

Objectives • for classical criteria: to analyze the Euclidean risk of ˆ µ ˆ m with regard to the complexity of the family of model S , and compare this risk to | ] 2 . m ∈M E [ | inf | µ − ˆ µ m | • to propose penalties versatile enough to take into account the complexity of S and the sample size. Complexity: We say that S has an index of complexity ( M, a ) if for all D ≥ 1 card { m ∈ M , D m = D } ≤ Me aD .

Theorem 1: Performances of classical penalties Let K > 1 and S with complexity ( M, a ) ∈ R 2 + . If for all m ∈ M , D m ≤ D max ( K, M, a ) (explicit) and pen ( m ) ≥ K 2 φ − 1 ( a ) D m , with φ ( x ) = ( x − 1 − log x ) / 2 for x ≥ 1 , then � � � � K 1 + pen ( m ) � | 2 � | 2 + pen ( m ) σ 2 | | µ − ˆ µ ˆ m | ≤ K − 1 inf | | µ − µ m | + R E n − D m m ∈M where � � R = Kσ 2 8 KMe − a K 2 φ − 1 ( a ) + 2 K + . � � 2 K − 1 e φ ( K ) / 2 − 1

Performances of ˆ µ ˆ m • under the above hypotheses if pen ( m ) = Kφ − 1 ( a ) D m with K > 1 � � � | 2 � � | 2 � ≤ c ( K, M ) φ − 1 ( a ) + σ 2 | | µ − ˆ µ ˆ m | inf | | µ − ˆ µ m | E m ∈M E • The condition ” pen ( m ) ≥ K 2 φ − 1 ( a ) D m with K > 1 ” is sharp (at least when a = 0 and a = log n ). Roughly, for large values of n this imposes the restrictions: Criteria FPE AIC BIC AMDL a < 1 a < 3 Complexity a < 0 . 15 a < 0 . 15 2 log( n ) 2 log( n )

Dkhi function For x ≥ 0 , we define �� 1 X D − x X N Dkhi [ D, N, x ] = E ( X D ) × E ∈ ]0 , 1] . N + where X D and X N are two independent χ 2 ( D ) and χ 2 ( N ) . Computation: x �→ Dkhi [ D, N, x ] is decreasing and � � � � x − x F D,N +2 ≥ ( N + 2) x Dkhi [ D, N, x ] = P F D +2 ,N ≥ , D P D + 2 DN where F D,N is a Fischer random variable with D and N degrees of freedom.

Theorem 2: a general risk bound Let pen be an arbitrary non-negative penalty function and assume that N m = n − D m ≥ 2 for all m ∈ M . If ˆ m exists a.s., then for any K > 1 � � � � K 1 + pen ( m ) � | 2 � | 2 + pen ( m ) σ 2 | | µ − ˆ µ ˆ m | ≤ K − 1 inf | | µ − µ m | + Σ (3) E N m m ∈M where � � Σ = K 2 σ 2 D m + 1 , N m − 1 , N m − 1 � ( D m + 1) Dkhi pen ( m ) . K − 1 KN m m ∈M

Minimal penalties • Choose K > 1 and L = { L m , m ∈ M} non-negative numbers (weights) such that � Σ ′ = ( D m + 1) e − L m < + ∞ . m ∈M • For any m ∈ M set N m N m − 1 Dkhi − 1 � D m + 1 , N m − 1 , e − L m � pen L K, L ( m ) = K • When L m ∨ D m ≤ κn with κ < 1 : pen L K, L ( m ) ≤ C ( K, κ ) ( L m ∨ D m ) .

How to choose the L m ? • When S has a complexity ( M, a ) : a possible choice is L m = aD m + 3 log( D m +1 ) . Then � � Σ ′ = ( D m + 1) e − L m ≤ M D − 2 m ∈M D ≥ 1 �� n • For change-point detection: We choose L m = L ( | m | ) = log +2 log( | m | ) , | m |− 2 for which p +1 p +1 � � n 1 � � De − L ( D ) = Σ ′ = D ≤ log( p + 1) . D − 2 D =2 D =2

Gaussian Model Selection with Unknown Variance Y. Baraud, C. Giraud - PowerPoint PPT Presentation

Gaussian Model Selection with Unknown Variance Y. Baraud, C. Giraud and S. Huet Universit e de Nice - Sophia Antipolis, INRA Jouy en Josas Luminy, 13-17 novembre 2006 The statistical setting The statistical model Observations: Y i = i +

MAP for Gaussian mean and variance Conjugate priors Mean: Gaussian prior Variance:

Variance Will Perkins January 22, 2013 Variance Definition The variance of a random variable X

Gaussian model selection with an unknown variance Yannick Baraud Laboratoire J.A. Dieudonn e

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Alex Psomas: Lecture 18. Random Variables: Variance 1. Variance 2. Distributions Variance Flip

Estimating Variance under Estimating Mean . . . Interval and Fuzzy Estimating Variance . . .

Analysis of variance and regression December 4, 2007 Variance component models Variance

Variance = E[I 2 ] 2pE[I] + p 2 = E[I] 2p p + p 2 = 2 2 = p-2p+ p pq variance.1

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

Variational Model Selection for Sparse Gaussian Process Regression Michalis K. Titsias School of

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

investigation of pen as structural self vetoing material for cryogenic low background experiments

Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate

Modelling discontinuities in simulator output using Voronoi tessellations John Paul Gosling

ProtoDUNE-DP Light Data Clara Cuesta on behalf of the CIEMAT team October, 31 st 2019 Photon

Non-Photorealistic Rendering Non-Photorealistic Rendering Pen-and-Ink Illustrations Pen-and-Ink

Jacqueline Quintanilla Twin Cities Code Camp April 13, 2019 qjac.net Dont know what CSS Grid

The Subtree Polynomial Lucas Mol Joint work with Jason Brown (Dalhousie University) CanaDAM 2019

Bolus insulin on pen therapy (MDI) Dr Jackie Elliott Senior Clinical Lecturer / Consultant

Sambuz

Useful Links

Newsletter

Mail Us

Gaussian Model Selection with Unknown Variance Y. Baraud, C. Giraud - PowerPoint PPT Presentation

Gaussian Model Selection with Unknown Variance Y. Baraud, C. Giraud and S. Huet Universit e de Nice - Sophia Antipolis, INRA Jouy en Josas Luminy, 13-17 novembre 2006 The statistical setting The statistical model Observations: Y i = i +

MAP for Gaussian mean and variance Conjugate priors Mean: Gaussian prior Variance:

Variance Will Perkins January 22, 2013 Variance Definition The variance of a random variable X

Gaussian model selection with an unknown variance Yannick Baraud Laboratoire J.A. Dieudonn e

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Alex Psomas: Lecture 18. Random Variables: Variance 1. Variance 2. Distributions Variance Flip

Estimating Variance under Estimating Mean . . . Interval and Fuzzy Estimating Variance . . .

Analysis of variance and regression December 4, 2007 Variance component models Variance

Variance = E[I 2 ] 2pE[I] + p 2 = E[I] 2p p + p 2 = 2 2 = p-2p+ p pq variance.1

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

Variational Model Selection for Sparse Gaussian Process Regression Michalis K. Titsias School of

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

investigation of pen as structural self vetoing material for cryogenic low background experiments

Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate

Modelling discontinuities in simulator output using Voronoi tessellations John Paul Gosling

ProtoDUNE-DP Light Data Clara Cuesta on behalf of the CIEMAT team October, 31 st 2019 Photon

Non-Photorealistic Rendering Non-Photorealistic Rendering Pen-and-Ink Illustrations Pen-and-Ink

Jacqueline Quintanilla Twin Cities Code Camp April 13, 2019 qjac.net Dont know what CSS Grid

The Subtree Polynomial Lucas Mol Joint work with Jason Brown (Dalhousie University) CanaDAM 2019

Bolus insulin on pen therapy (MDI) Dr Jackie Elliott Senior Clinical Lecturer / Consultant

Sambuz

Useful Links

Newsletter

Mail Us

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?