Recursive identification of smoothing spline ANOVA models Marco - PowerPoint PPT Presentation

Recursive identification of smoothing spline ANOVA models Marco Ratto, Andrea Pagano European Commission, Joint Research Centre, Ispra, ITALY July 8, 2009

Introduction We discuss different approaches to the estimation and identification of smoothing splines ANOVA models: • The ‘classical’ approach [Wahba, 1990, Gu, 2002], as improved by Storlie et al. [ACOSSO]; • the recursive approach of Ratto et al. [2007], Young [2001] [SDR]. 1

Introduction: ACOSSO ‘a new regularization method for simultaneous model fitting and variable selection in nonparametric regression models in the framework of smoothing spline ANOVA’. COSSO [Lin and Zhang, 2006] penalizes the sum of component norms, instead of the squared norm employed in the traditional smoothing spline method. Storlie et al. introduce an adaptive weight in the COSSO penalty allowing more flexibility in the estimate of important functional components (using heavier penalty to unimportant ones). 2

Introduction: SDR Using the the State-Dependent Parameter Regression (SDR) approach of Young [2001], Ratto et al. [2007] have developed a non-parametric approach very similar to smoothing splines, based on recursive filtering and smoothing estimation [the Kalman Filter, KF, combined with Fixed Interval Smoothing ,FIS, Kalman, 1960, Young, 1999]: • couched with optimal Maximum Likelihood estimation; • flexibility in adapting to local discontinuities, heavy non-linearity and heteroscedastic error terms. 3

Goals of the paper 1. develop a formal comparison and demonstrate equivalences between the ‘classical’ tensor product cubic spline approach and the SDR approach; 2. discuss advantages and disadvantages of these approaches; 3. propose a unified approach to smoothing spline ANOVA models that combines the best of the discussed methods. 4

State Dependent Regressions and smoothing splines: Additive models Denote the generic mapping as z ( X ) , where X ∈ [0 , 1] p and p is the number of parameters. The simplest example of smoothing spline mapping estimation of z is the additive model: p � f ( X ) = f 0 + f j ( X j ) (1) j =1 5

To estimate f we can use a multivariate smoothing spline minimization problem, that is, given λ , find the minimizer f ( X k ) of: � 1 p N 1 ( z k − f ( X k )) 2 + � � [ f ′′ j ( X j )] 2 dX j λ j (2) N 0 k =1 j =1 where a Monte Carlo sample of dimension N is assumed. This minimization problem requires the estimation of the p hyper- parameters λ j (also denoted as smoothing parameters): GCV, GML, etc. (see e.g. Wahba, 1990; Gu, 2002). 6

In the recursive approach by Ratto et al. [2007], the additive model is put into a State-Dependent Parameter Regression (SDR) form of Young [2001]. Consider the case of p = 1 and z ( X ) = g ( X ) + e , with e ∼ N (0 , σ 2 ) , i.e. z k = s k + e k , where k = 1 , . . . , N and s k is the estimate of g ( X k ) . The s k is characterized in some stochastic manner, borrowing from non-stationary time series processes and using the Generalized Random Walk (GRW) class on non-stationary random sequences [see e.g. Young and Ng, 1989, Ng and Young, 1990]. 7

The integrated random walk (IRW) process provides the same smoothing properties of a cubic spline, in the overall State-Space (SS) formulation: Observation Equation: z k = s k + e k State Equations: s k = s k − 1 + d k − 1 (3) d k = d k − 1 + η k where d k is the ‘slope’ of s k , η k ∼ N (0 , σ 2 η ) and η k is independent of e k . For the recursive estimate of s k , the MC sample has to be sorted in ascending order of X , i.e. the k and k − 1 subscripts in (3) denote adjacent elements under such ordering . 8

2.5 2 sorted z k signal 1.5 1 0.5 0 0 10 20 30 40 sorted k−ordering (increasing X 1 ) Figure 1: 9

SDR procedure 1. optimize with ML (via prediction error decomposition [Schweppe, 1965]) the hyper-parameter associated with (3): NVR = σ 2 η /σ 2 . The NVR plays the inverse role of a smoothing parameter: the smaller the NVR, the smoother the estimate of s k . 2. Given the NVR, the FIS algorithm yields ˆ s k | N : the ˆ s k | N from the IRW process is the equivalent of f ( X k ) in the cubic smoothing spline model. The recursive procedures also provide standard errors of the estimated ˆ s k | N . 10

The recursive ML optimization In the ‘classical’ smoothing spline estimates, a ‘penalty’ is always plugged in the objective function (GCV, GML, etc.) used to optimize the λ ’s, to limit the ‘degrees of freedom’ of the spline model. In GCV we have to find λ that minimizes k ( z k − f λ ( X k )) 2 � GCV λ = 1 /N · f ( λ ) /N ) 2 , (4) (1 − d where d f ∈ [0 , N ] denotes the ‘degrees of freedom’ of the spline and where we have explicitly indicated the dependency on λ in the GCV formula. 11

In the recursive notation just introduced: s k | N ) 2 � k ( z k − ˆ GCV NV R = 1 /N · f ( NV R ) /N ) 2 . (5) (1 − d Without the penalty term, the optimum would always be attained at λ = 0 (or NV R → ∞ ), i.e. perfect fit. 12

In SDR, however, the penalty is intrinsically plugged in by the fact that ML estimate is based on the filtered estimate ˆ s k | k − 1 = s k − 1 + d k − 1 and not on the smoothed estimate ˆ s k | N , namely we find NVR that minimizes: k =3 log(1 + P k | k − 1 ) + ( N − 2) · log( ˆ const + � N σ 2 ) − 2 · log( L ) = s k | k − 1 ) 2 ( z k − ˆ ˆ � N 1 σ 2 = k =3 N − 2 (1+ P k | k − 1 ) (6) where P k | k − 1 is the one step ahead forecast error of the state ˆ s k | k − 1 provided by the Kalman Filter. 13

• ˆ s k | k − 1 is based only on the information contained in [1 , . . . , k − 1] while smoothed estimates use the entire information set [1 , . . . , N ] . • a zero variance for e k implies ˆ s k | k − 1 = s k − 1 + d k − 1 = z k − 1 + d k − 1 , i.e. the one step ahead prediction of z k is given by the linear extrapolation of the adjacent value z k − 1 . • the limit NV R → ∞ ( λ → 0 ) is not a ‘perfect fit’ situation. 14

3.5 3 2.5 sorted z k signal 2 1.5 1 0.5 0 −0.5 1 2 3 4 5 6 7 8 sorted k−ordering (increasing X 1 ) Figure 2: The case of NVR → ∞ : no perfect fit for the recursive case! 15

Equivalence between SDR and cubic spline To complete the equivalence between the SDR and cubic spline formulations, we need to link the NVR estimated by the ML procedure to the smoothing parameters λ . This is easily accomplished by setting λ = 1 / (NVR · N 4 ) . 16

In the general additive case (1), the recursive procedure just described needs to be applied, in turn, for each term f j ( X j,k ) = ˆ s j,k | N , requiring a different sorting strategy for each ˆ s j,k | N . Hence the ‘backfitting’ procedure, as described in Young [2000, 2001], is exploited. Finally, the estimated NVR j ’s can be converted into λ j values and the additive model put into the standard cubic spline form. 17

State Dependent Regressions and smoothing splines: ANOVA models with interaction functions The additive model concept (1) can be generalized to include 2-way (and higher) interaction functions via the functional ANOVA decomposition. For example, we can let p p � � f ( X ) = f 0 + f j ( X j ) + f j,i ( X j , X i ) (7) j =1 j<i 18

In the ANOVA smoothing spline context, corresponding optimization problems with interaction functions and their solutions can be obtained conveniently with the reproducing kernel Hilbert space (RKHS) approach (see Wahba 1990). In the SDR context, an interaction function is formalized as the product of two states f 1 , 2 ( X 1 , X 2 ) = s 1 · s 2 , each of them characterized by an IRW stochastic process. 19

Hence the estimation of a single interaction term z ( X k ) = f ( X 1 ,k , X 2 ,k ) + e k is formalized as: s I 1 ,k · s I Observation Equation: z k = 2 ,k + e k s I s I j,k − 1 + d I State Equations: ( j = 1 , 2) = (8) j,k j,k − 1 d I d I j,k − 1 + η I = j,k j,k where I = 1 , 2 is a multi-index denoting the interaction term under estimation and η I j,k ∼ N (0 , σ 2 j ) . The two terms s I j,k are estimated η I iteratively by running the recursive procedure in turn. 20

• take an initial estimate of s I 1 ,k and s I 2 ,k by regressing z with the product of simple linear or quadratic polynomials p 1 ( X 1 ) · p 2 ( X 2 ) and set s I, 0 j,k = p j ( X j,k ) ; • iterate i = 1 , 2 : – fix s I,i − 1 1 and s I,i and estimate NV R I 1 ,k using the recursive 2 ,k procedure; – fix s I,i 2 and s I,i 1 ,k and estimate NV R I 2 ,k using the recursive procedure; • the product s I, 2 1 ,k · s I, 2 2 ,k obtained after the second iteration provides the recursive SDR estimate of the interaction function. 21

Unfortunately, in the case of interaction functions we cannot derive an explicit and full equivalence between SDR and cubic splines of the type mentioned for first order ANOVA terms. Therefore, in order to be able to exploit the estimation results in the context of a smoothing spline ANOVA model, we take a different approach, similarly to the ACOSSO case. 22

Recursive identification of smoothing spline ANOVA models Marco - PowerPoint PPT Presentation

Recursive identification of smoothing spline ANOVA models Marco Ratto, Andrea Pagano European Commission, Joint Research Centre, Ispra, ITALY July 8, 2009 Introduction We discuss different approaches to the estimation and

P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee and Mar

Two-Way ANOVA Two-way ANOVA So far, our ANOVA problems had only one dependent variable and

61A Lecture 6 Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Recursive Methods Noter ch.2 Recursive Methods Recursive problem solution Problems

Recursion Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Unit 4: Inference for numerical variables Lecture 3: ANOVA Statistics 101 Thomas Leininger June

Workshop 7.6a: Factorial ANOVA Murray Logan 19 Jul 2017 Section 1 Background Factorial ANOVA

STAT 213 ANOVA as Multiple Regression Colin Reimer Dawson Oberlin College 5 April 2016 Outline

Estimating Criteria for for Fitting Fitting B B- -spline Curves spline Curves: : Estimating

Subdivision Surfaces CAGD Ofir Weber 1 Spline Surfaces Spline Surfaces Why use them?

B-Spline Blossoms CS 418 Interactive Computer Graphics John C. Hart The Blossoming Game

Lesson 9 Recursive Types 2/19, 21 Chapters 20, 21 Recursive type Recursive type terms are

Recursive Methods Recursive problem solution Problems that are naturally solved by

Assessing the Stability of Forecasting Models: Recursive Parameter Estimation and Recursive

STAR: Spike Train Analysis with R Goodness of fit Smoothing spline Conclusions Christophe Pouzat

Non-Recursive In-Place FFT Algorithm Idea: "Unwind the in-place recursive algorithm and work

Design and Analysis of Computer Experiments for Bulk Acoustic Wave filters:

R05 - Multiple Regression STAT 587 (Engineering) Iowa State University October 30, 2020

Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous

Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous

TA2 Test Case Praveen. C 1 R. Duvigneau 2 1 Tata Institute of Fundamental Research Center for

Tutorials on the Gaussian Random Process and its OR Applications By Juta Pichitlamken

The Hottest, and Most Liquid, Liquid in the Universe Krishna Rajagopal MIT & CERN European

Natural Language Processing (CSE 517): Sequence Models Noah Smith 2018 c University of

Recursive identification of smoothing spline ANOVA models Marco - PowerPoint PPT Presentation

Recursive identification of smoothing spline ANOVA models Marco Ratto, Andrea Pagano European Commission, Joint Research Centre, Ispra, ITALY July 8, 2009 Introduction We discuss different approaches to the estimation and

P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee and Mar

Two-Way ANOVA Two-way ANOVA So far, our ANOVA problems had only one dependent variable and

61A Lecture 6 Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Recursive Methods Noter ch.2 Recursive Methods Recursive problem solution Problems

Recursion Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Unit 4: Inference for numerical variables Lecture 3: ANOVA Statistics 101 Thomas Leininger June

Workshop 7.6a: Factorial ANOVA Murray Logan 19 Jul 2017 Section 1 Background Factorial ANOVA

STAT 213 ANOVA as Multiple Regression Colin Reimer Dawson Oberlin College 5 April 2016 Outline

Estimating Criteria for for Fitting Fitting B B- -spline Curves spline Curves: : Estimating

Subdivision Surfaces CAGD Ofir Weber 1 Spline Surfaces Spline Surfaces Why use them?

B-Spline Blossoms CS 418 Interactive Computer Graphics John C. Hart The Blossoming Game

Lesson 9 Recursive Types 2/19, 21 Chapters 20, 21 Recursive type Recursive type terms are

Recursive Methods Recursive problem solution Problems that are naturally solved by

Assessing the Stability of Forecasting Models: Recursive Parameter Estimation and Recursive

STAR: Spike Train Analysis with R Goodness of fit Smoothing spline Conclusions Christophe Pouzat

Non-Recursive In-Place FFT Algorithm Idea: &quot;Unwind the in-place recursive algorithm and work

Design and Analysis of Computer Experiments for Bulk Acoustic Wave filters:

R05 - Multiple Regression STAT 587 (Engineering) Iowa State University October 30, 2020

Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous

Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous

TA2 Test Case Praveen. C 1 R. Duvigneau 2 1 Tata Institute of Fundamental Research Center for

Tutorials on the Gaussian Random Process and its OR Applications By Juta Pichitlamken

The Hottest, and Most Liquid, Liquid in the Universe Krishna Rajagopal MIT &amp; CERN European

Natural Language Processing (CSE 517): Sequence Models Noah Smith 2018 c University of

Non-Recursive In-Place FFT Algorithm Idea: "Unwind the in-place recursive algorithm and work

The Hottest, and Most Liquid, Liquid in the Universe Krishna Rajagopal MIT & CERN European