a general class of score driven smoothers
play

A General Class of Score-Driven Smoothers Giuseppe Buccheri Scuola - PowerPoint PPT Presentation

A General Class of Score-Driven Smoothers Giuseppe Buccheri Scuola Normale Superiore Joint work with Giacomo Bormetti a , Fulvio Corsi b and Fabrizio Lillo a University of Bologna a , University of Pisa and City University of London b IAAE 2018


  1. A General Class of Score-Driven Smoothers Giuseppe Buccheri Scuola Normale Superiore Joint work with Giacomo Bormetti a , Fulvio Corsi b and Fabrizio Lillo a University of Bologna a , University of Pisa and City University of London b IAAE 2018 Montr´ eal June 27, 2018

  2. Key facts Following Cox (1981), we divide time-varying parameter models into two classes: 1. Parameter-driven models: parameters evolve in time based on idiosyncratic innovations (e.g. local level, stochastic volatility, stochastic intensity, etc) 2. Observation-driven models: parameters evolve in time based on nonlinear functions of past observations (e.g. GARCH, MEM, DCC, Score-Driven models) We shall see that there is a trade-off between: 1. Estimation complexity and computational speed ◮ Here observation-driven models are superior 2. Flexibility ◮ Here parameter-driven models are superior Why a difference in flexibility? ◮ Observation-driven: Var[ f t +1 |F t ] = 0 but Var[ f t +1 ] > 0 ◮ Parameter-driven: Var[ f t +1 |F t ] > 0 and Var[ f t +1 ] > 0

  3. A different interpretation Consider a standard GARCH(1,1) model: r t = σ t ǫ t , ǫ t ∼ N(0 , 1) σ 2 t +1 = c + ar 2 t + b σ 2 t There are two possible interpretations for the dynamic model σ 2 t +1 : 1. It is the true DGP of volatility 2. Since σ 2 t +1 is F t -measurable, it can be seen as a filter, i.e. σ 2 t +1 = E[ ζ 2 t +1 |F t ], where ζ t +1 is the volatility of the true, parameter-driven, DGP (e.g. the SV model) Assumption 1 is more common in the financial econometrics literature while 2 is closer to the filtering literature.

  4. An example: ARCH filtering and smoothing ◮ ”Filtering and forecasting with misspecified ARCH models I. Getting the right variance with the wrong model” . Nelson (1992), JoE ◮ ”Asymptotic filtering theory for univariate ARCH models” . Nelson & Foster (1994), Ecta ◮ ”Filtering and forecasting with misspecified ARCH models II. Making the right forecast with the wrong model” . Nelson & Foster (1995), JoE ◮ ”Asymptotically Optimal Smoothing with Arch Models” . Nelson (1996), Ecta Quoting Nelson (1992): ”Note that our use of the term ‘estimate’ corresponds to its use in the filtering literature rather than the statistics literature; that is, an ARCH model with (given) fixed parameters produces ‘estimates’ of the true underlying conditional covariance matrix at each point in time in the same sense that a Kalman filter produces ‘estimates’ of unobserved state variables in a linear system”

  5. Motivations and Objectives A key observation ◮ Observation-driven models as DGP’s − → all relevant information is contained in past observations − → no room for smoothing ◮ Observation-driven models as filters − → can benefit from using all observations − → smoothing is useful Related literature ◮ Little attention has been paid to the problem of smoothing with misspecified observation-driven models. Harvey (2013) proposed a smoothing algorithm for a dynamic Student t location model. Objective of this paper ◮ Filling the gap by proposing a methodology to smooth filtered estimates of a general class of observation-driven models, namely Score-Driven models of Creal et al. (2013) and Harvey (2013)

  6. Filtering and smoothing in linear Gaussian models Consider the general linear Gaussian model: y t = Z α t + ǫ t , ǫ t ∼ N(0 , H ) α t +1 = c + T α t + η t , η t ∼ N(0 , Q ) Kalman forward filter → a t +1 = E[ α t +1 |F t ], P t +1 = Var[ α t +1 |F t ] F t = ZP t Z ′ + H v t = y t − Za t , P t +1 = TP t ( T − K t Z ) ′ + Q a t +1 = c + Ta t + K t v t , K t = TP t Z ′ F − 1 and t = 1 , . . . , n t α t = E[ α t |F T ], ˆ Kalman backward smoother → ˆ P t = Var[ α t |F T ], t ≤ n . r t − 1 = Z ′ F − 1 v t + L ′ N t − 1 = Z ′ F − 1 Z + L ′ t r t , t N t L t t t ˆ α t = a t + P t r t − 1 , ˆ P t = P t − P t N t − 1 P t L t = T − K t Z , r n = 0, N n = 0 and t = n , . . . , 1. t F − 1 ◮ The conditional density is written as log p ( y t |F t − 1 ) = − 1 2 log | F t | − 1 2 v ′ v t t ◮ As Z , H , T , Q are constant, the variance recursion has a fixed point solution ¯ P that is referred to as the steady state of the Kalman filter

  7. A more general representation Introduce the score and information matrix of the conditional density: � ′ � ∂ log p ( y t |F t − 1 ) I t | t − 1 = E t − 1 [ ∇ t ∇ ′ ∇ t = , t ] ∂ a ′ t After some simple algebra, we can re-write Kalman filter and smoothing recursions for the mean in the steady state as: a t +1 = c + Ta t + R ∇ t , (1) where R = T ¯ P , and: r t − 1 = ∇ t + L ′ t r t (2) α t = a t + T − 1 Rr t − 1 ˆ (3) where L t = T − R I t | t − 1 . ◮ Kalman recursions for the mean re-parametrized in terms of ∇ t and I t | t − 1 ◮ The new representation is more general, as it only relies on the conditional density p ( y t |F t − 1 ), which is defined for any observation-driven model. ◮ The Kalman filter is a score-driven process

  8. The Score-driven Smoother (SDS) ◮ Based on eq. (1), score-driven models can be viewed as approximate filters for nonlinear non-Gaussian state-space models ◮ By analogy, we can regard eq. (2), (3) as an approximate smoother for nonlinear non-Gaussian models Assume y t |F t − 1 ∼ p ( y t | f t , Θ), where f t is a vector of t.v.p. and Θ collects all static parameters. In score-driven models: f t +1 = ω + As t + Bf t (4) where s t = S t ∇ t , ∇ t = ∂ log p ( y t | f t , Θ) and S t = I − α t | t − 1 , α ∈ [0 , 1]. We generalize eq. (2), ∂ f t (3) as: r t − 1 = s t + ( B − AS t I t | t − 1 ) ′ r t (5) ˆ f t = f t + B − 1 Ar t − 1 (6) t = n , . . . , 1, r n = 0. We name the smoother (5), (6) “Score-Driven Smoother” (SDS). It has same structure as Kalman backward smoothing recursions but uses the score of the non-Gaussian density and it is nonlinear in the observations

  9. SDS methodology y t |F t − 1 ∼ p ( y t | f t , Θ) f t +1 = ω + As t + Bf t 1. Estimation of static parameters: n � ˜ Θ = argmax log p ( y t | f t , Θ) Θ t =1 2. Forward filter: ω + ˜ As t + ˜ f t +1 = ˜ Bf t 3. Backward smoother: r t − 1 = s t + (˜ B − ˜ AS t I t | t − 1 ) ′ r t B − 1 ˜ ˆ f t = f t + ˜ Ar t − 1 ◮ SDS is computationally simple (maximization of closed-form likelihood + forward/smoothing recursion) ◮ SDS is general, in that it can handle any observation density p ( y t | f t , Θ), with a potentially large number of time-varying parameters

  10. Example: GARCH-SDS Consider the model: y t = σ t ǫ t , ǫ t ∼ NID(0 , 1) The predictive density is thus: y 2 t 1 − 2 σ 2 p ( y t | σ 2 t ) = √ 2 πσ t e t t and S t = I − 1 Setting f t = σ 2 t | t − 1 , eq. (4) reduces to: f t +1 = ω + a ( y 2 t − f t ) + bf t i.e. the standard GARCH(1,1) model. The smoothing recursions (5), (6) reduce to: r t − 1 = y 2 t − f t + ( b − a ) r t ˆ f t = f t + b − 1 ar t − 1

  11. Example: GARCH-SDS 0.4 0.5 0.3 0.4 0.2 0.3 0.1 0.2 0 1000 2000 3000 4000 0 1000 2000 3000 4000 1 0.4 0.8 0.6 0.3 0.4 0.2 0.2 0 0.1 0 1000 2000 3000 4000 0 1000 2000 3000 4000 Figure: Filtered (blue dotted), and smoothed (red) estimates of GARCH(1,1) model

  12. Other examples ◮ MEM (Engle, 2002) y t = µ t ǫ t , ǫ t ∼ Gamma( α ) where Gamma( α ) = Γ( α ) − 1 ǫ α − 1 α α e − αǫ t t ◮ AR(1) with a time-varying coefficient ǫ t ∼ N(0 , q 2 ) y t = c + α t y t − 1 + ǫ t , ◮ Wishart-GARCH (Gorgi et al. 2018) r t |F t − 1 ∼ N k (0 , V t ) X t |F t − 1 ∼ W k ( V t /ν, ν ) where N k (0 , V t ) is a multivariate zero-mean normal distribution with covariance matrix V t and W k ( V t /ν, ν ) is a Wishart distribution with mean V t and degrees of freedom ν ≥ p

  13. Other example: MEM-SDS 0.6 0.4 0.5 0.3 0.4 0.3 0.2 0.2 0.1 0.1 0 1000 2000 3000 4000 0 1000 2000 3000 4000 1 0.4 0.8 0.3 0.6 0.2 0.4 0.1 0.2 0 0 0 1000 2000 3000 4000 0 1000 2000 3000 4000 Figure: Filtered (blue dotted), and smoothed (red) estimates of MEM(1,1) model

  14. Other example: t.v. AR(1)-SDS 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 1000 2000 3000 4000 0 1000 2000 3000 4000 0.8 1 0.6 0.8 0.4 0.2 0.6 0 0.4 0 1000 2000 3000 4000 0 1000 2000 3000 4000 Figure: Filtered (blue dotted), and smoothed (red) estimates of autoregressive coefficient of AR(1) model

  15. Other example: Wishart-GARCH-SDS 10 -3 V(1,1) 10 -3 V(1,2) 3 3 Simulated X(1,1) Simulated V(1,1) Wishart-GARCH 2 2.5 Wishart-GARCH-SDS 1 2 0 1.5 -1 1 -2 Simulated X(1,2) 0.5 -3 Simulated V(1,2) Wishart-GARCH Wishart-GARCH-SDS 0 -4 0 100 200 300 400 0 100 200 300 400 Figure: Comparison among simulated observations of X t (grey lines), simulated true covariances V t (black lines), filtered (blue dotted lines) and smoothed (red lines) (co)variances of realized Wishart-GARCH model in the case k = 5. We show the variance corresponding to the first asset on the left and the covariance between the first and the second asset on the right.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend