Pooling versus model selection for nowcasting with many predictors: - - PowerPoint PPT Presentation

pooling versus model selection for nowcasting with many
SMART_READER_LITE
LIVE PREVIEW

Pooling versus model selection for nowcasting with many predictors: - - PowerPoint PPT Presentation

Pooling versus model selection for nowcasting with many predictors: An application to German GDP Vladimir Kuzin Massimiliano Marcellino DIW Berlin EUI Florence Christian Schumacher Deutsche Bundesbank 1 Introduction 1.1 What is


slide-1
SLIDE 1

Pooling versus model selection for nowcasting with many predictors: An application to German GDP

Vladimir Kuzin DIW Berlin Massimiliano Marcellino EUI Florence Christian Schumacher Deutsche Bundesbank

slide-2
SLIDE 2

1 Introduction

1.1 What is nowcasting?

decision makers regularly request information on the current state of the economy, in particular with respect to GDP GDP is a comprehensive business cycle indicator, but sampled at quarterly fre- quency only and published with considerable delay example: at the beginning of May, German GDP is available only for the fourth quarter of the previous year; to obtain a 2nd quarter GDP nowcast, we have to make a projection with forecast horizon of two quarters from the end of the GDP sample economist’s task: estimate current quarter GDP using all information which is currently available

slide-3
SLIDE 3

1.2 Why is nowcasting GDP challenging?

there are many difficulties, we discuss two, both leading to unbalanced data:

  • 1. GDP is quarterly data, many important indicators are sampled at monthly or

higher frequency - mixed-frequency problem

  • 2. indicators for nowcasting are available with different publication lags → leads to

the so-called ragged edge of multivariate datasets, Wallis (1986) question here: how to nowcast quarterly German GDP with a large set of monthly indicators with missing observations?

slide-4
SLIDE 4

1.3 Nowcast approaches from the recent literature

  • 1. mixed-data sampling or MIDAS regressions with a few predictors, see Ghysels,

Sinko, and Valkanov (2007) EctrRev, Clements and Galvão (2009) JBES

  • 2. factor models based on large datasets:

(a) large state-space factor model by Giannone, Reichlin, and Small (2008) JME (b) New Eurocoin based on dynamic principal components, Altissimo et al. (2007) CEPR WP - used by CEPR to assess the current state of the economy

  • 3. a mix of both: Factor-MIDAS, Marcellino and Schumacher (2010) OBES: fac-

tors are estimated on ragged-edge data and plugged into MIDAS regressions

slide-5
SLIDE 5

1.4 Pooling versus model selection

in empirical forecast exercises of the model developers, all the single approaches above performed well we also evaluate the performance of pooling of forecasts from these models why pooling? stylised fact from forecast literature: very good forecast performance

  • f pooling, Timmermann (2005) HdBEcForec

why pooling? specifying or selecting single nowcast models can be difficult within each model class above, we have to make a lot of decisions concerning specification: variable selection, factor estimation method, number of factors, au- toregressive dynamics → potential misspecification

slide-6
SLIDE 6

in theory, pooling provides insurance against misspecification of single models and structural breaks, Clements and Hendry (2004) EctrcsJ recent empirical evidence that supports pooling based on balanced data: Clark and McCracken (2010) JAE fc, Assenmacher-Wesche and Pesaran (2008) NIEcRev

slide-7
SLIDE 7

1.5 What we do

we provide an empirical comparison exercise of nowcast pooling for German GDP based on about one hundred monthly predictors with a ragged edge we compare pooling to model selection of single models, where alternative model selection procedures are applied, e.g. information criteria, past performance

  • utline:
  • 2. MIDAS regressions
  • 3. Factor-MIDAS with factors estimated from many indicators
  • 4. Empirical now- and forecast comparison
  • 5. Conclusions
slide-8
SLIDE 8

2 MIDAS regressions

Ghysels, Sinko, Valkanov (2007) EctrRev, Clements and Galvão (2009) JBES MIDAS equation for quarterly GDP growth ytq+hq and forecast horizon hq explained by one particular monthly indicator xtm ytq+hq = β0 + β1b(Lm, θ)x(3)

tm + εtq+hq

(1) b(Lm, θ) =

K

  • k=0

c(k, θ)Lk

m

c(k, θ) = exp(θ1k + θ2k2)

K

  • k=0

exp(θ1k + θ2k2) (2) with Lmxtm = xtm−1 and time indices are related by tq = tm/3 superscript in x(3)

tm indicates that sampling frequency of xtm is three times higher than

for ytq

slide-9
SLIDE 9

2.1 Dynamics

GDP is explained by monthly indicator and its monthly lags x(3)

tm−i i = 0, 1, . . . , K

with unrestricted b(Lm), large K leads to overparameterized model to solve this, lag polynomial is non-linear exponential (Almon) determined by two coefficents θ1, θ2 only, to be estimated by nonlinear least squares (NLS) for large lag orders and mixing of data with very different frequencies, MIDAS is parametrically very parsimonious

slide-10
SLIDE 10

2.2 Forecasting

MIDAS equation is based on direct estimation technique, see Marcellino, Stock, Watson (2006) JEctrcs: LHS variables are tq +hq-dated, whereas RHS predictors are tm-dated for each horizon hq = hm/3, we have a different model: yTq+hq|Tm = β0 + β1b(Lm,

θ)xTm

(3) Note: For estimation, we have to use skip-sampled x(3)

tm , whereas MIDAS exploits all

monthly (lagged) observations xTm, xTm−1, ..., in the forecast by b(Lm,

θ)xTm

slide-11
SLIDE 11

2.3 Additional MIDAS features

MIDAS regressions take into account the the most recent observations of the indicator, which is often more timely available than GDP MIDAS can be augmented by autoregressive terms, taking into account potential hikes in the IRF, Clements and Galvão (2009) JBES more than one predictor can be taken into account simply by adding terms to the regression: per predictor xj,tm, the number of coefficients increases by three (β1,j, θ1,j, θ2,j) however, in the application below, we forecast with MIDAS and one indicator only, taken from a large set of potential predictors (single-indicator MIDAS)

slide-12
SLIDE 12

3 Factor-MIDAS with factors estimated from many indicators

Marcellino and Schumacher (2010) OBES: Factor-MIDAS based on mixed-frequency and ragged-edge data is a two-step procedure, see Boivin and Ng (2005) IJCB for the single-frequency case

  • 1. estimate factors based on monthly ragged-edge data
  • 2. make a nowcast with MIDAS, where factor estimates are the predictors
slide-13
SLIDE 13

3.1 Some simple theory behind Factor-MIDAS

Marcellino and Schumacher (2008) CEPR WP, revised: monthly one-factor (r = 1) model

Xtm = Λftm + ξtm

(4) and assume that monthly unobserved GDP growth is part of Xtm, ytm ∈ Xtm ytm = λyftm + ξy,tm (5) assume AR model for the factor and idiosyncratic components ftm = ρfftm + ef,tm (6) ξy,tm = ρξyξy,tm−1 + eξy,tm (7) given this monthly factor specification, we can write the factor representation for y three months ahead ytm+3 = κ0ytm + κ1ftm + κef(Lm)ef,tm+3 + κξy(Lm)eξy,tm+3 (8)

slide-14
SLIDE 14

consider time aggregation ytq = (1 + 2Lm + 3L2

m + 2L3 m + L4 m)ytm = ϕy(Lm)ytm

for tm = 3, 6, 9, ... and tm = 3tq, and multiply by ϕy(Lm) on both sides of the equation (1 − κ0L3

m)ϕy(Lm)ytm+3 = κ1ϕy(Lm)ftm

+ ϕy(Lm)(κef(Lm)ef,tm+3 + κξy(Lm)eξy,tm+3) (9) as time aggregation implies ytq+1 = ϕy(Lm)ytm+3, this is like a MIDAS equation with parameter restrictions from the factor model and the time aggregation scheme Factor-MIDAS should be regarded as an approximation rather than a structural model the residual in MIDAS can be serially dependent, which is typical for direct estimation

slide-15
SLIDE 15

3.2 Factor estimation with ragged-edge data

assumption: monthly (N × 1) vector Xtm with N large has a factor structure

Xtm = ΛFtm + ξtm

(10) with factors Ftm = (f1,tm, . . . , fr,tm), loadings Λ, idiosyncratic components ξtm if data X = (X1, . . . , XTm) is balanced, there are different ways to estimate F: principal components analysis (PCA) as in Stock and Watson (2002) JBES; dy- namic PCA as in Forni et al. (2005) JASA; subspace algorithms, Marcellino and Kapetanios (2009) JTSA this ’first wave’ of factor forecast papers does not consider the difficulties of real-time data (ragged edge, mixed frequencies)

slide-16
SLIDE 16

3.2.1 Vertical realignment of the data and dynamic PCA same procedure as in New Eurocoin, Altissimo et al. (2006) CEPR WP variable i is released with ki months of publication lag → in period Tm, the final

  • bservation available is in period Tm − ki

balancing by ‘vertical’ realignment

  • xi,Tm = xi,Tm−ki

(11) applying this procedure for each series and harmonising at the beginning of the sample yields a balanced data set

Xtm

factor estimation from

Xtm by Dynamic PCA, Forni et al. (2005) JASA

slide-17
SLIDE 17

3.2.2 Kalman smoother estimation in a large state-space model Doz, Giannone and Reichlin (2006) ECB WP, Giannone, Reichlin and Small (2008) JME:

Xtm = ΛFtm + ξtm

(12) Ψ(Lm)Ftm = Bηtm (13) factor VAR with Ψ(Lm) = p

i=1 ΨiLi m and Lmxtm = xtm−1, q-dimensional vector

ηtm contains dynamic shocks that drive factors, identification matrix B is (r × q)-

dimensional model has state-space representation with factors as states, Kalman smoother pro- vides factor estimates given the coefficients coefficients are estimated outside state-space model based on initial PCA factors, no iterative ML, rather 2-step

slide-18
SLIDE 18

4 Empirical nowcast comparison

4.1 Data and nowcast simulation design

data: German quarterly GDP from 1992Q1 until 2007Q4, 111 monthly indicators recursive design with increasing sample size: each month, we re-estimate models and compute new nowcasts with monthly horizon hm = 1, 2, . . . , 6 evaluation sample from 2000Q1 until 2007Q4 statistic: MSE relative to in-sample mean naive forecast

slide-19
SLIDE 19

4.2 Empirical results

slide-20
SLIDE 20

4.2.1 Fixed specifications

nowcast forecast current quarter 1 quarter horizon hm 1 2 3 4 5 6

  • A. Single-indicator MIDAS

survey: bus. exp., wholesale trade MIDAS 0.72 0.67 0.78 0.80 0.67 0.87 survey: bus. exp., consumer goods prod. MIDAS 0.78 0.67 0.75 0.89 0.92 0.96 survey: bus. exp., retail trade AR-MIDAS 0.79 0.79 0.87 1.16 1.17 0.91 survey consumer sentiment (GfK) AR-MIDAS 0.74 0.84 0.95 0.93 1.12 1.24 long-term interest rate (1-2 years mat.) MIDAS 0.86 0.90 0.87 0.89 0.85 0.86 production, intermediate goods prod. MIDAS 0.82 0.91 0.95 0.96 0.99 1.03

  • rders (domestic), intermediate goods prod.

MIDAS 0.88 0.92 0.99 1.23 1.39 1.04

  • B. Large factor models

VA-DPCA, r = 1, q = 1 AR-MIDAS 0.77 0.66 0.84 1.00 0.97 1.09 VA-DPCA, r = 1, q = 1 MIDAS 0.69 0.76 0.96 0.96 1.02 1.08 KFS-PCA, r = 1, q = 1 MIDAS 0.73 0.89 0.85 1.09 1.07 0.89 VA-DPCA, r = 2, q = 2 AR-MIDAS 0.85 0.77 0.93 1.08 1.03 1.06 KFS-PCA, r = 1, q = 1 AR-MIDAS 0.85 0.90 0.82 1.12 1.08 1.02

slide-21
SLIDE 21

4.2.2 Model selection based on information criteria and past performance

nowcast forecast current quarter 1 quarter horizon hm 1 2 3 4 5 6

  • A. Information criteria model selection

single-indicator MIDAS/AR-MIDAS BIC 0.96 1.07 1.50 1.04 1.70 1.01 VA-DPCA, MIDAS Bai, Ng (2002, 2007) 1.09 1.00 0.95 1.19 1.35 1.08 VA-DPCA, AR-MIDAS Bai, Ng (2002, 2007) 1.17 0.84 0.83 1.28 1.05 0.77 KFS-PCA, MIDAS Bai, Ng (2002, 2007) 1.29 1.66 0.83 1.59 1.04 0.73 KFS-PCA, AR-MIDAS Bai, Ng (2002, 2007) 1.48 1.53 0.88 1.26 1.22 1.07

  • B. Model and variable selection by past MSE performance

single-indicator MIDAS MSE 0.86 1.26 0.99 1.20 1.05 1.24 large factor models MSE 0.89 0.84 0.85 0.93 0.91 0.66

slide-22
SLIDE 22

4.2.3 Pooling MSE

nowcast forecast current quarter 1 quarter horizon hm 1 2 3 4 5 6 single-indicator MIDAS equal-weight mean 0.90 0.93 0.95 0.94 0.97 1.00 single-indicator MIDAS MSE-weighted mean 0.84 0.89 0.92 0.88 0.92 0.95 single-indicator MIDAS median 0.95 0.97 0.98 1.00 1.01 1.04 large factor models equal-weight mean 0.88 0.91 0.83 1.00 0.89 0.64 large factor models MSE-weighted mean 0.87 0.88 0.80 0.82 0.88 0.61 large factor models median 0.89 0.84 0.85 0.93 0.91 0.66 all equal-weight mean 0.76 0.81 0.83 0.94 0.88 0.75 all MSE-weighted mean 0.79 0.84 0.82 0.81 0.84 0.67 all median 0.79 0.81 0.85 0.94 0.91 0.78

slide-23
SLIDE 23

4.2.4 Pooling percentiles

nowcast forecast current quarter 1 quarter horizon hm 1 2 3 4 5 6

  • A. Pooling vs single-indicator MIDAS

MIDAS models equal-weight mean 0.13 0.16 0.15 0.13 0.21 0.22 MIDAS models MSE-weighted mean 0.08 0.11 0.10 0.07 0.11 0.13 MIDAS models median 0.23 0.22 0.21 0.26 0.31 0.29 all equal-weight mean 0.03 0.04 0.02 0.13 0.07 0.02 all MSE-weighted mean 0.05 0.07 0.02 0.02 0.04 0.00 all median 0.05 0.04 0.03 0.13 0.10 0.02

  • B. Pooling vs individual large factor models

large factor models equal-weight mean 0.25 0.22 0.11 0.20 0.06 0.08 large factor models MSE-weighted mean 0.24 0.17 0.06 0.00 0.06 0.06 large factor models median 0.25 0.14 0.13 0.06 0.07 0.10 all equal-weight mean 0.07 0.12 0.11 0.09 0.06 0.21 all MSE-weighted mean 0.12 0.14 0.08 0.00 0.04 0.10 all median 0.11 0.12 0.13 0.08 0.07 0.24

slide-24
SLIDE 24

5 Conclusions

Single-indicator MIDAS and Factor-MIDAS models can tackle ragged-edge and mixed- frequency data for nowcasting data mining: very good single models can be identified on an ex-post basis only information criteria do not work well, past performance only for factor models to some extent pooling often outperforms information-criteria model selection and the majority

  • f fixed specifcations (but not all)

pooling is more stable over time