[PPT] - Forecasting Ination Using Dynamic Model Averaging Gary Koop and PowerPoint Presentation

SLIDE 1

Forecasting In‡ation Using Dynamic Model Averaging

Gary Koop and Dimitris Korobilis September 20, 2010

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 1 / 33

SLIDE 2

Introduction

Macroeconomists typically have many time series variables

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 2 / 33

SLIDE 3

Introduction

Macroeconomists typically have many time series variables But even with all this information forecasting of macroeconomic variables like in‡ation, GDP growth, etc. can be very hard

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 2 / 33

SLIDE 4

Introduction

Macroeconomists typically have many time series variables But even with all this information forecasting of macroeconomic variables like in‡ation, GDP growth, etc. can be very hard Sometimes hard to beat very simple forecasting procedures (e.g. random walk)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 2 / 33

SLIDE 5

Introduction

Macroeconomists typically have many time series variables But even with all this information forecasting of macroeconomic variables like in‡ation, GDP growth, etc. can be very hard Sometimes hard to beat very simple forecasting procedures (e.g. random walk) Imagine a regression of in‡ation on many predictors

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 2 / 33

SLIDE 6

Introduction

Macroeconomists typically have many time series variables But even with all this information forecasting of macroeconomic variables like in‡ation, GDP growth, etc. can be very hard Sometimes hard to beat very simple forecasting procedures (e.g. random walk) Imagine a regression of in‡ation on many predictors Such a regression might …t well in practice, but forecast poorly

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 2 / 33

SLIDE 7

Why? There are many reasons, but three stand out:

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 3 / 33

SLIDE 8

Why? There are many reasons, but three stand out: Regressions with many predictors can over-…t (over-parameterization problems)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 3 / 33

SLIDE 9

Why? There are many reasons, but three stand out: Regressions with many predictors can over-…t (over-parameterization problems) Marginal e¤ects of predictors change over time (parameter change/structural breaks)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 3 / 33

SLIDE 10

Why? There are many reasons, but three stand out: Regressions with many predictors can over-…t (over-parameterization problems) Marginal e¤ects of predictors change over time (parameter change/structural breaks) The relevant forecasting model may change (model change)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 3 / 33

SLIDE 11

Why? There are many reasons, but three stand out: Regressions with many predictors can over-…t (over-parameterization problems) Marginal e¤ects of predictors change over time (parameter change/structural breaks) The relevant forecasting model may change (model change) This paper use an approach called Dynamic Model Averaging (DMA) in an attempt to address these problems

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 3 / 33

SLIDE 12

Why? There are many reasons, but three stand out: Regressions with many predictors can over-…t (over-parameterization problems) Marginal e¤ects of predictors change over time (parameter change/structural breaks) The relevant forecasting model may change (model change) This paper use an approach called Dynamic Model Averaging (DMA) in an attempt to address these problems Application: Forecasting US in‡ation shows DMA works quite well

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 3 / 33

SLIDE 13

The Generalized Phillips Curve

Generalized Phillips curve: In‡ation dependent on lagged in‡ation, unemployment and other predictors

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 4 / 33

SLIDE 14

The Generalized Phillips Curve

Generalized Phillips curve: In‡ation dependent on lagged in‡ation, unemployment and other predictors Many papers use generalized Phillips curve models for in‡ation forecasting

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 4 / 33

SLIDE 15

The Generalized Phillips Curve

Generalized Phillips curve: In‡ation dependent on lagged in‡ation, unemployment and other predictors Many papers use generalized Phillips curve models for in‡ation forecasting Regression-based methods based on: yt = φ + x0

t1β + p

∑

j=1

γjytj + εt

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 4 / 33

SLIDE 16

The Generalized Phillips Curve

Generalized Phillips curve: In‡ation dependent on lagged in‡ation, unemployment and other predictors Many papers use generalized Phillips curve models for in‡ation forecasting Regression-based methods based on: yt = φ + x0

t1β + p

∑

j=1

γjytj + εt yt is in‡ation and xt1 are lags of other predictors

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 4 / 33

SLIDE 17

The Generalized Phillips Curve

Generalized Phillips curve: In‡ation dependent on lagged in‡ation, unemployment and other predictors Many papers use generalized Phillips curve models for in‡ation forecasting Regression-based methods based on: yt = φ + x0

t1β + p

∑

j=1

γjytj + εt yt is in‡ation and xt1 are lags of other predictors To make things concrete, following is our list of predictors (other papers use similar)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 4 / 33

SLIDE 18

UNEMP: unemployment rate. CONS: the percentage change in real personal consumption expenditures. INV: the percentage change in private residential …xed investment. GDP: the percentage change in real GDP. HSTARTS: the log of housing starts (total new privately owned housing units). EMPLOY: the percentage change in employment (All Employees: Total Private Industries, seasonally adjusted). PMI: the change in the Institute of Supply Management (Manufacturing): Purchasing Manager’s Composite Index.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 5 / 33

SLIDE 19

WAGE: the percentage change in average hourly earnings in manufacturing. TBILL: three month Treasury bill (secondary market) rate. SPREAD: the spread between the 10 year and 3 month Treasury bill rates. DJIA: the percentage change in the Dow Jones Industrial Average. MONEY: the percentage change in the money supply (M1). INFEXP: University of Michigan measure of in‡ation expectations. COMPRICE: the change in the commodities price index (NAPM commodities price index). VENDOR: the change in the NAPM vendor deliveries index.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 6 / 33

SLIDE 20

Forecasting With Generalized Phillips Curve

Write more compactly as: yt = ztθ + εt

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 7 / 33

SLIDE 21

Forecasting With Generalized Phillips Curve

Write more compactly as: yt = ztθ + εt zt contains all predictors, lagged in‡ation, an intercept

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 7 / 33

SLIDE 22

Forecasting With Generalized Phillips Curve

Write more compactly as: yt = ztθ + εt zt contains all predictors, lagged in‡ation, an intercept Note zt = information available for forecasting yt

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 7 / 33

SLIDE 23

Forecasting With Generalized Phillips Curve

Write more compactly as: yt = ztθ + εt zt contains all predictors, lagged in‡ation, an intercept Note zt = information available for forecasting yt When forecasting h periods ahead will contain variables dated t h

r earlier

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 7 / 33

SLIDE 24

Consider forecasting yτ+1.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 8 / 33

SLIDE 25

Consider forecasting yτ+1. Recursive forecasting methods: b θ = estimate using data through τ.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 8 / 33

SLIDE 26

Consider forecasting yτ+1. Recursive forecasting methods: b θ = estimate using data through τ. So b θ will change (a bit) with τ, but can change too slowly

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 8 / 33

SLIDE 27

Consider forecasting yτ+1. Recursive forecasting methods: b θ = estimate using data through τ. So b θ will change (a bit) with τ, but can change too slowly Rolling forecasts use: b θ an estimate using data from τ τ0 through τ.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 8 / 33

SLIDE 28

Consider forecasting yτ+1. Recursive forecasting methods: b θ = estimate using data through τ. So b θ will change (a bit) with τ, but can change too slowly Rolling forecasts use: b θ an estimate using data from τ τ0 through τ. Better at capturing parameter change, but need to choose τ0

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 8 / 33

SLIDE 29

Consider forecasting yτ+1. Recursive forecasting methods: b θ = estimate using data through τ. So b θ will change (a bit) with τ, but can change too slowly Rolling forecasts use: b θ an estimate using data from τ τ0 through τ. Better at capturing parameter change, but need to choose τ0 Recursive and rolling forecasts might be imperfect solutions

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 8 / 33

SLIDE 30

Consider forecasting yτ+1. Recursive forecasting methods: b θ = estimate using data through τ. So b θ will change (a bit) with τ, but can change too slowly Rolling forecasts use: b θ an estimate using data from τ τ0 through τ. Better at capturing parameter change, but need to choose τ0 Recursive and rolling forecasts might be imperfect solutions Why not use a model which formally models the parameter change as well?

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 8 / 33

SLIDE 31

Time Varying Parameter (TVP) Models

TVP models gaining popularity in empirical macroeconomics yt = ztθt + εt θt = θt1 + ηt

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 9 / 33

SLIDE 32

Time Varying Parameter (TVP) Models

TVP models gaining popularity in empirical macroeconomics yt = ztθt + εt θt = θt1 + ηt εt

ind

N (0, Ht)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 9 / 33

SLIDE 33

Time Varying Parameter (TVP) Models

TVP models gaining popularity in empirical macroeconomics yt = ztθt + εt θt = θt1 + ηt εt

ind

N (0, Ht) ηt

ind

N (0, Qt)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 9 / 33

SLIDE 34

Time Varying Parameter (TVP) Models

TVP models gaining popularity in empirical macroeconomics yt = ztθt + εt θt = θt1 + ηt εt

ind

N (0, Ht) ηt

ind

N (0, Qt) Standard statistical methods (e.g. involving Kalman …lter and state smoother) exist for them

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 9 / 33

SLIDE 35

Why not use TVP model to forecast in‡ation?

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 10 / 33

SLIDE 36

Why not use TVP model to forecast in‡ation? Advantage: models parameter change in a formal manner

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 10 / 33

SLIDE 37

Why not use TVP model to forecast in‡ation? Advantage: models parameter change in a formal manner Disadvantage: same predictors used at all points in time.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 10 / 33

SLIDE 38

Why not use TVP model to forecast in‡ation? Advantage: models parameter change in a formal manner Disadvantage: same predictors used at all points in time. If number of predictors large, over-…t, over-parameterization problems

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 10 / 33

SLIDE 39

Why not use TVP model to forecast in‡ation? Advantage: models parameter change in a formal manner Disadvantage: same predictors used at all points in time. If number of predictors large, over-…t, over-parameterization problems In our empirical work, we show very poor forecast performance

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 10 / 33

SLIDE 40

Why not use TVP model to forecast in‡ation? Advantage: models parameter change in a formal manner Disadvantage: same predictors used at all points in time. If number of predictors large, over-…t, over-parameterization problems In our empirical work, we show very poor forecast performance Bayesian model averaging methods popular way of addressing this problem in cross-sectional regression

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 10 / 33

SLIDE 41

Why not use TVP model to forecast in‡ation? Advantage: models parameter change in a formal manner Disadvantage: same predictors used at all points in time. If number of predictors large, over-…t, over-parameterization problems In our empirical work, we show very poor forecast performance Bayesian model averaging methods popular way of addressing this problem in cross-sectional regression How to adapt BMA to TVP models?

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 10 / 33

SLIDE 42

Dynamic Model Averaging (DMA)

De…ne K models which have z(k)

t

for k = 1, .., K, as predictors

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 11 / 33

SLIDE 43

Dynamic Model Averaging (DMA)

De…ne K models which have z(k)

t

for k = 1, .., K, as predictors z(k)

t

is subset of zt.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 11 / 33

SLIDE 44

Dynamic Model Averaging (DMA)

De…ne K models which have z(k)

t

for k = 1, .., K, as predictors z(k)

t

is subset of zt. Set of models: yt = z(k)

t

θ(k)

t

+ ε(k)

t

θ(k)

t+1

= θ(k)

t

+ η(k)

t

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 11 / 33

SLIDE 45

Dynamic Model Averaging (DMA)

De…ne K models which have z(k)

t

for k = 1, .., K, as predictors z(k)

t

is subset of zt. Set of models: yt = z(k)

t

θ(k)

t

+ ε(k)

t

θ(k)

t+1

= θ(k)

t

+ η(k)

t

ε(k)

t

is N

0, H(k)

t

Gary Koop and Dimitris Korobilis ()

Dynamic Model Averaging September 20, 2010 11 / 33

SLIDE 46

Dynamic Model Averaging (DMA)

De…ne K models which have z(k)

t

for k = 1, .., K, as predictors z(k)

t

is subset of zt. Set of models: yt = z(k)

t

θ(k)

t

+ ε(k)

t

θ(k)

t+1

= θ(k)

t

+ η(k)

t

ε(k)

t

is N

0, H(k)

t

η(k)

t

is N

0, Q(k)

t

Gary Koop and Dimitris Korobilis ()

Dynamic Model Averaging September 20, 2010 11 / 33

SLIDE 47

Dynamic Model Averaging (DMA)

De…ne K models which have z(k)

t

for k = 1, .., K, as predictors z(k)

t

is subset of zt. Set of models: yt = z(k)

t

θ(k)

t

+ ε(k)

t

θ(k)

t+1

= θ(k)

t

+ η(k)

t

ε(k)

t

is N

0, H(k)

t

η(k)

t

is N

0, Q(k)

t

Let Lt 2 f1, 2, .., Kg denote which model applies at t

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 11 / 33

SLIDE 48

Why not just forecast using BMA over these TVP models at every point in time?

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 12 / 33

SLIDE 49

Why not just forecast using BMA over these TVP models at every point in time? Di¤erent weights in averaging at every point in time.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 12 / 33

SLIDE 50

Why not just forecast using BMA over these TVP models at every point in time? Di¤erent weights in averaging at every point in time. Or why not just select a single TVP forecasting model at every point in time?

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 12 / 33

SLIDE 51

Why not just forecast using BMA over these TVP models at every point in time? Di¤erent weights in averaging at every point in time. Or why not just select a single TVP forecasting model at every point in time? Di¤erent forecasting models selected at each point in time.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 12 / 33

SLIDE 52

Why not just forecast using BMA over these TVP models at every point in time? Di¤erent weights in averaging at every point in time. Or why not just select a single TVP forecasting model at every point in time? Di¤erent forecasting models selected at each point in time. If K is large (e.g. K = 2m), this is computationally infeasible.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 12 / 33

SLIDE 53

Why not just forecast using BMA over these TVP models at every point in time? Di¤erent weights in averaging at every point in time. Or why not just select a single TVP forecasting model at every point in time? Di¤erent forecasting models selected at each point in time. If K is large (e.g. K = 2m), this is computationally infeasible. With cross-sectional BMA have to work with model space K = 2m which is computationally burdensome

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 12 / 33

SLIDE 54

Why not just forecast using BMA over these TVP models at every point in time? Di¤erent weights in averaging at every point in time. Or why not just select a single TVP forecasting model at every point in time? Di¤erent forecasting models selected at each point in time. If K is large (e.g. K = 2m), this is computationally infeasible. With cross-sectional BMA have to work with model space K = 2m which is computationally burdensome In present time series context, forecasting through time τ involves 2mτ models.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 12 / 33

SLIDE 55

Why not just forecast using BMA over these TVP models at every point in time? Di¤erent weights in averaging at every point in time. Or why not just select a single TVP forecasting model at every point in time? Di¤erent forecasting models selected at each point in time. If K is large (e.g. K = 2m), this is computationally infeasible. With cross-sectional BMA have to work with model space K = 2m which is computationally burdensome In present time series context, forecasting through time τ involves 2mτ models. Also, Bayesian inference in TVP model requires MCMC (unlike cross-sectional regression). Computationally burdensome.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 12 / 33

SLIDE 56

Why not just forecast using BMA over these TVP models at every point in time? Di¤erent weights in averaging at every point in time. Or why not just select a single TVP forecasting model at every point in time? Di¤erent forecasting models selected at each point in time. If K is large (e.g. K = 2m), this is computationally infeasible. With cross-sectional BMA have to work with model space K = 2m which is computationally burdensome In present time series context, forecasting through time τ involves 2mτ models. Also, Bayesian inference in TVP model requires MCMC (unlike cross-sectional regression). Computationally burdensome. Even clever algorithms like MC-cubed are not good enough to handle this.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 12 / 33

SLIDE 57

Another strategy has been used to deal with similar problems in di¤erent contexts (e.g. multiple structural breaks): Markov switching

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 13 / 33

SLIDE 58

Another strategy has been used to deal with similar problems in di¤erent contexts (e.g. multiple structural breaks): Markov switching Markov transition matrix, P,

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 13 / 33

SLIDE 59

Another strategy has been used to deal with similar problems in di¤erent contexts (e.g. multiple structural breaks): Markov switching Markov transition matrix, P, Elements pij = Pr (Lt = ijLt1 = j) for i, j = 1, .., K.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 13 / 33

SLIDE 60

Another strategy has been used to deal with similar problems in di¤erent contexts (e.g. multiple structural breaks): Markov switching Markov transition matrix, P, Elements pij = Pr (Lt = ijLt1 = j) for i, j = 1, .., K. “If j is the forecasting model at t 1, we switch to forecasting model i at time t with probability pij"

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 13 / 33

SLIDE 61

Another strategy has been used to deal with similar problems in di¤erent contexts (e.g. multiple structural breaks): Markov switching Markov transition matrix, P, Elements pij = Pr (Lt = ijLt1 = j) for i, j = 1, .., K. “If j is the forecasting model at t 1, we switch to forecasting model i at time t with probability pij" Bayesian inference is theoretically straightforward, but computationally infeasible

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 13 / 33

SLIDE 62

Another strategy has been used to deal with similar problems in di¤erent contexts (e.g. multiple structural breaks): Markov switching Markov transition matrix, P, Elements pij = Pr (Lt = ijLt1 = j) for i, j = 1, .., K. “If j is the forecasting model at t 1, we switch to forecasting model i at time t with probability pij" Bayesian inference is theoretically straightforward, but computationally infeasible P is K K: an enormous matrix.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 13 / 33

SLIDE 63

Another strategy has been used to deal with similar problems in di¤erent contexts (e.g. multiple structural breaks): Markov switching Markov transition matrix, P, Elements pij = Pr (Lt = ijLt1 = j) for i, j = 1, .., K. “If j is the forecasting model at t 1, we switch to forecasting model i at time t with probability pij" Bayesian inference is theoretically straightforward, but computationally infeasible P is K K: an enormous matrix. Even if computation were possible, imprecise estimation of so many parameters

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 13 / 33

SLIDE 64

Solution: DMA

Adopt approach used by Raftery et al (2007 working paper) in an engineering application

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 14 / 33

SLIDE 65

Solution: DMA

Adopt approach used by Raftery et al (2007 working paper) in an engineering application Involves two approximations

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 14 / 33

SLIDE 66

Solution: DMA

Adopt approach used by Raftery et al (2007 working paper) in an engineering application Involves two approximations First approximation means we do not need MCMC in each TVP model (only Kalman …ltering and smoothing)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 14 / 33

SLIDE 67

Solution: DMA

Adopt approach used by Raftery et al (2007 working paper) in an engineering application Involves two approximations First approximation means we do not need MCMC in each TVP model (only Kalman …ltering and smoothing) See paper for details. Idea: replace Q(k)

t

and H(k)

t

by estimates

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 14 / 33

SLIDE 68

Sketch of some Kalman …ltering ideas (where yt1 are observations through t 1) θt1jyt1 N

b

θt1, Σt1jt1

Gary Koop and Dimitris Korobilis ()

Dynamic Model Averaging September 20, 2010 15 / 33

SLIDE 69

Sketch of some Kalman …ltering ideas (where yt1 are observations through t 1) θt1jyt1 N

b

θt1, Σt1jt1

Textbook formula for b

θt1 and Σt1jt1

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 15 / 33

SLIDE 70

Sketch of some Kalman …ltering ideas (where yt1 are observations through t 1) θt1jyt1 N

b

θt1, Σt1jt1

Textbook formula for b

θt1 and Σt1jt1 Then update θtjyt1 N

b

θt1, Σtjt1

Gary Koop and Dimitris Korobilis ()

Dynamic Model Averaging September 20, 2010 15 / 33

SLIDE 71

Sketch of some Kalman …ltering ideas (where yt1 are observations through t 1) θt1jyt1 N

b

θt1, Σt1jt1

Textbook formula for b

θt1 and Σt1jt1 Then update θtjyt1 N

b

θt1, Σtjt1

Σtjt1 = Σt1jt1 + Qt

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 15 / 33

SLIDE 72

Sketch of some Kalman …ltering ideas (where yt1 are observations through t 1) θt1jyt1 N

b

θt1, Σt1jt1

Textbook formula for b

θt1 and Σt1jt1 Then update θtjyt1 N

b

θt1, Σtjt1

Σtjt1 = Σt1jt1 + Qt

Get rid of Qt by approximating: Σtjt1 = 1 λΣt1jt1

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 15 / 33

SLIDE 73

Sketch of some Kalman …ltering ideas (where yt1 are observations through t 1) θt1jyt1 N

b

θt1, Σt1jt1

Textbook formula for b

θt1 and Σt1jt1 Then update θtjyt1 N

b

θt1, Σtjt1

Σtjt1 = Σt1jt1 + Qt

Get rid of Qt by approximating: Σtjt1 = 1 λΣt1jt1 0 < λ 1 is forgetting factor

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 15 / 33

SLIDE 74

Forgetting factors like this have long been used in state space literature

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 16 / 33

SLIDE 75

Forgetting factors like this have long been used in state space literature Implies that observations j periods in the past have weight λj.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 16 / 33

SLIDE 76

Forgetting factors like this have long been used in state space literature Implies that observations j periods in the past have weight λj. Or e¤ective window size of

1 1λ.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 16 / 33

SLIDE 77

Forgetting factors like this have long been used in state space literature Implies that observations j periods in the past have weight λj. Or e¤ective window size of

1 1λ.

Choose value of λ near one

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 16 / 33

SLIDE 78

Forgetting factors like this have long been used in state space literature Implies that observations j periods in the past have weight λj. Or e¤ective window size of

1 1λ.

Choose value of λ near one λ = 0.99: observations …ve years ago 80% as much weight as last period’s observation.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 16 / 33

SLIDE 79

Forgetting factors like this have long been used in state space literature Implies that observations j periods in the past have weight λj. Or e¤ective window size of

1 1λ.

Choose value of λ near one λ = 0.99: observations …ve years ago 80% as much weight as last period’s observation. λ = 0.95: observations …ve years ago 35% as much weight as last period’s observations.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 16 / 33

SLIDE 80

Forgetting factors like this have long been used in state space literature Implies that observations j periods in the past have weight λj. Or e¤ective window size of

1 1λ.

Choose value of λ near one λ = 0.99: observations …ve years ago 80% as much weight as last period’s observation. λ = 0.95: observations …ve years ago 35% as much weight as last period’s observations. We focus on λ 2 [0.95, 1.00].

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 16 / 33

SLIDE 81

Forgetting factors like this have long been used in state space literature Implies that observations j periods in the past have weight λj. Or e¤ective window size of

1 1λ.

Choose value of λ near one λ = 0.99: observations …ve years ago 80% as much weight as last period’s observation. λ = 0.95: observations …ve years ago 35% as much weight as last period’s observations. We focus on λ 2 [0.95, 1.00]. If λ = 1 no time variation in parameters (standard recursive forecasting)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 16 / 33

SLIDE 82

Forgetting factors like this have long been used in state space literature Implies that observations j periods in the past have weight λj. Or e¤ective window size of

1 1λ.

Choose value of λ near one λ = 0.99: observations …ve years ago 80% as much weight as last period’s observation. λ = 0.95: observations …ve years ago 35% as much weight as last period’s observations. We focus on λ 2 [0.95, 1.00]. If λ = 1 no time variation in parameters (standard recursive forecasting) Main results for λ = 0.99

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 16 / 33

SLIDE 83

Back to Model Averaging/Selection

Goal for forecasting at time t is πtjt1,k Pr

Lt = kjyt1

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 17 / 33

SLIDE 84

Back to Model Averaging/Selection

Goal for forecasting at time t is πtjt1,k Pr

Lt = kjyt1

Can average across k = 1, .., K forecasts using πtjt1,k as weights (DMA)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 17 / 33

SLIDE 85

Back to Model Averaging/Selection

Goal for forecasting at time t is πtjt1,k Pr

Lt = kjyt1

Can average across k = 1, .., K forecasts using πtjt1,k as weights (DMA) E.g. point forecasts (b θ

(k) t1 from Kalman …lter in model k):

E

ytjyt1 =

K

∑

k=1

πtjt1,kz(k)

t

b θ

(k) t1

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 17 / 33

SLIDE 86

Back to Model Averaging/Selection

Goal for forecasting at time t is πtjt1,k Pr

Lt = kjyt1

Can average across k = 1, .., K forecasts using πtjt1,k as weights (DMA) E.g. point forecasts (b θ

(k) t1 from Kalman …lter in model k):

E

ytjyt1 =

K

∑

k=1

πtjt1,kz(k)

t

b θ

(k) t1

Can forecast with model j at time t if πtjt1,j is highest (Dynamic model selection: DMS)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 17 / 33

SLIDE 87

Back to Model Averaging/Selection

Goal for forecasting at time t is πtjt1,k Pr

Lt = kjyt1

Can average across k = 1, .., K forecasts using πtjt1,k as weights (DMA) E.g. point forecasts (b θ

(k) t1 from Kalman …lter in model k):

E

ytjyt1 =

K

∑

k=1

πtjt1,kz(k)

t

b θ

(k) t1

Can forecast with model j at time t if πtjt1,j is highest (Dynamic model selection: DMS) Raftery et al (2007) propose another forgetting factor to approximate πtjt1,k

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 17 / 33

SLIDE 88

Complete details in paper.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 18 / 33

SLIDE 89

Complete details in paper. Idea: use similar state space updating formulae for models as is done with states then use similar forgetting factor

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 18 / 33

SLIDE 90

Complete details in paper. Idea: use similar state space updating formulae for models as is done with states then use similar forgetting factor Some key steps/notation:

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 18 / 33

SLIDE 91

Complete details in paper. Idea: use similar state space updating formulae for models as is done with states then use similar forgetting factor Some key steps/notation: pk

ytjyt1

is predictive density for model k evaluated at yt (Normal distribution with mean and variance from Kalman …lter)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 18 / 33

SLIDE 92

Complete details in paper. Idea: use similar state space updating formulae for models as is done with states then use similar forgetting factor Some key steps/notation: pk

ytjyt1

is predictive density for model k evaluated at yt (Normal distribution with mean and variance from Kalman …lter) Suppose we use "Markov switching” approach with pkl being probability of switching from model k to l

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 18 / 33

SLIDE 93

Complete details in paper. Idea: use similar state space updating formulae for models as is done with states then use similar forgetting factor Some key steps/notation: pk

ytjyt1

is predictive density for model k evaluated at yt (Normal distribution with mean and variance from Kalman …lter) Suppose we use "Markov switching” approach with pkl being probability of switching from model k to l Then model prediction equation would be: πtjt1,k =

K

∑

l=1

πt1jt1,lpkl

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 18 / 33

SLIDE 94

Complete details in paper. Idea: use similar state space updating formulae for models as is done with states then use similar forgetting factor Some key steps/notation: pk

ytjyt1

is predictive density for model k evaluated at yt (Normal distribution with mean and variance from Kalman …lter) Suppose we use "Markov switching” approach with pkl being probability of switching from model k to l Then model prediction equation would be: πtjt1,k =

K

∑

l=1

πt1jt1,lpkl But remember: hard to estimate pkl so use approximation: πtjt1,k = πα

t1jt1,k

∑K

l=1 πα t1jt1,l

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 18 / 33

SLIDE 95

Complete details in paper. Idea: use similar state space updating formulae for models as is done with states then use similar forgetting factor Some key steps/notation: pk

ytjyt1

is predictive density for model k evaluated at yt (Normal distribution with mean and variance from Kalman …lter) Suppose we use "Markov switching” approach with pkl being probability of switching from model k to l Then model prediction equation would be: πtjt1,k =

K

∑

l=1

πt1jt1,lpkl But remember: hard to estimate pkl so use approximation: πtjt1,k = πα

t1jt1,k

∑K

l=1 πα t1jt1,l

0 < α 1 is forgetting factor with similar interpretation to λ

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 18 / 33

SLIDE 96

Complete details in paper. Idea: use similar state space updating formulae for models as is done with states then use similar forgetting factor Some key steps/notation: pk

ytjyt1

is predictive density for model k evaluated at yt (Normal distribution with mean and variance from Kalman …lter) Suppose we use "Markov switching” approach with pkl being probability of switching from model k to l Then model prediction equation would be: πtjt1,k =

K

∑

l=1

πt1jt1,lpkl But remember: hard to estimate pkl so use approximation: πtjt1,k = πα

t1jt1,k

∑K

l=1 πα t1jt1,l

0 < α 1 is forgetting factor with similar interpretation to λ Focus on α 2 [0.95, 1.00]

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 18 / 33

SLIDE 97

Interpretation of forgetting factor α

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 19 / 33

SLIDE 98

Interpretation of forgetting factor α Easy to show: πtjt1,k =

t1

∏

i=1

pk
ytijyti1αi

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 19 / 33

SLIDE 99

Interpretation of forgetting factor α Easy to show: πtjt1,k =

t1

∏

i=1

pk
ytijyti1αi

pk

ytjyt1

is predictive density for model k evaluated at yt (measure of forecast performance of model k)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 19 / 33

SLIDE 100

Interpretation of forgetting factor α Easy to show: πtjt1,k =

t1

∏

i=1

pk
ytijyti1αi

pk

ytjyt1

is predictive density for model k evaluated at yt (measure of forecast performance of model k) Model k will receive more weight at time t if it has forecast well in the recent past

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 19 / 33

SLIDE 101

Interpretation of forgetting factor α Easy to show: πtjt1,k =

t1

∏

i=1

pk
ytijyti1αi

pk

ytjyt1

is predictive density for model k evaluated at yt (measure of forecast performance of model k) Model k will receive more weight at time t if it has forecast well in the recent past Interpretation of “recent past” is controlled by the forgetting factor, α

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 19 / 33

SLIDE 102

Interpretation of forgetting factor α Easy to show: πtjt1,k =

t1

∏

i=1

pk
ytijyti1αi

pk

ytjyt1

is predictive density for model k evaluated at yt (measure of forecast performance of model k) Model k will receive more weight at time t if it has forecast well in the recent past Interpretation of “recent past” is controlled by the forgetting factor, α α = 0.99: forecast performance …ve years ago receives 80% as much weight as forecast performance last period

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 19 / 33

SLIDE 103

Interpretation of forgetting factor α Easy to show: πtjt1,k =

t1

∏

i=1

pk
ytijyti1αi

pk

ytjyt1

is predictive density for model k evaluated at yt (measure of forecast performance of model k) Model k will receive more weight at time t if it has forecast well in the recent past Interpretation of “recent past” is controlled by the forgetting factor, α α = 0.99: forecast performance …ve years ago receives 80% as much weight as forecast performance last period α = 0.95: forecast performance …ve years ago receives only about 35% as much weight.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 19 / 33

SLIDE 104

Interpretation of forgetting factor α Easy to show: πtjt1,k =

t1

∏

i=1

pk
ytijyti1αi

pk

ytjyt1

is predictive density for model k evaluated at yt (measure of forecast performance of model k) Model k will receive more weight at time t if it has forecast well in the recent past Interpretation of “recent past” is controlled by the forgetting factor, α α = 0.99: forecast performance …ve years ago receives 80% as much weight as forecast performance last period α = 0.95: forecast performance …ve years ago receives only about 35% as much weight. α = 1: can show πtjt1,k is proportional to the marginal likelihood using data through time t 1 (standard BMA)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 19 / 33

SLIDE 105

Summary So Far

We want to do DMA or DMS

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 20 / 33

SLIDE 106

Summary So Far

We want to do DMA or DMS These use TVP models which allow marginal e¤ects to change over time

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 20 / 33

SLIDE 107

Summary So Far

We want to do DMA or DMS These use TVP models which allow marginal e¤ects to change over time These allow for forecasting model to switch over time

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 20 / 33

SLIDE 108

Summary So Far

We want to do DMA or DMS These use TVP models which allow marginal e¤ects to change over time These allow for forecasting model to switch over time So can switch from one parsimonious forecasting model to another (avoid over-parametization)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 20 / 33

SLIDE 109

Summary So Far

We want to do DMA or DMS These use TVP models which allow marginal e¤ects to change over time These allow for forecasting model to switch over time So can switch from one parsimonious forecasting model to another (avoid over-parametization) But a full formal Bayesian analysis is computationally infeasible

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 20 / 33

SLIDE 110

Summary So Far

We want to do DMA or DMS These use TVP models which allow marginal e¤ects to change over time These allow for forecasting model to switch over time So can switch from one parsimonious forecasting model to another (avoid over-parametization) But a full formal Bayesian analysis is computationally infeasible Sensible approximations make it computationally feasible.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 20 / 33

SLIDE 111

Summary So Far

We want to do DMA or DMS These use TVP models which allow marginal e¤ects to change over time These allow for forecasting model to switch over time So can switch from one parsimonious forecasting model to another (avoid over-parametization) But a full formal Bayesian analysis is computationally infeasible Sensible approximations make it computationally feasible. State space updating formula must be run K times, instead of (roughly speaking) K T MCMC algorithms

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 20 / 33

SLIDE 112

Forecasting US In‡ation

Data from 1959Q1 through 2008Q2.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 21 / 33

SLIDE 113

Forecasting US In‡ation

Data from 1959Q1 through 2008Q2. Two measure of in‡ation based on CPI and GDP de‡ator

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 21 / 33

SLIDE 114

Forecasting US In‡ation

Data from 1959Q1 through 2008Q2. Two measure of in‡ation based on CPI and GDP de‡ator 15 predictors listed previously (all variables transformed to be approximately stationary)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 21 / 33

SLIDE 115

Forecasting US In‡ation

Data from 1959Q1 through 2008Q2. Two measure of in‡ation based on CPI and GDP de‡ator 15 predictors listed previously (all variables transformed to be approximately stationary) All models include an intercept and two lags of the dependent variable

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 21 / 33

SLIDE 116

Forecasting US In‡ation

Data from 1959Q1 through 2008Q2. Two measure of in‡ation based on CPI and GDP de‡ator 15 predictors listed previously (all variables transformed to be approximately stationary) All models include an intercept and two lags of the dependent variable α = 0.99, λ = 0.99 (if time permits, sensitivity analysis at end of seminar)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 21 / 33

SLIDE 117

Forecasting US In‡ation

Data from 1959Q1 through 2008Q2. Two measure of in‡ation based on CPI and GDP de‡ator 15 predictors listed previously (all variables transformed to be approximately stationary) All models include an intercept and two lags of the dependent variable α = 0.99, λ = 0.99 (if time permits, sensitivity analysis at end of seminar) Paper has 3 forecast horizons: h = 1,4, 8, here I will focus on h = 1

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 21 / 33

SLIDE 118

Is DMA Parsimonious?

Even though 15 potential predictors, most probability is attached to very parsimonious models with only a few predictors.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 22 / 33

SLIDE 119

Is DMA Parsimonious?

Even though 15 potential predictors, most probability is attached to very parsimonious models with only a few predictors. Sizek = number of predictors in model k

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 22 / 33

SLIDE 120

Is DMA Parsimonious?

Even though 15 potential predictors, most probability is attached to very parsimonious models with only a few predictors. Sizek = number of predictors in model k (Sizek does not include the intercept plus two lags of the dependent variable)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 22 / 33

SLIDE 121

Is DMA Parsimonious?

Even though 15 potential predictors, most probability is attached to very parsimonious models with only a few predictors. Sizek = number of predictors in model k (Sizek does not include the intercept plus two lags of the dependent variable) Figure 1 plots E (Sizet) =

K

∑

k=1

πtjt1,kSizek

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 22 / 33

SLIDE 122

Figure 1: Expected Number of Predictors

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 23 / 33

SLIDE 123

Which Variables are Good Predictors for In‡ation?

Posterior inclusion probabilities for jth predictor =

∑

k2J

πtjt1,k

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 24 / 33

SLIDE 124

Which Variables are Good Predictors for In‡ation?

Posterior inclusion probabilities for jth predictor =

∑

k2J

πtjt1,k where k 2 J indicates models which include jth predictor

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 24 / 33

SLIDE 125

Which Variables are Good Predictors for In‡ation?

Posterior inclusion probabilities for jth predictor =

∑

k2J

πtjt1,k where k 2 J indicates models which include jth predictor See …gures

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 24 / 33

SLIDE 126

Which Variables are Good Predictors for In‡ation?

Posterior inclusion probabilities for jth predictor =

∑

k2J

πtjt1,k where k 2 J indicates models which include jth predictor See …gures Any predictor where the inclusion probability is never above 0.5 is excluded from the appropriate …gure.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 24 / 33

SLIDE 127

Which Variables are Good Predictors for In‡ation?

Posterior inclusion probabilities for jth predictor =

∑

k2J

πtjt1,k where k 2 J indicates models which include jth predictor See …gures Any predictor where the inclusion probability is never above 0.5 is excluded from the appropriate …gure. Lots of evidence of predictor change in all cases.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 24 / 33

SLIDE 128

Which Variables are Good Predictors for In‡ation?

Posterior inclusion probabilities for jth predictor =

∑

k2J

πtjt1,k where k 2 J indicates models which include jth predictor See …gures Any predictor where the inclusion probability is never above 0.5 is excluded from the appropriate …gure. Lots of evidence of predictor change in all cases. DMA/DMS will pick this up automatically

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 24 / 33

SLIDE 129

Figure 1: Predictors for CPI, h = 1

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 25 / 33

SLIDE 130

Figure 5: Predictors for GDP De‡ator In‡ation, h = 1

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 26 / 33

SLIDE 131

Forecast Performance

Pseudo-out-of-sample forecasting exercise

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 27 / 33

SLIDE 132

Forecast Performance

Pseudo-out-of-sample forecasting exercise forecast evaluation begins in 1970Q1

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 27 / 33

SLIDE 133

Forecast Performance

Pseudo-out-of-sample forecasting exercise forecast evaluation begins in 1970Q1 Measures of forecast performance using point forecasts

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 27 / 33

SLIDE 134

Forecast Performance

Pseudo-out-of-sample forecasting exercise forecast evaluation begins in 1970Q1 Measures of forecast performance using point forecasts Mean squared forecast error (MSFE) and mean absolute forecast error (MAFE).

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 27 / 33

SLIDE 135

Forecast Performance

Pseudo-out-of-sample forecasting exercise forecast evaluation begins in 1970Q1 Measures of forecast performance using point forecasts Mean squared forecast error (MSFE) and mean absolute forecast error (MAFE). Forecast metric involving entire predictive distribution: the sum of log predictive likelihoods.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 27 / 33

SLIDE 136

Forecast Performance

Pseudo-out-of-sample forecasting exercise forecast evaluation begins in 1970Q1 Measures of forecast performance using point forecasts Mean squared forecast error (MSFE) and mean absolute forecast error (MAFE). Forecast metric involving entire predictive distribution: the sum of log predictive likelihoods. Predictive likelihood = Predictive density for yt (given data through time t 1) evaluated at the actual outcome.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 27 / 33

SLIDE 137

Forecasting Methods

DMA with α = λ = 0.99.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 28 / 33

SLIDE 138

Forecasting Methods

DMA with α = λ = 0.99. DMS with α = λ = 0.99.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 28 / 33

SLIDE 139

Forecasting Methods

DMA with α = λ = 0.99. DMS with α = λ = 0.99. TVP model with all the predictors.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 28 / 33

SLIDE 140

Forecasting Methods

DMA with α = λ = 0.99. DMS with α = λ = 0.99. TVP model with all the predictors. DMA but constant coe¢cients (i.e. DMA where λ = 1).

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 28 / 33

SLIDE 141

Forecasting Methods

DMA with α = λ = 0.99. DMS with α = λ = 0.99. TVP model with all the predictors. DMA but constant coe¢cients (i.e. DMA where λ = 1). BMA (i.e. DMA where λ = α = 1).

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 28 / 33

SLIDE 142

Forecasting Methods

DMA with α = λ = 0.99. DMS with α = λ = 0.99. TVP model with all the predictors. DMA but constant coe¢cients (i.e. DMA where λ = 1). BMA (i.e. DMA where λ = α = 1). Recursive OLS forecasts using an AR(2) model.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 28 / 33

SLIDE 143

Forecasting Methods

DMA with α = λ = 0.99. DMS with α = λ = 0.99. TVP model with all the predictors. DMA but constant coe¢cients (i.e. DMA where λ = 1). BMA (i.e. DMA where λ = α = 1). Recursive OLS forecasts using an AR(2) model. Recursive OLS forecasts using all of the predictors.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 28 / 33

SLIDE 144

Forecasting Methods

DMA with α = λ = 0.99. DMS with α = λ = 0.99. TVP model with all the predictors. DMA but constant coe¢cients (i.e. DMA where λ = 1). BMA (i.e. DMA where λ = α = 1). Recursive OLS forecasts using an AR(2) model. Recursive OLS forecasts using all of the predictors. Random walk forecasts.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 28 / 33

SLIDE 145

Discussion of Log Predictive Likelihoods

Preferred method of Bayesian forecast comparison

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 29 / 33

SLIDE 146

Discussion of Log Predictive Likelihoods

Preferred method of Bayesian forecast comparison DMA or DMS always forecast best.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 29 / 33

SLIDE 147

Discussion of Log Predictive Likelihoods

Preferred method of Bayesian forecast comparison DMA or DMS always forecast best. TVP model with all predictors leads to very poor forecasting performance.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 29 / 33

SLIDE 148

Discussion of Log Predictive Likelihoods

Preferred method of Bayesian forecast comparison DMA or DMS always forecast best. TVP model with all predictors leads to very poor forecasting performance. Shrinkage provided by DMA or DMS is of great value in forecasting.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 29 / 33

SLIDE 149

Discussion of Log Predictive Likelihoods

Preferred method of Bayesian forecast comparison DMA or DMS always forecast best. TVP model with all predictors leads to very poor forecasting performance. Shrinkage provided by DMA or DMS is of great value in forecasting. Treatment of model change more important than parameter change (see especially h = 1 where λ = 1 does well).

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 29 / 33

SLIDE 150

Discussion of Log Predictive Likelihoods

Preferred method of Bayesian forecast comparison DMA or DMS always forecast best. TVP model with all predictors leads to very poor forecasting performance. Shrinkage provided by DMA or DMS is of great value in forecasting. Treatment of model change more important than parameter change (see especially h = 1 where λ = 1 does well). At short horizons, conventional BMA forecasts does okay, but not at longer horizons.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 29 / 33

SLIDE 151

Discussion of Log Predictive Likelihoods

Preferred method of Bayesian forecast comparison DMA or DMS always forecast best. TVP model with all predictors leads to very poor forecasting performance. Shrinkage provided by DMA or DMS is of great value in forecasting. Treatment of model change more important than parameter change (see especially h = 1 where λ = 1 does well). At short horizons, conventional BMA forecasts does okay, but not at longer horizons. DMS tends to forecast a bit better than DMA

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 29 / 33

SLIDE 152

Discussion of MSFE and MAFE

Patterns noted with predictive likelihoods mainly still hold (although DMA does better relative to DMS)

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 30 / 33

SLIDE 153

Discussion of MSFE and MAFE

Patterns noted with predictive likelihoods mainly still hold (although DMA does better relative to DMS) Simple forecasting methods (AR(2) or random walk model) are inferior to DMA and DMS always

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 30 / 33

SLIDE 154

Discussion of MSFE and MAFE

Patterns noted with predictive likelihoods mainly still hold (although DMA does better relative to DMS) Simple forecasting methods (AR(2) or random walk model) are inferior to DMA and DMS always With CPI in‡ation, recursive OLS forecasting using all the predictors does well for h = 8, but not for GDP de‡ator nor for shorter horizons

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 30 / 33

SLIDE 155

Discussion of MSFE and MAFE

Patterns noted with predictive likelihoods mainly still hold (although DMA does better relative to DMS) Simple forecasting methods (AR(2) or random walk model) are inferior to DMA and DMS always With CPI in‡ation, recursive OLS forecasting using all the predictors does well for h = 8, but not for GDP de‡ator nor for shorter horizons In general: DMA and DMS look to be safe options. Usually they do best, but where not they do not go too far wrong

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 30 / 33

SLIDE 156

Discussion of MSFE and MAFE

Patterns noted with predictive likelihoods mainly still hold (although DMA does better relative to DMS) Simple forecasting methods (AR(2) or random walk model) are inferior to DMA and DMS always With CPI in‡ation, recursive OLS forecasting using all the predictors does well for h = 8, but not for GDP de‡ator nor for shorter horizons In general: DMA and DMS look to be safe options. Usually they do best, but where not they do not go too far wrong Unlike other methods which might perform well in some cases, but very poorly in others

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 30 / 33

SLIDE 157

Table 1: Comparing Di¤erent Forecasting Methods: CPI in‡ation Forecast Method Sum of log

pred. like.

MSFE MAFE h = 1 DMA 85.31 47.48 26.37 DMS 82.26 48.96 27.82 TVP 182.36 54.70 32.20 DMA (λ = 1) 81.63 45.00 23.02 BMA (DMA with α = λ = 1) 84.12 46.07 24.14 Recursive OLS – AR(2)

57.52

41.58 Recursive OLS – All Preds.

52.76

34.16 Random Walk

54.59

35.14

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 31 / 33

SLIDE 158

Table 2a: Comparing Di¤erent Forecasting Methods: GDP De‡ator in‡ation Forecast Method Sum of log

pred. like.

MSFE MAFE h = 1 DMA 27.10 34.47 12.98 DMS 24.97 35.61 13.70 TVP 176.90 38.85 16.99 DMA (λ = 1) 21.47 33.17 12.02 BMA (DMA with α = λ = 1) 25.00 34.58 13.10 Recursive OLS – AR(2)

40.10

17.34 Recursive OLS – All Preds

37.34

14.30 Random Walk

37.39

15.19

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 32 / 33

SLIDE 159

Conclusions

When forecasting in the presence of change/breaks/turbulence want an approach which:

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 33 / 33

SLIDE 160

Conclusions

When forecasting in the presence of change/breaks/turbulence want an approach which: Allows for forecasting model to change over time

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 33 / 33

SLIDE 161

Conclusions

When forecasting in the presence of change/breaks/turbulence want an approach which: Allows for forecasting model to change over time Allows for marginal e¤ects of predictors to change over time

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 33 / 33

SLIDE 162

Conclusions

When forecasting in the presence of change/breaks/turbulence want an approach which: Allows for forecasting model to change over time Allows for marginal e¤ects of predictors to change over time Automatically does the shrinkage necessary to reduce risk of

ver-parameterizations/over-…tting

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 33 / 33

SLIDE 163

Conclusions

When forecasting in the presence of change/breaks/turbulence want an approach which: Allows for forecasting model to change over time Allows for marginal e¤ects of predictors to change over time Automatically does the shrinkage necessary to reduce risk of

ver-parameterizations/over-…tting

In theory, DMA and DMS should satisfy these criteria

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 33 / 33

SLIDE 164

Conclusions

When forecasting in the presence of change/breaks/turbulence want an approach which: Allows for forecasting model to change over time Allows for marginal e¤ects of predictors to change over time Automatically does the shrinkage necessary to reduce risk of

ver-parameterizations/over-…tting

In theory, DMA and DMS should satisfy these criteria In practice, we …nd DMA and DMS to forecast well in an exercise involving US in‡ation.

Gary Koop and Dimitris Korobilis () Dynamic Model Averaging September 20, 2010 33 / 33