Maximum likelihood estimation of factor models on data sets with - - PowerPoint PPT Presentation

maximum likelihood estimation of factor models on data
SMART_READER_LITE
LIVE PREVIEW

Maximum likelihood estimation of factor models on data sets with - - PowerPoint PPT Presentation

Maximum likelihood estimation of factor models on data sets with arbitrary pattern of missing data. Marta Ba nbura Michele Modugno European Central Bank 6th Eurostat Colloquium, 29 September 2010 Framework Dynamic factor model y t f t


slide-1
SLIDE 1

Maximum likelihood estimation of factor models on data sets with arbitrary pattern of missing data.

Marta Ba´ nbura Michele Modugno

European Central Bank

6th Eurostat Colloquium, 29 September 2010

slide-2
SLIDE 2

Framework

Dynamic factor model yt = Λft + εt ft = A1ft−1 + · · · + Apft−p + ut

  • State space form.
  • Factors are unobservable.
  • Factors follow a VAR dynamic.
slide-3
SLIDE 3

Framework

Dynamic factor model yt = Λft + εt ft = A1ft−1 + · · · + Apft−p + ut

  • State space form.
  • Factors are unobservable.
  • Factors follow a VAR dynamic.
  • yt can follow an arbitrary pattern of missing data.
slide-4
SLIDE 4

Arbitrary pattern of missing data

  • 2.09

0.70

  • 11.59

NaN NaN NaN NaN NaN NaN

  • 2.17

2.19

  • 12.04

NaN NaN NaN NaN NaN NaN

  • 1.79
  • 3.19
  • 9.92

NaN NaN NaN 0.10 0.23 0.08

  • 1.04
  • 9.75
  • 5.78

0.89 NaN NaN NaN NaN NaN

  • 0.28
  • 7.79
  • 1.54
  • 0.70

NaN NaN NaN NaN NaN 0.10

  • 0.05

0.57 0.47 0.54

  • 0.96
  • 0.10

0.32

  • 0.08

NaN 4.22

  • 0.93

1.45

  • 0.15

0.19 NaN NaN NaN

  • 0.43

0.43

  • 2.42

2.45

  • 0.22

0.80 NaN NaN NaN 0.06

  • 13.71

0.33 3.13 0.57 0.09 1.20 0.57 0.96 0.09 NaN 0.51 3.05 0.32 0.04 NaN NaN NaN 0.12

  • 13.15

0.64 1.70

  • 0.07
  • 1.47

NaN NaN NaN

  • 0.53
  • 8.77
  • 2.92

1.03 0.28

  • 1.10

0.30 0.87 0.24

  • 1.27
  • 8.21
  • 7.04

0.38 0.25

  • 0.75

NaN NaN NaN

  • 1.53
  • 7.94
  • 8.52

1.36 0.29

  • 0.74

NaN NaN NaN

  • 1.10
  • 9.46
  • 6.12

2.13

  • 0.06
  • 1.23

0.50 0.70 0.4

  • 0.69

2.48

  • 3.82

NaN

  • 0.24

4.70 NaN NaN NaN

  • 0.93

NaN

  • 5.17

NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

slide-5
SLIDE 5

Estimation of dynamic factor model

For small scale models: Maximum likelihood

  • Use of optimisation methods:
  • In the time domain (Engle and Watson, 1981, Quah and

Sargent, 1992)

  • In the frequency domain domain (Geweke, 1977; Sargent

and Sims, 1977)

  • EM algorithm (Watson and Engle, 1983, Shumway and

Stoffer, 1982) With missing data: Shumway and Stoffer, 1982

  • nly if the the loading matrix is known!
slide-6
SLIDE 6

Estimation of dynamic factor model

For large scale models:

  • Principal components

(Connor and Korajczyk (1986,1988),Forni and Reichlin (1996,1998), Forni, Hallin, Lippi and Reichlin (2000), Stock and Watson (2002))

With missing data: Giannone, Reichlin and Small(2008)

  • nly at the end of the sample!
  • Maximum likelihood with EM algorithm

Doz, Giannone, and Reichlin (2006): Likelihood for a misspecified model: no serial and cross-correlation of the idiosyncratic components if N, T → +∞, no problem!

slide-7
SLIDE 7

Contributions of this paper

Generalize Doz, Giannone, and Reichlin (2006) in order to:

  • adapt EM algorithm to deal with a general pattern of

missing data

  • introduce restrictions on the parameters: group specific

factors

  • model dynamics of idiosyncratic component

Moreover:

  • Monte Carlo evaluation
  • News concept:helps to understand the sources of

forecast revisions in case of multiple data releases

  • Application to the euro area data:

real-time forecasting and backdating of GDP

slide-8
SLIDE 8

EM, how does it work

Solution to problems in which latent or missing data yield the likelihood intractable. Write the likelihood as if all the data were observed: in terms of both observables and factors. Iterate:

  • E-step: replace the sufficient statistics by their

expectations (given the parameters from previous iteration)

  • M-step: maximise the “expected likelihood”
slide-9
SLIDE 9

EM, dynamic factor model

yt = Λft + ξt , ξt ∼ N(0, R) ft = Aft−1 + ut , ut ∼ N(0, Q) ft observed and yt available:

  • Λ

= T

  • t=1

ytf ′

t

T

  • t=1

ftf ′

t

−1

  • R

= diag

  • 1

T

T

  • t=1
  • yt − ˆ

Λft yt − ˆ Λft ′

  • = diag
  • 1

T T

  • t=1

yty ′

t − ˆ

Λ

T

  • t=1

fty ′

t

slide-10
SLIDE 10

EM, dynamic factor model

ft unobserved and yt available (EM, ΩT = Data):

  • Λ(r + 1)

= T

  • t=1

Eθ(r)

  • ytf ′

t |ΩT

  • T
  • t=1

Eθ(r)

  • ftf ′

t |ΩT

  • −1
  • R(r + 1)

= diag

  • 1

T T

  • t=1

Eθ(r)

  • yty ′

t |ΩT

Λ(r + 1)

T

  • t=1

Eθ(r)

  • fty ′

t |ΩT

  • Analogously for A and Q in the state equation.

Expectations of the sufficient statistics without missing data (Watson and Engle, 1983)

  • Eθ(r)
  • ytf ′

t |ΩT

  • = ytEθ(r)
  • f ′

t |ΩT

  • r Eθ(r)
  • yty′

t |ΩT

  • = yty′

t

  • Eθ(r)
  • f ′

t |ΩT

  • r Eθ(r)
  • ft−1f ′

t−1|ΩT

  • from the Kalman smoother
slide-11
SLIDE 11

EM for the factor model, yt contains missing values

yt with missing data ⇒ we cannot longer use Watson and Engle, 1983 Instead

vec

  • Λ(r + 1)
  • =

T

  • t=1

Eθ(r)

  • ftf ′

t |ΩT

  • ⊗ Wt

−1 vec T

  • t=1

WtytEθ(r)

  • f ′

t |ΩT

  • R(r + 1) = diag
  • 1

T

T

  • t=1
  • Wtyty ′

t W ′ t − WtytEθ(r)

  • f ′

t |ΩT

  • Λ(r + 1)′Wt +

−Wt Λ(r + 1)Eθ(r)

  • ft|ΩT
  • y ′

t W ′ t + Wt

Λ(r + 1) Eθ(r)

  • ftf ′

t |ΩT

  • Λ(r + 1)′Wt + (I − Wt)

R(r)(I − Wt)

slide-12
SLIDE 12

Application:Daily Inflation Tracking

Proposes a framework to nowcast and forecast the euro area HICP inflation exploiting weekly and daily data. HICP data are published 15 days after the end of the reference month. Only for Total HICP there is a flash estimate for the current month, released at the end of the month. However, till the end of the month, weekly and daily data are available. We want to exploit this information in order to know where inflation is, before the flash estimate is published.

slide-13
SLIDE 13

Data

We focus on two groups of series. The first is: The Weekly Commission Oil Bulletin Price Statistics (WOB).

Member states initiated a survey to gather data on petroleum prices. Every Monday they communicate to the European Commission consumer prices of petroleum products net of duties and taxes and inclusive all duties and taxes in force. This data are average through the period.

Focus on the consumer prices. In comparison to raw oil prices they account for distribution and retail margins. (released 2 or 3 days after the reference period)

slide-14
SLIDE 14

Data

The second group of data are. World Market Prices of Raw Materials (RMP).

Weighted according to commodity imports of Euro area countries in 1999-2001, excluding EU-intra trade. Published at mid of following week.

Raw Material Prices capture some of the global price dynamics as well as prices at the early part of the pricing chain. (released 2 or 3 days after the reference period)

slide-15
SLIDE 15

Why RMP and WOB? Coincident!

  • 3.5
  • 2.5
  • 1.5
  • 0.5

0.5 1.5 2.5 1/4/2005 1/7/2005 1/10/2005 1/1/2006 1/4/2006 1/7/2006 1/10/2006 1/1/2007 1/4/2007 1/7/2007 1/10/2007 1/1/2008 1/4/2008 1/7/2008 1/10/2008 1/1/2009 1/4/2009 1/7/2009 1/10/2009

  • verall HICP

Diesel Euro Super 95 Gas oil total RMP

slide-16
SLIDE 16

Why RMP and WOB? More timely!

March 2010 >

1 2 3

RMP 22-26 Feb

4 5

WOB 23 Feb - 1 Mar

.. 8 9

RMP 1-5 Mar

10 11 12

WOB 2-8 Mar

.. 15 16

HICP Releases Feb 10

17

RMP 8-12 Mar

18 19

WOB 9-15 Mar

.. 22 23

RMP 15-19 Mar

24 25

WOB 16-22 Mar

26 .. 29 30

RMP 22-26 Mar

31

WOB 23-29 Mar

Flash Estimate Mar 10

slide-17
SLIDE 17

Literature Review

The use of higher frequency indicators in order to Nowcast/Forecast lower frequency indicators started with monthly data for GDP, among others Giannone, Reichlin and Small (2008) for the U.S. and Banbura and Runstler (2007) for the Euro Area show that using monthly indicators is crucial in

  • rder to Nowcast accurately GDP

. For inflation Monteforte and Moretti (2010) extract from daily financial series some factors that are then plugged in an equation (predetermination) for forecasting inflation . These daily factors are then averaged using a MIDAS regression.

slide-18
SLIDE 18

Literature Review

The use of higher frequency indicators in order to Nowcast/Forecast lower frequency indicators started with monthly data for GDP, among others Giannone, Reichlin and Small (2008) for the U.S. and Banbura and Runstler (2007) for the Euro Area show that using monthly indicators is crucial in

  • rder to Nowcast accurately GDP

. For inflation Monteforte and Moretti (2010) extract from daily financial series some factors that are then plugged in an equation (predetermination) for forecasting inflation . These daily factors are then averaged using a MIDAS regression. ⇒ do not use the full co-movement to extract the signal

slide-19
SLIDE 19

Literature Review

The use of higher frequency indicators in order to Nowcast/Forecast lower frequency indicators started with monthly data for GDP, among others Giannone, Reichlin and Small (2008) for the U.S. and Banbura and Runstler (2007) for the Euro Area show that using monthly indicators is crucial in

  • rder to Nowcast accurately GDP

. For inflation Monteforte and Moretti (2010) extract from daily financial series some factors that are then plugged in an equation (predetermination) for forecasting inflation . These daily factors are then averaged using a MIDAS regression. ⇒ do not use the full co-movement to extract the signal ⇒ daily variables are predetermined: no unique framework for forecasting all the variables, no News

slide-20
SLIDE 20

News concept

New release for several groups of variables ⇒ forecast is revised. Why and how is it revised:

  • What is the “new” information in this data release?
  • How does it revise the forecast?
  • Which variables carry the biggest news (revise the forecast

the most)?

  • Do different (groups of) variables pull the revision in

different directions (we have both positive and negative news)?

slide-21
SLIDE 21

News concept, definition and link to forecast revision

Note: only the surprise component revises the forecast NEWS = ACTUAL − EXPECTED

Ij(Ωnew) = yj,tj − P

  • yj,tj|Ωold
  • e.g. WOB diesel price lower than expected

Actual: yj,tj = 0.6, Expected: P

  • yj,tj|Ωold
  • = 0.8,

News: -0.2

slide-22
SLIDE 22

News concept, definition and link to forecast revision

Note: only the surprise component revises the forecast NEWS = ACTUAL − EXPECTED

Ij(Ωnew) = yj,tj − P

  • yj,tj|Ωold
  • e.g. WOB diesel price lower than expected

Actual: yj,tj = 0.6, Expected: P

  • yj,tj|Ωold
  • = 0.8,

News: -0.2

REVISION = ?

slide-23
SLIDE 23

News concept, definition and link to forecast revision

Note: only the surprise component revises the forecast NEWS = ACTUAL − EXPECTED

Ij(Ωnew) = yj,tj − P

  • yj,tj|Ωold
  • e.g. WOB diesel price lower than expected

Actual: yj,tj = 0.6, Expected: P

  • yj,tj|Ωold
  • = 0.8,

News: -0.2

REVISION =

  • i A(j) · NEWS(j)

In Banbura and Modugno (2010) how to derive the A(j)’s for state space models.

slide-24
SLIDE 24

Data

  • HICP detailed , 6 series

transformed in monthly growth rates

  • Weekly Commission Oil Bulletin Price Statistics (WOB), 3

series transformed in weekly growth rates

  • World Market Price of Raw Materials (RMP), 10 series

transformed in trading day growth rates

slide-25
SLIDE 25

Framework

Idiosyncratic dynamics and restrictions on the parameters   ym

t

yw

t

yd

t

  =   Λm INm Λw INw Λd INd           f m

t

f w

t

f d

t

ξm

t

ξw

t

ξd

t

        where t indicates trading days

slide-26
SLIDE 26

Framework

f d

t = Af d t−1 + ut

ut ∼ N(0, Q) f w

t

= Ξw

t f w t−1 + f d t

f m

t

= Ξm

t f m t−1 + f d t

with where Ξm

t and Ξw t are equal to zero the day after the

reference trading day for and ym

t

and yw

t

respectively, equal to 1

  • therwise.

And: ξj

t = αjξj t−1 + ǫj t

ǫj

t ∼ N(0, Rjj)

with j=m,w,d

slide-27
SLIDE 27

Forecast exercise design

  • Evaluation of forecasts for:

year-on-year growth rate of overall HICP inflation

  • Forecasts produced once per month:

before the release of all HICPs (mid-month)

  • Different forecast horizons:

0 to 12 months ahead

  • Evaluation sample: Jan 2004 - Dec 2009
  • Recursive estimation, from April 1997
  • Pseudo Real Time
slide-28
SLIDE 28

Evaluated specifications

  • Model with monthly HICPs (Mon)
  • Model with all available data (All)
  • Model with monthly HICPs and daily RMP (DAY)

Compared to

  • Random Walk for year-on-year inflation rate
slide-29
SLIDE 29

Evaluation method

Mean Squared Forecast Error (MSFE) ratios of the proposed models to the Random Walk (RW). (if lower than one, proposed model better than RW) We show the averages of the MSFE ratios for different specification of the models, i.e. for different number of factors (from 1 to 3) and for different number of lags in the transition equation (from 1 to 6). Moreover we show the best model specifications, within each group, chosen ex-post.

slide-30
SLIDE 30

Results

Figure: MSFE ratios for Overall HICP inflation at mid month: average model

0.40 0.50 0.60 0.70 0.80 0.90 1.00 1.10 1.20 12 6 3 1 steps ahead MSFE ratios MON ALL DAY

slide-31
SLIDE 31

Results

Figure: MSFE ratios for Overall HICP inflation at mid month: best model

0.40 0.50 0.60 0.70 0.80 0.90 1.00 1.10 12 6 3 1 steps ahead MSFE ratios MON ALL DAY

slide-32
SLIDE 32

Results

We have a framework for producing early estimate of the HICP inflation using weekly and daily data!!!

slide-33
SLIDE 33

Results

We have a framework for producing early estimate of the HICP inflation using weekly and daily data!!! and it perform well!!!

slide-34
SLIDE 34

Results

We have a framework for producing early estimate of the HICP inflation using weekly and daily data!!! and it perform well!!! Let’s see how can be used to interpret releases:

slide-35
SLIDE 35

Results

March 2009

0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09

slide-36
SLIDE 36

Results

March 2009

0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09

slide-37
SLIDE 37

Results

March 2009

0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb

slide-38
SLIDE 38

Results

March 2009

0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar

slide-39
SLIDE 39

Results

March 2009

0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar

HICP releases relative to Feb 09

RMP relative to 9 Mar - 13 Mar

slide-40
SLIDE 40

Results

March 2009

0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar

HICP releases relative to Feb 09

RMP relative to 9 Mar - 13 Mar RMP relative to 16 Mar - 20 Mar

slide-41
SLIDE 41

Results

March 2009

0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar

HICP releases relative to Feb 09

RMP relative to 9 Mar - 13 Mar RMP relative to 16 Mar - 20 Mar RMP relative to 23 Mar - 27 Mar

slide-42
SLIDE 42

Results

March 2009

0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar

HICP releases relative to Feb 09

RMP relative to 9 Mar - 13 Mar RMP relative to 16 Mar - 20 Mar RMP relative to 23 Mar - 27 Mar

slide-43
SLIDE 43

Conclusions

We propose an estimation methodology for factor models that:

  • adapts EM algorithm to deal with a general pattern of

missing data

  • introduces restrictions on the parameters: group specific

factors

  • models dynamics of idiosyncratic component
  • indicates how to extract the News
slide-44
SLIDE 44

Conclusions

We propose an estimation methodology for factor models that:

  • adapts EM algorithm to deal with a general pattern of

missing data

  • introduces restrictions on the parameters: group specific

factors

  • models dynamics of idiosyncratic component
  • indicates how to extract the News

In "Daily inflation tracking" I apply this methodology in order to obtain reliable early estimates of the euro area HICP inflation, exploiting weekly and daily data, allowing us to understand how and why a given release has changed our estimate.