Maximum likelihood estimation of factor models on data sets with - - PowerPoint PPT Presentation
Maximum likelihood estimation of factor models on data sets with - - PowerPoint PPT Presentation
Maximum likelihood estimation of factor models on data sets with arbitrary pattern of missing data. Marta Ba nbura Michele Modugno European Central Bank 6th Eurostat Colloquium, 29 September 2010 Framework Dynamic factor model y t f t
Framework
Dynamic factor model yt = Λft + εt ft = A1ft−1 + · · · + Apft−p + ut
- State space form.
- Factors are unobservable.
- Factors follow a VAR dynamic.
Framework
Dynamic factor model yt = Λft + εt ft = A1ft−1 + · · · + Apft−p + ut
- State space form.
- Factors are unobservable.
- Factors follow a VAR dynamic.
- yt can follow an arbitrary pattern of missing data.
Arbitrary pattern of missing data
- 2.09
0.70
- 11.59
NaN NaN NaN NaN NaN NaN
- 2.17
2.19
- 12.04
NaN NaN NaN NaN NaN NaN
- 1.79
- 3.19
- 9.92
NaN NaN NaN 0.10 0.23 0.08
- 1.04
- 9.75
- 5.78
0.89 NaN NaN NaN NaN NaN
- 0.28
- 7.79
- 1.54
- 0.70
NaN NaN NaN NaN NaN 0.10
- 0.05
0.57 0.47 0.54
- 0.96
- 0.10
0.32
- 0.08
NaN 4.22
- 0.93
1.45
- 0.15
0.19 NaN NaN NaN
- 0.43
0.43
- 2.42
2.45
- 0.22
0.80 NaN NaN NaN 0.06
- 13.71
0.33 3.13 0.57 0.09 1.20 0.57 0.96 0.09 NaN 0.51 3.05 0.32 0.04 NaN NaN NaN 0.12
- 13.15
0.64 1.70
- 0.07
- 1.47
NaN NaN NaN
- 0.53
- 8.77
- 2.92
1.03 0.28
- 1.10
0.30 0.87 0.24
- 1.27
- 8.21
- 7.04
0.38 0.25
- 0.75
NaN NaN NaN
- 1.53
- 7.94
- 8.52
1.36 0.29
- 0.74
NaN NaN NaN
- 1.10
- 9.46
- 6.12
2.13
- 0.06
- 1.23
0.50 0.70 0.4
- 0.69
2.48
- 3.82
NaN
- 0.24
4.70 NaN NaN NaN
- 0.93
NaN
- 5.17
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Estimation of dynamic factor model
For small scale models: Maximum likelihood
- Use of optimisation methods:
- In the time domain (Engle and Watson, 1981, Quah and
Sargent, 1992)
- In the frequency domain domain (Geweke, 1977; Sargent
and Sims, 1977)
- EM algorithm (Watson and Engle, 1983, Shumway and
Stoffer, 1982) With missing data: Shumway and Stoffer, 1982
- nly if the the loading matrix is known!
Estimation of dynamic factor model
For large scale models:
- Principal components
(Connor and Korajczyk (1986,1988),Forni and Reichlin (1996,1998), Forni, Hallin, Lippi and Reichlin (2000), Stock and Watson (2002))
With missing data: Giannone, Reichlin and Small(2008)
- nly at the end of the sample!
- Maximum likelihood with EM algorithm
Doz, Giannone, and Reichlin (2006): Likelihood for a misspecified model: no serial and cross-correlation of the idiosyncratic components if N, T → +∞, no problem!
Contributions of this paper
Generalize Doz, Giannone, and Reichlin (2006) in order to:
- adapt EM algorithm to deal with a general pattern of
missing data
- introduce restrictions on the parameters: group specific
factors
- model dynamics of idiosyncratic component
Moreover:
- Monte Carlo evaluation
- News concept:helps to understand the sources of
forecast revisions in case of multiple data releases
- Application to the euro area data:
real-time forecasting and backdating of GDP
EM, how does it work
Solution to problems in which latent or missing data yield the likelihood intractable. Write the likelihood as if all the data were observed: in terms of both observables and factors. Iterate:
- E-step: replace the sufficient statistics by their
expectations (given the parameters from previous iteration)
- M-step: maximise the “expected likelihood”
EM, dynamic factor model
yt = Λft + ξt , ξt ∼ N(0, R) ft = Aft−1 + ut , ut ∼ N(0, Q) ft observed and yt available:
- Λ
= T
- t=1
ytf ′
t
T
- t=1
ftf ′
t
−1
- R
= diag
- 1
T
T
- t=1
- yt − ˆ
Λft yt − ˆ Λft ′
- = diag
- 1
T T
- t=1
yty ′
t − ˆ
Λ
T
- t=1
fty ′
t
EM, dynamic factor model
ft unobserved and yt available (EM, ΩT = Data):
- Λ(r + 1)
= T
- t=1
Eθ(r)
- ytf ′
t |ΩT
- T
- t=1
Eθ(r)
- ftf ′
t |ΩT
- −1
- R(r + 1)
= diag
- 1
T T
- t=1
Eθ(r)
- yty ′
t |ΩT
- −
Λ(r + 1)
T
- t=1
Eθ(r)
- fty ′
t |ΩT
- Analogously for A and Q in the state equation.
Expectations of the sufficient statistics without missing data (Watson and Engle, 1983)
- Eθ(r)
- ytf ′
t |ΩT
- = ytEθ(r)
- f ′
t |ΩT
- r Eθ(r)
- yty′
t |ΩT
- = yty′
t
- Eθ(r)
- f ′
t |ΩT
- r Eθ(r)
- ft−1f ′
t−1|ΩT
- from the Kalman smoother
EM for the factor model, yt contains missing values
yt with missing data ⇒ we cannot longer use Watson and Engle, 1983 Instead
vec
- Λ(r + 1)
- =
T
- t=1
Eθ(r)
- ftf ′
t |ΩT
- ⊗ Wt
−1 vec T
- t=1
WtytEθ(r)
- f ′
t |ΩT
- R(r + 1) = diag
- 1
T
T
- t=1
- Wtyty ′
t W ′ t − WtytEθ(r)
- f ′
t |ΩT
- Λ(r + 1)′Wt +
−Wt Λ(r + 1)Eθ(r)
- ft|ΩT
- y ′
t W ′ t + Wt
Λ(r + 1) Eθ(r)
- ftf ′
t |ΩT
- Λ(r + 1)′Wt + (I − Wt)
R(r)(I − Wt)
Application:Daily Inflation Tracking
Proposes a framework to nowcast and forecast the euro area HICP inflation exploiting weekly and daily data. HICP data are published 15 days after the end of the reference month. Only for Total HICP there is a flash estimate for the current month, released at the end of the month. However, till the end of the month, weekly and daily data are available. We want to exploit this information in order to know where inflation is, before the flash estimate is published.
Data
We focus on two groups of series. The first is: The Weekly Commission Oil Bulletin Price Statistics (WOB).
Member states initiated a survey to gather data on petroleum prices. Every Monday they communicate to the European Commission consumer prices of petroleum products net of duties and taxes and inclusive all duties and taxes in force. This data are average through the period.
Focus on the consumer prices. In comparison to raw oil prices they account for distribution and retail margins. (released 2 or 3 days after the reference period)
Data
The second group of data are. World Market Prices of Raw Materials (RMP).
Weighted according to commodity imports of Euro area countries in 1999-2001, excluding EU-intra trade. Published at mid of following week.
Raw Material Prices capture some of the global price dynamics as well as prices at the early part of the pricing chain. (released 2 or 3 days after the reference period)
Why RMP and WOB? Coincident!
- 3.5
- 2.5
- 1.5
- 0.5
0.5 1.5 2.5 1/4/2005 1/7/2005 1/10/2005 1/1/2006 1/4/2006 1/7/2006 1/10/2006 1/1/2007 1/4/2007 1/7/2007 1/10/2007 1/1/2008 1/4/2008 1/7/2008 1/10/2008 1/1/2009 1/4/2009 1/7/2009 1/10/2009
- verall HICP
Diesel Euro Super 95 Gas oil total RMP
Why RMP and WOB? More timely!
March 2010 >
1 2 3
↑
RMP 22-26 Feb
4 5
↑
WOB 23 Feb - 1 Mar
.. 8 9
↑
RMP 1-5 Mar
10 11 12
↑
WOB 2-8 Mar
.. 15 16
↓
HICP Releases Feb 10
17
↑
RMP 8-12 Mar
18 19
↑
WOB 9-15 Mar
.. 22 23
↑
RMP 15-19 Mar
24 25
↑
WOB 16-22 Mar
26 .. 29 30
↑
RMP 22-26 Mar
31
↑
WOB 23-29 Mar
↓
Flash Estimate Mar 10
Literature Review
The use of higher frequency indicators in order to Nowcast/Forecast lower frequency indicators started with monthly data for GDP, among others Giannone, Reichlin and Small (2008) for the U.S. and Banbura and Runstler (2007) for the Euro Area show that using monthly indicators is crucial in
- rder to Nowcast accurately GDP
. For inflation Monteforte and Moretti (2010) extract from daily financial series some factors that are then plugged in an equation (predetermination) for forecasting inflation . These daily factors are then averaged using a MIDAS regression.
Literature Review
The use of higher frequency indicators in order to Nowcast/Forecast lower frequency indicators started with monthly data for GDP, among others Giannone, Reichlin and Small (2008) for the U.S. and Banbura and Runstler (2007) for the Euro Area show that using monthly indicators is crucial in
- rder to Nowcast accurately GDP
. For inflation Monteforte and Moretti (2010) extract from daily financial series some factors that are then plugged in an equation (predetermination) for forecasting inflation . These daily factors are then averaged using a MIDAS regression. ⇒ do not use the full co-movement to extract the signal
Literature Review
The use of higher frequency indicators in order to Nowcast/Forecast lower frequency indicators started with monthly data for GDP, among others Giannone, Reichlin and Small (2008) for the U.S. and Banbura and Runstler (2007) for the Euro Area show that using monthly indicators is crucial in
- rder to Nowcast accurately GDP
. For inflation Monteforte and Moretti (2010) extract from daily financial series some factors that are then plugged in an equation (predetermination) for forecasting inflation . These daily factors are then averaged using a MIDAS regression. ⇒ do not use the full co-movement to extract the signal ⇒ daily variables are predetermined: no unique framework for forecasting all the variables, no News
News concept
New release for several groups of variables ⇒ forecast is revised. Why and how is it revised:
- What is the “new” information in this data release?
- How does it revise the forecast?
- Which variables carry the biggest news (revise the forecast
the most)?
- Do different (groups of) variables pull the revision in
different directions (we have both positive and negative news)?
News concept, definition and link to forecast revision
Note: only the surprise component revises the forecast NEWS = ACTUAL − EXPECTED
Ij(Ωnew) = yj,tj − P
- yj,tj|Ωold
- e.g. WOB diesel price lower than expected
Actual: yj,tj = 0.6, Expected: P
- yj,tj|Ωold
- = 0.8,
News: -0.2
News concept, definition and link to forecast revision
Note: only the surprise component revises the forecast NEWS = ACTUAL − EXPECTED
Ij(Ωnew) = yj,tj − P
- yj,tj|Ωold
- e.g. WOB diesel price lower than expected
Actual: yj,tj = 0.6, Expected: P
- yj,tj|Ωold
- = 0.8,
News: -0.2
REVISION = ?
News concept, definition and link to forecast revision
Note: only the surprise component revises the forecast NEWS = ACTUAL − EXPECTED
Ij(Ωnew) = yj,tj − P
- yj,tj|Ωold
- e.g. WOB diesel price lower than expected
Actual: yj,tj = 0.6, Expected: P
- yj,tj|Ωold
- = 0.8,
News: -0.2
REVISION =
- i A(j) · NEWS(j)
In Banbura and Modugno (2010) how to derive the A(j)’s for state space models.
Data
- HICP detailed , 6 series
transformed in monthly growth rates
- Weekly Commission Oil Bulletin Price Statistics (WOB), 3
series transformed in weekly growth rates
- World Market Price of Raw Materials (RMP), 10 series
transformed in trading day growth rates
Framework
Idiosyncratic dynamics and restrictions on the parameters ym
t
yw
t
yd
t
= Λm INm Λw INw Λd INd f m
t
f w
t
f d
t
ξm
t
ξw
t
ξd
t
where t indicates trading days
Framework
f d
t = Af d t−1 + ut
ut ∼ N(0, Q) f w
t
= Ξw
t f w t−1 + f d t
f m
t
= Ξm
t f m t−1 + f d t
with where Ξm
t and Ξw t are equal to zero the day after the
reference trading day for and ym
t
and yw
t
respectively, equal to 1
- therwise.
And: ξj
t = αjξj t−1 + ǫj t
ǫj
t ∼ N(0, Rjj)
with j=m,w,d
Forecast exercise design
- Evaluation of forecasts for:
year-on-year growth rate of overall HICP inflation
- Forecasts produced once per month:
before the release of all HICPs (mid-month)
- Different forecast horizons:
0 to 12 months ahead
- Evaluation sample: Jan 2004 - Dec 2009
- Recursive estimation, from April 1997
- Pseudo Real Time
Evaluated specifications
- Model with monthly HICPs (Mon)
- Model with all available data (All)
- Model with monthly HICPs and daily RMP (DAY)
Compared to
- Random Walk for year-on-year inflation rate
Evaluation method
Mean Squared Forecast Error (MSFE) ratios of the proposed models to the Random Walk (RW). (if lower than one, proposed model better than RW) We show the averages of the MSFE ratios for different specification of the models, i.e. for different number of factors (from 1 to 3) and for different number of lags in the transition equation (from 1 to 6). Moreover we show the best model specifications, within each group, chosen ex-post.
Results
Figure: MSFE ratios for Overall HICP inflation at mid month: average model
0.40 0.50 0.60 0.70 0.80 0.90 1.00 1.10 1.20 12 6 3 1 steps ahead MSFE ratios MON ALL DAY
Results
Figure: MSFE ratios for Overall HICP inflation at mid month: best model
0.40 0.50 0.60 0.70 0.80 0.90 1.00 1.10 12 6 3 1 steps ahead MSFE ratios MON ALL DAY
Results
We have a framework for producing early estimate of the HICP inflation using weekly and daily data!!!
Results
We have a framework for producing early estimate of the HICP inflation using weekly and daily data!!! and it perform well!!!
Results
We have a framework for producing early estimate of the HICP inflation using weekly and daily data!!! and it perform well!!! Let’s see how can be used to interpret releases:
Results
March 2009
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9
- 0.3
- 0.2
- 0.1
0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09
Results
March 2009
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9
- 0.3
- 0.2
- 0.1
0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09
Results
March 2009
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9
- 0.3
- 0.2
- 0.1
0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb
Results
March 2009
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9
- 0.3
- 0.2
- 0.1
0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar
Results
March 2009
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9
- 0.3
- 0.2
- 0.1
0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar
HICP releases relative to Feb 09
RMP relative to 9 Mar - 13 Mar
Results
March 2009
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9
- 0.3
- 0.2
- 0.1
0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar
HICP releases relative to Feb 09
RMP relative to 9 Mar - 13 Mar RMP relative to 16 Mar - 20 Mar
Results
March 2009
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9
- 0.3
- 0.2
- 0.1
0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar
HICP releases relative to Feb 09
RMP relative to 9 Mar - 13 Mar RMP relative to 16 Mar - 20 Mar RMP relative to 23 Mar - 27 Mar
Results
March 2009
0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 2 8 / 2 / 9 3 / 3 / 9 6 / 3 / 9 9 / 3 / 9 1 2 / 3 / 9 1 5 / 3 / 9 1 8 / 3 / 9 2 1 / 3 / 9 2 4 / 3 / 9 2 7 / 3 / 9 3 / 3 / 9
- 0.3
- 0.2
- 0.1
0.1 0.2 0.3 0.4 0.5 RMP energy RMP non-ene HICP releases Nowcast Flash Estimate Feb09 True value First Release Feb09 RMP relative to 23 Feb - 27 Feb RMP relative to 2 Mar - 6 Mar
HICP releases relative to Feb 09
RMP relative to 9 Mar - 13 Mar RMP relative to 16 Mar - 20 Mar RMP relative to 23 Mar - 27 Mar
Conclusions
We propose an estimation methodology for factor models that:
- adapts EM algorithm to deal with a general pattern of
missing data
- introduces restrictions on the parameters: group specific
factors
- models dynamics of idiosyncratic component
- indicates how to extract the News
Conclusions
We propose an estimation methodology for factor models that:
- adapts EM algorithm to deal with a general pattern of
missing data
- introduces restrictions on the parameters: group specific
factors
- models dynamics of idiosyncratic component
- indicates how to extract the News