Analysis of Multiple Time Series
Kevin Sheppard ❤tt♣✿✴✴✇✇✇✳❦❡✈✐♥s❤❡♣♣❛r❞✳❝♦♠
Oxford MFE
This version: February 24, 2020
Analysis of Multiple Time Series Kevin Sheppard - - PDF document
Analysis of Multiple Time Series Kevin Sheppard ttsr Oxford MFE This version: February 24, 2020 February March, 2020 This weeks material Vector Autoregressions
This version: February 24, 2020
Vector Autoregressions Basic examples Properties ◮ Stationarity Revisiting univariate ARMA processes Forecasting ◮ Granger Causality ◮ Impulse Response functions Cointegration ◮ Examining long-run relationships ◮ Determining whether a VAR is cointegrated ◮ Error Correction Models ◮ Testing for Cointegration ⊲ Engle-Granger
2 / 67
Stationary VARs ◮ Determine whether variables feedback into one another ◮ Improve forecasts ◮ Model the effect of a shock in one series on another ◮ Differentiate between short-run and long-run dynamics Cointegration ◮ Link random walks ◮ Uncover long run relationships ◮ Can improve medium to long term forecasting a lot 3 / 67
Pth order autoregression, AR(P):
Pth order vector autoregression, VAR(P):
Bivariate VAR(1):
4 / 67
Stationarity is a statistically meaningful form of regularity.
AR(1) stationarity: yt = φyt−1 + ǫt ◮ |φ| < 1 ◮ ǫt is white noise AR(P) stationarity: yt = φ1yt−1 + . . . + φP yt−P + ǫt ◮ Roots of (zP − φ1zP −1 − φ2zP −2 − . . . − φP −1z − φP ) less than 1 ◮ ǫt is white noise No dependence on t 5 / 67
AR(1)
6 / 67
VAR(1)
7 / 67
Stationarity ◮ AR(1): |φ1| < 1 ◮ VAR(1): |λi| < 1 where λi are the eigenvalues of Φ1 8 / 67
VWM from CRSP TERM constructed from 10-year bond return minus 1-year return
February 1962 until December 2018 (683 months)
Long bond model
Estimates
Estimates from VAR
Estimates from AR
10 / 67
727198 728659 730120 731581 733042 734503 735964 0.0 0.1 0.2
727198 728659 730120 731581 733042 734503 735964
0.0 0.1 0.2
11 / 67
Standard tool in monetary policy analysis ◮ Unemployment rate (differenced) ◮ Federal Funds rate ◮ Inflation rate (differenced)
12 / 67
Variable scale affects cross-parameter estimates ◮ Not an issue in ARMA analysis Standardizing data can improve interpretation when scales differ
Other important measures – statistical significance, persistence,
13 / 67
Companion form:
Reform into a single VAR(1) where
◮ All results can be directly applied to the companion form. ◮ Can also be used to transform AR(P) into VAR(1) 14 / 67
Consider standard AR(1)
Optimal 1-step ahead forecast:
Optimal 2-step ahead forecast:
1yt
Optimal h-step ahead forecast:
h−1
1φ0 + φh 1yt
15 / 67
Identical to univariate case
Optimal 1-step ahead forecast:
Optimal h-step ahead forecast:
Higher order forecast can be recursively computed
16 / 67
Forecast residuals
Residuals are not white noise Can contain an MA(h − 1) component ◮ Forecast error for yt+1 − ˆ
Plot your residuals Residual ACF Mincer-Zarnowitz regressions Three period procedure ◮ Training sample: Used to build model ◮ Validation sample: Used to refine model ◮ Evaluation sample: Ultimate test, ideally 1 shot 17 / 67
Two methods Iterative method ◮ Build model for 1-step ahead forecasts
◮ Iterate forecast out to period h
h−1
1Φ0 + Φh 1yt ◮ Makes efficient use of information ◮ Imposes a lot of structure on the problem Direct Method ◮ Build model for h-step ahead forecasts
◮ Directly forecast using a pseudo 1-step ahead method
◮ Robust to some nonlinearities 18 / 67
Multistep forecast evaluation is identical to one-step ahead forecast
h-step ahead forecast errors may be correlated with any forecast error
Leads to a MA(h − 1) structure in the forecast errors Solutions: ◮ Use regular GMZ regression with a Newey-West covariance
◮ Explicitly model the MA(h − 1) and use a standard covariance
h−1
19 / 67
Forecasts produced iteratively for 1 to 8 quarters ahead Random walk (FF) or constant mean benchmark AR and VAR select lag length using BIC Restricted force reversion to in-sample mean using 2-step
Evaluation based on relative MSE
20 / 67
21 / 67
Univariate Identification: Box-Jenkins ◮ Use ACF and PACF to determine AR and MA lag order ◮ Examine residuals ◮ Parsimony principle The autocorrelation of a scalar process is defined
◮ Regression coefficient:
Partial autocorrelation ψs ◮ Regression interpretation of sth partial autocorrelation:
◮ ψ is the sth partial autocorrelation 22 / 67
Multivariate equivalents ◮ ACF and PACF have same regression definitions ◮ Cross-correlation function
◮ Generally different ◮ Cross-partial-correlation function ψxy,s
⊲ Can help identify VAR order Deeper issue: too many and too complicated Simple solution: Model selection 23 / 67
y has HAR dynamics, spills over to x
5
22
24 / 67
1 6 12 18 24 0.0 0.2 0.4 0.6 0.8 1.0 1 6 12 18 24 0.0 0.2 0.4 0.6 0.8 1.0
1 6 12 18 24 0.0 0.2 0.4 0.6 0.8 1.0 1 6 12 18 24 0.0 0.2 0.4 0.6 0.8 1.0 25 / 67
1 6 12 18 24 0.00 0.05 0.10 0.15 0.20 0.25 1 6 12 18 24 0.00 0.05 0.10 0.15 0.20 0.25
1 6 12 18 24 0.00 0.05 0.10 0.15 0.20 0.25 1 6 12 18 24 0.00 0.05 0.10 0.15 0.20 0.25 26 / 67
Step 1: Pick maximum lag length ◮ Information criteria
⊲ Σ(P) is the covariance of the residuals using P lags ⊲ | · | is the determinant ◮ Hypothesis testing based ⊲ General to Specific ⊲ Specific to General ◮ Likelihood Ratio
A
(P2−P1)k2
27 / 67
Maximum lag: 12 (1 year)
28 / 67
First fundamentally new concept Examines whether lags of one variable are helpful in predicting
29 / 67
Translates directly into a restriction in a VAR Unrestricted
30 / 67
In P lag model
Alternative is
Likelihood Ratio test
Σu is the covariance of the errors from unrestricted model Σr is the covariance of the errors from restricted model T − Pk2 is number of observations minus number of free
◮ Why χ2
P ?
31 / 67
Standard tool in monetary policy analysis ◮ Unemployment rate (differenced) ⊲ Federal Funds rate ⊲ Inflation rate (differenced)
32 / 67
Using model with lags 3 lags (HQIC) H0 : φij,1 = φij,2 = φij,3 = 0 H1 : φij,1 = 0 or φij,2 = 0 or φij,3 = 0 i represent series being affected by lags of series j
33 / 67
Second fundamentally new concept Complicated dynamics of a VAR make direct interpretation of
Solution is to examine impulse responses The impulse response function of yi with respect to a shock in ǫj,
◮ Hard to decipher As long as yt is covariance stationarity it must have a VMA
Ξj are the impulse responses! Why? ◮ Directly measure the effect in period j of any shock 34 / 67
Any stationary AR(P)
AR(1)
Stationary VAR(P) have the same relationship to VMA(∞)
35 / 67
Easy in VAR(1)
Ξj = Φj
In the general VAR(P),
◮ In a VAR(2),
⊲ Ξ0 = Ik, Ξ1 = Φ1, Ξ2 = Φ2 1 + Φ2, and Ξ3 = Φ3 1 + Φ1Φ2 + Φ2Φ1. Confidence intervals are also somewhat painful ◮ Explained in notes 36 / 67
Simple bivariate VAR(1)
Depends on correlation between ǫ1,t and ǫ2,t 3 methods ◮ Ignore correlation and just shock ǫj,t with a 1 standard deviation
◮ Use Cholesky to factor Σ and use Σ1/2ej where ej is a vector of
C
◮ “Generalized” impulse response that uses a projection method 37 / 67
Define the error covariance
C
Federal Funds ordered first Response to Federal Funds Shock Cholesky factorization
39 / 67
Cointegration is the VAR version of unit roots Establishes long run relationships between two unit root variables ◮ Consumption has a unit root, income has a unit root ◮ Consumption - Income : ????
40 / 67
Strong link between xt and yt Both are random walks but difference is mean reverting Mean reversion to the trend (stochastic trend) 41 / 67
42 / 67
20 40 60 80 100 20 40 60 80 100
20 40 60 80 100 20 40 60 80 100 43 / 67
Eigenvalue condition determines whether a VAR(1) is cointegrated
If all less than 1: ? If both 1: two independent unit roots
44 / 67
Major point of cointegration ◮ Cointegrated ⇔ Error correction model What is an error correction model? ◮ Cointegrated VAR:
[−.2 .2]′ measures the speed of adjustment 45 / 67
α measures the speed of convergence β contain the cointegrating vectors Number of cointegrating vectors is rank(αβ′)
How many? 47 / 67
Put π in row echelon form
Recall π = αβ′
48 / 67
49 / 67
Two tests for cointegration ◮ Engle-Granger ◮ Johansen We will focus on Engle-Granger ◮ Simple and intuitive ◮ Only applicable with 1 cointegrating relationship Test key property of cointegration: difference is I(0) Most of the work is a simple OLS
Rest of work is testing ˆ
Johansen tests eigenvalues of π = αβ′ directly. 50 / 67
51 / 67
Deterministic terms ◮ No deterministic terms: only in special circumstances
◮ Constant: standard case
◮ Time trend and constant: allow different growth rates/time trends in
Critical Values ◮ Critical values depend on the deterministics in the CI regression ⊲ Models with more deterministics have lower (more negative) critical
◮ Critical values depend on number of RHS I(1) variables ⊲ Larger models have lower critical values 52 / 67
Consumption-Aggregate Wealth has been an interesting
Has revived the CCAPM Three components: ◮ Consumption (c) ◮ Asset Wealth (a) ◮ Labor Income (Human Wealth) (y) Deviation from long run related to expected return Cointegrating relationship: ct + .643 − 0.249at − 0.785yt
t
t
t
53 / 67
1960 1970 1980 1990 2000 2010
0.0 0.1 0.2 Consumption Asset Prices Labor Income
1960 1970 1980 1990 2000 2010
2 c Residual a Residual (Neg.) y Residual (Neg.)
54 / 67
VECM estimated using the residuals from cointegrating regression
(0.000)
(0.014)
(0.000)
(0.281)
(0.037)
(0.515)
(0.005)
(0.000)
(0.004)
(0.116)
(0.006)
(0.414)
(0.000)
(0.088)
(0.140)
P-values in parentheses Estimation of cointegration relationship has no effect on standard
◮ Converges fast (T) ◮ VECM parameters converge at rate
55 / 67
Caution is needed when working with I(1) data ◮ I(0) on I(0): The usual case. Standard asymptotic arguments apply. ◮ I(1) on I(0): This regression is unbalanced. ◮ I(1) on I(1): Cointegration or spurious regression. ◮ I(0) on I(1): This regression is unbalanced. Spurious regression can lead to large t-stats when the series are
◮ Two unrelated I(1) processes, xt and yt
◮ When T = 50, approx 80% of t-stats are significant ◮ Always check for I(1) when using time-series data ◮ If both I(1), make sure cointegrated. 56 / 67
5 10 15 ±1.96
10 20 30 ±1.96
57 / 67
It is common to run regressions using time-series data
Using time-series data in a cross-sectional regression may require
Modification is needed if the scores {xtǫt} are autocorrelated
◮ Usually occurs when the errors ǫt are autocorrelated due to mis- or
58 / 67
Consider the estimation of the mean when yt has white noise
Obviously The sample mean is unbiased
59 / 67
The variance of the sample mean
T
T
t + T
T
T
t] + T −2 T
T
T
T
T
Due to white noise, E[ǫiǫj]=0 whenever i = j. This is the usual result 60 / 67
Now suppose that the error follows an MA(1)
Error is mean 0 and so sample mean is still unbiased Variance of sample mean is different since ηt is autocorrelated ◮ E[ηtηt−1] = 0.
T
t + 2 T −1
T −2
2
1
In terms of autocovariances,
T
t ] + 2T −2 T −1
T −2
2
1
T
T −1
T −2
1
γ0 = V[ηt]=
An MA(1) has 1 non-zero autocovariance,
62 / 67
Putting it all together
63 / 67
When the scores are uncorrelated (a Martingale Difference
64 / 67
White’s estimator is only heteroskedasticity robust – not
T
tx′ txt p
Solution is to use a Newey-West covariance for the scores (xtǫt)
t = zt − ¯
t=1 zt. The L-lag Newey-West covariance
L
′ l
t=l+1 z∗ t z∗′ t−l and wl = 1 − l L+1.
65 / 67
Applied to a cross-sectional regression with time-series data
T
tx′ txt + L
T
sxs−l + T
q−lxq
L
′ l
66 / 67
Is a Newey-West estimator needed? Complex estimators have
It must be the case that L → ∞ as T → ∞ Even if the scores follow a MA(1)! Optimal rate is O(T 1 3 ) so L ∝ T 1 3 or L = cT 1 3 for some (unknown)
Other HAC estimators available and may work well if the scores
◮ Den Haan-Levin Alternative is to include lagged regressand(s) in the regression
◮ Not popular when focus is on cross-section component of model 67 / 67