Non-Stationary Time Series and Unit Root Tests Heino Bohn Nielsen - - PowerPoint PPT Presentation

non stationary time series and unit root tests
SMART_READER_LITE
LIVE PREVIEW

Non-Stationary Time Series and Unit Root Tests Heino Bohn Nielsen - - PowerPoint PPT Presentation

Econometrics 2 Non-Stationary Time Series and Unit Root Tests Heino Bohn Nielsen 1 of 28 Outline (1) Deviations from stationarity: Trends. Level shifts. Variance changes. Unit roots. (2) More on autoregressive unit root


slide-1
SLIDE 1

Econometrics 2

Non-Stationary Time Series and Unit Root Tests

Heino Bohn Nielsen

1 of 28

slide-2
SLIDE 2

Outline

(1) Deviations from stationarity:

  • Trends.
  • Level shifts.
  • Variance changes.
  • Unit roots.

(2) More on autoregressive unit root processes. (3) Dickey-Fuller unit root test.

  • AR(1).
  • AR(p).
  • Deterministic terms.

(4) Further issues.

2 of 28

slide-3
SLIDE 3

Stationarity

  • The main assumption on the time series data so far has been stationarity.

Recall the definition: A time series is called weakly stationary if

E[yt] = µ V [yt] = E[(yt − µ)2] = γ0 Cov[yt, yt−k] = E[(yt − µ) (yt−k − µ)] = γk

for k = 1, 2, ...

  • This can be violated in different ways.

Examples of non-stationarity: (A) Deterministic trends (trend stationarity). (B) Level shifts. (C) Variance changes. (D) Unit roots (stochastic trends).

3 of 28

slide-4
SLIDE 4

Four Non-Stationary Time Series

50 100 150 200 5 10 (A) Stationary and trend-stationary process

~ xt xt

50 100 150 200 5 (B) Process with a level shift 50 100 150 200

  • 5

5 (C) Process with a change in the variance 50 100 150 200 5 10 (D) Unit root process

4 of 28

slide-5
SLIDE 5

(A) Trend-Stationarity

  • Observation: Many macro-economic variables are trending.

We need a model for a trending variable, yt.

  • Assume that xt is stationary and that yt is xt plus a deterministic linear trend, e.g.

xt = θxt−1 + t, | θ |< 1, yt = xt + µ0 + µ1t.

  • Remarks:

(1) yt has a trending mean, E[yt] = µ0 + µ1t, and is non-stationary. (2) The de-trended variable, b

xt = yt − b µ0 − b µ1t, is stationary.

We say that yt is trend-stationary.

(3) The stochastic part is stationary and standard asymptotics apply to b

xt.

(4) Solution: Extend the regression with a deterministic trend, e.g.

yt = β0 + β1 · zt + β3 · t + t.

5 of 28

slide-6
SLIDE 6

(B) Level Shifts and Structural Breaks

  • Another type of non-stationarity in due to changes in parameters, e.g. a level shift:

E[yt] = ½ µ1

for t = 1, 2, ..., T0

µ2

for t = T0 + 1, T0 + 2, ..., T.

  • If each sub-sample is stationary, then there are two modelling approaches:

(1) Include a dummy variable

Dt = ½ 0 for t = 1, 2, ..., T0 1 for t = T0 + 1, T0 + 2, ..., T

in the regression model,

yt = β0 + β1 · zt + β3 · Dt + t.

If yt − β3 · Dt is stationary, standard asymptotics apply.

(2) Analyze the two sub-samples separately.

This is particularly relevant if we think that more parameters have changed.

6 of 28

slide-7
SLIDE 7

(C) Changing Variances

  • A third type of non-stationary is related to changes in the variance.

An example is

yt = 0.5 · yt−1 + t,

where

t ∼ ½ N(0, 1) for t = 1, 2, ..., T0 N(0, 5) for t = T0 + 1, T0 + 2, ..., T

The interpretation is that the time series covers different regimes.

  • A natural solution is to model the regimes separately.
  • Alternatively we can try to model the variance.

We return to so-called ARCH models for changing variance later.

7 of 28

slide-8
SLIDE 8

(D) Unit Roots

  • If there is a unit root in an autoregressive model, no standard asymptotics apply!

Consider the DGP

yt = θyt−1 + t, t ∼ N(0, 1),

for t = 1, 2, ..., 500, and y0 = 0. Consider the distribution of b

θ.

  • Note: the shape, location and variance of the distributions.

0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.0 2.5 5.0 7.5 10.0

(C) Distribution of ^ θ for θ=0.5

Distribution of ^ θ N(s=0.0389)

0.97 0.98 0.99 1.00 1.01 50 100

(D) Distribution of ^ θ for θ=1

Distribution of ^ θ N(s=0.00588)

8 of 28

slide-9
SLIDE 9

Properties of a Stationary AR(1)

  • Consider the AR(1) model

yt = θyt−1 + t, t = 1, 2, ..., T.

The characteristic polynomial is θ(z) = 1 − θz, with characteristic root z1 = θ−1 and inverse root φ1 = z−1

1

= θ. The stationarity condition is that |θ| < 1.

  • Recall the solution

yt = t + θt−1 + θ2t−2 + ... + θt−11 + θty0,

where θs → 0. Shocks have only transitory effects; yt has an attractor.

  • Note the properties

E[yt] = θty0 → 0 V [yt] = σ2 + θ2σ2 + θ4σ2 + ... + θt−1σ → σ2 1 − θ2 ρs = Corr(yt, yt−s) = θs

9 of 28

slide-10
SLIDE 10

Simulated Example

20 40 60 80 100 5 10 (A) Shock to a stationary process, θ= 0.8 20 40 60 80 100 10 (B) Shock to a unit root process, θ= 1 5 10 15 20 25 0.5 1.0 (C) ACF for stationary process, θ=0.8 5 10 15 20 25 0.5 1.0 (D) ACF for unit root process, θ=1

10 of 28

slide-11
SLIDE 11

Autoregression with a Unit Root

  • Consider the AR(1) with θ = 1, i.e.

yt = yt−1 + t.

The characteristic polynomial is θ(z) = 1 − z. There is a unit root, θ(1) = 0.

  • The solution is given by

yt = y0 + ∆y1 + ∆y2 + ... + ∆yt = y0 + 1 + 2 + ... + t = y0 +

t

X

i=1

i.

  • Note the remarkable differences between θ = 1 and a stationary process, |θ| < 1 :

(1) The effect of the initial value, y0, stays in the process. And E[yt | y0] = y0. (2) Shocks, t, have permanent effects.

Accumulated to a random walk component, P i, called a stochastic trend.

11 of 28

slide-12
SLIDE 12

(3) The variance increases,

V [yt] = V hXt

i=1 i

i = t · σ2.

The process is clearly non-stationary.

(4) The covariance, Cov(yt, yt−s), is given by

E[(yt − y0)(yt−s − y0)] = E[(1 + 2 + ... + t)(1 + 2 + ... + t−s)] = (t − s)σ2.

The autocorrelation is

Corr(yt, yt−s) = Cov(yt, yt−s) p V [yt] · V [yt−s] = (t − s)σ2 p tσ2 · (t − s)σ2 = r t − s t ,

which dies out very slowly with s.

(5) The first difference, ∆yt = t, is stationary.

yt is called integrated of first order, I(1).

12 of 28

slide-13
SLIDE 13

Deterministic Terms

  • To model actual time series we include deterministic terms, e.g.

yt = δ + θyt−1 + t.

With a unit root, the terms accumulate!

  • If |θ| < 1, the solution is

yt = θty0 +

t

X

i=0

θit−i + (1 + θ + θ2 + ...)δ,

where the mean converges to (1 + θ + θ2 + ...)δ → δ/(1 − θ).

  • If θ = 1, the solution is

yt = y0 +

t

X

i=1

(δ + i) = y0 + δt +

t

X

i=1

i.

The constant term produces a deterministic linear trend: Random walk with drift. Note the parallel between a deterministic and a stochastic trend.

13 of 28

slide-14
SLIDE 14

Unit Root Testing

  • Estimate an autoregressive model and test whether

θ(1) = 0,

i.e. whether z = 1 is a root in the autoregressive polynomial.

  • This is a straightforward hypothesis test!

We compare two relevant models: H0 and HA. The complication is that the asymptotic distributions are not standard.

  • Issues:

(1) Remember to specify the statistical model carefully! (2) What kinds of deterministic components are relevant: constant or trend? (3) What are the properties of the model under the null and under the alternative.

Are both H0 and HA relevant?

(4) What is the relevant distribution of the test statistic?

14 of 28

slide-15
SLIDE 15

Dickey-Fuller Test in an AR(1)

  • Consider an AR(1) model

yt = θyt−1 + t.

The unit root hypothesis is θ(1) = 1−θ = 0. The one-sided test against stationarity:

H0 : θ = 1

against

HA : −1 < θ < 1.

  • An equivalent formulation is

∆yt = πyt−1 + t,

where π = θ − 1 = −θ(1). The hypothesis θ(1) = 0 translates into

H0 : π = 0

against

HA : −2 < π < 0.

  • The Dickey-Fuller (DF) test statistic is simply the t− ratio, i.e.

b τ = b θ − 1

se(b

θ) = b π

se(b

π).

The asymptotic distribution is Dickey-Fuller, DF, and not N(0, 1).

15 of 28

slide-16
SLIDE 16

Quantile Distribution

1% 2.5% 5% 10% N(0, 1) −2.33 −1.96 −1.64 −1.28

DF

−2.56 −2.23 −1.94 −1.62

DFc

−3.43 −3.12 −2.86 −2.57

DFl

−3.96 −3.66 −3.41 −3.13

  • 4
  • 2

2 4 0.0 0.2 0.4 0.6 N(0,1) DF DFc DFl

(A) Dickey-Fuller distributions

16 of 28

slide-17
SLIDE 17

Dickey-Fuller Test in an AR(p)

  • The DF distribution is derived under the assumption that t is IID.

For the AR(p) process we derive the Augmented Dickey-Fuller (ADF) test.

  • Consider the case of p = 3 lags:

yt = θ1yt−1 + θ2yt−2 + θ3yt−3 + t.

A unit root in θ(z) = 1 − θ1z − θ2z2 − θ3z3 corresponds to θ(1) = 0. To avoid testing a restriction on 1 − θ1 − θ2 − θ3, the model is rewritten as

yt − yt−1 = (θ1 − 1)yt−1 + θ2yt−2 + θ3yt−3 + t yt − yt−1 = (θ1 − 1)yt−1 + (θ2 + θ3)yt−2 + θ3(yt−3 − yt−2) + t yt − yt−1 = (θ1 + θ2 + θ3 − 1)yt−1 + (θ2 + θ3)(yt−2 − yt−1) + θ3(yt−3 − yt−2) + t ∆yt = πyt−1 + c1∆yt−1 + c2∆yt−2 + t,

where

π = θ1 + θ2 + θ3 − 1 = −θ(1), c1 = − (θ2 + θ3) , c2 = −θ3.

17 of 28

slide-18
SLIDE 18
  • The hypothesis for θ(1) = 0 is unchanged:

H0 : π = 0

against

HA : −2 < π < 0.

The t−test statistic τ π=0 again follows the DF-distribution.

  • Remarks:

(1) It is only the test for π = 0 that follows the DF distribution.

Tests on c1 and c2 are N(0, 1).

(2) We use the normal tools to determine the appropriate lag-length:

general-to-specific testing or information criteria.

(3) Verbeek suggests to calculate the DF test for all values of p.

... but why should we look at inferior or misspecified models? Find the best model and test in that.

18 of 28

slide-19
SLIDE 19

Dickey-Fuller Test with a Constant Term

  • We need deterministic variables to model actual time series, E[yt] 6= 0.

The DF regression with a constant term (and p = 3 lags again) is

∆yt = δ + πyt−1 + c1∆yt−1 + c2∆yt−2 + t.

(∗) The hypothesis is unchanged H0 : π = 0, and as a test statistic we can use

b τ c = b π

se(b

π).

  • Remarks:

(1) The constant term in the regression changes the asymptotic distribution.

The relevant distribution, DFc, is shifted to the left of DF.

(2) Under the null hypothesis, π = 0, the constant gives a trend in yt. We have

yt = ½ µ + stationary process

for

|θ| < 1 y0 + random walk + δt

for

θ = 1

That is not a natural comparison. We assume that δ = 0 if θ = 1.

19 of 28

slide-20
SLIDE 20
  • A more satisfactory hypothesis is H∗

0 : π = δ = 0, i.e. compare (∗) with

∆yt = c1∆yt−1 + c2∆yt−2 + t.

(∗∗)

  • The joint hypothesis can be tested by a LR test,

LR(π = δ = 0) = −2 · (log L0 − log LA) ,

where log L0 and log LA denote the log-likelihood values from (∗) and (∗∗)

  • The LR statistic follows the square of the DFc distribution, DF2

c, under the null.

Instead of tabulating critical values for DF2

c, we can use the signed LR test,

b ωc = sign(b π) · p LR(π = δ = 0).

This statistic follows the same DFc distribution as τ c.

20 of 28

slide-21
SLIDE 21

Empirical Example: Danish Bond Rate

1970 1975 1980 1985 1990 1995 2000 2005

  • 0.025

0.000 0.025 0.050 0.075 0.100 0.125 0.150 0.175 0.200 Bond rate, rt First difference, ∆rt

21 of 28

slide-22
SLIDE 22
  • An AR(4) model for 1972 : 1 − 2005 : 2 gives (t−values):

∆rt = −0.0076

(−0.58) rt−1 + 0.3955 (4.50) ∆rt−1 −0.0169 (−0.18) ∆rt−2 −0.0744 (−0.84) ∆rt−3 + 0.0005 (0.29) .

Removing insignificant terms produces a model

∆rt = −0.0103

(−0.81) rt−1 + 0.3847 (4.72) ∆rt−1 + 0.0008 (0.52) ,

with log LA = 479.80. The DF t−test is b

τ c = −0.81. We do not reject a unit root.

  • To test the joint hypothesis, H∗

0 : π = δ = 0, we run the regression under the null,

∆rt = 0.3791

(4.72) ∆rt−1,

with log L0 = 479.28. The LR test is given by

LR(π = γ = 0) = −2 · (log L0 − log LA) = −2 · (479.28 − 479.80) = 1.03,

and the signed LR test is

b ωc = − p LR(π = γ = 0) = − √ 1.03 = −1.02.

This is again clearly larger than the 5% critical value of −2.89 in DFc.

22 of 28

slide-23
SLIDE 23

Dickey-Fuller Test with a Trend Term

  • For trending variables, the relevant alternative is often trend-stationarity. We use

∆yt = δ + γt + πyt−1 + c1∆yt−1 + c2∆yt−2 + t.

(#) The hypothesis is still H0 : π = 0, and the DF t−test is

b τ l = b π

se(b

π).

The presence of a trend shifts the asymptotic distribution, DFl, further to the left.

  • To avoid the accumulation of the trend under π = 0, we may consider the joint

hypothesis, H∗

0 : π = γ = 0, i.e. to compare (#) with

∆yt = δ + c1∆yt−1 + c2∆yt−2 + t.

(##) The LR test is LR(π = γ = 0) = −2 · (log L0 − log LA), which follows a DF2

l .

Again we can use the signed square root,

b ωl = sign(b π) · p LR(π = γ = 0),

which follows the same DFl distribution as the t−test.

23 of 28

slide-24
SLIDE 24

Empirical Examples: Trend-Stationarity

1970 1980 1990 2000 0.25 0.50 0.75 1.00

(A) Log of Danish productivity

1970 1980 1990 2000 6.0 6.2 6.4

(B) Log of Danish private consumption

1970 1980 1990 2000

  • 0.05

0.00 0.05

(C) Productivity, deviation from trend

1970 1980 1990 2000

  • 0.05

0.00 0.05

(D) Consumption, deviation from trend

24 of 28

slide-25
SLIDE 25

Empirical Example: Danish Productivity

  • To test whether log-productivity is trend-stationary we use an AR(1) regression

∆LPRODt = −0.439

(−6.22) LPRODt−1 + 0.091 (6.58) + 0.0024 (6.15) t + t,

with log LA = 366.09. The DF t−test is given by b

τ l = −6.22 ¿ −3.96 (1% cv).

Here we reject a unit root and conclude that productivity is trend-stationary.

  • To test the joint hypothesis, H∗

0 : π = γ = 0, we run the regression under the null

∆LPRODt = 0.0057

(3.48) + t,

with log L0 = 348.63. The LR test is

LR(π = γ = 0) = −2 · (log L0 − log LA) = −2 · (348.63 − 366.09) = 34.92,

and the signed LR test is

b ωl = − p LR(π = γ = 0) = − √ 34.92 = −5.91.

This is smaller than the critical value in DFl, and we reject the unit root hypothesis.

25 of 28

slide-26
SLIDE 26

Empirical Example: Danish Consumption

  • To test if log-consumption is trend-stationary we use the regression

∆LCONSt = −0.129

(−2.56) LCONSt−1 −0.209 (−2.43) ∆LCONSt−1 + 0.764 (2.57) + 0.0004 (2.58) t + t,

with log LA = 359.23. The Dickey-Fuller t−test is given by b

τ l = −2.56, which is

not significantly in the DFl distribution. We conclude that private consumption seems to have a unit-root.

  • To test the joint hypothesis, H∗

0 : π = γ = 0, we use the regression under the null,

∆LCONSt = −0.274

(−3.29) ∆LCONSt−1 + 0.0046 (2.97) + t,

with log L0 = 355.87. The LR test for a unit root is given by

LR(π = γ = 0) = −2 · (log L0 − log LA) = −2 · (355.87 − 359.23) = 6.72,

and the signed LR test is

b ωl = − p LR(π = γ = 0) = − √ 6.72 = −2.59.

Again we conclude in favour of a unit root in consumption.

26 of 28

slide-27
SLIDE 27

The Problem of Low Power

  • It is difficult to distinguish unit roots from large stationary roots.

Always be careful in conclusions.

  • Consider time series generated from the two models

∆yt = −0.2 · yt−1 + 0.05 · t + t ∆xt = 0.25 + t.

Hard to tell apart in practice. We need many observations to be sure.

20 40 60 80 100 10 20 30 (A) Trend-stationary and unit root process

∆Yt = −0.2⋅Yt−1 + 0.05⋅t + εt ∆Yt = 0.25 + εt

100 200 300 400 500 50 100 (B) Trend-stationary and unit root process

∆Yt = −0.2⋅Yt−1 + 0.05⋅t + εt ∆Yt = 0.25 + εt

27 of 28

slide-28
SLIDE 28

Special Events

  • Unit root tests assess whether shocks have transitory or permanent effects.

The conclusions are sensitive to a few large shocks.

  • Consider a one-time change in the mean of the series, a so-called break.

This is one large shock with a permanent effect. Even if the series is stationary, such that normal shocks have transitory effects, the presence of a break will make it look like the shocks have permanent effects. That may bias the conclusion towards a unit root.

  • Consider a few large outliers, i.e. a single strange observations.

The series may look more mean reverting than it actually is. That may bias the results towards stationarity.

28 of 28