Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen 1 - - PDF document

univariate time series analysis arima models
SMART_READER_LITE
LIVE PREVIEW

Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen 1 - - PDF document

Econometrics 2 Fall 2004 Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen 1 of 40 Outline of the Lecture (1) Introduction to univariate time series analysis. (2) Stationarity. (3) Characterizing time dependence: ACF and


slide-1
SLIDE 1

Econometrics 2 — Fall 2004

Univariate Time Series Analysis; ARIMA Models

Heino Bohn Nielsen

1 of 40

Outline of the Lecture

(1) Introduction to univariate time series analysis. (2) Stationarity. (3) Characterizing time dependence: ACF and PACF. (4) Modelling time dependence: ARMA(p,q). (5) Lag operators, lag polynomials and invertibility. (6) Examples:

  • AR(1).
  • AR(2).
  • MA(1).

(7) Model selection. (8) Estimation. (9) Forecasting.

2 of 40

slide-2
SLIDE 2

Univariate Time Series Analysis

  • Consider a single time series: y1, y2, ..., yT.

Simple models for yt as a function of the past.

  • Univariate models are used for

— Analyzing the properties of a time series. The dynamic adjustment after a chock. Transitory or permanent effects. Presence of unit roots. — Forecasting. A model for E[yt | xt] is only useful for forecasting yt+1 if we know (or can forecast) xt+1. — Introducing the tools necessary for analyzing more complicated models.

3 of 40

Stationarity

  • A time series, y1, y2, ..., yt, ..., yT, is (strictly) stationary if the joint distributions

(yt1, yt2, ..., ytn)

and

(yt1+h, yt2+h, ..., ytn+h)

are the same for all h.

  • A time series is called weakly stationary or covariance stationary if

E[yt] = µ V [yt] = E[(yt − µ)2] = γ0 Cov[yt, yt−k] = E[(yt − µ) (yt−k − µ)] = γk

for k = 1, 2, ... Often µ and γ0 are assumed finite.

  • On these slides we consider only stationary processes.

Later we consider non-stationary processes.

4 of 40

slide-3
SLIDE 3

The Autocorrelation Function (ACF)

  • For a stationary time series we define the autocorrelation function (ACF) as

ρk = Corr(yt, yt−k) = γk γ0 = Cov(yt, yt−k) V (yt) , Ã = Cov(yt, yt−k) p V (yt) · V (yt−k) ! .

Note that −1 ≤ ρk ≤ 1, ρ0 = 1, and ρk = ρ−k.

  • Recall that the ACF can (e.g.) be estimated by OLS in the regression model

yt = c + ρkyt−k + residual.

  • Under the assumption of white noise, ρ1 = ρ2 = ... = 0, it holds that

V (b ρk) = T −1,

and 95% confidence bands are given by ±2/

√ T.

5 of 40

The Partial Autocorrelation Function (PACF)

  • An alternative measure is the partial autocorrelation function (PACF), which is

the correlation conditional on the intermediate values, i.e. Corr(yt, yt−k | yt−1, ..., yt−k+1).

  • The PACF can be estimated as the OLS estimator b

θk in the regression yt = c + θ1yt−1 + ... + θkyt−k + residual,

where the intermediate lags are included.

  • Under the assumption of white noise, θ1 = θ2 = ... = 0, it holds that

V (b θk) = T −1.

95% confidence bands are given by ±2/

√ T.

6 of 40

slide-4
SLIDE 4

1970 1980 1990 2000 6.0 6.2 6.4 (A) Danish income, log of constant prices 1970 1980 1990 2000 2 (B) Deviation from trend 5 10 15 20 1 (C) Estimated ACF 5 10 15 20 1 (D) Estimated PACF

7 of 40

The ARMA(p,q) Model

  • We consider two simple models for yt:

The autoregressive AR(p) model and the moving average MA(q) model.

  • First define a white noise process, t ∼ i.i.d.(0, σ2).
  • The AR(p) model is defined as

yt = θ1yt−1 + θ2yt−2 + ... + θpyt−p + t.

Systematic part of yt is a linear function of p lagged values.

  • The MA(q) model is defined as

yt = t + α1t−1 + α2t−2 + ... + αqt−q. yt is a moving average of past chocks to the process.

  • They can be combined into the ARMA(p,q) model

yt = θ1yt−1 + ... + θpyt−p + t + α1t−1 + ... + αqt−q.

8 of 40

slide-5
SLIDE 5

The Lag— and Difference Operators

  • Define the lag-operator, L, to have the property that

Lyt = yt−1.

Note, that

L2yt = L(Lyt) = Lyt−1 = yt−2.

  • Also define the first difference operator, ∆ = 1 − L, such that

∆yt = (1 − L) yt = yt − Lyt = yt − yt−1.

Note, that

∆2yt = ∆(∆yt) = ∆(yt − yt−1) = ∆yt − ∆yt−1 = (yt − yt−1) − (yt−1 − yt−2) = yt − 2yt−1 + yt−2.

9 of 40

Lag Polynomials

  • Consider as an example the AR(2) model

yt = θ1yt−1 + θ2yt−2 + t.

That can be written as

yt − θ1yt−1 − θ2yt−2 = t (1 − θ1L − θ2L2)yt = t θ(L)yt = t,

where

θ(L) = 1 − θ1L − θ2L2

is a polynomial in L denoted a lag-polynomial.

  • Standard rules for calculating with polynomials also hold for polynomials in L.

10 of 40

slide-6
SLIDE 6

Characteristic Equations and Roots

  • For a model

yt − θ1yt−1 − θ2yt−2 = θ(L)yt = t,

we define the characteristic equation as

θ(z) = 1 − θ1z − θ2z2 = 0.

The solutions, z1 and z2, are denoted characteristic roots.

  • An AR(p) has p roots.

Some of them may be complex values, h ± v · i, where i = √−1.

  • The roots can be used for factorizing the polynomial

θ(z) = 1 − θ1z − θ2z2 = (1 − φ1z) (1 − φ2z) ,

where φ1 = z−1

1

and φ2 = z−1

2 .

11 of 40

Invertibility of Polynomials

  • Define the inverse of a polynomial, θ−1(L) of θ(L), so that

θ−1(L)θ(L) = 1.

  • Consider the AR(1) case, θ(L) = 1 − θL, and look at the product

(1 − θL) ¡ 1 + θL + θ2L2 + θ3L3 + ... + θkLk¢ = (1 − θL) + ¡ θL − θ2L2¢ + ¡ θ2L2 − θ3L3¢ + ¡ θ3L3 − θ4L5¢ + ... = 1 − θk+1Lk+1.

If |θ| < 1, it holds that θk+1Lk+1 → 0 as k → ∞ implying that

θ−1(L) = (1 − θL)−1 = 1 1 − θL = 1 + θL + θ2L2 + θ3L3 + ... =

X

i=0

θiLi

  • If θ(L) is a finite polynomial, the inverse polynomial, θ−1(L), is infinite.

12 of 40

slide-7
SLIDE 7

ARMA Models in AR and MA form

  • Using lag polynomials we can rewrite the stationary ARMA(p,q) model as

yt − θ1yt−1 − ... − θpyt−p = t + α1t−1 + ... + αqt−q

(∗)

θ(L)yt = α(L)t.

where θ(L) and α(L) are finite polynomials.

  • If θ(L) is invertible, (∗) can be written as the infinite MA(∞) model

yt = θ−1(L)α(L)t yt = t + γ1t−1 + γ2t−2 + ...

This is called the MA representation.

  • If α(L) is invertible, (∗) can be written as an infinite AR(∞) model

α−1(L)θ(L)yt = t yt − γ1yt−1 − γ2yt−2 − ... = t.

This is called the AR representation.

13 of 40

Invertibility and Stationarity

  • A finite order MA process is stationary by construction.

— It is a linear combination of stationary white noise terms. — Invertibility is sometimes convenient for estimation and prediction.

  • An infinite MA process is stationary if the coefficients, αi, converge to zero.

— We require that P∞

i=1 α2 i < ∞.

  • An AR process is stationary if θ(L) is invertible.

— This is important for interpretation and inference. — In the case of a root at unity standard results no longer hold. We return to unit roots later.

14 of 40

slide-8
SLIDE 8
  • Consider again the AR(2) model

θ(z) = 1 − θ1z − θ2z2 = (1 − φ1L) (1 − φ2L) .

The polynomial is invertible if the factors (1 − φiL) are invertible, i.e. if

|φ1| < 1

and

|φ2| < 1.

  • In general a polynomial, θ(L), is invertible if the characteristic roots, z1, ..., zp,

are larger than one in absolute value. In complex cases, this corresponds to the roots being outside the complex unit

  • circle. (Modulus larger than one).

15 of 40

ARMA Models and Common Roots

  • Consider the stationary ARMA(p,q) model

yt − θ1yt−1 − ... − θpyt−p = t + α1t−1 + ... + αqt−q θ(L)yt = α(L)t (1 − φ1L) (1 − φ2L) · · · ¡ 1 − φpL ¢ yt = (1 − ξ1L) (1 − ξ2L) · · · ¡ 1 − ξqL ¢ t.

  • If φi = ξj for some i, j, they are denoted common roots or canceling roots.

The ARMA(p,q) model is equivalent to a ARMA(p-1,q-1) model.

  • As an example, consider

yt − yt−1 + 0.25yt−2 = t − 0.5t−1 ¡ 1 − L + 0.25L2¢ yt = (1 − 0.5L) t (1 − 0.5L) (1 − 0.5L) yt = (1 − 0.5L) t (1 − 0.5L) yt = t.

16 of 40

slide-9
SLIDE 9

Unit Roots and ARIMA Models

  • A root at one is denoted a unit root, and has important consequences for the
  • analysis. We consider tests for unit roots and unit root econometrics later.
  • Consider an ARMA(p,q) model

θ(L)yt = α(L)t.

If there is a unit root in the AR polynomial, we can factorize into

θ(L) = (1 − L) (1 − φ2L) · · · ¡ 1 − φpL ¢ = (1 − L)θ∗(L),

and we can write the model as

θ∗(L)(1 − L)yt = α(L)t θ∗(L)∆yt = α(L)t.

  • An ARMA(p,q) model for ∆dyt is denoted an ARIMA(p,d,q) model for yt.

17 of 40

Example: Danish Real House Prices

  • Consider the Danish real house prices in logs, pt. An AR(2) model yields

pt = 1.551

(20.7)

pt−1 − 0.5734

(−7.56)

pt−2 + 0.003426

(1.30)

The lag polynomial is given by

θ(L) = 1 − 1.551 · L + 0.5734 · L2,

with inverse roots given by 0.9422 and 0.6086.

  • One root is close to unity and we estimate an ARIMA(2,1,0) model for pt :

∆pt = 1.323

(16.6)

∆pt−1 − 0.4853

(−6.12)

∆pt−2 + 0.0009959

(0.333)

The lag polynomial is given by

θ(L) = 1 − 1.323 · L + 0.4853 · L2,

with complex (inverse) roots given by

0.66140 ± 0.21879 · i,

where

i = √ −1.

18 of 40

slide-10
SLIDE 10

AR(1) Model

  • Consider the AR(1) model

Yt = δ + θYt−1 + t (1 − θL)Yt = δ + t

where Y1 is the initial value.

  • Provided |θ| < 1, the solution for Yt can be found as

Yt = (1 − θL)−1 (δ + t) = ¡ 1 + θL + θ2L2 + θ3L3 + ... ¢ (δ + t) = ¡ 1 + θ + θ2 + θ3 + ... ¢ δ + ¡ 1 + θL + θ2L2 + θ3L3 + ... ¢ t = δ 1 − θ + t + θt−1 + θ2t−2 + +θ3t−3 + ...

This is the MA-representation.

  • In finite samples there is an effect of the initial value, θtY1.

19 of 40

  • We calculate the expectation

E[Yt] = E ∙ δ 1 − θ + t + θt−1 + θ2t−2 + ... ¸ = δ 1 − θ = µ.

The effect of the constant depends on the autoregressive parameter.

  • Now define the deviation from mean, yt = Yt − µ, so that

Yt = δ + θYt−1 + t Yt = (1 − θ) µ + θYt−1 + t Yt − µ = θ (Yt−1 − θµ) + t yt = θyt−1 + t.

  • Note, that γ0 = V [Yt] = V [yt]. And under stationarity

V [yt] = V [θyt−1 + t] V [yt] = θ2V [yt−1] + V [t] V [yt](1 − θ2) = V [t] γ0 = σ2 1 − θ2

20 of 40

slide-11
SLIDE 11
  • The covariances are given by

γ1 = Cov[yt, yt−1] = E[ytyt−1] = E[(θyt−1 + t)yt−1] = θE[y2

t−1] + E[yt−1t]

= θ σ2 1 − θ2 = θγ0 γ2 = Cov[yt, yt−2] = E[ytyt−2] = E[(θyt−1 + t) yt−2] = E[(θ(θyt−2 + t−1) + t) yt−2] = E[θ2yt−2 + θyt−2t−1 + yt−2t] = θ2E[y2

t−2] + θE[yt−2t−1] + E[yt−2t] = θ2

σ2 1 − θ2 = θ2γ0 γk = Cov[yt, yt−k] = θk σ2 1 − θ2 = θkγ0

  • The ACF is given by

ρk = γk γ0 = θkγ0 γ0 = θk.

  • The PACF is simply the autoregressive coefficients: θ1, 0, 0, ...

21 of 40

Examples of AR(1) Models

50 100

  • 2

2

yt=εt

10 20 1

ACF-

10 20 1

PACF-

50 100

  • 2.5

0.0 2.5

yt=0.8⋅yt−1 +εt

10 20 1

ACF-

10 20 1

PACF-

50 100

  • 5

5

yt=−0.8⋅yt −1 +εt

10 20 1

ACF-

10 20 1

PACF-

22 of 40

slide-12
SLIDE 12

Examples of AR(1) Models

20 40 60 80 100

  • 2.5

0.0 2.5

yt=0.50⋅yt−1 +εt

20 40 60 80 100

  • 5.0
  • 2.5

0.0 2.5

yt=0.95⋅yt−1 +εt

20 40 60 80 100

  • 10
  • 5

yt=yt−1 +εt

20 40 60 80 100

  • 750
  • 500
  • 250

yt=1.05⋅yt−1 +εt 23 of 40

AR(2) Model

  • Consider the AR(2) model given by

Yt = δ + θ1Yt−1 + θ2Yt−2 + t.

  • Again we can find the mean

E [Yt] = δ + θ1E [Yt−1] + θ2E [Yt−2] + E [t] E [Yt] = δ 1 − θ1 − θ2 = µ,

and define the process yt = Yt − µ for which it holds that

yt = θ1yt−1 + θ2yt−2 + t.

24 of 40

slide-13
SLIDE 13
  • Multiplying both sides with yt and taking expectations yields

E £ y2

t

¤ = θ1E [yt−1yt] + θ2E [yt−2yt] + E [tyt] γ0 = θ1γ1 + θ2γ2 + σ2

Multiplying instead with yt−1 yields

E [ytyt−1] = θ1E [yt−1yt−1] + θ2E [yt−2yt−1] + E [tyt−1] γ1 = θ1γ0 + θ2γ1

Multiplying instead with yt−2 yields

E [ytyt−2] = θ1E [yt−1yt−2] + θ2E [yt−2yt−2] + E [tyt−2] γ2 = θ1γ1 + θ2γ0

Multiplying instead with yt−3 yields

E [ytyt−3] = θ1E [yt−1yt−3] + θ2E [yt−2yt−3] + E [tyt−3] γ3 = θ1γ2 + θ2γ1

  • These are the so-called Yule-Walker equations.

25 of 40

  • Formulating in terms of the ACF, we get

ρ1 = θ1 + θ2ρ1 ρ2 = θ1ρ1 + θ2 ρk = θ1ρk−1 + θ2ρk−2, k ≥ 3

  • r alternatively that

ρ1 = θ1 1 − θ2 ρ2 = θ2

1

1 − θ2 + θ2 ρk = θ1ρk−1 + θ2ρk−2, k ≥ 3.

26 of 40

slide-14
SLIDE 14

Examples of AR(2) Models

50 100

  • 2.5

0.0 2.5

yt=0.5⋅yt−1 +0.4⋅yt −2 +εt

10 20 0.5 1.0

ACF-

10 20 1

PACF-

50 100

  • 5

5

yt=−0.8⋅yt −2 +εt

10 20 1

ACF-

10 20 1

PACF-

50 100

  • 5

5

yt=1.3⋅yt−1 −0.8⋅yt −2 εt

10 20 1

ACF-

10 20 1

PACF-

27 of 40

MA(1) Model

  • Consider the MA(1) model

Yt = µ + t + αt−1, t = 2, 3, ..., T.

If the polynomial α(L) = 1 + αL is invertible, the MA(1) model can be written as the infinite AR(∞) model.

  • The mean and variance are given by

E[Yt] = E[µ + t + αt−1] = µ V [Yt] = E h (Yt − µ)2i = E h (t + αt−1)2i = E £ 2

t

¤ + E £ α22

t−1

¤ + E [2αtt−1] = ¡ 1 + α2¢ σ2.

28 of 40

slide-15
SLIDE 15
  • The covariances are given by

γ1 = Cov[Yt, Yt−1] = E [(Yt − µ) (Yt−1 − µ)] = E[(t + αt−1) (t−1 + αt−2)] = E[tt−1 + αtt−2 + α2

t−1 + α2t−1t−2]

= ασ2 γ2 = Cov[Yt, Yt−2] = E [(Yt − µ) (Yt−2 − µ)] = E[(t + αt−1) (t−2 + αt−3)] = E[tt−2 + αtt−3 + αt−1t−2 + α2t−1t−3] = 0 γk = 0

for

k = 3, 4, ...

  • The ACF is given by

ρ1 = γ1 γ0 = ασ2 (1 + α2) σ2 = α (1 + α2) ρk = 0, k ≥ 2.

29 of 40

Examples of MA Models

50 100

  • 2.5

0.0 2.5

yt=εt − 0.9⋅εt −1

10 20 1

ACF-

10 20 1

PACF-

50 100

  • 2.5

0.0 2.5 5.0

yt=εt + 0.9⋅εt −1

10 20 1

ACF-

10 20 1

PACF-

50 100

  • 2.5

0.0 2.5

yt=εt − 0.6⋅εt −1 + 0.5⋅εt −2 + 0.5⋅εt −3

10 20 1

ACF-

10 20 1

PACF-

30 of 40

slide-16
SLIDE 16

ARIMA(p,d,q) Model Selection

  • Find a transformation of the process that is stationary, e.g. ∆dYt.
  • Recall, that for the stationary AR(p) model

— The ACF is infinite but convergent. — The PACF is zero for lags larger than p.

  • For the MA(q) model

— The ACF is zero for lags larger than q. — The PACF is infinite but convergent.

  • The ACF and PACF contains information p and q.

Can be used to select relevant models.

31 of 40

  • If alternative models are nested, they can be tested.
  • Model selection can be based on information criteria

IC = log b σ2 | {z }

Measures the likelihood

+ penalty(T, #parameters) | {z }

A penalty for the number of parameters

The information criteria should be minimized!

  • Three important criteria

AIC = log b σ2 + 2 · k T HQ = log b σ2 + 2 · k · log(log(T)) T BIC = log b σ2 + k · log(T) T ,

where k is the number of estimated parameters, e.g. k = p + q.

32 of 40

slide-17
SLIDE 17

Example: Consumption-Income Ratio

1970 1980 1990 2000 6.00 6.25 (A) Consumption and income, logs.

Consumption (c) Income (y)

1970 1980 1990 2000

  • 0.15
  • 0.10
  • 0.05

0.00 (B) Consumption-Income ratio, logs. 5 10 15 20 1 (C) ACF f

  • r series in (B)

5 10 15 20 1 (D) PACF f

  • r series in (B)

33 of 40

Model T p log-lik SC HQ AIC ARMA(2,2) 130 5 300.82151

  • 4.4408
  • 4.5063
  • 4.5511

ARMA(2,1) 130 4 300.39537

  • 4.4717
  • 4.5241
  • 4.5599

ARMA(2,0) 130 3 300.38908

  • 4.5090
  • 4.5483
  • 4.5752

ARMA(1,2) 130 4 300.42756

  • 4.4722
  • 4.5246
  • 4.5604

ARMA(1,1) 130 3 299.99333

  • 4.5030
  • 4.5422
  • 4.5691

ARMA(1,0) 130 2 296.17449

  • 4.4816
  • 4.5078
  • 4.5258

ARMA(0,0) 130 1 249.82604

  • 3.8060
  • 3.8191
  • 3.8281

34 of 40

slide-18
SLIDE 18
  • --- Maximum likelihood estimation of ARFIMA(1,0,1) model ----

The estimation sample is: 1971 (1) - 2003 (2) The dependent variable is: cy (ConsumptionData.in7) Coefficient Std.Error t-value t-prob AR-1 0.857361 0.05650 15.2 0.000 MA-1

  • 0.300821

0.09825

  • 3.06

0.003 Constant

  • 0.0934110

0.009898

  • 9.44

0.000 log-likelihood 299.993327 sigma 0.0239986 sigma^2 0.000575934

  • --- Maximum likelihood estimation of ARFIMA(2,0,0) model ----

The estimation sample is: 1971 (1) - 2003 (2) The dependent variable is: cy (ConsumptionData.in7) Coefficient Std.Error t-value t-prob AR-1 0.536183 0.08428 6.36 0.000 AR-2 0.250548 0.08479 2.95 0.004 Constant

  • 0.0935407

0.009481

  • 9.87

0.000 log-likelihood 300.389084 sigma 0.0239238 sigma^2 0.000572349

35 of 40

Estimation of ARMA Models

  • The natural estimator is maximum likelihood. With normal errors

log L(θ, α, σ2) = −T 2 log(2πσ2) −

T

X

t=1

2

t

2 · σ2,

where t is the residual.

  • For an AR(1) model we can write the residual as

t = Yt − δ − θ1 · Yt−1,

and OLS coincides with ML.

  • Usual to condition on the initial values. Alternatively we can postulate a distri-

bution for the first observation, e.g.

Y1 ∼ N µ δ 1 − θ, σ2 1 − θ2 ¶ ,

where the mean and variance are chosen as implied by the model for the rest

  • f the observations. We say that Y1 is chosen from the invariant distribution.

36 of 40

slide-19
SLIDE 19
  • For the MA(1) model

Yt = µ + t + αt−1,

the residuals can be found recursively as a function of the parameters

1 = Y1 − µ 2 = Y2 − µ − α1 3 = Y3 − µ − α2

. . . Here, the initial value is 0 = 0, but that could be relaxed if required by using the invariant distribution.

  • The likelihood function can be maximized wrt. α and µ.

37 of 40

Forecasting

  • Easy to forecast with ARMA models.

Main drawback is that here is no economic insight.

  • We want to predict yT+k given all information up to time T, i.e.

given the information set

IT = {y−∞, ..., yT−1, yT}.

The optimal predictor is the conditional expectation

yT+k|T = E[yT+k | IT].

38 of 40

slide-20
SLIDE 20
  • Consider the ARMA(1,1) model

yt = θ · yt−1 + t + αt−1, t = 1, 2, ..., T.

  • To forecast we

— Substitute the estimated parameters for the true. — Use estimated residuals up to time T. Hereafter, the best forecast is zero.

  • The optimal forecasts will be

yT+1|T = E[θ · yT + T+1 + α · T | IT] = b θ · yT + b α ·b T yT+2|T = E[θ · yT+1 + T+2 + α · T+1 | IT] = b θ · yT+1|T.

39 of 40

1990 1992 1994 1996 1998 2000 2002 2004 2006 2008

  • 0.16
  • 0.14
  • 0.12
  • 0.10
  • 0.08
  • 0.06
  • 0.04
  • 0.02

Forecasts Actual

40 of 40