Time Series Analysis Henrik Madsen hm@imm.dtu.dk Informatics and - - PowerPoint PPT Presentation

time series analysis
SMART_READER_LITE
LIVE PREVIEW

Time Series Analysis Henrik Madsen hm@imm.dtu.dk Informatics and - - PowerPoint PPT Presentation

H. Madsen, Time Series Analysis, Chapmann Hall Time Series Analysis Henrik Madsen hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby Henrik Madsen 1 H. Madsen, Time Series Analysis,


slide-1
SLIDE 1

1 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Time Series Analysis

Henrik Madsen

hm@imm.dtu.dk

Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

slide-2
SLIDE 2

2 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Outline of the lecture

Identification of univariate time series models Introduction, Sec. 6.1 Estimation of auto-covariance and -correlation, Sec. 6.2.1 (and the intro. to 6.2) Using SACF, SPACF, and SIACF for suggesting model structure, Sec. 6.3 Estimation of model parameters, Sec. 6.4 Examples... Cursory material: The extended linear model class in Sec. 6.4.2 (we’ll come back to the extended model class later)

slide-3
SLIDE 3

3 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Model building in general

  • 1. Identification
  • 2. Estimation

(Prediction, simulation, etc.)

  • 3. Model checking

(Specifying the model order) (of the model parameters) Is the model OK ? Data physical insight Theory No Yes Applications using the model

slide-4
SLIDE 4

4 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Identification of univariate time series models

What ARIMA structure would be appropriate for the data at hand? (If any)

20 40 60 80 100 2 4 6 8 12

Given the structure we will then consider how to estimate the parameters (next lecture) What do we know about ARIMA models which could help us?

slide-5
SLIDE 5

5 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Estimation of the autocovariance function

Estimate of γ(k) CY Y (k) = C(k) = γ(k) = 1 N

N−|k|

  • t=1

(Yt − Y )(Yt+|k| − Y ) It is enough to consider k > 0 S-PLUS: acf(x, type = "covariance")

slide-6
SLIDE 6

6 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Some properties of C(k)

The estimate is a non-negative definite function (as γ(k)) The estimator is non-central: E[C(k)] = 1 N

N−|k|

  • t=1

γ(k) = (1 − |k| N )γ(k) Asymptotically central (consistent) for fixed k: E[C(k)] → γ(k) for N → ∞ The estimates are autocorrelated them self (don’t trust apparent correlation at high lags too much)

slide-7
SLIDE 7

7 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

How does C(k) behave for non-stationary series?

C(k) = 1 N

N−|k|

  • t=1

(Yt − Y )(Yt+|k| − Y )

slide-8
SLIDE 8

7 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

How does C(k) behave for non-stationary series?

C(k) = 1 N

N−|k|

  • t=1

(Yt − Y )(Yt+|k| − Y )

5 10 15 20 25 7200 7400 7600 7800 8000

Series : arima.sim(model = list(ar = 0.9, ndiff = 1), n = 500)

slide-9
SLIDE 9

8 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Autocorrelation and Partial Autocorrelation

Sample autocorrelation function (SACF):

  • ρ(k) = rk = C(k)/C(0)

For white noise and k = 1 it holds that E[ ρ(k)] ≃ 0 and V [ ρ(k)] ≃ 1/N, this gives the bounds ±2/ √ N for deciding when it is not possible to distinguish a value from zero. S-PLUS: acf(x) Sample partial autocorrelation function (SPACF): Use the Yule-Walker equations on ρ(k) (exactly as for the theoretical relations) It turns out that ±2/ √ N is also appropriate for deciding when the SPACF is zero (more in the next lecture) S-PLUS: acf(x, type="partial")

slide-10
SLIDE 10

9 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

What would be an appropriate structure?

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −2 −1 1 2 Lag ACF 5 10 15 20 −0.2 0.2 0.6 1.0 Lag Partial ACF 5 10 15 20 −0.2 −0.1 0.0 0.1 0.2

slide-11
SLIDE 11

10 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

What would be an appropriate structure?

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −2 2 4 Lag ACF 5 10 15 20 −0.2 0.2 0.6 1.0 Lag Partial ACF 5 10 15 20 −0.2 0.0 0.2 0.4 0.6

slide-12
SLIDE 12

11 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

What would be an appropriate structure?

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −6 −4 −2 2 4 Lag ACF 5 10 15 20 −0.2 0.2 0.6 1.0 Lag Partial ACF 5 10 15 20 −0.4 0.0 0.4 0.8

slide-13
SLIDE 13

12 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

What would be an appropriate structure?

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −4 −2 2 4 Lag ACF 5 10 15 20 −0.5 0.0 0.5 1.0 Lag Partial ACF 5 10 15 20 −0.6 −0.2 0.0 0.2

slide-14
SLIDE 14

13 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

What would be an appropriate structure?

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −2 −1 1 2 3 Lag ACF 5 10 15 20 −0.2 0.2 0.6 1.0 Lag Partial ACF 5 10 15 20 −0.2 0.0 0.2 0.4 0.6

slide-15
SLIDE 15

14 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

What would be an appropriate structure?

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −2 −1 1 2 3 Lag ACF 5 10 15 20 −0.2 0.2 0.6 1.0 Lag Partial ACF 5 10 15 20 −0.2 0.0 0.2 0.4

slide-16
SLIDE 16

15 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

What would be an appropriate structure?

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −150 −130 −110 −90 Lag ACF 5 10 15 20 −0.2 0.2 0.6 1.0 Lag Partial ACF 5 10 15 20 −0.4 0.0 0.4 0.8

slide-17
SLIDE 17

16 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Example of data from an MA(2)-process

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −4 −2 2 4 Lag ACF 5 10 15 20 −0.5 0.0 0.5 1.0 Lag Partial ACF 5 10 15 20 −0.6 −0.2 0.0 0.2

slide-18
SLIDE 18

17 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Example of data from a non-stationary process

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −40 20 40 60 80 Lag ACF 5 10 15 20 −0.2 0.2 0.6 1.0 Lag Partial ACF 5 10 15 20 −0.2 0.2 0.6 1.0

slide-19
SLIDE 19

18 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Same series; analysing ∇Yt = (1 − B)Yt = Yt − Yt−1

  • 1

1 0.8 0.9 1.0 1.1 1.2 0.8 0.9 1.0 1.1 1.2 20 40 60 80 100 −4 −2 2 4 Lag ACF 5 10 15 −0.2 0.2 0.6 1.0 Lag Partial ACF 5 10 15 −0.2 0.2 0.6

slide-20
SLIDE 20

19 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Identification of the order of differencing

Select the order of differencing d as the first order for which the autocorrelation decreases sufficiently fast towards 0 In practice d is 0, 1, or maybe 2 Sometimes a periodic difference is required, e.g. Yt − Yt−12 Remember to consider the practical application . . . it may be that the system is stationary, but you measured over a too short period

slide-21
SLIDE 21

20 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Stationarity vs. length of measuring period

US/CA 30 day interest rate differential

1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996

−0.5 −0.1 0.1 0.3

US/CA 30 day interest rate differential

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 1990 1991 1992 1993 1994 1995 1996

−0.5 −0.4 −0.3 −0.2 −0.1 0.0

slide-22
SLIDE 22

21 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Identification of the ARMA-part

Characteristics for the autocorrelation functions: ACF ρ(k) PACF φkk AR(p) Damped exponential and/or sine functions φkk = 0 for k > p MA(q) ρ(k) = 0 for k > q Dominated by damped exponential and or/sine functions ARMA(p, q) Damped exponential and/or sine functions after lag q − p Dominated by damped exponential and/or sine functions after lag p − q The IACF is similar to the PACF; see the book page 133

slide-23
SLIDE 23

22 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Behaviour of the SACF ˆ ρ(k) (based on N obs.)

If the process is white noise then ±2

  • 1

N is an approximate 95% confidence interval for the SACF for lags different from 0 If the process is a MA(q)-process then ±2

  • 1 + 2(ˆ

ρ2(1) + . . . + ˆ ρ2(q)) N is an approximate 95% confidence interval for the SACF for lags larger than q

slide-24
SLIDE 24

23 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Behaviour of the SPACF ˆ φkk (based on N obs.)

If the process is a AR(p)-process then ±2

  • 1

N is an approximate 95% confidence interval for the SPACF for lags larger than p

slide-25
SLIDE 25

24 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Model building in general

  • 1. Identification
  • 2. Estimation

(Prediction, simulation, etc.)

  • 3. Model checking

(Specifying the model order) (of the model parameters) Is the model OK ? Data physical insight Theory No Yes Applications using the model

slide-26
SLIDE 26

25 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Estimation

We have an appropriate model structure AR(p), MA(q), ARMA(p, q), ARIMA(p, d, q) with p, d, and q known Task: Based on the observations find appropriate values of the parameters The book describes many methods: Moment estimates LS-estimates Prediction error estimates

  • Conditioned
  • Unconditioned

ML-estimates

  • Conditioned
  • Unconditioned (exact)
slide-27
SLIDE 27

26 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Example

Using the autocorre- lation functions we agreed that ˆ yt+1|t = a1yt + a2yt−1 and we would select a1 and a2 so that the sum of the squared prediction errors got so small as possible when using the model

  • n the data at hand
slide-28
SLIDE 28

26 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Example

Using the autocorre- lation functions we agreed that ˆ yt+1|t = a1yt + a2yt−1 and we would select a1 and a2 so that the sum of the squared prediction errors got so small as possible when using the model

  • n the data at hand

To comply with the notation of the book we will write the 1-step forecasts as ˆ yt+1|t = −φ1yt − φ2yt−1

slide-29
SLIDE 29

27 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

The errors given the parameters (φ1 and φ2)

Observations: y1, y2, . . . , yN Errors: et+1|t = yt+1 − ˆ yt+1|t = yt+1 − (−φ1yt − φ2yt−1)

slide-30
SLIDE 30

27 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

The errors given the parameters (φ1 and φ2)

Observations: y1, y2, . . . , yN Errors: et+1|t = yt+1 − ˆ yt+1|t = yt+1 − (−φ1yt − φ2yt−1) e3|2 = y3 + φ1y2 + φ2y1 e4|3 = y4 + φ1y3 + φ2y2 e5|4 = y5 + φ1y4 + φ2y3 . . . eN|N−1 = yN + φ1yN−1 + φ2yN−2

slide-31
SLIDE 31

27 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

The errors given the parameters (φ1 and φ2)

Observations: y1, y2, . . . , yN Errors: et+1|t = yt+1 − ˆ yt+1|t = yt+1 − (−φ1yt − φ2yt−1) e3|2 = y3 + φ1y2 + φ2y1 e4|3 = y4 + φ1y3 + φ2y2 e5|4 = y5 + φ1y4 + φ2y3 . . . eN|N−1 = yN + φ1yN−1 + φ2yN−2    y3 . . . yN    =    −y2 −y1 . . . . . . −yN−1 −yN−2    φ1 φ2

  • +

   e3|2 . . . eN|N−1   

slide-32
SLIDE 32

27 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

The errors given the parameters (φ1 and φ2)

Observations: y1, y2, . . . , yN Errors: et+1|t = yt+1 − ˆ yt+1|t = yt+1 − (−φ1yt − φ2yt−1) e3|2 = y3 + φ1y2 + φ2y1 e4|3 = y4 + φ1y3 + φ2y2 e5|4 = y5 + φ1y4 + φ2y3 . . . eN|N−1 = yN + φ1yN−1 + φ2yN−2    y3 . . . yN    =    −y2 −y1 . . . . . . −yN−1 −yN−2    φ1 φ2

  • +

   e3|2 . . . eN|N−1    Or just: Y = Xθ + ε

slide-33
SLIDE 33

28 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Solution

To minimize the sum of the squared 1-step prediction errors εT ε we use the result for the General Linear Model from Chapter 3:

  • θ = (XT X)−1XT Y

With X =    −y2 −y1 . . . . . . −yN−1 −yN−2    and Y =    y3 . . . yN    The method is called the LS-estimator for dynamical systems The method is also in the class of prediction error methods since it minimize the sum of the squared 1-step prediction errors

slide-34
SLIDE 34

28 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Solution

To minimize the sum of the squared 1-step prediction errors εT ε we use the result for the General Linear Model from Chapter 3:

  • θ = (XT X)−1XT Y

With X =    −y2 −y1 . . . . . . −yN−1 −yN−2    and Y =    y3 . . . yN    The method is called the LS-estimator for dynamical systems The method is also in the class of prediction error methods since it minimize the sum of the squared 1-step prediction errors How does it generalize to AR(p)-models?

slide-35
SLIDE 35

29 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Small illustrative example using S-PLUS

> obs [1] -3.51 -3.81 -1.85 -2.02 -1.91 -0.88 > N <- length(obs); Y <- obs[3:N] > Y [1] -1.85 -2.02 -1.91 -0.88 > X <- cbind(-obs[2:(N-1)], -obs[1:(N-2)]) > X [,1] [,2] [1,] 3.81 3.51 [2,] 1.85 3.81 [3,] 2.02 1.85 [4,] 1.91 2.02 > solve(t(X) %*% X, t(X) %*% Y) # Estimates [,1] [1,] -0.1474288 [2,] -0.4476040

slide-36
SLIDE 36

30 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Maximum likelihood estimates

ARMA(p, q)-process: Yt + φ1Yt−1 + · · · + φpYt−p = εt + θ1εt−1 + · · · + θqεt−q Notation: θT = (φ1, . . . , φp, θ1, . . . , θq) YT

t

= (Yt, Yt−1, . . . , Y1) The Likelihood function is the joint probability distribution function for all observations for given values of θ and σ2

ε:

L(YN; θ, σ2

ε) = f(YN|θ, σ2 ε)

Given the observations YN we estimate θ and σ2

ε as the

values for which the likelihood is maximized.

slide-37
SLIDE 37

31 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

The likelihood function for ARMA(p, q)-models

The random variable YN|YN−1 only contains εN as a random component εN is a white noise process at time N and does therefore not depend on anything We therefore know that the random variables YN|YN−1 and YN−1 are independent, hence: f(YN|θ, σ2

ε) = f(YN|YN−1, θ, σ2 ε)f(YN−1|θ, σ2 ε)

Repeating these arguments: L(YN; θ, σ2

ε) =

 

N

  • t=p+1

f(Yt|Yt−1, θ, σ2

ε)

  f(Yp|θ, σ2

ε)

slide-38
SLIDE 38

32 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

The conditional likelihood function

Evaluation of f(Yp|θ, σ2

ε) requires special attention

It turns out that the estimates obtained using the conditional likelihood function: L(YN; θ, σ2

ε) = N

  • t=p+1

f(Yt|Yt−1, θ, σ2

ε)

results in the same estimates as the exact likelihood function when many observations are available For small samples there can be some difference Software: The S-PLUS function arima.mle calculate conditional estimates The R function arima calculate exact estimates

slide-39
SLIDE 39

33 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Evaluating the conditional likelihood function

Task: Find the conditional densities given specified values of the parameters θ and σ2

ε

The mean of the random variable Yt|Yt−1 is the the 1-step forecast Yt|t−1 The prediction error εt = Yt − Yt|t−1 has variance σ2

ε

We assume that the process is Gaussian: f(Yt|Yt−1, θ, σ2

ε) =

1 σε √ 2πe−(Yt−b

Yt|t−1(θ))2/2σ2

ε

And therefore: L(YN; θ, σ2

ε) = (σ2 ε2π)− N−p

2

exp  − 1 2σ2

ε N

  • t=p+1

ε2

t (θ)

 

slide-40
SLIDE 40

34 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

ML-estimates

The (conditional) ML-estimate θ is a prediction error estimate since it is obtained by minimizing S(θ) =

N

  • t=p+1

ε2

t (θ)

By differentiating w.r.t. σ2

ε it can be shown that the ML-estimate

  • f σ2

ε is

  • σ2

ε = S(

θ)/(N − p) The estimate θ is asymptoticly “good” and the variance-covariance matrix is approximately 2σ2

εH−1 where H

contains the 2nd order partial derivatives of S(θ) at the minimum

slide-41
SLIDE 41

35 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Finding the ML-estimates using the PE-method

1-step predictions:

  • Yt|t−1 = −φ1Yt−1 − · · · − φpYt−p + θ1εt−1 + · · · + θqεt−q

If we use εp = εp−1 = · · · = εp+1−q = 0 we can find:

  • Yp+1|p = −φ1Yp − · · · − φpY1 + θ1εp + · · · + θqεp+1−q

Which will give us εp+1 = Yp+1 − Yp+1|p and we can then calculate Yp+2|p+1 and εp+1 . . . and so on until we have all the 1-step prediction errors we need. We use numerical optimization to find the parameters which minimize the sum of squared prediction errors

slide-42
SLIDE 42

36 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

S(θ) for (1 + 0.7B)Yt = (1 − 0.4B)εt with σ2

ε = 0.252

−0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0 −1.0 −0.5 0.0 0.5 MA−parameter AR−parameter Data: arima.sim(model=list(ar=−0.7,ma=0.4), n=500, sd=0.25)

30 35 40 45

slide-43
SLIDE 43

37 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Moment estimates

Given the model structure: Find formulas for the theoretical autocorrelation or autocovariance as function of the parameters in the model Estimate, e.g. calculate the SACF Solve the equations by using the lowest lags necessary Complicated! General properties of the estimator unknown!

slide-44
SLIDE 44

38 Henrik Madsen

  • H. Madsen, Time Series Analysis, Chapmann Hall

Moment estimates for AR(p)-processes

In this case moment estimates are simple to find due to the Yule-Walker equations. We simply plug in the estimated autocorrelation function in lags 1 to p:     

  • ρ(1)
  • ρ(2)

. . .

  • ρ(p)

     =      1

  • ρ(1)

· · ·

  • ρ(p − 1)
  • ρ(1)

1 · · ·

  • ρ(p − 2)

. . . . . . . . .

  • ρ(p − 1)
  • ρ(p − 2)

· · · 1           −φ1 −φ2 . . . −φp      and solve w.r.t. the φ’s The function ar in S-PLUS does this