Comments on Choice of ARMA model Keep it simple! Use small p and q . - - PowerPoint PPT Presentation

comments on choice of arma model keep it simple use small
SMART_READER_LITE
LIVE PREVIEW

Comments on Choice of ARMA model Keep it simple! Use small p and q . - - PowerPoint PPT Presentation

Comments on Choice of ARMA model Keep it simple! Use small p and q . Some systems have autoregressive-like structure. E.g. first order dynamics: dx ( t ) = x ( t ) dt or in stochastic form, dx ( t ) = x ( t ) dt + dW ( t )


slide-1
SLIDE 1

Comments on Choice of ARMA model

  • Keep it simple! Use small p and q.
  • Some systems have autoregressive-like structure.
  • E.g. first order dynamics:

dx(t) dt = −αx(t)

  • r in stochastic form,

dx(t) = −αx(t)dt + dW(t) where W(t) is a Wiener process, the continuous time limit of the random walk.

1

slide-2
SLIDE 2
  • Discrete time approximation:

δx(t) = x(t + δt) − x(t) = −αx(t)δt + δW(t)

  • r

x(t + δt) = x(t) − αx(t)δt + δW(t) = (1 − αδt)x(t) + δW(t), an AR(1) (causal if α > 0 and δt is small).

  • Similarly a second order system leads to AR(2).
  • Since many real-world systems can be approximated by first
  • r second order dynamics, this suggests using p = 1 or 2,

and q = 0.

2

slide-3
SLIDE 3
  • Some systems have more dimensions. E.g. first order vector

autoregression, VARp(1):

xt

p × 1 =

Φ

p × p

xt−1

p × 1 +

wt

p × 1 .

  • Here each component time series is typically ARMA(p, p−1).
  • This suggests using q < p, especially q = p − 1.

3

slide-4
SLIDE 4
  • Added noise: if yt is ARMA(p, q) with q < p, but we observe

xt = yt + w′

t where w′ t is white noise, uncorrelated with yt,

then xt is ARMA(p, p).

  • This suggests using q = p.
  • Summary:

you’ll often find that you can use small p and q ≤ p, perhaps q = 0 or q = p − 1 or q = p, depending on the background of the series.

4

slide-5
SLIDE 5

Estimation

  • Current methods are likelihood-based:

f1,2,...,n (x1, x2, . . . , xn) = f1 (x1) × f2|1 (x2|x1) × . . . × fn|n−1,...,1

xn|xn−1, xn−2, . . . , x1 .

  • If xt is AR(p) and n > p, then

fn|n−1,...,1

xn|xn−1, xn−2, . . . , x1 =

fn|n−1,...,n−p

xn|xn−1, xn−2, . . . , xn−p .

5

slide-6
SLIDE 6
  • Assume xt is Gaussian. E.g. AR(1):

ft|t−1(xt|xt−1) is N[(1 − φ)µ + φxt−1, σ2

w] for t > 1,

and f1(x1) is N[µ, σ2

w/(1 − φ2)].

  • So the likelihood, still for AR(1), is

L(µ, φ, σ2

w) = (2πσ2 w)−n/2

  • 1 − φ2 exp
  • −S(µ, φ)

2σ2

w

  • ,

where S(µ, φ) = (1 − φ2) (x1 − µ)2 +

n

  • t=2

(xt − µ) − φ xt−1 − µ 2 .

6

slide-7
SLIDE 7

Methods in proc arima

  • method = ml: maximize the likelihood.
  • method = uls:

minimize the unconditional sum of squares S(µ, φ).

  • method = cls: minimize the conditional sum of squares Sc(µ, φ):

Sc(µ, φ) = S(µ, φ) − (1 − φ2) (x1 − µ)2 =

n

  • t=2

(xt − µ) − φ xt−1 − µ 2 .

This is essentially least squares regression of xt on xt−1.

7

slide-8
SLIDE 8
  • AR(p), p > 1, can be handled similarly.
  • ARMA(p, q) with q > 0 is more complicated; state space

methods can be used to calculate the exact likelihood.

  • proc arima implements the same three methods in all cases.
  • All three methods give estimators with the same large-sample

normal distribution; all are asymptotically optimal.

8

slide-9
SLIDE 9

Brute Force

  • Above methods fail (or need serious modification) if any data

are missing.

  • Can always fall back to brute force:

x1, x2, . . . , xn ∼ Nn(µ1, Γ), where

Γ

n × n =

       

γ(0) γ(1) γ(2) . . . γ(n − 1) γ(1) γ(0) γ(1) . . . γ(n − 2) γ(2) γ(1) γ(0) . . . γ(n − 3) . . . . . . . . . ... . . . γ(n − 1) γ(n − 2) γ(n − 3) . . . γ(0)

       

9

slide-10
SLIDE 10
  • Write γ(h) = σ2

wγ∗(h), and use e.g.

R’s ARMAacf(...) to compute γ∗(h).

  • Likelihood is

1

  • det(2πΓ)

exp

  • −1

2(x − µ1)′Γ−1(x − µ1)

  • =

1

  • det(2πσ2

wΓ∗)

exp

  • − 1

2σ2

w

(x − µ1)′Γ∗−1(x − µ1)

  • Can maximize analytically with respect to µ and σ2

w, then

numerically with respect to φ and θ.

  • Missing data? Just leave out corresponding rows and columns
  • f Γ∗.

10