Time Series Regression A regression model relates a response x t to - - PowerPoint PPT Presentation

time series regression a regression model relates a
SMART_READER_LITE
LIVE PREVIEW

Time Series Regression A regression model relates a response x t to - - PowerPoint PPT Presentation

Time Series Regression A regression model relates a response x t to inputs z t, 1 , z t, 2 , . . . , z t,q : x t = 1 z t, 1 + 2 z t, 2 + + q z t,q + error . Time domain modeling: the inputs often include lagged val- ues of


slide-1
SLIDE 1

Time Series Regression

  • A regression model relates a response xt to inputs zt,1, zt,2, . . . , zt,q:

xt = β1zt,1 + β2zt,2 + · · · + βqzt,q + error.

  • Time domain modeling: the inputs often include lagged val-

ues of the same series, xt−1, xt−2, . . . , xt−p.

  • Frequency domain modeling: the inputs include sine and co-

sine functions.

1

slide-2
SLIDE 2

Fitting a Trend > g1900 = window(globtemp, start = 1900) > plot(g1900)

Time window(globtemp, start = 1900) 1900 1920 1940 1960 1980 2000 −0.4 0.0 0.2 0.4

2

slide-3
SLIDE 3
  • possible model:

xt = β1 + β2t + wt, where the error (“noise”) is white noise (unlikely!).

  • fit using ordinary least squares (OLS):

> lmg1900 = lm(g1900 ~ time(g1900)); summary(lmg1900) Call: lm(formula = g1900 ~ time(g1900)) Residuals: Min 1Q Median 3Q Max

  • 0.30352 -0.09671

0.01132 0.08289 0.33519

3

slide-4
SLIDE 4

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.219e+01 9.032e-01

  • 13.49

<2e-16 *** time(g1900) 6.209e-03 4.635e-04 13.40 <2e-16 ***

  • Signif. codes:

0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.1298 on 96 degrees of freedom Multiple R-Squared: 0.6515, Adjusted R-squared: 0.6479 F-statistic: 179.5 on 1 and 96 DF, p-value: < 2.2e-16

4

slide-5
SLIDE 5

> plot(g1900) > abline(reg = lmg1900)

Time g1900 1900 1920 1940 1960 1980 2000 −0.4 0.0 0.2 0.4

5

slide-6
SLIDE 6

Using PROC ARIMA Program

data globtemp; infile ’globtemp.dat’; n + 1; input globtemp; year = 1855 + n; run; proc arima data = globtemp; where year >= 1900; identify var = globtemp crosscorr = year; /* The ESTIMATE statement fits a model to the *\ \* variable in the most recent IDENTIFY statement */ estimate input = year; run;

and output.

6

slide-7
SLIDE 7

Regression Review

  • the regression model:

xt = β1zt,1 + β2zt,2 + · · · + βqzt,q + wt = β′zt + wt.

  • fit by minimizing the residual sum of squares

RSS(β) =

n

  • t=1
  • xt − β′zt

2

  • find the minimum by solving the normal equations

 

n

  • t=1

ztz′

t

  ˆ

β =

n

  • t=1

ztxt.

7

slide-8
SLIDE 8

Matrix Formulation

  • factor matrix Zn×q = (z1, z2, . . . , zn)′, response vector xn×1 =

(x1, x2, . . . , xn)′

  • normal equations (Z′Z)ˆ

β = Z′x with solution ˆ β = (Z′Z)−1Z′x

  • minimized RSS

RSS

ˆ

β

  • =
  • x − Zˆ

β

x − Zˆ β

  • = x′x − ˆ

β′Z′x

= x′x − x′Z(Z′Z)−1Z′x

8

slide-9
SLIDE 9

Distributions

  • If the (white noise) errors are normally distributed (wt ∼

iid N(0, σ2

w)), then ˆ

β is multivariate normal, and the usual

t- and F-statistics have the corresponding distributions.

  • If the errors are not normally distributed, but still iid, the

same is approximately true.

  • If the errors are not white noise, none of that is true.

9

slide-10
SLIDE 10

Choosing a Regression Model

  • We want a model that fits well without using too many pa-

rameters.

  • Two estimates of the noise variance:

– unbiased: s2

w = RSS/(n − q)

– maximum likelihood: ˆ σ2 = RSS/n.

  • We want small ˆ

σ2 but also small q.

10

slide-11
SLIDE 11

Information Criteria (smaller is better)

  • Akaike’s Information Criterion (with k variables in the model):

AIC = ln ˆ σ2

k + n + 2k

n

  • bias-corrected Akaike’s Information Criterion:

AICc = ln ˆ σ2

k +

n + k n − k − 2

  • Schwarz’s (Bayesian) Information Criterion:

SIC = ln ˆ σ2

k + k ln n

n

11

slide-12
SLIDE 12

Notes

  • More commonly (e.g. in SAS output and in R’s AIC function),

these are all multiplied by n.

  • AIC, AICc, and SIC (also known as SBC and BIC) can be

generalized to other problems where likelihood methods are used.

  • If n is large and the true k is small, minimizing BIC picks k

well, but minimizing AIC tends to over-estimate it.

  • If the true k is large (or infinite), minimizing AIC picks a value

that gives good predictions by trading off bias vs variance.

12