Exploratory Data Analysis (or Searching for Stationarity) When an - - PowerPoint PPT Presentation

exploratory data analysis or searching for stationarity
SMART_READER_LITE
LIVE PREVIEW

Exploratory Data Analysis (or Searching for Stationarity) When an - - PowerPoint PPT Presentation

Exploratory Data Analysis (or Searching for Stationarity) When an observed time series appears stationary, we can calculate its sample autocorrelations, and use them to decide on a model. Many time series do not appear stationary; e.g.,


slide-1
SLIDE 1

Exploratory Data Analysis (or Searching for Stationarity)

  • When an observed time series appears stationary, we can

calculate its sample autocorrelations, and use them to decide

  • n a model.
  • Many time series do not appear stationary; e.g., Johnson and

Johnson earnings, global temperature.

  • Often we can find a way to relate one series to a different

series, for which stationarity is more plausible.

1

slide-2
SLIDE 2

Trends and Detrending

  • Some series can be modeled as

xt = µt + yt, where yt is stationary.

  • If µt is a parametric form, we can estimate it and subtract
  • it. That is, we use the residuals from a fitted trend.
  • The form of trend might be linear, or higher degree polyno-

mial, or some other function suggested by theory.

2

slide-3
SLIDE 3

Example: 20th Century Global Temperature lmg1900 = lm(g1900 ~ time(g1900)); plot(ts(residuals(lmg1900), start = 1900));

Time Residuals 1900 1920 1940 1960 1980 2000 −0.3 0.0 0.3

3

slide-4
SLIDE 4

Differencing

  • Some series still appear nonstationary after detrending.
  • E.g. the “trend” µt is a random walk with drift:

µt = δt +

t

  • j=1

wj Here E(xt) = δt, but xt − E(xt) =

t

  • j=1

wj + yt with a variance that grows with time.

4

slide-5
SLIDE 5
  • But now the first differences

∇xt = xt − xt−1 = δ + wt + yt − yt−1 are stationary.

  • Define the backshift operator B by Bxt = xt−1
  • Then ∇xt = (1 − B)xt.
  • Also second differences

∇2xt = (1 − B)2xt = xt − 2xt−1 + xt−2,

  • etc. Easy for any positive integer d; possible for fractional d.

5

slide-6
SLIDE 6

Example: 20th Century Global Temperature plot(diff(g1900));

Time diff(g1900) 1900 1920 1940 1960 1980 2000 −0.3 0.0 0.3

  • Both detrending and differencing give apparently stationary

results.

6

slide-7
SLIDE 7

acf(diff(g1900));

5 10 15 −0.2 0.4 1.0 Lag ACF

Series diff(g1900)

  • Differencing has removed almost all auto-correlation.

7

slide-8
SLIDE 8

acf(residuals(lmg1900))

5 10 15 −0.2 0.4 1.0 Lag ACF

Series residuals(lmg1900)

  • Removing the trend without differencing leaves more auto-

correlation.

8

slide-9
SLIDE 9

Transformation (Re-expression)

  • Some series need to be re-expressed.
  • Most commonly logarithms, sometimes square roots (espe-

cially with counted data).

  • Often re-expression improves stationarity, and other desirable

features such as symmetry of distribution.

  • E.g. Glacial varve thicknesses, Johnson and Johnson earn-

ings.

9

slide-10
SLIDE 10

Periodic Signals

  • If a series is plausibly modeled as a cosine wave plus noise,

we can fit xt = A cos(2πωt+φ)+wt = (A cos φ) cos(2πωt)−(A sin φ) sin(2πωt) by least squares.

  • If ω is known (e.g., ω = 1/12 for an annual cycle in monthly

data), this is a linear regression: xt = β1 cos(2πωt) + β2 sin(2πωt)

10

slide-11
SLIDE 11
  • If ω is of the form j/n for integer j (n = series length), then

ˆ β1 = 2 n

n

  • t=1

xt cos(2πtj/n), ˆ β2 = 2 n

n

  • t=1

xt sin(2πtj/n).

  • For other ω, use standard linear least squares regression.
  • If ω is unknown, either:

– try all ωs of the form j/n, plotting ˆ β1(j/n)2 + ˆ β2(j/n)2 against j/n (the periodogram); – use non-linear least squares for other ω.

11

slide-12
SLIDE 12

# detrend global temperature using a quadratic fit gtres = residuals(lm(globtemp ~ time(globtemp) + I(time(globtemp)^2))); gtres = ts(gtres, start = start(globtemp)); par(mfcol = c(2, 1)); plot(gtres); # use spectrum() to plot the periodogram of detrended global temperature spectrum(gtres, log = "no");

12

slide-13
SLIDE 13

Smoothing a Time Series

  • Smoothing a time series makes long-term behavior (low fre-

quencies) more apparent. E.g. global temperature, Johnson and Johnson earnings.

  • Many types of smoother:

– moving averages; – kernel smoothers; – lowess, supsmu, etc.; – smoothing splines.

13

slide-14
SLIDE 14

# Trailing yearly average J&J earnings plot(jj) lines(filter(jj, rep(1, 4)/4, sides = 1), col = "red") title("Trailing 4-quarter averages") # smooth global temperatures over a 30 year window # (note half weight on end values) plot(globtemp) lines(filter(globtemp, c(.5, rep(1, 29), .5)/30), col = "red") title("Centered 30 year averages")

14

slide-15
SLIDE 15

Smoothing a Scatter Plot

  • Smoothing a scatter plot can also reveal behavior.
  • E.g. daily NYSE returns plotted against previous day.

15

slide-16
SLIDE 16

# scatter plot of NYSE return against previous day, # with lowess smooth plot(nyse[-length(nyse)], nyse[-1], xlim = c(-0.02, 0.02), ylim = c(-0.02, 0.02)) lines(lowess(nyse[-length(nyse)], nyse[-1], f = 1/5), col = "red") title("NYSE daily return against previous day")

16