SLIDE 1 Exploratory Data Analysis (or Searching for Stationarity)
- When an observed time series appears stationary, we can
calculate its sample autocorrelations, and use them to decide
- n a model.
- Many time series do not appear stationary; e.g., Johnson and
Johnson earnings, global temperature.
- Often we can find a way to relate one series to a different
series, for which stationarity is more plausible.
1
SLIDE 2 Trends and Detrending
- Some series can be modeled as
xt = µt + yt, where yt is stationary.
- If µt is a parametric form, we can estimate it and subtract
- it. That is, we use the residuals from a fitted trend.
- The form of trend might be linear, or higher degree polyno-
mial, or some other function suggested by theory.
2
SLIDE 3
Example: 20th Century Global Temperature lmg1900 = lm(g1900 ~ time(g1900)); plot(ts(residuals(lmg1900), start = 1900));
Time Residuals 1900 1920 1940 1960 1980 2000 −0.3 0.0 0.3
3
SLIDE 4 Differencing
- Some series still appear nonstationary after detrending.
- E.g. the “trend” µt is a random walk with drift:
µt = δt +
t
wj Here E(xt) = δt, but xt − E(xt) =
t
wj + yt with a variance that grows with time.
4
SLIDE 5
- But now the first differences
∇xt = xt − xt−1 = δ + wt + yt − yt−1 are stationary.
- Define the backshift operator B by Bxt = xt−1
- Then ∇xt = (1 − B)xt.
- Also second differences
∇2xt = (1 − B)2xt = xt − 2xt−1 + xt−2,
- etc. Easy for any positive integer d; possible for fractional d.
5
SLIDE 6 Example: 20th Century Global Temperature plot(diff(g1900));
Time diff(g1900) 1900 1920 1940 1960 1980 2000 −0.3 0.0 0.3
- Both detrending and differencing give apparently stationary
results.
6
SLIDE 7 acf(diff(g1900));
5 10 15 −0.2 0.4 1.0 Lag ACF
Series diff(g1900)
- Differencing has removed almost all auto-correlation.
7
SLIDE 8 acf(residuals(lmg1900))
5 10 15 −0.2 0.4 1.0 Lag ACF
Series residuals(lmg1900)
- Removing the trend without differencing leaves more auto-
correlation.
8
SLIDE 9 Transformation (Re-expression)
- Some series need to be re-expressed.
- Most commonly logarithms, sometimes square roots (espe-
cially with counted data).
- Often re-expression improves stationarity, and other desirable
features such as symmetry of distribution.
- E.g. Glacial varve thicknesses, Johnson and Johnson earn-
ings.
9
SLIDE 10 Periodic Signals
- If a series is plausibly modeled as a cosine wave plus noise,
we can fit xt = A cos(2πωt+φ)+wt = (A cos φ) cos(2πωt)−(A sin φ) sin(2πωt) by least squares.
- If ω is known (e.g., ω = 1/12 for an annual cycle in monthly
data), this is a linear regression: xt = β1 cos(2πωt) + β2 sin(2πωt)
10
SLIDE 11
- If ω is of the form j/n for integer j (n = series length), then
ˆ β1 = 2 n
n
xt cos(2πtj/n), ˆ β2 = 2 n
n
xt sin(2πtj/n).
- For other ω, use standard linear least squares regression.
- If ω is unknown, either:
– try all ωs of the form j/n, plotting ˆ β1(j/n)2 + ˆ β2(j/n)2 against j/n (the periodogram); – use non-linear least squares for other ω.
11
SLIDE 12
# detrend global temperature using a quadratic fit gtres = residuals(lm(globtemp ~ time(globtemp) + I(time(globtemp)^2))); gtres = ts(gtres, start = start(globtemp)); par(mfcol = c(2, 1)); plot(gtres); # use spectrum() to plot the periodogram of detrended global temperature spectrum(gtres, log = "no");
12
SLIDE 13 Smoothing a Time Series
- Smoothing a time series makes long-term behavior (low fre-
quencies) more apparent. E.g. global temperature, Johnson and Johnson earnings.
– moving averages; – kernel smoothers; – lowess, supsmu, etc.; – smoothing splines.
13
SLIDE 14
# Trailing yearly average J&J earnings plot(jj) lines(filter(jj, rep(1, 4)/4, sides = 1), col = "red") title("Trailing 4-quarter averages") # smooth global temperatures over a 30 year window # (note half weight on end values) plot(globtemp) lines(filter(globtemp, c(.5, rep(1, 29), .5)/30), col = "red") title("Centered 30 year averages")
14
SLIDE 15 Smoothing a Scatter Plot
- Smoothing a scatter plot can also reveal behavior.
- E.g. daily NYSE returns plotted against previous day.
15
SLIDE 16
# scatter plot of NYSE return against previous day, # with lowess smooth plot(nyse[-length(nyse)], nyse[-1], xlim = c(-0.02, 0.02), ylim = c(-0.02, 0.02)) lines(lowess(nyse[-length(nyse)], nyse[-1], f = 1/5), col = "red") title("NYSE daily return against previous day")
16