Time series Decomposing a series into meaningful components R.W. - - PowerPoint PPT Presentation
Time series Decomposing a series into meaningful components R.W. - - PowerPoint PPT Presentation
Time series Decomposing a series into meaningful components R.W. Oldford Time series data - Sunspot activity Recall the measurements of monthly mean relative sunspot numbers from 1749 to 1983 plotted as Monthly sunspot activity 200 Sunspot
Time series data - Sunspot activity
Recall the measurements of monthly mean relative sunspot numbers from 1749 to 1983 plotted as
Monthly sunspot activity
Date Sunspot numbers 1750 1800 1850 1900 1950 50 100 200
The principal feature here was the rise and fall of sunspot activity roughly following an eleven year cycle.
Time series data - Sunspot activity
Working with the square root of the sunspot numbers, the plot can be coloured according to 11 year blocks:
1750 1800 1850 1900 1950 5 10 15
Sunspot activity
Time Mean sqrt monthly sunspots
There are 22 different colours (randomized order here) each corresponding to a different 11 year cycle.
Time series data - Sunspot activity
And the putative 11 year cycles overlaid to see how well they match:
20 40 60 80 100 120 5 10 15
11 year sequences
Month number in 11 year sequence Mean sqrt monthly sunspots
One reason we could overlap these is that the sunspot pattern is relatvely consistent, or stationary, over time. It does not seem to be the case, for example, that the number of sunspots is growing (or declining) dramatically over the past few centuries. This is not always the case for time series.
Atmospheric CO2 concentrations
Let’s consider another physical time series, the concentration of carbon dioxide
- r CO2 (a “greenhouse gas”) over over the Hawaii Island volcano Mauna Loa.
These measurements have been made since at least March 1958. The R monthly time series co2 contains atmospheric concentrations (in parts per million) over Mauna Loa of CO2 for the years 1959 to 1997. Three months (Feb., March, April, 1964) were missing and have simply had their values interpolated.
plot(co2)
Time co2 1960 1970 1980 1990 320 340 360
What happens when you change the aspect ratio? (Try with zoom in RStudio, or with interactive loon plots as shown on the next slide.)
Atmospheric CO2 concentrations
We could again explore this interactively, using loon
# loon requires at least R 3.4.1 # install loon from CRAN install.packages("loon") or # follow install instructions at # http://waddella.github.io/loon/install.html library(loon) # First a helper function to get nice dates out of a monthly timeseries getDates <- function(ts) { mapply(FUN= function(x, y) {paste(x, y)},floor(time(co2)), month.name)} data <- co2 p <- l_plot(data, title="Mauna Loa Atmospheric CO2", ylabel="concentration (ppm)", xlabel = "Date", linkingGroup="co2", color = "grey20", showScales = TRUE, showGuides = TRUE, itemLabel = getDates(data), showItemLabels=TRUE) # Another way to get the x and y coordinates xy.raw <- xy.coords(data) l_layer_line(p, x= xy.raw$x, y= xy.raw$y, color = "grey50", linewidth = 3, index="end", label="connect the dots")
Unlike the sunspots data, there are clearly two patterns in the CO2 data: an
- verall rising trend, and a shorter pattern repeating within each year.
Atmospheric CO2 concentrations
We could try to capture these patterns by looking at averages:
◮ first over all months for each year ◮ second over all years for each month
and then plot these averages over time.
# First we need a couple of helper functions # As before we need to get the years from our time-series object # We can use time(...) head(time(co2)) ## [1] 1959.000 1959.083 1959.167 1959.250 1959.333 1959.417 getYears <- function(ts) {unique(floor(time(ts)))} getYears(co2) ## [1] 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 ## [16] 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 ## [31] 1989 1990 1991 1992 1993 1994 1995 1996 1997 # And get the number of the month. # This will actually work more generally using frequency() getMonthNos <- function(ts) {1:frequency(ts)} getMonthNos(co2) ## [1] 1 2 3 4 5 6 7 8 9 10 11 12
Atmospheric CO2 concentrations
We can now compute the averages:
◮ first over all months for each year ◮ second over all years for each month
as follows:
getAves <- function(x, by = "year" ){ years <- getYears(x) monthNos <- getMonthNos(x) nyears <- length(years) nmonths <- length(monthNos) if (! (by %in% c("year", "month"))) { stop("unknown value for by = ", by, "; by must be one of {\"year\", \"month\"}") } if(by == "year"){ aves <- sapply(1:nyears, FUN=function(i) { mean(window(x, start = c(years[i], monthNos[1]), end = c(years[i], monthNos[nmonths])))}) aves <- data.frame(aves = aves, row.names = years) } else { aves <- sapply(1:nmonths, FUN=function(i) { mean(window(x, start = c(years[1], monthNos[i]), end = c(years[nyears], monthNos[i]), frequency=1))}) aves <- data.frame(aves = aves, row.names = month.abb[monthNos]) } aves }
Atmospheric CO2 concentrations
We now compute the averages: first by year over all months
# over months (by year) head(getAves(co2, by = "year")) ## aves ## 1959 315.8258 ## 1960 316.7475 ## 1961 317.4850 ## 1962 318.2975 ## 1963 318.8325 ## 1964 319.4625
Now compute the averages: second by month over all years
# over years (by month) head(getAves(co2, by = "month")) ## aves ## Jan 336.4308 ## Feb 337.2033 ## Mar 338.0546 ## Apr 339.2944 ## May 339.8821 ## Jun 339.3282
Atmospheric CO2 concentrations
And we can plot them
◮ first by year (over all months) ◮ second by month (over all years)
## over months (by year) plot(getYears(co2), getAves(co2, by = "year")$aves, type="l", xlab="Year", ylab="Annual average over all months") ## over years (by month) plot(getAves(co2, by = "month")$aves, type="l", xlab="Month", xaxt="n", ylab="Monthly average over all years") axis(1, at=getMonthNos(co2), labels=month.abb)
Atmospheric CO2 concentrations
And we can plot them
◮ first by year(over all months)
1960 1970 1980 1990 320 330 340 350 360 Year Annual average over all months
We have smoothed out all of the bumps given each season to reveal the overall trend in the data.
Atmospheric CO2 concentrations
And we can plot them
◮ first by month (over all years)
334 335 336 337 338 339 340 Month Monthly average over all years January February March April May June July August September October November December
We have smoothed out the data over all years to show the average seasonal pattern.
Atmospheric CO2 concentrations
In some sense, we have decomposed the series into two pieces; an overall trend (across years), and a seasonal (within year) pattern. We could plot these together with the data as follows
# Get the number of years nyears <- length(getYears(co2)) # Reduce white space between plots savePar <- par(mfrow= c(3,1), mar=c(2, 4, 1.5, 0.5), oma=rep(0.1,4)) # Original data plot(co2, main= "Atmospheric CO2 (in ppm)") # The trend (annual averages) plot(getYears(co2), getAves(co2)$aves, type="l", ylab="Annual average") # The seasonal pattern (month average over all years) # We repeat the pattern here so that every season is shown (identically) seasonalAves <- getAves(co2, by = "month")$aves plot(time(co2), rep(seasonalAves, nyears), type="l", ylab="Monthly averages") par(savePar)
Atmospheric CO2 concentrations
Atmospheric CO2 (in ppm)
co2 1960 1970 1980 1990 320 340 360 1960 1970 1980 1990 320 330 340 350 360 Annual average 1960 1970 1980 1990 334 336 338 340 Monthly averages
Atmospheric CO2 concentrations
There exist functions in R that will produce this decomposition of a time series for you.
# The default decomposition is an additive one # (multiplicative also available) co2_decomp <- decompose(co2) # having structure str(co2_decomp) # which can be plotted as plot(co2_decomp)
Atmospheric CO2 concentrations
320 340 360
- bserved
320 340 360
trend
−3 −1 1 2 3
seasonal
−0.5 0.0 0.5 1960 1970 1980 1990
random Time
Decomposition of additive time series The three bottom pieces sum to the top one. Notice the y-axis labels on each and what this says about the range
- f possible values for each component.
Atmospheric CO2 concentrations
The visual decomposition is itself a model. Namely, that for time t Y (t) = Trend(t) + Seasonality(t) + Random(t)
◮ the time series is visually decomposed into the sum of several smooth
components and one “random” component.
◮ each smooth component is interpretable and captures an important feature
- f the series
◮ the “random” component consists set of residuals from the fitted model; we
expect to see no pattern in the residuals Questions:
- 1. The smooths are simple averages (over months within each year; over years
within each month).
◮ As we did with density estimates, can we do better than simple averages?
- 2. This visual decomposition seems like a good idea:
◮ each piece summarizes important visual features of the data ◮ can it be used for more than time series? If so, how?
Summarizing a y on x relationship
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Often model the values yi as yi = µ(xi) + ri where µ(x) is a simple summary of the yis at x = xi
Summarizing a y on x relationship
Example, a straight-line model: µ(x) = α + β x which can be fitted via least-squares. In R these estimates can be had immediately by using the function “linear model” fitting function lm(...):
- urdata <- data.frame(x=x, y=y)
# # Fitting a straight line: # fit <- lm(y~x, data=ourdata) summary(fit) ## ## Call: ## lm(formula = y ~ x, data = ourdata) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1.01095 -0.21882 0.06357 0.19656 0.95727 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -1.39441 0.05215
- 26.74
<2e-16 *** ## x 2.53435 0.09738 26.02 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.3442 on 148 degrees of freedom ## Multiple R-squared: 0.8207, Adjusted R-squared: 0.8195 ## F-statistic: 677.3 on 1 and 148 DF, p-value: < 2.2e-16
Summarizing a y on x relationship
# The coefficients are fit$coefficients ## (Intercept) x ##
- 1.394412
2.534347
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
Could model this relationship as a more complicated function of x, say as µ(x) = β0 + β1x + β2x2 + . . . + βpxp
- r any other linear (in the coefficients) model. Any of these can be fitted by least-squares.
Unfortunately, these models require us to specify a parametric form (e.g. a polynomial) that fits the data everywhere (globally for all x). Alternatively, we could try fitting many simple functions of x locally, different at every value of x. Connecting the fitted values together produces an estimated µ(x) For example, while we might not be willing to have one line fit all points, we might be willing to have different lines fitted in separate (and contiguous) regions of x. That is we could fit lines locally within each region of x. We can fit locally by using weighted least squares which minimizes
n
- i=1
wi(x) (yi − µ(xi))2 where wi(x) depends on the location x where we are fitting µ(x). We fit µ(x) for every x on in the range of the data. We could also make the weight function wi(x) to be 1 for those xi near x and 0 for those far away. In this way, the weights determine the xi values that contribute to fitting µ(x) and those which do not. The smaller these regions were, the less structure we would be imposing on the underlying function µ(x).
Summarizing a y on x relationship
This can be illustrated through a series of pictures. First we need some unit weight function. # We can see how the local fits behave as we change various elements # xloc is the location where we are fitting, # proportionRange stands for proportion of the x range # which determines what is meant by local. # All weights are 1 or zero unitWeight <- function(xloc, x, proportionRange = 0.5) { xhalf <-diff(range(x)) * proportionRange/2 wts <- rep(0, length(x)) wts[x < xloc + xhalf & x > xloc - xhalf] <-1 wts } # We have our data with x and y as before and need weights # xloc <- 0.5 weights <- unitWeight(xloc, x, proportionRange=1) # And use these to fit a line fitw <- lm(y~x, data=ourdata, weights=weights) We could now plot the data again and the fitted line. We do this on the next slide showing as well which points contributed (by having non-zero weight) to the fit and which did not.
Summarizing a y on x relationship
Here, we are estimating µ(x) at the point x = 0.5 and nearly all points are contributing equally to this estimate. (Why not all?)
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
We can reduce the number of points contributing by reducing the value of proportionRange.
Summarizing a y on x relationship
proportionRange = 0.5
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
proportionRange = 0.2
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
To get a complete µ(x), choose a value for proportionRange and determine the fit at all values x, and “connect the dots”. This is a moving average estimate. This locally weighted least squares fit is called “lowess” or “loess”. As with kernel density estimation, we might consider using a weight function that diminished more smoothly as xi gets farther away from the x location of interest. For example, we could again use a Gaussian density as a weight function. # Construct a weight function shaped like a Gaussian # or Normal density. GaussWeight <- function(xloc, x, h=1) { # Normal density dnorm(x, mean=xloc, sd=h) } As before, we can look at the weights and the resulting fits.
Summarizing a y on x relationship
Gaussian weights: h=0.4 (σ = h)
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
Gaussian weights: h=0.2 (σ = h)
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
Gaussian weights: h=0.1 (σ = h)
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
We can see how the entire µ(x) is constructed by moving the x location from left to right at selected points across the range of the xis.
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y
Summarizing a y on x relationship
When we connect the dots (only nine of them), we get a fitted curve for µ(x):
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
We could repeat this for different values of h.
Summarizing a y on x relationship
Halving the span, or bandwidth:
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y
Summarizing a y on x relationship
And the entire µ(x):
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
Halving the span, or bandwidth, again:
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −0.5 0.5 1.0 y
Summarizing a y on x relationship
And the entire µ(x):
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
Let’s look at this last bandwidth with a few more points to get a more complete µ(x)
0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y 0.0 0.4 0.8 −1.5 0.0 1.0 y
Summarizing a y on x relationship
And the entire µ(x):
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
So the thing to do is make a function that just does this.
Summarizing a y on x relationship - putting it all together
Here is one such function that will produce a smooth curve from minimizing the Locally Weighted Sum of Squares, or LoWeSS. # Given a weight function the corresponding # weighted least squares estimate at any point(s) x # is easily constructed. # # It requires: # x, y
- the data
# xloc
- x locations at which the estimate
# is to be calculated # span
- a bandwidth
# weightFn
- a weighting function, default will be GaussWeight
# nlocs
- number of equi-spaced locations at which estimate
# will be calculated if xloc=NULL, ignored otherwise. # # CODE GIVEN ON NEXT SLIDE
Summarizing a y on x relationship - putting it all together
Here is one such function that will produce a smooth curve from minimizing the Locally Weighted Sum of Squares, or LoWeSS.
LoWeSS <- function(x, y, xloc=NULL, span=0.1, weightFn=GaussWeight, nlocs=200) { data <- data.frame(x=x, y=y) xrange <- extendrange(x) bandwidth <- (diff(xrange)*span)/4 if (is.null(xloc)) { xloc <- seq(xrange[1], xrange[2], length.out=nlocs) } estimates <- vector("numeric", length=length(xloc)) for (i in 1:length(xloc)) { weights <- weightFn(xloc[i], x, bandwidth) fit <- lm(y~x, data=data, weights=weights) pred <- predict(fit, newdata=data.frame(x=xloc[i])) estimates[i] <- pred } # return the estimates list(x = xloc, y = estimates) }
Summarizing a y on x relationship - putting it all together
We can now apply this function to ourdata
smooth <- LoWeSS(x,y, nlocs = 1000, span = 0.75) plot(x,y, col="grey50", pch=19, main = "Our data") lines(smooth$x, smooth$y, col="steelblue", lwd=3)
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Our data
x y
Summarizing a y on x relationship
Now ourdata was actually generated as Yi = µ(xi) + Ri with Ri ∼ N(0, 0.2) So we are in the completely artificial position of being able to compare the fitted smooth with the true µ(xi) that was used to produce these data. # The data plot(x, y, col="grey50", pch=19, main = "Relating y to x") # The LoWeSS estimate # lines(smooth$x, smooth$y, col="steelblue", lwd=3) # And the true Mu(x) # xordered <- sort(x) lines(xordered,mu(xordered), col="firebrick", lwd=3, lty=2)
Summarizing a y on x relationship
Comparing the fitted smooth with the true µ(xi) that was used to produce these data.
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
What we have constructed here is a naive locally weighted sum of squares
- estimator. Clearly, there are some difficulties that require attention, such as:
◮ choosing the bandwidth. What value? Also, ours only looked at x locations;
another might choose a proportion of the nearest x values.
◮ what weight function? Perhaps a more robust choice that actually gives
zero weight to points that are far away.
◮ what about the ends? Seems that you can only estimate using data from
- ne side. Should that effect the choice of bandwidth there? Or the weight
function?
Summarizing a y on x relationship
loess: locally weighted sums of squares
In R there is a function called loess that fits a locally weighted sum of squares estimate that pays a little more attention to some of these problems. As its default weight function loess uses Tukey’s tri-cube weight K(w) =
(1 − |w|3)3
w ∈ [−1, 1]
- therwise.
−2 −1 1 2 0.0 0.2 0.4 0.6 0.8 1.0
Weight functions
w weight tricube gaussian
This is generally more resistant to outlying xis than say the Gaussian weight function. It is also not restricted to fitting local lines. loess can fit any degree polynomial locally (though typically only degree 1 or 2 is used in practice).
Summarizing a y on x relationship
loess: locally weighted sums of squares For example,
fit <- loess(y~x, data=ourdata) plot(x,y, col="grey50", pch=19, main = "Relating y to x") # # Get the values predicted by the loess pred <- predict(fit) # # Get the order of the x
- rd <- order(x)
# lines(x[ord],pred[ord], lwd=2, col="steelblue")
Summarizing a y on x relationship
loess: locally weighted sums of squares
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Summarizing a y on x relationship
We could also reduce the span being used here (default is 0.75):
fit <- loess(y~x, data=ourdata, span=0.15) 0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
Relating y to x
x y
Using spans to decompose - large to small
Small spans follow detailed and very local patterns; large spans follow large and more general patterns. This suggests that we might use different spans to find different features. In fact, if we are careful about it, we might separate these features into different components. Suppose we first fit a smooth with a large span. Call this fitted smooth sLarge(x). Then we can imagine that we have a fitted model value at each data point: y = sLarge(x) + rLarge For the point i, then there will be a corresponding estimated residual that represents what remains in yi that has not been explained by the large scale smooth sLarge(x). The estimated residual of the point i (for i = 1, . . . , n) can be expressed as
- rLarge(i) = yi −
sLarge(xi) From these estimated residuals, we might consider fitting another smooth with a smaller
- span. That is we now fit
- rLarge =
sSmall(x) + rSmall Here’s how this could proceed for our data.
Using spans to decompose
# First fit the large span fit_Large <- loess(y~x, data=ourdata, span=0.75)
0.0 0.2 0.4 0.6 0.8 1.0 −1.5 −1.0 −0.5 0.0 0.5 1.0
s_Large fit of y on x
x y
Using spans to decompose
The fitted smooth tells the story of the large scale (low-frequency) changes seen in the data. As the fit indicates, there appears to be two levels in y with a rise from low to high y in the range of x from about 0.3 to 0.7. The residuals from this fit might now be explored for any smaller scale (higher frequency) data patterns.
# From this fit we extract the estimated residuals r_Large <- fit_Large$residuals # This is another data set that we can plot # plot(x,r_Large,col="grey50", pch=19, main = "s_Small fit of r_Large on x") # And we can fit this with a smaller span data_resids <- data.frame(r = r_Large, x = x) fit_Small <- loess(r~x, data=data_resids, span=0.2) # Get the values predicted by the loess pred_Small <- predict(fit_Small) # Need the order of x
- rd <- order(x)
lines(x[ord],pred_Small[ord], lwd=2, col="steelblue")
Using spans to decompose
0.0 0.2 0.4 0.6 0.8 1.0 −0.5 0.0 0.5
s_Small fit of r_Large on x
x r_Large
Using spans to decompose
Note that the range of these residuals, rLarge(i), will necessarily be much less than that
- f the original observations yi if the large scale patterns were substantive.
Secondly, note that the dominant pattern here is a sinusoidal one in the centre of the x range, again from about x = 0.3 to x = 0.7 (the transition section detected by the first fitted smooth). The remaining residuals should also be plotted and examined for any substantive patterns.
# # Plot the final remaining residuals # r_Remains <- fit_Small$residuals plot(x,r_Remains, col="grey50", pch=19, main = "Remaining residuals on x") # Which we could fit again with an even smaller span # data_remains <- data.frame(r = r_Remains, x = x) fit_Tiny <- loess(r~x, data=data_remains, span=0.15) # Get the values predicted by the loess pred_Tiny <- predict(fit_Tiny) # lines(x[ord],pred_Tiny[ord], lwd=2, col="steelblue")
We are hoping for no major patterns remaining in these residuals.
Using spans to decompose
0.0 0.2 0.4 0.6 0.8 1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4
Remaining residuals on x
x r_Remains
Using spans to decompose
Putting together the various fitted smooths we have a decomposition of the
- bserved values as
- pt <- par(mfrow=c(3,1), mar=c(0,4,4,2) + 0.1)
plot(x[ord],pred_Large[ord], xlab="", ylab="", xaxt="n", col="grey50", pch=19, cex=0.5, main = "Large span smooth") # add a range of residuals rectangle xleft <- 1.01 xright <- 1.02 yDelta <- sd(r_Remains) rectCol <- adjustcolor("red", 0.5) ymiddle <- mean(pred_Large) rect(xleft, ymiddle - yDelta, xright, ymiddle+yDelta, border="grey50", col=rectCol) plot(x[ord],pred_Small[ord], xlab="", ylab="", xaxt="n", col="grey50", pch=19, cex=0.5, main = "Small span smooth") # add the same range rectangle ymiddle <- mean(pred_Small) rect(xleft, ymiddle - yDelta, xright, ymiddle+yDelta, border="grey50", col=rectCol) plot(x[ord],r_Remains[ord], xlab="", ylab="", xaxt="n", col="grey50", pch=19, cex=0.5, main = "Remaining residuals") abline(h=0, col="lightgrey") # add the same range rectangle ymiddle <- mean(r_Remains) rect(xleft, ymiddle - yDelta, xright, ymiddle+yDelta, border="grey50", col=rectCol) rect(xleft, ybottom, xright, ytop, border="grey50", col=rectCol) par(opt)
Using spans to decompose
Putting together the various fitted smooths we have a decomposition of the
- bserved values as
−1.0 −0.5 0.0 0.5
Large span smooth
−0.4 −0.2 0.0 0.2 0.4
Small span smooth
−0.8 −0.4 0.0 0.4
Remaining residuals
Using spans to decompose - small to large?
What happens if we start with the small span and then move to a larger one?
# First fit the small span fit_Small <- loess(y~x, data=ourdata, span=0.3) # # Get the values pred_Small <- predict(fit_Small) r_Small <- fit_Small$residuals data_resids <- data.frame(r = r_Small, x = x) # Now fit the larger span fit_Large <- loess(r~x, data=data_resids, span=0.75) # # Get the values from the fit pred_Large <- predict(fit_Large) r_Remains <- fit_Large$residuals data_remains <- data.frame(r = r_Remains, x = x)
Then plot the results as before.
Using spans to decompose - small to large?
What happens if we start with the small span and then move to a larger one?
−1.0 −0.5 0.0 0.5
Small span first
−0.005 0.005 0.015
Then large span
−0.6 −0.2 0.2
Remaining residuals
Using spans to decompose
Note the size of the range rectangle for the various spans. The small span is now capturing low and high frequency variation. That which remains for the large span to capture has a very small vertical range. For decomposition, large spans need to go first. An iterative process is typically followed. First the large span is removed, then the small. If we denote a large span smooth by L(x) and a small span smooth by S(x), then we could proceed as follows. Begin with an initial estimate y = L0(x) + r0
L
r0
L =
S0(x) + r0
S
Then given estimates Li(x), Si(x), etc.
- y −
Si(x) = Li+1(x) + ri+1
L
- y −
Li+1(x) = Si+1(x) + ri+1
S
One could iterate back and forth over these alternately get a large span and small span smooth until convergence.
Decomposing time series
Recall the co2 time series:
saveOptions <- options(width = 100) window(co2, start=c(1959,1), end=c(1963, 12)) ## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ## 1959 315.42 316.31 316.50 317.56 318.13 318.00 316.39 314.65 313.68 313.18 314.66 315.43 ## 1960 316.27 316.81 317.42 318.87 319.87 319.43 318.01 315.74 314.00 313.68 314.84 316.03 ## 1961 316.73 317.54 318.38 319.31 320.42 319.61 318.42 316.63 314.83 315.16 315.94 316.85 ## 1962 317.78 318.40 319.53 320.42 320.85 320.45 319.45 317.25 316.11 315.27 316.53 317.53 ## 1963 318.58 318.92 319.70 321.22 322.08 321.31 319.58 317.61 316.05 315.83 316.91 318.20
- ptions(saveOptions)
We decomposed this series into a “trend” component, a “seasonal” component, and residual component (which hopefully is patternless) in at least two ways.
- 1. The first way involved computing
◮ the average for each row (i.e. for each year across months) providing an estimate of the trend
and
◮ the average for each column (i.e. for each month over all years) providing a seasonality
estimate. However they do not sum to the data. We could perhaps subtract the annual average from each month for that year, and then average those differences to get a seasonal component that we could add to the trend. The residuals would then be the difference between the data and the sum of these two components. Note that the seasonality is identical for all years.
Decomposing time series
- 2. The second way was to call the decompose function
fit <- decompose(co2) This separates the time series into the three components in the same way as the first except that
◮ the trend is determined by a moving average (as wide as the frequency + 1) of values on
either side of the month being estimated
◮ the window for the averaging is symmetric about the month in question ◮ the average uses equal weights except at the outermost points (which have half the weight
again) the seasonal and residual components are defined as with the first method. Given that we have the loess smoother, we might have third approach:
- 3. stl - seasonal trend decomposition by loess
◮ instead of simply averaging over years for each month and then using that
average for that month for every year, we could smooth the values for that month over all years and use the smoothed values as the estimate for that month for each year.
◮ similarly, instead of a simple moving average of the years for the trend
component, we could smooth the years (after the seasonality component has been removed).
Decomposing time series - stl
320 330 340 350 360
data
−3 −2 −1 1 2 3
seasonal
320 330 340 350 360
trend
−0.5 0.0 0.5 1960 1970 1980 1990
remainder time
Decomposing time series - stl
Note that range bars appear on the right of each component’s plot. These are all of the same length in data coordinates (here in ppm). Note also that the seasonal component is identically repeated across the years. This is because it is simply an average (for each month) across all years. This can be generalized by giving an odd integer value to the argument s.window rather than the term “periodic”. This numerical value is the span of the loess window (in lags). We can also specify whether we would like to fit averages over this window with s.degree=0 or fit a local line with s.degree=1. For example, plot(stl(co2, s.window=7, s.degree=1))
Decomposing time series - stl
320 330 340 350 360
data
−3 −2 −1 1 2 3
seasonal
320 330 340 350 360
trend
−0.6 −0.2 0.0 0.2 0.4 0.6 1960 1970 1980 1990
remainder time
Decomposing time series - stl
The seasonal components now vary a little more over the years. There are similar arguments t.window and t.degree for the loess smoothing of the “trend” component. plot(stl(co2, s.window=7, s.degree=1, t.degree=1))
Decomposing time series - stl
320 330 340 350 360
data
−3 −2 −1 1 2 3
seasonal
320 330 340 350 360
trend
−0.6 −0.2 0.0 0.2 0.4 0.6 1960 1970 1980 1990
remainder time
Interactive plots
Exploration of time series through the interactive graphics available from the package loon. library("loon") l_plot(stl(co2, s.window=7, s.degree=1, t.degree=1)) # Or demo("l_timeseries")
CO2 updated
Up to date measurements
Time CO2 1960 1970 1980 1990 2000 2010 320 340 360 380 400
Decomposing time series - stl
Back to the sunspots.
◮ One of the strengths of stl is that it identifies the “seasonal” component of the
decomposition with the built in frequency of the series.
◮ The weakness of this is with a series like the sunspots data whose frequency is 12
(being monthly) but whose repeating cycle is expected to have a cycle length of 11 years! For example, we can apply stl to sunspots as plot(stl(sunspots, s.window=7, s.degree=1, t.degree=1)) # or # l_plot(stl(sunspots, s.window=7, s.degree=1, t.degree=1))
Decomposing time series - stl
50 100 150 200 250
data
−30 −10 10 20 30 40
seasonal
50 100 150 200
trend
−40 −20 20 40 60 1750 1800 1850 1900 1950
remainder time
Comments?
Decomposing time series - stl
To try and get at the 11 year cycle, we would have to change the frequency of the sunspots series from 12 months to 11 years or 132 months. newsunspots <- ts(sunspots, frequency = 132) plot(stl(newsunspots, s.window=7, s.degree=1, t.degree=1))
Decomposing time series - stl
50 100 150 200 250
data
−50 50 100
seasonal
20 30 40 50 60 70 80 90
trend
−50 50 100 5 10 15 20
remainder time