Lecture 6 Discrete Time Series 9/21/2018 1 Discrete Time Series - - PowerPoint PPT Presentation
Lecture 6 Discrete Time Series 9/21/2018 1 Discrete Time Series - - PowerPoint PPT Presentation
Lecture 6 Discrete Time Series 9/21/2018 1 Discrete Time Series Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In
Discrete Time Series
Stationary Processes
A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In the time series context this means that the joint distribution of
{𝑧𝑢1, … , 𝑧𝑢𝑜} must be identical to the distribution of {𝑧𝑢1+𝑙, … , 𝑧𝑢𝑜+𝑙}
for any value of 𝑜 and 𝑙.
2
Stationary Processes
A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In the time series context this means that the joint distribution of
{𝑧𝑢1, … , 𝑧𝑢𝑜} must be identical to the distribution of {𝑧𝑢1+𝑙, … , 𝑧𝑢𝑜+𝑙}
for any value of 𝑜 and 𝑙.
2
Weak Stationary
Strict stationary is unnecessarily strong / restrictive for many applications, so instead we often opt for weak stationary which requires the following,
- 1. The process has finite variance
𝐹(𝑧2
𝑢 ) < ∞ for all 𝑢
- 2. The mean of the process is constant
𝐹(𝑧𝑢) = 𝜈 for all 𝑢
- 3. The second moment only depends on the lag
𝐷𝑝𝑤(𝑧𝑢, 𝑧𝑡) = 𝐷𝑝𝑤(𝑧𝑢+𝑙, 𝑧𝑡+𝑙) for all 𝑢, 𝑡, 𝑙
When we say stationary in class we will almost always mean weakly stationary.
3
Weak Stationary
Strict stationary is unnecessarily strong / restrictive for many applications, so instead we often opt for weak stationary which requires the following,
- 1. The process has finite variance
𝐹(𝑧2
𝑢 ) < ∞ for all 𝑢
- 2. The mean of the process is constant
𝐹(𝑧𝑢) = 𝜈 for all 𝑢
- 3. The second moment only depends on the lag
𝐷𝑝𝑤(𝑧𝑢, 𝑧𝑡) = 𝐷𝑝𝑤(𝑧𝑢+𝑙, 𝑧𝑡+𝑙) for all 𝑢, 𝑡, 𝑙
When we say stationary in class we will almost always mean weakly stationary.
3
Autocorrelation
For a stationary time series, where 𝐹(𝑧𝑢) = 𝜈 and Var(𝑧𝑢) = 𝜏2 for all 𝑢, we define the autocorrelation at lag 𝑙 as
𝜍𝑙 = 𝐷𝑝𝑠(𝑧𝑢, 𝑧𝑢+𝑙) = 𝐷𝑝𝑤(𝑧𝑢, 𝑧𝑢+𝑙) √𝑊 𝑏𝑠(𝑧𝑢)𝑊 𝑏𝑠(𝑧𝑢+𝑙) = 𝐹 ((𝑧𝑢 − 𝜈)(𝑧𝑢+𝑙 − 𝜈)) 𝜏2
this is also sometimes written in terms of the autocovariance function (𝛿𝑙) as
𝛿𝑙 = 𝛿(𝑢, 𝑢 + 𝑙) = 𝐷𝑝𝑤(𝑧𝑢, 𝑧𝑢+𝑙) 𝜍𝑙 = 𝛿(𝑢, 𝑢 + 𝑙) √𝛿(𝑢, 𝑢)𝛿(𝑢 + 𝑙, 𝑢 + 𝑙) = 𝛿(𝑙) 𝛿(0)
4
Autocorrelation
For a stationary time series, where 𝐹(𝑧𝑢) = 𝜈 and Var(𝑧𝑢) = 𝜏2 for all 𝑢, we define the autocorrelation at lag 𝑙 as
𝜍𝑙 = 𝐷𝑝𝑠(𝑧𝑢, 𝑧𝑢+𝑙) = 𝐷𝑝𝑤(𝑧𝑢, 𝑧𝑢+𝑙) √𝑊 𝑏𝑠(𝑧𝑢)𝑊 𝑏𝑠(𝑧𝑢+𝑙) = 𝐹 ((𝑧𝑢 − 𝜈)(𝑧𝑢+𝑙 − 𝜈)) 𝜏2
this is also sometimes written in terms of the autocovariance function (𝛿𝑙) as
𝛿𝑙 = 𝛿(𝑢, 𝑢 + 𝑙) = 𝐷𝑝𝑤(𝑧𝑢, 𝑧𝑢+𝑙) 𝜍𝑙 = 𝛿(𝑢, 𝑢 + 𝑙) √𝛿(𝑢, 𝑢)𝛿(𝑢 + 𝑙, 𝑢 + 𝑙) = 𝛿(𝑙) 𝛿(0)
4
Covariance Structure
Based on our definition of a (weakly) stationary process, it implies a covariance of the following structure, 𝚻 = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ 𝛿(0) 𝛿(1) 𝛿(2) 𝛿(3) ⋯ 𝛿(𝑜 − 1) 𝛿(𝑜) 𝛿(1) 𝛿(0) 𝛿(1) 𝛿(2) ⋯ 𝛿(𝑜 − 2) 𝛿(𝑜 − 1) 𝛿(2) 𝛿(1) 𝛿(0) 𝛿(1) ⋯ 𝛿(𝑜 − 3) 𝛿(𝑜 − 2) 𝛿(3) 𝛿(2) 𝛿(1) 𝛿(0) ⋯ 𝛿(𝑜 − 4) 𝛿(𝑜 − 3) ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ 𝛿(𝑜 − 1) 𝛿(𝑜 − 2) 𝛿(𝑜 − 3) 𝛿(𝑜 − 4) ⋯ 𝛿(0) 𝛿(1) 𝛿(𝑜) 𝛿(𝑜 − 1) 𝛿(𝑜 − 2) 𝛿(𝑜 − 3) ⋯ 𝛿(1) 𝛿(0) ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
where 𝑄𝑢,𝑙(𝑧) is the project of 𝑧 onto the space spanned by 𝑧𝑢+1, … , 𝑧𝑢+𝑙−1.
5
Example - Random walk
Let 𝑧𝑢 = 𝑧𝑢−1 + 𝑥𝑢 with 𝑧0 = 0 and 𝑥𝑢 ∼ 𝒪(0, 1).
−10 10 250 500 750 1000
t y
Random walk
6
ACF + PACF
−10 10 200 400 600 800 1000
rw$y
0.00 0.25 0.50 0.75 1.00 10 20 30 40 50
Lag ACF
0.00 0.25 0.50 0.75 1.00 10 20 30 40 50
Lag PACF 7
Stationary?
Is 𝑧𝑢 stationary?
8
Partial Autocorrelation - pACF
Given these type of patterns in the autocorrelation we often want to examine the relationship between 𝑧𝑢 and 𝑧𝑢+𝑙 with the (linear) dependence of 𝑧𝑢 on 𝑧𝑢+1 through 𝑧𝑢+𝑙−1 removed. This is done through the calculation of a partial autocorrelation (𝛽(𝑙)), which is defined as follows:
𝛽(0) = 1 𝛽(1) = 𝜍(1) = 𝐷𝑝𝑠(𝑧𝑢, 𝑧𝑢+1) ⋮ 𝛽(𝑙) = 𝐷𝑝𝑠(𝑧𝑢 − 𝑄𝑢,𝑙(𝑧𝑢), 𝑧𝑢+𝑙 − 𝑄𝑢,𝑙(𝑧𝑢+𝑙))
9
Example - Random walk with drift
Let 𝑧𝑢 = 𝜀 + 𝑧𝑢−1 + 𝑥𝑢 with 𝑧0 = 0 and 𝑥𝑢 ∼ 𝒪(0, 1).
20 40 60 80 250 500 750 1000
t y
Random walk with trend
10
ACF + PACF
20 40 60 80 200 400 600 800 1000
rwt$y
0.00 0.25 0.50 0.75 1.00 10 20 30 40 50
Lag ACF
0.00 0.25 0.50 0.75 1.00 10 20 30 40 50
Lag PACF 11
Stationary?
Is 𝑧𝑢 stationary?
12
Example - Moving Average
Let 𝑥𝑢 ∼ 𝒪(0, 1) and 𝑧𝑢 = 𝑥𝑢−1 + 𝑥𝑢.
−2 −1 1 2 3 25 50 75 100
t y
Moving Average
13
ACF + PACF
−2 −1 1 2 3 20 40 60 80 100
ma$y
−0.25 0.00 0.25 10 20 30 40 50
Lag ACF
−0.25 0.00 0.25 10 20 30 40 50
Lag PACF 14
Stationary?
Is 𝑧𝑢 stationary?
15
Autoregressive
Let 𝑥𝑢 ∼ 𝒪(0, 1) and 𝑧𝑢 = 𝑧𝑢−1 − 0.9𝑧𝑢−2 + 𝑥𝑢 with 𝑧𝑢 = 0 for 𝑢 < 1.
−4 4 100 200 300 400 500
t y
Autoregressive
16
ACF + PACF
−4 4 100 200 300 400 500
ar$y
−0.5 0.0 0.5 10 20 30 40 50
Lag ACF
−0.5 0.0 0.5 10 20 30 40 50
Lag PACF 17
Example - Australian Wine Sales
Australian total wine sales by wine makers in bottles <= 1 litre. Jan 1980 – Aug 1994.
aus_wine = readRDS(”../data/aus_wine.rds”) aus_wine ## # A tibble: 176 x 2 ## date sales ## <dbl> <dbl> ## 1 1980 15136 ## 2 1980. 16733 ## 3 1980. 20016 ## 4 1980. 17708 ## 5 1980. 18019 ## 6 1980. 19227 ## 7 1980. 22893 ## 8 1981. 23739 ## 9 1981. 21133 ## 10 1981. 22591 ## # ... with 166 more rows
18
Time series
20000 30000 40000 1980 1985 1990 1995
date sales 19
Basic Model Fit
20000 30000 40000 1980 1985 1990 1995
date sales model
linear quadratic
20
Residuals
quad_resid lin_resid 1980 1985 1990 1995 −10000 −5000 5000 10000 15000 −10000 −5000 5000 10000 15000
date residual type
lin_resid quad_resid
21
Autocorrelation Plot
−10000 −5000 5000 10000 50 100 150
d$quad_resid
0.0 0.5 5 10 15 20 25 30 35
Lag ACF
0.0 0.5 5 10 15 20 25 30 35
Lag PACF 22
lag9 lag10 lag11 lag12 lag5 lag6 lag7 lag8 lag1 lag2 lag3 lag4 −10000 −5000 0 5000 10000 −10000 −5000 0 5000 10000 −10000 −5000 0 5000 10000 −10000 −5000 0 5000 10000 −10000 −5000 5000 10000 −10000 −5000 5000 10000 −10000 −5000 5000 10000
lag_value quad_resid 23
Auto regressive errors
## ## Call: ## lm(formula = quad_resid ~ lag_12, data = d_ar) ## ## Residuals: ## Min 1Q Median 3Q Max ## -12286.5
- 1380.5
73.4 1505.2 7188.1 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 83.65080 201.58416 0.415 0.679 ## lag_12 0.89024 0.04045 22.006 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 2581 on 162 degrees of freedom ## (12 observations deleted due to missingness) ## Multiple R-squared: 0.7493, Adjusted R-squared: 0.7478 ## F-statistic: 484.3 on 1 and 162 DF, p-value: < 2.2e-16
24
Residual residuals
−10000 −5000 5000 1980 1985 1990 1995
date resid 25
Residual residuals - acf
−10000 −5000 5000 50 100 150
l_ar$residuals
−0.2 −0.1 0.0 0.1 0.2 5 10 15 20 25 30 35
Lag ACF
−0.2 −0.1 0.0 0.1 0.2 5 10 15 20 25 30 35
Lag PACF 26
lag9 lag10 lag11 lag12 lag5 lag6 lag7 lag8 lag1 lag2 lag3 lag4 −10000 −5000 5000 −10000 −5000 5000 −10000 −5000 5000 −10000 −5000 5000 −10000 −5000 5000 −10000 −5000 5000 −10000 −5000 5000