lecture 6

Lecture 6 Discrete Time Series 9/21/2018 1 Discrete Time Series - PowerPoint PPT Presentation

Lecture 6 Discrete Time Series 9/21/2018 1 Discrete Time Series Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In


  1. Lecture 6 Discrete Time Series 9/21/2018 1

  2. Discrete Time Series

  3. Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In the time series context this means that the joint distribution of {๐‘ง ๐‘ข 1 , โ€ฆ , ๐‘ง ๐‘ข ๐‘œ } must be identical to the distribution of {๐‘ง ๐‘ข 1 +๐‘™ , โ€ฆ , ๐‘ง ๐‘ข ๐‘œ +๐‘™ } for any value of ๐‘œ and ๐‘™ . 2

  4. Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In the time series context this means that the joint distribution of {๐‘ง ๐‘ข 1 , โ€ฆ , ๐‘ง ๐‘ข ๐‘œ } must be identical to the distribution of {๐‘ง ๐‘ข 1 +๐‘™ , โ€ฆ , ๐‘ง ๐‘ข ๐‘œ +๐‘™ } for any value of ๐‘œ and ๐‘™ . 2

  5. Weak Stationary Strict stationary is unnecessarily strong / restrictive for many applications, so instead we often opt for weak stationary which requires the following, 1. The process has finite variance ๐น(๐‘ง 2 2. The mean of the process is constant ๐น(๐‘ง ๐‘ข ) = ๐œˆ for all ๐‘ข 3. The second moment only depends on the lag ๐ท๐‘๐‘ค(๐‘ง ๐‘ข , ๐‘ง ๐‘ก ) = ๐ท๐‘๐‘ค(๐‘ง ๐‘ข+๐‘™ , ๐‘ง ๐‘ก+๐‘™ ) for all ๐‘ข, ๐‘ก, ๐‘™ When we say stationary in class we will almost always mean weakly stationary . 3 ๐‘ข ) < โˆž for all ๐‘ข

  6. Weak Stationary Strict stationary is unnecessarily strong / restrictive for many applications, so instead we often opt for weak stationary which requires the following, 1. The process has finite variance ๐น(๐‘ง 2 2. The mean of the process is constant ๐น(๐‘ง ๐‘ข ) = ๐œˆ for all ๐‘ข 3. The second moment only depends on the lag ๐ท๐‘๐‘ค(๐‘ง ๐‘ข , ๐‘ง ๐‘ก ) = ๐ท๐‘๐‘ค(๐‘ง ๐‘ข+๐‘™ , ๐‘ง ๐‘ก+๐‘™ ) for all ๐‘ข, ๐‘ก, ๐‘™ When we say stationary in class we will almost always mean weakly stationary . 3 ๐‘ข ) < โˆž for all ๐‘ข

  7. ๐›ฟ ๐‘™ = ๐›ฟ(๐‘ข, ๐‘ข + ๐‘™) = ๐ท๐‘๐‘ค(๐‘ง ๐‘ข , ๐‘ง ๐‘ข+๐‘™ ) = ๐›ฟ(๐‘™) ๐œ ๐‘™ = Autocorrelation as ๐›ฟ(0) โˆš๐›ฟ(๐‘ข, ๐‘ข)๐›ฟ(๐‘ข + ๐‘™, ๐‘ข + ๐‘™) ๐›ฟ(๐‘ข, ๐‘ข + ๐‘™) this is also sometimes written in terms of the autocovariance function ( ๐›ฟ ๐‘™ ) ๐œ 2 โˆš๐‘Š ๐‘๐‘ (๐‘ง ๐‘ข )๐‘Š ๐‘๐‘ (๐‘ง ๐‘ข+๐‘™ ) ๐ท๐‘๐‘ค(๐‘ง ๐‘ข , ๐‘ง ๐‘ข+๐‘™ ) = we define the autocorrelation at lag ๐‘™ as 4 For a stationary time series, where ๐น(๐‘ง ๐‘ข ) = ๐œˆ and Var (๐‘ง ๐‘ข ) = ๐œ 2 for all ๐‘ข , ๐œ ๐‘™ = ๐ท๐‘๐‘ (๐‘ง ๐‘ข , ๐‘ง ๐‘ข+๐‘™ ) = ๐น ((๐‘ง ๐‘ข โˆ’ ๐œˆ)(๐‘ง ๐‘ข+๐‘™ โˆ’ ๐œˆ))

  8. Autocorrelation ๐œ 2 ๐›ฟ(0) โˆš๐›ฟ(๐‘ข, ๐‘ข)๐›ฟ(๐‘ข + ๐‘™, ๐‘ข + ๐‘™) ๐›ฟ(๐‘ข, ๐‘ข + ๐‘™) as this is also sometimes written in terms of the autocovariance function ( ๐›ฟ ๐‘™ ) โˆš๐‘Š ๐‘๐‘ (๐‘ง ๐‘ข )๐‘Š ๐‘๐‘ (๐‘ง ๐‘ข+๐‘™ ) ๐ท๐‘๐‘ค(๐‘ง ๐‘ข , ๐‘ง ๐‘ข+๐‘™ ) = we define the autocorrelation at lag ๐‘™ as 4 For a stationary time series, where ๐น(๐‘ง ๐‘ข ) = ๐œˆ and Var (๐‘ง ๐‘ข ) = ๐œ 2 for all ๐‘ข , ๐œ ๐‘™ = ๐ท๐‘๐‘ (๐‘ง ๐‘ข , ๐‘ง ๐‘ข+๐‘™ ) = ๐น ((๐‘ง ๐‘ข โˆ’ ๐œˆ)(๐‘ง ๐‘ข+๐‘™ โˆ’ ๐œˆ)) ๐›ฟ ๐‘™ = ๐›ฟ(๐‘ข, ๐‘ข + ๐‘™) = ๐ท๐‘๐‘ค(๐‘ง ๐‘ข , ๐‘ง ๐‘ข+๐‘™ ) = ๐›ฟ(๐‘™) ๐œ ๐‘™ =

  9. Covariance Structure โ‹ฎ ๐›ฟ(1) ๐›ฟ(0) โ‹ฏ ๐›ฟ(๐‘œ โˆ’ 4) ๐›ฟ(๐‘œ โˆ’ 3) ๐›ฟ(๐‘œ โˆ’ 2) ๐›ฟ(๐‘œ โˆ’ 1) โ‹ฎ ๐›ฟ(๐‘œ โˆ’ 1) โ‹ฑ โ‹ฎ โ‹ฎ โ‹ฎ โ‹ฎ ๐›ฟ(๐‘œ โˆ’ 3) ๐›ฟ(๐‘œ โˆ’ 4) ๐›ฟ(๐‘œ) ๐›ฟ(๐‘œ โˆ’ 2) ๐›ฟ(0) โŽŸ where ๐‘„ ๐‘ข,๐‘™ (๐‘ง) is the project of ๐‘ง onto the space spanned by ๐‘ง ๐‘ข+1 , โ€ฆ , ๐‘ง ๐‘ข+๐‘™โˆ’1 . โŽ  โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ โŽŸ ๐›ฟ(๐‘œ โˆ’ 3) โŽŸ โŽŸ โŽŸ โŽŸ โŽž ๐›ฟ(0) ๐›ฟ(1) โ‹ฏ โ‹ฏ ๐›ฟ(1) Based on our definition of a (weakly) stationary process, it implies a โŽœ ๐›ฟ(0) โŽ โŽœ โŽœ โŽœ โŽœ โŽœ โŽœ ๐›ฟ(2) โŽœ โŽœ โŽœ โŽœ โŽ› ๐šป = covariance of the following structure, ๐›ฟ(1) ๐›ฟ(3) ๐›ฟ(2) ๐›ฟ(2) ๐›ฟ(3) ๐›ฟ(๐‘œ โˆ’ 2) ๐›ฟ(๐‘œ โˆ’ 3) โ‹ฏ ๐›ฟ(1) ๐›ฟ(0) ๐›ฟ(1) ๐›ฟ(๐‘œ โˆ’ 1) โ‹ฏ ๐›ฟ(๐‘œ โˆ’ 2) โ‹ฏ ๐›ฟ(2) ๐›ฟ(1) ๐›ฟ(0) ๐›ฟ(1) ๐›ฟ(๐‘œ) ๐›ฟ(๐‘œ โˆ’ 1) 5

  10. Example - Random walk 6 Let ๐‘ง ๐‘ข = ๐‘ง ๐‘ขโˆ’1 + ๐‘ฅ ๐‘ข with ๐‘ง 0 = 0 and ๐‘ฅ ๐‘ข โˆผ ๐’ช(0, 1) . Random walk 10 y 0 โˆ’10 0 250 500 750 1000 t

  11. ACF + PACF 7 rw$y 10 0 โˆ’10 0 200 400 600 800 1000 1.00 1.00 0.75 0.75 PACF ACF 0.50 0.50 0.25 0.25 0.00 0.00 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag

  12. Stationary? 8 Is ๐‘ง ๐‘ข stationary?

  13. Partial Autocorrelation - pACF Given these type of patterns in the autocorrelation we often want to This is done through the calculation of a partial autocorrelation ( ๐›ฝ(๐‘™) ), which is defined as follows: ๐›ฝ(0) = 1 ๐›ฝ(1) = ๐œ(1) = ๐ท๐‘๐‘ (๐‘ง ๐‘ข , ๐‘ง ๐‘ข+1 ) โ‹ฎ 9 examine the relationship between ๐‘ง ๐‘ข and ๐‘ง ๐‘ข+๐‘™ with the (linear) dependence of ๐‘ง ๐‘ข on ๐‘ง ๐‘ข+1 through ๐‘ง ๐‘ข+๐‘™โˆ’1 removed. ๐›ฝ(๐‘™) = ๐ท๐‘๐‘ (๐‘ง ๐‘ข โˆ’ ๐‘„ ๐‘ข,๐‘™ (๐‘ง ๐‘ข ), ๐‘ง ๐‘ข+๐‘™ โˆ’ ๐‘„ ๐‘ข,๐‘™ (๐‘ง ๐‘ข+๐‘™ ))

  14. Example - Random walk with drift 10 Let ๐‘ง ๐‘ข = ๐œ€ + ๐‘ง ๐‘ขโˆ’1 + ๐‘ฅ ๐‘ข with ๐‘ง 0 = 0 and ๐‘ฅ ๐‘ข โˆผ ๐’ช(0, 1) . Random walk with trend 80 60 40 y 20 0 0 250 500 750 1000 t

  15. ACF + PACF 11 rwt$y 80 60 40 20 0 0 200 400 600 800 1000 1.00 1.00 0.75 0.75 PACF ACF 0.50 0.50 0.25 0.25 0.00 0.00 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag

  16. Stationary? 12 Is ๐‘ง ๐‘ข stationary?

  17. Example - Moving Average 13 Let ๐‘ฅ ๐‘ข โˆผ ๐’ช(0, 1) and ๐‘ง ๐‘ข = ๐‘ฅ ๐‘ขโˆ’1 + ๐‘ฅ ๐‘ข . Moving Average 3 2 1 y 0 โˆ’1 โˆ’2 0 25 50 75 100 t

  18. ACF + PACF 14 ma$y 3 2 1 0 โˆ’1 โˆ’2 0 20 40 60 80 100 0.25 0.25 PACF ACF 0.00 0.00 โˆ’0.25 โˆ’0.25 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag

  19. Stationary? 15 Is ๐‘ง ๐‘ข stationary?

  20. 16 Autoregressive Let ๐‘ฅ ๐‘ข โˆผ ๐’ช(0, 1) and ๐‘ง ๐‘ข = ๐‘ง ๐‘ขโˆ’1 โˆ’ 0.9๐‘ง ๐‘ขโˆ’2 + ๐‘ฅ ๐‘ข with ๐‘ง ๐‘ข = 0 for ๐‘ข < 1 . Autoregressive 4 y 0 โˆ’4 0 100 200 300 400 500 t

  21. ACF + PACF 17 ar$y 4 0 โˆ’4 0 100 200 300 400 500 0.5 0.5 PACF 0.0 0.0 ACF โˆ’0.5 โˆ’0.5 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag

  22. Example - Australian Wine Sales ## ## # ... with 166 more rows ## 10 1981. 22591 9 1981. 21133 ## 8 1981. 23739 ## 7 1980. 22893 ## 6 1980. 19227 ## 5 1980. 18019 ## 4 1980. 17708 3 1980. 20016 Australian total wine sales by wine makers in bottles <= 1 litre. Jan 1980 โ€“ ## 2 1980. 16733 ## 15136 1 1980 ## <dbl> <dbl> ## date sales ## ## # A tibble: 176 x 2 aus_wine Aug 1994. 18 aus_wine = readRDS (โ€../data/aus_wine.rdsโ€)

  23. 19 Time series 40000 30000 sales 20000 1980 1985 1990 1995 date

  24. Basic Model Fit 20 40000 30000 model sales linear quadratic 20000 1980 1985 1990 1995 date

  25. Residuals 21 lin_resid 15000 10000 5000 0 โˆ’5000 โˆ’10000 type residual lin_resid quad_resid 15000 quad_resid 10000 5000 0 โˆ’5000 โˆ’10000 1980 1985 1990 1995 date

  26. Autocorrelation Plot 22 d$quad_resid 10000 5000 0 โˆ’5000 โˆ’10000 0 50 100 150 0.5 0.5 PACF ACF 0.0 0.0 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35 Lag Lag

  27. 23 lag1 lag2 lag3 lag4 10000 5000 0 โˆ’5000 โˆ’10000 lag5 lag6 lag7 lag8 10000 quad_resid 5000 0 โˆ’5000 โˆ’10000 lag9 lag10 lag11 lag12 10000 5000 0 โˆ’5000 โˆ’10000 โˆ’10000 โˆ’5000 0 5000 10000 โˆ’10000 โˆ’5000 0 5000 10000 โˆ’10000 โˆ’5000 0 5000 10000 โˆ’10000 โˆ’5000 0 5000 10000 lag_value

  28. Auto regressive errors 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 0.679 ## lag_12 0.89024 0.04045 22.006 <2e-16 *** ## --- ## Signif. codes: ## 201.58416 ## Residual standard error: 2581 on 162 degrees of freedom ## (12 observations deleted due to missingness) ## Multiple R-squared: 0.7493, Adjusted R-squared: 0.7478 ## F-statistic: 484.3 on 1 and 162 DF, p-value: < 2.2e-16 0.415 83.65080 ## 3Q ## Call: ## lm(formula = quad_resid ~ lag_12, data = d_ar) ## ## Residuals: ## Min 1Q Median Max ## (Intercept) ## -12286.5 -1380.5 73.4 1505.2 7188.1 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) 24

  29. Residual residuals 25 5000 0 resid โˆ’5000 โˆ’10000 1980 1985 1990 1995 date

  30. Residual residuals - acf 26 l_ar$residuals 5000 0 โˆ’5000 โˆ’10000 0 50 100 150 0.2 0.2 0.1 0.1 0.0 PACF 0.0 ACF โˆ’0.1 โˆ’0.1 โˆ’0.2 โˆ’0.2 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35 Lag Lag

  31. 27 lag1 lag2 lag3 lag4 5000 0 โˆ’5000 โˆ’10000 lag5 lag6 lag7 lag8 5000 0 resid โˆ’5000 โˆ’10000 lag9 lag10 lag11 lag12 5000 0 โˆ’5000 โˆ’10000 โˆ’10000 โˆ’5000 0 5000 โˆ’10000 โˆ’5000 0 5000 โˆ’10000 โˆ’5000 0 5000 โˆ’10000 โˆ’5000 0 5000 lag_value

  32. sales (๐‘ข) = ๐›พ 0 + ๐›พ 1 ๐‘ข + ๐›พ 2 ๐‘ข 2 + ๐‘ฅ ๐‘ข ๐‘ฅ ๐‘ข = ๐œ€ ๐‘ฅ ๐‘ขโˆ’12 + ๐œ— ๐‘ข Writing down the model? So, is our EDA suggesting that we fit the following model? the model we actually fit is, where 28 sales (๐‘ข) = ๐›พ 0 + ๐›พ 1 ๐‘ข + ๐›พ 2 ๐‘ข 2 + ๐›พ 3 sales (๐‘ข โˆ’ 12) + ๐œ— ๐‘ข

Recommend


More recommend