Lecture 11 Fitting ARIMA Models 10/10/2018 1 Model Fitting - PowerPoint PPT Presentation

Lecture 11 Fitting ARIMA Models 10/10/2018 1

Model Fitting

Fitting ARIMA For an 𝐵𝑆𝐽𝑁𝐵(𝑞, 𝑒, 𝑟) model • Requires that the data be stationary after differencing • Handling 𝑒 is straight forward, just difference the original data 𝑒 times (leaving 𝑜 − 𝑒 observations) 𝑧 ′ • After differencing, fit an 𝐵𝑆𝑁𝐵(𝑞, 𝑟) model to 𝑧 ′ 𝑢 . • To keep things simple we’ll assume 𝑥 𝑢 𝑗𝑗𝑒 ∼ 𝒪(0, 𝜏 2 𝑥 ) 2 𝑢 = Δ 𝑒 𝑧 𝑢

In general, the vector 𝐳 = (𝑧 1 , 𝑧 2 , … , 𝑧 𝑢 ) ′ will have a multivariate normal distribution with mean {𝝂} 𝑗 = 𝐹(𝑧 𝑗 ) = 𝐹(𝑧 𝑢 ) and covariance 𝚻 where {𝚻} 𝑗𝑘 = 𝛿(𝑗 − 𝑘) . 2(𝐳 − 𝝂) ′ Σ −1 (𝐳 − 𝝂)) (2𝜌) 𝑢/2 det (𝚻) 1/2 × exp (−1 MLE - Stationarity & iid normal errors normal. The joint density of 𝐳 is given by 𝑔 𝐳 (𝐳) = 1 3 If both of these conditions are met, then the time series 𝑧 𝑢 will also be

MLE - Stationarity & iid normal errors normal. The joint density of 𝐳 is given by 𝑔 𝐳 (𝐳) = 1 3 If both of these conditions are met, then the time series 𝑧 𝑢 will also be In general, the vector 𝐳 = (𝑧 1 , 𝑧 2 , … , 𝑧 𝑢 ) ′ will have a multivariate normal distribution with mean {𝝂} 𝑗 = 𝐹(𝑧 𝑗 ) = 𝐹(𝑧 𝑢 ) and covariance 𝚻 where {𝚻} 𝑗𝑘 = 𝛿(𝑗 − 𝑘) . 2(𝐳 − 𝝂) ′ Σ −1 (𝐳 − 𝝂)) (2𝜌) 𝑢/2 det (𝚻) 1/2 × exp (−1

Fitting 𝐵𝑆(1) 𝑥 𝑥 . the MLE estimate for 𝜀 , 𝜚 , and 𝜏 2 but that does not make it easy to write down a (simplified) closed form for Using these properties it is possible to write the distribution of 𝐳 as a MVN 𝑥 𝜏 2 1 − 𝜚 2 𝜏 2 𝑊 𝑏𝑠(𝑧 𝑢 ) = 1 − 𝜚 𝜀 𝐹(𝑧 𝑢 ) = 𝑥 , we know We need to estimate three parameters: 𝜀 , 𝜚 , and 𝜏 2 4 𝑧 𝑢 = 𝜀 + 𝜚 𝑧 𝑢−1 + 𝑥 𝑢 𝛿 ℎ = 1 − 𝜚 2 𝜚 |ℎ|

Log likelihood of AR(1) 𝜏 2 𝑗=2 ∑ 𝑜 𝑥 𝜏 2 1 + 𝑗=2 ∑ 𝑜 𝑥 𝜏 2 1 𝑥 2((𝑜 − 1) log 2𝜌 + (𝑜 − 1) log 𝜏 2 6 ℓ(𝜀, 𝜚, 𝜏 2 1 log 𝑔 𝑧 𝑗 |𝑧 𝑗−1 𝑗=2 ∑ 𝑢 𝜏 2 𝑥 log 𝑔 𝑧 𝑢 |𝑧 𝑢−1 (𝑧 𝑢 ) = − 1 2 ( log 2𝜌 + log 𝜏 2 𝑥 + (𝑧 𝑢 − 𝜀 + 𝜚 𝑧 𝑢−1 ) 2 ) 𝑥 ) = log 𝑔 𝐳 = log 𝑔 𝑧 1 + = − 1 𝑥 − log (1 − 𝜚 2 ) + (1 − 𝜚 2 ) 2 ( log 2𝜌 + log 𝜏 2 (𝑧 1 − 𝜀) 2 ) − 1 𝑥 + (𝑧 𝑗 − 𝜀 + 𝜚 𝑧 𝑗−1 ) 2 ) = − 1 2 (𝑜 log 2𝜌 + 𝑜 log 𝜏 2 𝑥 − log (1 − 𝜚 2 ) ((1 − 𝜚 2 )(𝑧 1 − 𝜀) 2 + (𝑧 𝑗 − 𝜀 + 𝜚 𝑧 𝑗−1 ) 2 ))

AR(1) Example with 𝜚 = 0.75 , 𝜀 = 0.5 , and 𝜏 2 7 𝑥 = 1 , ar1 4 2 0 −2 0 100 200 300 400 500 0.6 0.6 0.4 0.4 PACF ACF 0.2 0.2 0.0 0.0 0 5 10 15 20 25 0 5 10 15 20 25 Lag Lag

Arima MAE BIC=1433.35 ## ## Training set error measures: ## ME RMSE MPE ## AIC=1420.71 MAPE MASE ## Training set 0.005333274 0.9950158 0.7997576 -984.9413 1178.615 0.9246146 ## ACF1 ## Training set -0.04437489 AICc=1420.76 log likelihood=-707.35 ar1_arima = forecast:: Arima (ar1, order = c (1,0,0)) ar1 summary (ar1_arima) ## Series: ar1 ## ARIMA(1,0,0) with non-zero mean ## ## Coefficients: ## mean ## sigma^2 estimated as 0.994: ## 0.7312 1.8934 ## s.e. 0.0309 0.1646 ## 8

lm ## Signif. codes: 0.07328 7.144 3.25e-12 *** ## lag(y) 0.72817 0.03093 23.539 < 2e-16 *** ## --- 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## (Intercept) ## ## Residual standard error: 0.9949 on 497 degrees of freedom ## (1 observation deleted due to missingness) ## Multiple R-squared: 0.5272, Adjusted R-squared: 0.5262 ## F-statistic: 554.1 on 1 and 497 DF, p-value: < 2.2e-16 0.52347 Estimate Std. Error t value Pr(>|t|) d = data_frame (y = ar1 %>% strip_attrs (), t= seq_along (ar1)) ## summary (ar1_lm) ## ## Call: ## lm(formula = y ~ lag(y), data = d) ## ## Residuals: ## Min 1Q Median 3Q Max ## -3.2772 -0.6880 0.0785 0.6819 2.5704 ## ## Coefficients: 9 ar1_lm = lm (y~ lag (y), data=d)

Bayesian AR(1) Model mu = delta/(1-phi) }” sigma2_w <- 1/tau tau ~ dgamma(0.001,0.001) phi ~ dnorm(0,1) delta ~ dnorm(0,1/1000) # priors } ar1_model = ”model{ y_hat[t] ~ dnorm(delta + phi*y[t-1], 1/sigma2_w) y[t] ~ dnorm(delta + phi*y[t-1], 1/sigma2_w) for (t in 2:length(y)) { y_hat[1] ~ dnorm(delta/(1-phi), (sigma2_w/(1-phi^2))^-1) y[1] ~ dnorm(delta/(1-phi), (sigma2_w/(1-phi^2))^-1) # likelihood 10

Chains 11 0.7 0.6 delta 0.5 0.4 0.3 .variable 0.80 .value delta 0.75 phi phi 0.70 sigma2_w 0.65 1.2 1.1 sigma2_w 1.0 0.9 0.8 0 1000 2000 3000 4000 5000 .iteration

Posteriors 12 delta phi sigma2_w 10 model density ARIMA lm truth 5 0 0.3 0.4 0.5 0.6 0.7 0.8 0.65 0.70 0.75 0.80 0.8 0.9 1.0 1.1 1.2

Predictions 13 4 Model y 2 lm y ARIMA bayes 0 −2 0 100 200 300 400 500 t

Faceted 14 y lm 4 2 0 Model −2 y lm y ARIMA bayes ARIMA bayes 4 2 0 −2 0 100 200 300 400 500 0 100 200 300 400 500 t

Regressing 𝑧 𝑢 on 𝑧 𝑢−𝑞 , … , 𝑧 𝑢−1 gets us an approximate solution, but it Fitting AR(p) - Lagged Regression We can rewrite the density as follows, 𝑔(𝐳) = 𝑔(𝑧 𝑢 , 𝑧 𝑢−1 , … , 𝑧 2 , 𝑧 1 ) = 𝑔(𝑧 𝑜 |𝑧 𝑜−1 , … , 𝑧 𝑜−𝑞 ) ⋯ 𝑔(𝑧 𝑞+1 |𝑧 𝑞 , … , 𝑧 1 )𝑔(𝑧 𝑞 , … , 𝑧 1 ) ignores the 𝑔(𝑧 1 , 𝑧 2 , … , 𝑧 𝑞 ) part of the likelihood. How much does this matter (vs. using the full likelihood)? • If 𝑞 is near to 𝑜 then probably a lot • If 𝑞 << 𝑜 then probably not much 15

Fitting AR(p) - Lagged Regression We can rewrite the density as follows, 𝑔(𝐳) = 𝑔(𝑧 𝑢 , 𝑧 𝑢−1 , … , 𝑧 2 , 𝑧 1 ) = 𝑔(𝑧 𝑜 |𝑧 𝑜−1 , … , 𝑧 𝑜−𝑞 ) ⋯ 𝑔(𝑧 𝑞+1 |𝑧 𝑞 , … , 𝑧 1 )𝑔(𝑧 𝑞 , … , 𝑧 1 ) ignores the 𝑔(𝑧 1 , 𝑧 2 , … , 𝑧 𝑞 ) part of the likelihood. How much does this matter (vs. using the full likelihood)? • If 𝑞 is near to 𝑜 then probably a lot • If 𝑞 << 𝑜 then probably not much 15 Regressing 𝑧 𝑢 on 𝑧 𝑢−𝑞 , … , 𝑧 𝑢−1 gets us an approximate solution, but it

Fitting AR(p) - Method of Moments Recall for an AR(p) process, 𝛿(0) = 𝜏 2 𝛿(ℎ) = 𝜚 1 𝛿(ℎ − 1) + 𝜚 2 𝛿(ℎ − 2) + … 𝜚 𝑞 𝛿(ℎ − 𝑞) We can rewrite the first equation in terms of 𝜏 2 𝑥 , 𝜏 2 these are called the Yule-Walker equations. 16 𝑥 + 𝜚 1 𝛿(1) + 𝜚 2 𝛿(2) + … + 𝜚 𝑞 𝛿(𝑞) 𝑥 = 𝛿(0) − 𝜚 1 𝛿(1) − 𝜚 2 𝛿(2) − … − 𝜚 𝑞 𝛿(𝑞)

𝜹 𝑞 which 𝑥 = 𝛿(0) − Yule-Walker ̂ ̂ can plug in and solve for 𝝔 and 𝜏 2 𝑥 , ̂ 𝝔 = ̂ 𝚫 𝑞 −1 𝜹 𝑞 = (𝛿(1), 𝛿(2), … , 𝛿(𝑞)) ′ 𝜏 2 ̂ 𝜹 𝑞 ′ ̂ 𝚫 −1 𝑞 ̂ 𝜹 𝑞 If we estimate the covariance structure from the data we obtain 𝑞×1 These equations can be rewritten into matrix notation as follows 1×1 𝚫 𝑞 𝑞×𝑞 𝝔 𝑞×1 = 𝜹 𝑞 𝑞×1 𝜏 2 𝑥 1×1 = 𝛿(0) − 𝝔 ′ 𝜹 𝑞 1×𝑞 𝜹 𝐪 𝑞×1 where 𝚫 𝐪 𝑞×𝑞 = {𝛿(𝑘 − 𝑙)} 𝑘,𝑙 𝝔 𝑞×1 = (𝜚 1 , 𝜚 2 , … , 𝜚 𝑞 ) ′ 17

Yule-Walker ̂ If we estimate the covariance structure from the data we obtain ̂ can plug in and solve for 𝝔 and 𝜏 2 𝑥 , ̂ 𝝔 = ̂ 𝚫 𝑞 −1 𝜹 𝑞 These equations can be rewritten into matrix notation as follows 𝜏 2 ̂ 𝜹 𝑞 ′ ̂ 𝚫 −1 𝑞 ̂ 𝜹 𝑞 = (𝛿(1), 𝛿(2), … , 𝛿(𝑞)) ′ 𝑞×1 𝜹 𝑞 = (𝜚 1 , 𝜚 2 , … , 𝜚 𝑞 ) ′ 𝚫 𝑞 𝑞×𝑞 𝝔 𝑞×1 = 𝜹 𝑞 𝑞×1 𝜏 2 𝑥 1×1 = 𝛿(0) 1×1 − 𝝔 ′ 1×𝑞 𝜹 𝐪 𝑞×1 where 𝚫 𝐪 𝑞×𝑞 = {𝛿(𝑘 − 𝑙)} 𝑘,𝑙 𝝔 𝑞×1 17 𝜹 𝑞 which 𝑥 = 𝛿(0) −

Lecture 11 Fitting ARIMA Models 10/10/2018 1 Model Fitting - PowerPoint PPT Presentation

Lecture 11 Fitting ARIMA Models 10/10/2018 1 Model Fitting Fitting ARIMA For an (, , ) model Requires that the data be stationary after differencing Handling is straight forward, just difference the

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

Multiphase Modelling in Cancer Helen Byrne Wolfson Centre for Mathematical Biology Mathematical

Previous Lecture Todays Lecture Slides for Lecture 5 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 30 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 28 Completion of divide-by-3 counter

Previous Lecture Todays Lecture Slides for Lecture 12 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 3 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 2 ENEL 353: Digital Circuits Fall 2013

Previous Lecture Todays Lecture Slides for Lecture 35 ENEL 353: Digital Circuits Fall

Lecture Capture Introduction to Lecture Capture Learning Outcomes What will lecture capture

Previous Lecture Todays Lecture Slides for Lecture 32 Completion of a timing analysis

Repetition Automatic Control, Basic Course, Lecture 11 Fredrik Bagge Carlson December 17, 2016

Previous Lecture Todays Lecture Slides for Lecture 26 ENEL 353: Digital Circuits Fall

Previous Lecture Todays Lecture Slides for Lecture 33 ENEL 353: Digital Circuits Fall

Problem: Inefficiency of recomputing subresults Two example sentences and their potential

May 26: Covert Channels Covert channels Composition of policies Problem

Metropolis Markov chains for wireless random-access networks Alessandro Zocca joint work with

CS293S Redundancy Removal Yufei Ding Review of Last Class Consideration of optimization

Adaptive Filters Application of Linear Prediction Gerhard Schmidt

ELG5377 Illustration of Performance of LMS for AR(2) Process Eric Dubois School of Electrical

In the name of Allah f the compassionate, the merciful p , Digital Video Systems g y S.

Orders of Growth and Tree Recursion CoSc 450: Programming Paradigms 04 Graphics primitive