Estimation of Autoregressive Processes with Sparse Parameters
Abbas Kazemipour
MAST Group Meeting University of Maryland. College Park kaazemi@umd.edu
November 18, 2015
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 1 / 1
Estimation of Autoregressive Processes with Sparse Parameters Abbas - - PowerPoint PPT Presentation
Estimation of Autoregressive Processes with Sparse Parameters Abbas Kazemipour MAST Group Meeting University of Maryland. College Park kaazemi@umd.edu November 18, 2015 Abbas Kazemipour (UMD) Sparse AR November 18, 2015 1 / 1 Overview
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 1 / 1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 2 / 1
1 AR(p) :
2 {wk} : i.i.d innovation sequence.] 3 Assumption 1: σ2
4 Output of an LTI system with the z-transform of its impulse
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 3 / 1
1 AR(p) :
2 {wk} : i.i.d innovation sequence.] 3 Assumption 1: σ2
4 Output of an LTI system with the z-transform of its impulse
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 3 / 1
1 AR(p) :
2 {wk} : i.i.d innovation sequence.] 3 Assumption 1: σ2
4 Output of an LTI system with the z-transform of its impulse
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 3 / 1
1 AR(p) :
2 {wk} : i.i.d innovation sequence.] 3 Assumption 1: σ2
4 Output of an LTI system with the z-transform of its impulse
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 3 / 1
1 Assumption 2: |θi|≤ 1 − ǫ < 1
2 AR(p) process given by {xi}∞
3 By (2) the power spectral density of the process:
4 Assumption 3: θ is an s-sparse vector with s ≪ p. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 4 / 1
1 Assumption 2: |θi|≤ 1 − ǫ < 1
2 AR(p) process given by {xi}∞
3 By (2) the power spectral density of the process:
4 Assumption 3: θ is an s-sparse vector with s ≪ p. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 4 / 1
1 Assumption 2: |θi|≤ 1 − ǫ < 1
2 AR(p) process given by {xi}∞
3 By (2) the power spectral density of the process:
4 Assumption 3: θ is an s-sparse vector with s ≪ p. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 4 / 1
1 Assumption 2: |θi|≤ 1 − ǫ < 1
2 AR(p) process given by {xi}∞
3 By (2) the power spectral density of the process:
4 Assumption 3: θ is an s-sparse vector with s ≪ p. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 4 / 1
1 We observe n consecutive snapshots of length p (a total of
2 Questions:
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 5 / 1
1 We observe n consecutive snapshots of length p (a total of
2 Questions:
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 5 / 1
1 We observe n consecutive snapshots of length p (a total of
2 Questions:
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 5 / 1
1 We observe n consecutive snapshots of length p (a total of
2 Questions:
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 5 / 1
1 We observe n consecutive snapshots of length p (a total of
2 Questions:
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 5 / 1
1 Yule-Walker equations:
2 R := Rp×p = E[xp
3 rk = E[xixi+k] is the k-th autocorrelation 4 If n ≫ p ⇒ estimate R, rk’s + Yule-Walker
5 Usually biased estimates are used at the cost of distorting the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 6 / 1
1 Yule-Walker equations:
2 R := Rp×p = E[xp
3 rk = E[xixi+k] is the k-th autocorrelation 4 If n ≫ p ⇒ estimate R, rk’s + Yule-Walker
5 Usually biased estimates are used at the cost of distorting the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 6 / 1
1 Yule-Walker equations:
2 R := Rp×p = E[xp
3 rk = E[xixi+k] is the k-th autocorrelation 4 If n ≫ p ⇒ estimate R, rk’s + Yule-Walker
5 Usually biased estimates are used at the cost of distorting the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 6 / 1
1 Yule-Walker equations:
2 R := Rp×p = E[xp
3 rk = E[xixi+k] is the k-th autocorrelation 4 If n ≫ p ⇒ estimate R, rk’s + Yule-Walker
5 Usually biased estimates are used at the cost of distorting the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 6 / 1
1 Yule-Walker equations:
2 R := Rp×p = E[xp
3 rk = E[xixi+k] is the k-th autocorrelation 4 If n ≫ p ⇒ estimate R, rk’s + Yule-Walker
5 Usually biased estimates are used at the cost of distorting the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 6 / 1
1 Yule-Walker equations:
2 R := Rp×p = E[xp
3 rk = E[xixi+k] is the k-th autocorrelation 4 If n ≫ p ⇒ estimate R, rk’s + Yule-Walker
5 Usually biased estimates are used at the cost of distorting the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 6 / 1
1 Yule-Walker equations:
2 R := Rp×p = E[xp
3 rk = E[xixi+k] is the k-th autocorrelation 4 If n ≫ p ⇒ estimate R, rk’s + Yule-Walker
5 Usually biased estimates are used at the cost of distorting the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 6 / 1
1 Yule-Walker equations:
2 R := Rp×p = E[xp
3 rk = E[xixi+k] is the k-th autocorrelation 4 If n ≫ p ⇒ estimate R, rk’s + Yule-Walker
5 Usually biased estimates are used at the cost of distorting the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 6 / 1
1 LASSO type estimator given by a conditional log-likelihood
2 X is toeplitz matrix with highly correlated elements. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 7 / 1
1 LASSO type estimator given by a conditional log-likelihood
2 X is toeplitz matrix with highly correlated elements. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 7 / 1
4
1 n can be much less than p 2 Better than Yule-Walker Abbas Kazemipour (UMD) Sparse AR November 18, 2015 8 / 1
4
1 n can be much less than p 2 Better than Yule-Walker Abbas Kazemipour (UMD) Sparse AR November 18, 2015 8 / 1
4
1 n can be much less than p 2 Better than Yule-Walker Abbas Kazemipour (UMD) Sparse AR November 18, 2015 8 / 1
1
20 40 60 80 100
0.1
20 40 60 80 100
0.05
20 40 60 80 100
0.1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 9 / 1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 10 / 1
1 Essentially requires the eigenvalues of all n × s submatrices of X
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 11 / 1
1 Essentially requires the eigenvalues of all n × s submatrices of X
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 11 / 1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 12 / 1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 13 / 1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 14 / 1
1 Step 1:
2 Step 2:
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 15 / 1
1 Step 1:
2 Step 2:
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 15 / 1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 16 / 1
1 k 100 200 300 0.5 1 1.5 2 2.5 3 3.5 4 Minimum and Maximum Eigenvalues of R
λmin(k) λmax(k)
f
0.2 0.4 0.6 0.8 1 0.5 1 1.5 2 2.5 3 3.5 4
Power Spectral Density
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 17 / 1
1 RE condition also holds for any stationary process satisfying
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 18 / 1
1 RE condition also holds for any stationary process satisfying
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 18 / 1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 19 / 1
1 Abbas Kazemipour (UMD) Sparse AR November 18, 2015 20 / 1
1 Abbas Kazemipour (UMD) Sparse AR November 18, 2015 20 / 1
1 This holds for every t. 2 We will be interested in t = λmin/2s⋆. 3 Noting that
4 In order to complete this bound it only remains to show that t can
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 21 / 1
1 This holds for every t. 2 We will be interested in t = λmin/2s⋆. 3 Noting that
4 In order to complete this bound it only remains to show that t can
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 21 / 1
1 This holds for every t. 2 We will be interested in t = λmin/2s⋆. 3 Noting that
4 In order to complete this bound it only remains to show that t can
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 21 / 1
1 This holds for every t. 2 We will be interested in t = λmin/2s⋆. 3 Noting that
4 In order to complete this bound it only remains to show that t can
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 21 / 1
1 This holds for every t. 2 We will be interested in t = λmin/2s⋆. 3 Noting that
4 In order to complete this bound it only remains to show that t can
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 21 / 1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 22 / 1
1 Choose t⋆ =
2 p ≫ n + choose n > csp2/3(log p)2/3 ⇒ bound on κ. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 23 / 1
1 Choose t⋆ =
2 p ≫ n + choose n > csp2/3(log p)2/3 ⇒ bound on κ. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 23 / 1
1 Choose t⋆ =
2 p ≫ n + choose n > csp2/3(log p)2/3 ⇒ bound on κ. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 23 / 1
1 Gradient of the objective function xn
2 Lemmas 8 and 4 suggest that a suitable choice of the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 24 / 1
1 Gradient of the objective function xn
2 Lemmas 8 and 4 suggest that a suitable choice of the
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 24 / 1
1 First it is easy to check that by the uncorrelatedness of wk’s we
2 In linear regression terminology, (17) is known as the
3 We show that ∇L(θ) is concentrated around its mean. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 25 / 1
1 First it is easy to check that by the uncorrelatedness of wk’s we
2 In linear regression terminology, (17) is known as the
3 We show that ∇L(θ) is concentrated around its mean. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 25 / 1
1 First it is easy to check that by the uncorrelatedness of wk’s we
2 In linear regression terminology, (17) is known as the
3 We show that ∇L(θ) is concentrated around its mean. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 25 / 1
1 We have
2 The jth element in this expansion is of the form
3 It is easy to check that the sequence yn
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 26 / 1
1 We have
2 The jth element in this expansion is of the form
3 It is easy to check that the sequence yn
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 26 / 1
1 We have
2 The jth element in this expansion is of the form
3 It is easy to check that the sequence yn
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 26 / 1
1 We will now state the following concentration result for sums of
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 27 / 1
1 Since yj’s are a product of two independent subgaussian random
2 Proposition 1 implies that
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 28 / 1
1 Since yj’s are a product of two independent subgaussian random
2 Proposition 1 implies that
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 28 / 1
1 Hence γn ≤ d2
2 Combined with the result of Corollary 12 for n > d1sp2/3(log p)2/3,
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 29 / 1
1 Hence γn ≤ d2
2 Combined with the result of Corollary 12 for n > d1sp2/3(log p)2/3,
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 29 / 1
1 Greedy methods 2 Penalized Yule-Walker 3 Dynamic ℓ1 reconstruction 4 Dynamic Durbin Levinson Abbas Kazemipour (UMD) Sparse AR November 18, 2015 30 / 1
1 Greedy methods 2 Penalized Yule-Walker 3 Dynamic ℓ1 reconstruction 4 Dynamic Durbin Levinson Abbas Kazemipour (UMD) Sparse AR November 18, 2015 30 / 1
1 Greedy methods 2 Penalized Yule-Walker 3 Dynamic ℓ1 reconstruction 4 Dynamic Durbin Levinson Abbas Kazemipour (UMD) Sparse AR November 18, 2015 30 / 1
1 Greedy methods 2 Penalized Yule-Walker 3 Dynamic ℓ1 reconstruction 4 Dynamic Durbin Levinson Abbas Kazemipour (UMD) Sparse AR November 18, 2015 30 / 1
1 Penalized Yule-Walker
2 Instead try
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 31 / 1
1 Reguralized ML
2 Yule-Walker l2,1
3 Yule-Walker l1,1
4 Least Square Solutions to Yule-Walker and Maximum Likelihood
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 32 / 1
20 40 60 80 100 120 140 160 180 200 0.1 0.2 True Parameters 20 40 60 80 100 120 140 160 180 200 −0.2 0.2 Reguralized ML 20 40 60 80 100 120 140 160 180 200 −0.1 0.1 ML Least Squares 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 OMP 20 40 60 80 100 120 140 160 180 200 −0.2 0.2 Yule-Walker ℓ2,1 20 40 60 80 100 120 140 160 180 200 −500 500 Yule-Walker Least Squares 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 OMP 20 40 60 80 100 120 140 160 180 200 −0.2 0.2 Yule-Walker ℓ1,1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 33 / 1
20 40 60 80 100 120 140 160 180 200 0.1 0.2 True Parameters 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 Reguralized ML 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 ML Least Squares 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 OMP 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 Yule-Walker ℓ2,1 20 40 60 80 100 120 140 160 180 200 −4 −2 2 Yule-Walker Least Squares 20 40 60 80 100 120 140 160 180 200 0.2 0.4 OMP 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 Yule-Walker ℓ1,1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 34 / 1
20 40 60 80 100 120 140 160 180 200 0.1 0.2 True Parameters 20 40 60 80 100 120 140 160 180 200 −0.1 0.1 Reguralized ML 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 1 ML Least Squares 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 OMP 20 40 60 80 100 120 140 160 180 200 −0.2 0.2 Yule-Walker ℓ2,1 20 40 60 80 100 120 140 160 180 200 −1 1 Yule-Walker Least Squares 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 OMP 20 40 60 80 100 120 140 160 180 200 −0.2 0.2 Yule-Walker ℓ1,1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 35 / 1
20 40 60 80 100 120 140 160 180 200 0.1 0.2 True Parameters 20 40 60 80 100 120 140 160 180 200 −0.1 0.1 Reguralized ML 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 ML Least Squares 20 40 60 80 100 120 140 160 180 200 −0.2 0.2 OMP 20 40 60 80 100 120 140 160 180 200 −0.1 0.1 0.2 Yule-Walker ℓ2,1 20 40 60 80 100 120 140 160 180 200 −0.2 0.2 Yule-Walker Least Squares 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 OMP 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 Yule-Walker ℓ1,1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 36 / 1
20 40 60 80 100 120 140 160 180 200 0.1 0.2 True Parameters 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 Reguralized ML 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 ML Least Squares 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 OMP 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 Yule-Walker ℓ2,1 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 Yule-Walker Least Squares 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 OMP 20 40 60 80 100 120 140 160 180 200 −0.5 0.5 Yule-Walker ℓ1,1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 37 / 1
20 40 60 80 100 120 140 160 180 200 0.1 0.2 True Parameters 20 40 60 80 100 120 140 160 180 200 −0.1 0.1 0.2 Reguralized ML 20 40 60 80 100 120 140 160 180 200 −0.2 0.2 ML Least Squares 20 40 60 80 100 120 140 160 180 200 0.1 0.2 OMP 20 40 60 80 100 120 140 160 180 200 −0.1 0.1 0.2 Yule-Walker ℓ2,1 20 40 60 80 100 120 140 160 180 200 −0.2 0.2 Yule-Walker Least Squares 20 40 60 80 100 120 140 160 180 200 0.2 0.4 OMP 20 40 60 80 100 120 140 160 180 200 −0.1 0.1 0.2 Yule-Walker ℓ1,1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 38 / 1
101 102 103 104 105 10-2 10-1 100 101 102
Regularized ML Least Squares Yule-Walker ℓ11 Yule-Walker Regularized ML + OMP Yule-Walker ℓ21 OMP + Yule-Walker
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 39 / 1
1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 40 / 1
1 ξ −2
4
1 Abbas Kazemipour (UMD) Sparse AR November 18, 2015 41 / 1
1 Crude oil price of cushing, OK WTI Spot Price FOB dataset. 2 The dataset consists of 7429 daily values 3 outliers removed by visual inspection, n = 4000 4 Long-memory time series → first order differencing 5 Model order selection of low importance here Abbas Kazemipour (UMD) Sparse AR November 18, 2015 42 / 1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 43 / 1
1 First order differences show Gaussian behavior 2 Given no outliers our method predicts a sudden change in prices
3 Yule-Walker is bad ! 4 Greedy is good! Abbas Kazemipour (UMD) Sparse AR November 18, 2015 44 / 1
1 minimax estimation risk over the class of good stationary:
2 The minimax estimator:
3 Typically cannot be constructed → interested in optimal in order
4 Can also define the minimax prediction risk:
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 45 / 1
1 ℓ2-regularized LS problem: [Goldenhauser 2001] 2 Slightly weaker exponential inequality 3 p⋆ = ⌊−1/2 log(1 − ǫ) log n⌋ is minimax optimal 4 Requires an exponentially large in p sample size 5 Our result: the ℓ1-regularized LS estimator is minimax optimal 6 Can afford higher orders Abbas Kazemipour (UMD) Sparse AR November 18, 2015 46 / 1
3
1 Abbas Kazemipour (UMD) Sparse AR November 18, 2015 47 / 1
1 Large n → prediction error variance is very close to the variance of
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 48 / 1
1 Define the event:
2
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 49 / 1
1
4
4
2 For n > cǫsp2/3(log p)2/3, the first term will be the dominant factor Abbas Kazemipour (UMD) Sparse AR November 18, 2015 50 / 1
1 Assumption: Gaussian innovations
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 51 / 1
1 Class Z of AR processes defined over a fixed subset
2 Add the all zero vector θ to Z. 3
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 52 / 1
1 2
3 Arbitrary estimate
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 53 / 1
1 Markov’s inequality:
8 ⌋
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 54 / 1
1 fθi: joint pdf of {xk}k = 1n conditioned on {x}0
2 Gaussian innovations, for i = j
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 55 / 1
1 Using Fano’s:
2 Choose η = ǫ2 and N = log n for large enough s and n. 3 Any θ ∈ Z, satisfies θ1≤ 1 − ǫ. Abbas Kazemipour (UMD) Sparse AR November 18, 2015 56 / 1
1 The residues (estimated innovations) of the process:
2 Goal: quantify how close the sequence {ei}n
3 Abbas Kazemipour (UMD) Sparse AR November 18, 2015 57 / 1
1 Kolmogorov-Smirnov (KS) test statistic
2 Cramer-Von Mises (CvM) statistic
3 Anderson-Darling (AD) statistic
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 58 / 1
1 KS
2 CvM
3 AD
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 59 / 1
1 Based on the similarities of the spectrogram of the data and the
2 Spectral KS, CvM, AD tests ... Abbas Kazemipour (UMD) Sparse AR November 18, 2015 60 / 1
1
Abbas Kazemipour (UMD) Sparse AR November 18, 2015 61 / 1