spiked eigenvalues of high dimensional separable sample
play

Spiked Eigenvalues of High Dimensional Separable Sample Covariance - PowerPoint PPT Presentation

Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices Guangming Pan, Nanyang Technological University, Singapore November 19, 2019 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance


  1. Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices Guangming Pan, Nanyang Technological University, Singapore November 19, 2019 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 1 / 75

  2. Outline Motivation 1 Misleading PCA on Simulated Data High Dimensional Separable Covariance Model 2 Asymptotic Performance of Largest Eigenvalues 3 Inference on High Dimensional Time Series 4 Implementing Factor Analysis on Our Model Unit Root Models Satisfying Assumption 4 A New Test for Unit Root against Factor Model More Thoughts about Panel Data Structures Simulations 5 The Simulation about Proposition 3 The Simulation about Proposition 4 Reference 6 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 2 / 75

  3. Outline Motivation 1 Misleading PCA on Simulated Data High Dimensional Separable Covariance Model 2 Asymptotic Performance of Largest Eigenvalues 3 Inference on High Dimensional Time Series 4 Implementing Factor Analysis on Our Model Unit Root Models Satisfying Assumption 4 A New Test for Unit Root against Factor Model More Thoughts about Panel Data Structures Simulations 5 The Simulation about Proposition 3 The Simulation about Proposition 4 Reference 6 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 3 / 75

  4. The model y it = ℓ i 1 f 1 t + ℓ i 2 f 2 t + ε it = ℓ ∗ i f t + ε it , i = 1 , 2 , . . . , n ; t = 1 , 2 , . . . , T, (1) where f t = ( f 1 t , f 2 t ) ∗ are two common factors, ℓ i = ( ℓ i 1 , ℓ i 2 ) ∗ are the corresponding factor loadings, and ε it is the error component, in which the symbol “ ∗ ” denotes the conventional conjugate transpose. Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 4 / 75

  5. Scenario : No true common factors Under this case, the factor loadings are generated as ℓ i = (0 , 0) ∗ . When the original data follow AR(1) model ( γ = 0 . 2 ), Figures 1 and 2 provide all eigenvalues of the sample covariance matrix as ( T, n ) = (20 , 40) and ( T, n ) = (40 , 20) , respectively. There are no spiked eigenvalues in view of these graphs, which correctly reflect the fact that there are no common factors in the original data. Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 5 / 75

  6. Figures Figure: 1 T = 20 , n = 40 , γ = 0 . 2 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 6 / 75

  7. Figures Figure: 2 T = 40 , n = 20 , γ = 0 . 2 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 7 / 75

  8. Figures Figure: 3 T = 20 , n = 40 , γ = 1 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 8 / 75

  9. Figures Figure: 4 T = 40 , n = 20 , γ = 1 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 9 / 75

  10. Scenario : No true common factors However, as the data observations are nonstationary ( γ = 1 ), Figures 3 and 4 show that there is one spiked eigenvalue from the sample covariance matrix, while the true number of common factors is 0 . This example demonstrates that PCA may not be informative accurately on high dimensional data with dependent sample observations. Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 10 / 75

  11. Outline Motivation 1 Misleading PCA on Simulated Data High Dimensional Separable Covariance Model 2 Asymptotic Performance of Largest Eigenvalues 3 Inference on High Dimensional Time Series 4 Implementing Factor Analysis on Our Model Unit Root Models Satisfying Assumption 4 A New Test for Unit Root against Factor Model More Thoughts about Panel Data Structures Simulations 5 The Simulation about Proposition 3 The Simulation about Proposition 4 Reference 6 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 11 / 75

  12. High Dimensional Separable Covariance Model Consider an n -dimensional random vector y with observations y 1 , y 2 , . . . , y T . Pool all observations together into a T × n matrix Y = ( y 1 , y 2 , . . . , y T ) ∗ . The data matrix Y has the structure Y = ΓXΩ 1 / 2 , (2) where X = ( x 1 , ..., x n ) = ( x ij ) ( T + L ) × n is a ( T + L ) × n random matrix with i.i.d. elements; Σ = ΓΓ ∗ and Ω are T × T and n × n deterministic non-negative definite matrices, respectively. Here Γ is a T × ( T + L ) deterministic matrix. Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 12 / 75

  13. Separable covariance matrix Actually the matrix Γ describes dependence among sample observations. The matrix Ω measures cross-sectional dependence for y under study. Under this setting, the sample covariance matrix of y can be expressed as ΓXΩX ∗ Γ ∗ . It is also called separable covariance matrix. Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 13 / 75

  14. Largest spiked eigenvalues We are interested in the largest spiked eigenvalues of matrix Ω , which describes the cross-sectional dependence. In the classical procedure of using PCA, spiked empirical eigenvalues from sample covariance matrix ΓXΩX ∗ Γ ∗ are utilized to approximate those of the matrix Ω . In this paper, we investigate the spiked empirical eigenvalues from an innovative view: how the the spiked eigenvalues of the matrix Σ (due to the dependent sample) affect the spiked sample eigenvalues ? To this end, we do not impose any spiked structures on the matrix Ω . Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 14 / 75

  15. spikiness of the matrix Σ We assume spikiness of the matrix Σ through the following decomposition. Let the spectral decomposition of Γ be V Λ 1 / 2 U , where V and U are T × T and T × ( T + L ) orthogonal matrices respectively ( VV ∗ = UU ∗ = I ), Λ is a diagonal matrix composed by the descent � Λ S � 0 ordered eigenvalues of Σ = ΓΓ ∗ . Moreover, we write Λ = , 0 Λ P where Λ S = diag ( µ 1 , ..., µ K ) , Λ P = diag ( µ K +1 , ..., µ T ) , and µ 1 , ..., µ K are referred to the spiked eigenvalues that are significantly bigger than the � U 1 � and Σ 2 = U ∗ rest. In addition, we write U = 2 Λ P U 2 . U 2 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 15 / 75

  16. Outline Motivation 1 Misleading PCA on Simulated Data High Dimensional Separable Covariance Model 2 Asymptotic Performance of Largest Eigenvalues 3 Inference on High Dimensional Time Series 4 Implementing Factor Analysis on Our Model Unit Root Models Satisfying Assumption 4 A New Test for Unit Root against Factor Model More Thoughts about Panel Data Structures Simulations 5 The Simulation about Proposition 3 The Simulation about Proposition 4 Reference 6 Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 16 / 75

  17. Asymptotic Performance of Largest Eigenvalues This section is to establish the asymptotic distribution of the largest spiked empirical eigenvalues. First, we make the following assumptions. Assumption (Moment Conditions) { x ij : i = 1 , ..., T + L , j = 1 , ..., n } are i.i.d random variables such that E x ij = 0 . E |√ nx ij | 2 = 1 and E |√ nx ij | 4 = γ 4 < ∞ . Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 17 / 75

  18. Assumption 2 Assumption (Dependent Sample Structure) α L = µ K = ... = µ K − n L < α L− 1 = µ K − n L +1 ... < α 1 = µ n 1 = ... = µ 1 , where n 1 ,..., n L are finite. Moreover, there exists a small constant c > 0 such that α i − 1 − α i ≥ cα i for i = 1 , 2 , ..., L and µ K − µ K +1 ≥ cµ K . Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 18 / 75

  19. Assumption 3 Assumption (Cross-sectional Structure) The matrix Ω is nonnegative definite and its effective rank r ∗ ( Ω ) = tr ( Ω ) � Ω � 2 → ∞ , where � Ω � 2 means the spectral norm. Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 19 / 75

  20. Assumption 4 Assumption (Spiked Dependent Sample Structure) The spiked eigenvalues of the population covariance matrix are much bigger than the rest of the eigenvalues. Precisely speaking, for ∀ ε > 0 , there is K ε , independent of n and T , such that when n and T are big enough, � T i = K ε µ i < ε 2 . (3) µ K Guangming Pan, (USTC) Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices November 19, 2019 20 / 75

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend