lecture 2 random matrix theory and phase transitions of
play

Lecture 2. Random Matrix Theory and Phase Transitions of PCA Yuan - PowerPoint PPT Presentation

Lecture 2. Random Matrix Theory and Phase Transitions of PCA Yuan Yao Hong Kong University of Science and Technology February 26, 2020 Outline Recall: Horns Parallel Analysis of PCA Random Matrix Theory Phase Transitions of PCA Recall:


  1. Lecture 2. Random Matrix Theory and Phase Transitions of PCA Yuan Yao Hong Kong University of Science and Technology February 26, 2020

  2. Outline Recall: Horn’s Parallel Analysis of PCA Random Matrix Theory Phase Transitions of PCA Recall: Horn’s Parallel Analysis of PCA 2

  3. How many components of PCA? ◮ Data matrix: X = [ x 1 | x 2 | · · · | x n ] ∈ R p × n ◮ Centering data matrix: Y = XH where H = I − 1 n 1 · 1 T ◮ PCA is given by top left singular vectors of Y = USV T (called loading vectors) by projections to R p , z j = u j Y ◮ MDS is given by top right singular vectors of Y = USV T as Euclidean embedding coordinates of n sample points ◮ But how many components shall we keep? Recall: Horn’s Parallel Analysis of PCA 3

  4. Recall: Horn’s Parallel Analysis ◮ Data matrix: X = [ x 1 | x 2 | · · · | x n ] ∈ R p × n   X 1 , 1 X 1 , 2 · · · X 1 ,n   X 2 , 1 X 2 , 2 · · · X 2 ,n   X =   .  . . . ... . . .  . . . · · · X p, 1 X p, 2 X p,n ◮ Compute its principal eigenvalues { ˆ λ i } i =1 ,...,p Recall: Horn’s Parallel Analysis of PCA 4

  5. Recall: Horn’s Parallel Analysis ◮ Randomly take p permutations of n numbers π 1 , . . . , π p ∈ S n (usually π 1 is set as identity), noting that sample means are permutation invariant,   X 1 ,π 1 (1) X 1 ,π 1 (2) · · · X 1 ,π 1 ( n )   · · · X 2 ,π 2 (1) X 2 ,π 2 (2) X 2 ,π 2 ( n )   X 1 =  .  . . .  ... . . .  . . . X p,π p (1) X p,π p (2) · · · X p,π p ( n ) ◮ Compute its principal eigenvalues { ˆ λ 1 i } i =1 ,...,p . ◮ Repeat such procedure for r times, we can get r sets of principal eigenvalues. { ˆ λ k i } i =1 ,...,p for k = 1 , . . . , r Recall: Horn’s Parallel Analysis of PCA 5

  6. Recall: Horn’s Parallel Analysis (continued) ◮ For each i = 1 , define the i -th p -value as the percentage of random eigenvalues { ˆ i } k =1 ,...,r that exceed the i -th principal eigenvalue ˆ λ k λ i of the original data X , pval i = 1 r # { ˆ i > ˆ λ k λ i : k = 1 , . . . , r } . ◮ Setup a threshold q , e.g. q = 0 . 05 , and only keep those principal eigenvalues ˆ λ i such that pval i < q Recall: Horn’s Parallel Analysis of PCA 6

  7. Example ◮ Let’s look at an example of Parallel Analysis – R: https://github.com/yuany-pku/2017_CSIC5011/blob/ master/slides/paran.R – Matlab: papca.m – Python: Recall: Horn’s Parallel Analysis of PCA 7

  8. How does it work? ◮ We are going to introduce an analysis based on Random Matrix Theory for rank-one spike model Recall: Horn’s Parallel Analysis of PCA 8

  9. How does it work? ◮ We are going to introduce an analysis based on Random Matrix Theory for rank-one spike model ◮ There is a phase transition in principal component analysis Recall: Horn’s Parallel Analysis of PCA 8

  10. How does it work? ◮ We are going to introduce an analysis based on Random Matrix Theory for rank-one spike model ◮ There is a phase transition in principal component analysis – If the signal is strong, principal eigenvalues are beyond the random spectrum and principal components are correlated with signal Recall: Horn’s Parallel Analysis of PCA 8

  11. How does it work? ◮ We are going to introduce an analysis based on Random Matrix Theory for rank-one spike model ◮ There is a phase transition in principal component analysis – If the signal is strong, principal eigenvalues are beyond the random spectrum and principal components are correlated with signal – If the signal is weak, all eigenvalues in PCA are due to random noise Recall: Horn’s Parallel Analysis of PCA 8

  12. Outline Recall: Horn’s Parallel Analysis of PCA Random Matrix Theory Phase Transitions of PCA Random Matrix Theory 9

  13. Marˇ cenko-Pastur Distribution of Noise Eigenvalues ◮ Let x i ∼ N (0 , I p ) ( i = 1 , . . . , n ) and X = [ x 1 , x 2 , . . . , x n ] ∈ R p × n . ◮ The sample covariance matrix Σ n = 1 nXX T . � is called Wishart (random) matrix. ◮ When both n and p grow at p n → γ � = 0 , the distribution of the eigenvalues of � Σ n follows the Marˇ ccenko-Pastur (MP) Law � � � ∈ [ a, b ] , 0 t / 1 − 1 µ MP ( t ) = √ δ ( x ) I ( γ > 1) + ( b − t )( t − a ) γ dt t ∈ [ a, b ] , 2 πγt where a = (1 − √ γ ) 2 , b = (1 + √ γ ) 2 . Random Matrix Theory 10

  14. Illustration of MP Law ◮ If γ ≤ 1 , MP distribution has a support on [ a, b ] ; ◮ if γ > 1 , it has an additional point mass 1 − 1 /γ at the origin. (a) (b) Figure: Show by matlab: (a) Marˇ cenko-Pastur distribution with γ = 2 . (b) Marˇ cenko-Pastur distribution with γ = 0 . 5 . Random Matrix Theory 11

  15. Outline Recall: Horn’s Parallel Analysis of PCA Random Matrix Theory Phase Transitions of PCA Phase Transitions of PCA 12

  16. Rank-one Spike Model Consider the following rank-1 signal-noise model Y = X + ε, where ◮ the signal lies in an one-dimensional subspace X = αu with α ∼ N (0 , σ 2 X ) ; ◮ the noise ε ∼ N (0 , σ 2 ε I p ) is i.i.d. Gaussian. Therefore Y ∼ N (0 , Σ) where the limiting covariance matrix Σ is rank-one added by a sparse matrix: X uu T + σ 2 Σ = σ 2 ε I p . Phase Transitions of PCA 13

  17. When does PCA work? ◮ Can we recover signal direction u from principal component analysis on noisy measurements Y ? ◮ It depends on the signal noise ratio, defined as SNR = R := σ 2 X . σ 2 ε For simplicity we assume that σ 2 ε = 1 without loss of generality. Phase Transitions of PCA 14

  18. Phase Transition of PCA ◮ Consider the scenario p γ = lim n. (1) p,n →∞ as in applications, one never has infinite amount of samples and dimensionality ◮ A fundamental result by I. Johnstone in 2006 shows a phase transition of PCA: Phase Transitions of PCA 15

  19. Phase Transitions ◮ The primary (largest) eigenvalue of sample covariance matrix satisfies � (1 + √ γ ) 2 = b, X ≤ √ γ σ 2 λ max ( � Σ n ) → X > √ γ (2) γ (1 + σ 2 σ 2 X )(1 + X ) , σ 2 ◮ The primary eigenvector (principal component) associated with the largest eigenvalue converges to  X ≤ √ γ σ 2 0  |� u, v max �| 2 → γ 1 − (3) X > √ γ σ 4  σ 2 , X γ 1+ σ 2 X Phase Transitions of PCA 16

  20. Phase Transitions (continued) In other words, X > √ γ , the primary eigenvalue ◮ If the signal is strong SNR = σ 2 goes beyond the random spectrum (upper bound of MP distribution), and the primary eigenvector is correlated with signal (in a cone around the signal direction whose deviation angle goes to 0 as σ 2 X /γ → ∞ ); X ≤ √ γ , the primary eigenvalue is ◮ If the signal is weak SNR = σ 2 buried in the random spectrum, and the primary eigenvector is random of no correlation with the signal. Phase Transitions of PCA 17

  21. Proof in Sketch ◮ Following the rank-1 model, consider random vectors y i ∼ N (0 , Σ) x uu T + σ 2 ( i = 1 , . . . , n ), where Σ = σ 2 ε I p and u is an arbitrarily chosen unit vector ( � u � 2 = 1 ) showing the signal direction. � n ◮ The sample covariance matrix is ˆ Σ n = 1 i = 1 i =1 y i y T n Y Y T n where Y = [ y 1 , . . . , y n ] ∈ R p × n . Suppose one of its eigenvalue is ˆ λ v , so ˆ and the corresponding unit eigenvector is ˆ Σ n ˆ v = λ ˆ v . ◮ First of all, we relate the ˆ λ to the MP distribution by the trick: z i = Σ − 1 2 y i → Z i ∼ N (0 , I p ) . (4) � n n ZZ T ( Z = [ z 1 , . . . , z n ] ) is a Wishart Then S n = 1 i =1 z i z T i = 1 n random matrix whose eigenvalues follow the Marˇ cenko-Pastur distribution. Phase Transitions of PCA 18

  22. Proof in Sketch ◮ Notice that Σ n = 1 nY Y T = Σ 1 / 2 ( 1 nZZ T )Σ 1 / 2 = Σ ˆ 1 1 2 S n Σ 2 and (ˆ v ) is eigenvalue-eigenvector pair of matrix ˆ λ, ˆ Σ n . Therefore 1 1 v = ˆ v ⇒ S n Σ(Σ − 1 v ) = ˆ λ (Σ − 1 2 S n Σ 2 ˆ 2 ˆ 2 ˆ Σ λ ˆ v ) (5) In other words, ˆ λ and Σ − 1 2 ˆ v are the eigenvalue and eigenvector of matrix S n Σ . ◮ Suppose c Σ − 1 2 ˆ v = v where the constant c makes v a unit eigenvector and thus satisfies, c 2 = c ˆ v T ˆ v = v T Σ v = v T ( σ 2 x uu T + σ 2 x ( u T v ) 2 + σ 2 ε ) = R ( u T v ) 2 +1 . ε ) v = σ 2 (6) Phase Transitions of PCA 19

  23. Proof in Sketch Now we have, S n Σ v = ˆ λv. (7) Plugging in the expression of Σ , it gives X uu T + σ 2 ε I p ) v = ˆ S n ( σ 2 λv Rearrange the term with u to one side, we got X S n u ( u T v ) (ˆ λI p − σ 2 ε S n ) v = σ 2 Assuming that ˆ λI p − σ 2 ε S n is invertible, then multiple its reversion at both sides of the equality, we get, ε S n ) − 1 · S n u ( u T v ) . X · (ˆ v = σ 2 λI p − σ 2 (8) Phase Transitions of PCA 20

  24. Primary Eigenvalue ˆ λ ◮ Multiply (8) by u T at both side, u T v = σ 2 X · u T (ˆ ε S n ) − 1 S n u · ( u T v ) λI p − σ 2 that is, if u T v � = 0 , X · u T (ˆ 1 = σ 2 λI p − σ 2 ε S n ) − 1 S n u (9) Phase Transitions of PCA 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend