Lecture 2. Random Matrix Theory and Phase Transitions of PCA Yuan - PowerPoint PPT Presentation

Lecture 2. Random Matrix Theory and Phase Transitions of PCA Yuan Yao Hong Kong University of Science and Technology February 26, 2020

Outline Recall: Horn’s Parallel Analysis of PCA Random Matrix Theory Phase Transitions of PCA Recall: Horn’s Parallel Analysis of PCA 2

How many components of PCA? ◮ Data matrix: X = [ x 1 | x 2 | · · · | x n ] ∈ R p × n ◮ Centering data matrix: Y = XH where H = I − 1 n 1 · 1 T ◮ PCA is given by top left singular vectors of Y = USV T (called loading vectors) by projections to R p , z j = u j Y ◮ MDS is given by top right singular vectors of Y = USV T as Euclidean embedding coordinates of n sample points ◮ But how many components shall we keep? Recall: Horn’s Parallel Analysis of PCA 3

Recall: Horn’s Parallel Analysis ◮ Data matrix: X = [ x 1 | x 2 | · · · | x n ] ∈ R p × n   X 1 , 1 X 1 , 2 · · · X 1 ,n   X 2 , 1 X 2 , 2 · · · X 2 ,n   X =   .  . . . ... . . .  . . . · · · X p, 1 X p, 2 X p,n ◮ Compute its principal eigenvalues { ˆ λ i } i =1 ,...,p Recall: Horn’s Parallel Analysis of PCA 4

Recall: Horn’s Parallel Analysis ◮ Randomly take p permutations of n numbers π 1 , . . . , π p ∈ S n (usually π 1 is set as identity), noting that sample means are permutation invariant,   X 1 ,π 1 (1) X 1 ,π 1 (2) · · · X 1 ,π 1 ( n )   · · · X 2 ,π 2 (1) X 2 ,π 2 (2) X 2 ,π 2 ( n )   X 1 =  .  . . .  ... . . .  . . . X p,π p (1) X p,π p (2) · · · X p,π p ( n ) ◮ Compute its principal eigenvalues { ˆ λ 1 i } i =1 ,...,p . ◮ Repeat such procedure for r times, we can get r sets of principal eigenvalues. { ˆ λ k i } i =1 ,...,p for k = 1 , . . . , r Recall: Horn’s Parallel Analysis of PCA 5

Recall: Horn’s Parallel Analysis (continued) ◮ For each i = 1 , define the i -th p -value as the percentage of random eigenvalues { ˆ i } k =1 ,...,r that exceed the i -th principal eigenvalue ˆ λ k λ i of the original data X , pval i = 1 r # { ˆ i > ˆ λ k λ i : k = 1 , . . . , r } . ◮ Setup a threshold q , e.g. q = 0 . 05 , and only keep those principal eigenvalues ˆ λ i such that pval i < q Recall: Horn’s Parallel Analysis of PCA 6

Example ◮ Let’s look at an example of Parallel Analysis – R: https://github.com/yuany-pku/2017_CSIC5011/blob/ master/slides/paran.R – Matlab: papca.m – Python: Recall: Horn’s Parallel Analysis of PCA 7

How does it work? ◮ We are going to introduce an analysis based on Random Matrix Theory for rank-one spike model Recall: Horn’s Parallel Analysis of PCA 8

How does it work? ◮ We are going to introduce an analysis based on Random Matrix Theory for rank-one spike model ◮ There is a phase transition in principal component analysis Recall: Horn’s Parallel Analysis of PCA 8

How does it work? ◮ We are going to introduce an analysis based on Random Matrix Theory for rank-one spike model ◮ There is a phase transition in principal component analysis – If the signal is strong, principal eigenvalues are beyond the random spectrum and principal components are correlated with signal Recall: Horn’s Parallel Analysis of PCA 8

How does it work? ◮ We are going to introduce an analysis based on Random Matrix Theory for rank-one spike model ◮ There is a phase transition in principal component analysis – If the signal is strong, principal eigenvalues are beyond the random spectrum and principal components are correlated with signal – If the signal is weak, all eigenvalues in PCA are due to random noise Recall: Horn’s Parallel Analysis of PCA 8

Outline Recall: Horn’s Parallel Analysis of PCA Random Matrix Theory Phase Transitions of PCA Random Matrix Theory 9

Marˇ cenko-Pastur Distribution of Noise Eigenvalues ◮ Let x i ∼ N (0 , I p ) ( i = 1 , . . . , n ) and X = [ x 1 , x 2 , . . . , x n ] ∈ R p × n . ◮ The sample covariance matrix Σ n = 1 nXX T . � is called Wishart (random) matrix. ◮ When both n and p grow at p n → γ � = 0 , the distribution of the eigenvalues of � Σ n follows the Marˇ ccenko-Pastur (MP) Law � � � ∈ [ a, b ] , 0 t / 1 − 1 µ MP ( t ) = √ δ ( x ) I ( γ > 1) + ( b − t )( t − a ) γ dt t ∈ [ a, b ] , 2 πγt where a = (1 − √ γ ) 2 , b = (1 + √ γ ) 2 . Random Matrix Theory 10

Illustration of MP Law ◮ If γ ≤ 1 , MP distribution has a support on [ a, b ] ; ◮ if γ > 1 , it has an additional point mass 1 − 1 /γ at the origin. (a) (b) Figure: Show by matlab: (a) Marˇ cenko-Pastur distribution with γ = 2 . (b) Marˇ cenko-Pastur distribution with γ = 0 . 5 . Random Matrix Theory 11

Outline Recall: Horn’s Parallel Analysis of PCA Random Matrix Theory Phase Transitions of PCA Phase Transitions of PCA 12

Rank-one Spike Model Consider the following rank-1 signal-noise model Y = X + ε, where ◮ the signal lies in an one-dimensional subspace X = αu with α ∼ N (0 , σ 2 X ) ; ◮ the noise ε ∼ N (0 , σ 2 ε I p ) is i.i.d. Gaussian. Therefore Y ∼ N (0 , Σ) where the limiting covariance matrix Σ is rank-one added by a sparse matrix: X uu T + σ 2 Σ = σ 2 ε I p . Phase Transitions of PCA 13

When does PCA work? ◮ Can we recover signal direction u from principal component analysis on noisy measurements Y ? ◮ It depends on the signal noise ratio, defined as SNR = R := σ 2 X . σ 2 ε For simplicity we assume that σ 2 ε = 1 without loss of generality. Phase Transitions of PCA 14

Phase Transition of PCA ◮ Consider the scenario p γ = lim n. (1) p,n →∞ as in applications, one never has infinite amount of samples and dimensionality ◮ A fundamental result by I. Johnstone in 2006 shows a phase transition of PCA: Phase Transitions of PCA 15

Phase Transitions ◮ The primary (largest) eigenvalue of sample covariance matrix satisfies � (1 + √ γ ) 2 = b, X ≤ √ γ σ 2 λ max ( � Σ n ) → X > √ γ (2) γ (1 + σ 2 σ 2 X )(1 + X ) , σ 2 ◮ The primary eigenvector (principal component) associated with the largest eigenvalue converges to  X ≤ √ γ σ 2 0  |� u, v max �| 2 → γ 1 − (3) X > √ γ σ 4  σ 2 , X γ 1+ σ 2 X Phase Transitions of PCA 16

Phase Transitions (continued) In other words, X > √ γ , the primary eigenvalue ◮ If the signal is strong SNR = σ 2 goes beyond the random spectrum (upper bound of MP distribution), and the primary eigenvector is correlated with signal (in a cone around the signal direction whose deviation angle goes to 0 as σ 2 X /γ → ∞ ); X ≤ √ γ , the primary eigenvalue is ◮ If the signal is weak SNR = σ 2 buried in the random spectrum, and the primary eigenvector is random of no correlation with the signal. Phase Transitions of PCA 17

Proof in Sketch ◮ Following the rank-1 model, consider random vectors y i ∼ N (0 , Σ) x uu T + σ 2 ( i = 1 , . . . , n ), where Σ = σ 2 ε I p and u is an arbitrarily chosen unit vector ( � u � 2 = 1 ) showing the signal direction. � n ◮ The sample covariance matrix is ˆ Σ n = 1 i = 1 i =1 y i y T n Y Y T n where Y = [ y 1 , . . . , y n ] ∈ R p × n . Suppose one of its eigenvalue is ˆ λ v , so ˆ and the corresponding unit eigenvector is ˆ Σ n ˆ v = λ ˆ v . ◮ First of all, we relate the ˆ λ to the MP distribution by the trick: z i = Σ − 1 2 y i → Z i ∼ N (0 , I p ) . (4) � n n ZZ T ( Z = [ z 1 , . . . , z n ] ) is a Wishart Then S n = 1 i =1 z i z T i = 1 n random matrix whose eigenvalues follow the Marˇ cenko-Pastur distribution. Phase Transitions of PCA 18

Proof in Sketch ◮ Notice that Σ n = 1 nY Y T = Σ 1 / 2 ( 1 nZZ T )Σ 1 / 2 = Σ ˆ 1 1 2 S n Σ 2 and (ˆ v ) is eigenvalue-eigenvector pair of matrix ˆ λ, ˆ Σ n . Therefore 1 1 v = ˆ v ⇒ S n Σ(Σ − 1 v ) = ˆ λ (Σ − 1 2 S n Σ 2 ˆ 2 ˆ 2 ˆ Σ λ ˆ v ) (5) In other words, ˆ λ and Σ − 1 2 ˆ v are the eigenvalue and eigenvector of matrix S n Σ . ◮ Suppose c Σ − 1 2 ˆ v = v where the constant c makes v a unit eigenvector and thus satisfies, c 2 = c ˆ v T ˆ v = v T Σ v = v T ( σ 2 x uu T + σ 2 x ( u T v ) 2 + σ 2 ε ) = R ( u T v ) 2 +1 . ε ) v = σ 2 (6) Phase Transitions of PCA 19

Proof in Sketch Now we have, S n Σ v = ˆ λv. (7) Plugging in the expression of Σ , it gives X uu T + σ 2 ε I p ) v = ˆ S n ( σ 2 λv Rearrange the term with u to one side, we got X S n u ( u T v ) (ˆ λI p − σ 2 ε S n ) v = σ 2 Assuming that ˆ λI p − σ 2 ε S n is invertible, then multiple its reversion at both sides of the equality, we get, ε S n ) − 1 · S n u ( u T v ) . X · (ˆ v = σ 2 λI p − σ 2 (8) Phase Transitions of PCA 20

Primary Eigenvalue ˆ λ ◮ Multiply (8) by u T at both side, u T v = σ 2 X · u T (ˆ ε S n ) − 1 S n u · ( u T v ) λI p − σ 2 that is, if u T v � = 0 , X · u T (ˆ 1 = σ 2 λI p − σ 2 ε S n ) − 1 S n u (9) Phase Transitions of PCA 21

Lecture 2. Random Matrix Theory and Phase Transitions of PCA Yuan - PowerPoint PPT Presentation

Lecture 2. Random Matrix Theory and Phase Transitions of PCA Yuan Yao Hong Kong University of Science and Technology February 26, 2020 Outline Recall: Horns Parallel Analysis of PCA Random Matrix Theory Phase Transitions of PCA Recall:

Random Graphs (2 nd part) Omid Etesami Phase transitions for CNF-SAT Phase transitions for other

Topological defects and cosmological phase transitions Mark Hindmarsh 1 , 2 1 Department of Physics

Byron Nelson High School Phase 2 GMP January 14, 2019 BNHS Phase 2 GMP Bid Date: December 11,

COMMUNITY GAME RETURN TO PLAY ROADMAP Phase 1 Phase 2A Phase 2B Phase 3 Phase 4 Phase 5 WRU &

A Tale of Two Theories: A Tale of Two Theories: Reconciling Reconciling random matrix theory

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Phase transitions A. O. Lopes Inst. Mat. - UFRGS 25 de fevereiro de 2015 A. O. Lopes (Inst.

Phase Transitions in Random Discrete Structures Mihyun Kang Institute of Optimization and

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Phase IB Supplement Phase II Submission Progressing Towards a Phase II Submission Phase IB

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Random Vectors, Random Matrices, and Matrix Expected Value James H. Steiger Department of

Cosmological Energy Budget of Cosmological First Order Phase Transitions First Order Phase

Random Matrix Theory in a nutshell and applications Manuela Girotti IFT 6085, February 27th,

Introductory Matrix Operations Matrix Entries Defn. For matrix A , notation a ij means the en-

Random Sampling of Bandlimited Signals on Graphs Pierre Vandergheynst cole Polytechnique

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian

OBJECT ORIENTED PROGRAMMING Coin.java and CoinTester.java This excellent tutorial written by

Stat 5101 Lecture Slides: Deck 8 Dirichlet Distribution Charles J. Geyer School of Statistics

Bijective approach to percolation on triangulations and Liouville quantum gravity Olivier

Random Dieudonn e Modules and the Cohen-Lenstra Heuristics David Zureick-Brown Bryden Cais

Approximate Matchings in Dynamic Graph Streams Sanjeev Khanna University of Pennsylvania Joint

Large Deviaons and Exponenal Random Graphs Yufei Zhao MIT May 2018 Universality

Lecture 2. Random Matrix Theory and Phase Transitions of PCA Yuan - PowerPoint PPT Presentation

Lecture 2. Random Matrix Theory and Phase Transitions of PCA Yuan Yao Hong Kong University of Science and Technology February 26, 2020 Outline Recall: Horns Parallel Analysis of PCA Random Matrix Theory Phase Transitions of PCA Recall:

Random Graphs (2 nd part) Omid Etesami Phase transitions for CNF-SAT Phase transitions for other

Topological defects and cosmological phase transitions Mark Hindmarsh 1 , 2 1 Department of Physics

Byron Nelson High School Phase 2 GMP January 14, 2019 BNHS Phase 2 GMP Bid Date: December 11,

COMMUNITY GAME RETURN TO PLAY ROADMAP Phase 1 Phase 2A Phase 2B Phase 3 Phase 4 Phase 5 WRU &amp;

A Tale of Two Theories: A Tale of Two Theories: Reconciling Reconciling random matrix theory

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Phase transitions A. O. Lopes Inst. Mat. - UFRGS 25 de fevereiro de 2015 A. O. Lopes (Inst.

Phase Transitions in Random Discrete Structures Mihyun Kang Institute of Optimization and

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Phase IB Supplement Phase II Submission Progressing Towards a Phase II Submission Phase IB

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Random Vectors, Random Matrices, and Matrix Expected Value James H. Steiger Department of

Cosmological Energy Budget of Cosmological First Order Phase Transitions First Order Phase

Random Matrix Theory in a nutshell and applications Manuela Girotti IFT 6085, February 27th,

Introductory Matrix Operations Matrix Entries Defn. For matrix A , notation a ij means the en-

Random Sampling of Bandlimited Signals on Graphs Pierre Vandergheynst cole Polytechnique

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian

OBJECT ORIENTED PROGRAMMING Coin.java and CoinTester.java This excellent tutorial written by

Stat 5101 Lecture Slides: Deck 8 Dirichlet Distribution Charles J. Geyer School of Statistics

Bijective approach to percolation on triangulations and Liouville quantum gravity Olivier

Random Dieudonn e Modules and the Cohen-Lenstra Heuristics David Zureick-Brown Bryden Cais

Approximate Matchings in Dynamic Graph Streams Sanjeev Khanna University of Pennsylvania Joint

Large Devia*ons and Exponen*al Random Graphs Yufei Zhao MIT May 2018 Universality

COMMUNITY GAME RETURN TO PLAY ROADMAP Phase 1 Phase 2A Phase 2B Phase 3 Phase 4 Phase 5 WRU &

Large Deviaons and Exponenal Random Graphs Yufei Zhao MIT May 2018 Universality