edgeworth and confidence interval correction in spiked pca
play

Edgeworth and confidence interval correction in spiked PCA Iain - PowerPoint PPT Presentation

Edgeworth and confidence interval correction in spiked PCA Iain Johnstone & Jeha Yang Statistics & Biomedical Data Science, Stanford & Two Sigma Shanghai, December 10, 2019 Edgeworth and confidence interval correction in spiked PCA


  1. Edgeworth and confidence interval correction in spiked PCA Iain Johnstone & Jeha Yang Statistics & Biomedical Data Science, Stanford & Two Sigma Shanghai, December 10, 2019

  2. Edgeworth and confidence interval correction in spiked PCA Iain Johnstone & Jeha Yang Statistics & Biomedical Data Science, Stanford & Two Sigma Shanghai, December 10, 2019

  3. Viral protein mutations and spiked models Quadeer et. al. PLOS Comp. Bio. 2018

  4. Viral protein mutations and spiked models Quadeer et. al. PLOS Comp. Bio. 2018

  5. A suggestive simulation on correlation matrices [David Morales, Matt McKay] 2 nd eigenvalue ρ 1 = 0 . 2 ; ρ 2 = 0 . 1 2-st Leading Eigenvalue 2-st Leading Eigenvalue γ γ c = 0.2, N = 300, N1 = 10, simple spks = [2.8, 1.9], deg. spks = [0.8, 0.9] c = 0.2, N = 300, N1 = 30, simple spks = [6.8, 3.9], deg. spks = [0.8, 0.9] Histogram of the sample eigenvalue Histogram of the sample eigenvalue 9 3.5 mean = 2.305 [2.322] mean = 4.145 [4.169] std = 0.053 [0.054] (0.89 xPaul) std = 0.126 [0.125] (0.89 xPaul) 8 3 7 2.5 6 2 5 4 1.5 3 1 2 0.5 1 0 0 2.1 2.15 2.2 2.25 2.3 2.35 2.4 2.45 2.5 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 Theoretical variance is pretty accurate, but there seems to be a shift in the mean (similar to what we’ve seen before in the eigenvector projections of sample covariance when spikes were close to each other) 6

  6. Outline Background on spiked covariance model Edgeworth correction - single spike Edgeworth for multiple spikes Explaining the repulsion correction Confidence intervals after selection

  7. High dimensional spiked PCA model ◮ Data : X = [ x 1 · · · x n ] ′ with i . i . d . x 1 , · · · , x n ∼ N p +1 (0 , Σ) ◮ Large dimensional asymptotic regime : as n → ∞ , γ n := p / n → γ ∈ (0 , ∞ ) ◮ Spiked eigenstructure of Σ : for a fixed r , ℓ 1 > · · · > ℓ r > 1 = ℓ r +1 = · · · = ℓ p +1 � �� � Spikes ◮ Statistics : eigenvalues of sample covariance matrix X ′ X / n ρ 1 ≥ · · · ≥ ˆ ˆ ρ p +1 → w.l.o.g. Σ is diagonal

  8. Largest Eigenvalue ˆ ρ 1 : Numerical illustration p = 200 , n = 800 [i.e. γ n = p / n = 0 . 25] subcritical critical supercritical Spike h = ℓ − 1 : 0, 0.25, h + = 0 . 5, 0.75, 1. 15 10 5 0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9

  9. Finite rank model, K = 1: phase transition Σ = diag( ℓ 1 , 1 , . . . , 1) p / n → γ . Interior point transition at ℓ 1 = 1 + √ γ : [Baik–Ben Arous–Pech´ e,05] ¸ Tracy-Widom 1 {2 = 3 n fluctuation . ` 1 2 (1+ ° ) 1+ ° Critical point:

  10. Finite rank model, K = 1: phase transition Σ = diag( ℓ 1 , 1 , . . . , 1) p / n → γ . Interior point transition at ℓ 1 = 1 + √ γ : [Baik–Ben Arous–Pech´ e,05] ¸ 1 Gaussian . {1 = 2 n fluctuation bias ` 1 2 (1+ ° ) 1+ ° ¸ ( ` ) Critical point: 1

  11. Largest Eigenvalue ˆ ρ 1 : Numerical illustration p = 200 , n = 800 [i.e. γ n = p / n = 0 . 25] subcritical critical supercritical Spike h = 0, 0.25, h + = 0 . 5, 0.75, 1. 15 10 5 0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 Edge: (1 + √ γ n ) 2 = 2 . 25

  12. ½ ( ` ) ° º 2 (1+ ° ) ` =1+ h º 2 1 1+ ° (1+ ° ) Largest eigenvalue: Phase transition Different rates, limit distributions: � ˆ � For h < √ γ : ρ 1 − µ ( γ n ) D n 2 / 3 ⇒ TW β , τ ( γ n ) � ˆ � For h > √ γ : ρ 1 − ρ ( h , γ n ) D n 1 / 2 ⇒ N (0 , 1) σ ( h , γ n )

  13. Largest eigenvalue: Phase transition Different rates, limit distributions: � ˆ � For h < √ γ : ρ 1 − µ ( γ n ) D n 2 / 3 ⇒ TW β , τ ( γ n ) � ˆ � For h > √ γ : ρ 1 − ρ ( h , γ n ) D n 1 / 2 ⇒ N (0 , 1) σ ( h , γ n ) with � 1 + γ � σ 2 ( h , γ ) = 2(1 + h ) 2 � 1 − γ � ρ ( h , γ ) = (1 + h ) h 2 h ½ ( ` ) º ° Statistical physics lit, 94- bias Baik-Ben Arous-Peche(05) 2 (1+ ° ) , Paul (07) Baik-Silverstein (06), Bloemendal-Virag (11) Mo (11) , Wang (12) ` =1+ h Benaych-Georges-Guionnet- º 2 Maida (11) 1 1+ ° (1+ ° ) (bulk)

  14. Normal approximation – multiple spikes ◮ Assume that all spikes are simple, supercritical : ℓ 1 > · · · > ℓ r > 1 + √ γ ◮ Asymptotic mutual independence: with ρ kn := ρ ( ℓ k , γ n ) , σ kn := σ ( ℓ k , γ n ), � � n 1 / 2 (ˆ ρ k − ρ kn ) (ˆ z kn ) k =1 , ··· , r := ⇒ N (0 , I r ) σ kn k =1 , ··· , r Shi (2013)

  15. Edgeworth approximations

  16. Inaccuracy of approximations : ˆ z kn associated with ℓ k = 2 . 7 (n, γ n ,l) = (400,1,(2.7)) (n, γ n ,l) = (400,1,(2.7,2.2)) 0.5 0.5 Normal Normal 0.4 0.4 0.3 0.3 Density Density 0.2 0.2 0.1 0.1 0.0 0.0 − 3 − 2 − 1 0 1 2 3 4 − 3 − 2 − 1 0 1 2 3 4 ^ 1n ^ 1n z z (n, γ n ,l) = (400,1,(3.2,2.7)) (n, γ n ,l) = (400,1,(2.7,2.4)) 0.5 0.5 Normal Normal 0.4 0.4 0.3 0.3 Density Density 0.2 0.2 0.1 0.1 0.0 0.0 − 3 − 2 − 1 0 1 2 3 4 − 3 − 2 − 1 0 1 2 3 4 ^ 2n ^ 1n z z

  17. Traditional Edgeworth (Smooth function of) means model: Petrov, 1975, Hall, 1992 n 1 � indep, mean 0 , ∈ R d , S n = √ n κ 2 n X ni d fixed i =1 n κ jn = 1 � E X j moments ni n 1 First order expansion: P ( S n ≤ x ) = Φ( x ) + n − 1 / 2 p ( x ) φ ( x ) + o ( n − 1 / 2 ) p ( x ) = − κ 3 n H 2 ( x ) H 2 ( x ) = x 2 − 1 . , κ 3 / 2 6 2 n skewness correction

  18. Single spike, first order expansion for ˆ ρ 1 z 1 n = n 1 / 2 (ˆ ˆ ρ 1 − ρ 1 n ) /σ 1 n Theorem In spiked model, h 1 = ℓ 1 − 1 > √ γ, γ n = p / n , z 1 n ≤ x ) = Φ( x ) + n − 1 / 2 p 1 n ( x ) φ ( x ) + o ( n − 1 / 2 ) , P (ˆ uniformly in x ∈ R , with p 1 n ( x ) = − α 2 n H 2 ( x ) − α 0 n √ h 3 2 1 + γ n α 2 n = α 2 ( h 1 , γ n ) = 1 − γ n ) 3 / 2 , ( h 2 3 α 0 n = α 0 ( h 1 , γ n ) = γ n h 1 + 1 √ ( h 2 1 − γ n ) 3 / 2 2

  19. Coefficients of Edgeworth expansion for single-spike √ h 3 2 1 + γ n α 0 ( h 1 , γ n ) = γ n h 1 + 1 α 2 ( h 1 , γ n ) = 1 − γ n ) 3 / 2 , √ ( h 2 ( h 2 1 − γ n ) 3 / 2 3 2 ◮ Larger for “harder” cases i.e. larger γ and smaller h ( > √ γ ) √ ◮ Larger than the fixed p case i.e. γ = 0 , α 2 = 2 / 3 , α 0 = 0 Muirhead-Chikuse (1975) ◮ Empirically reasonable if α 2 n = ( h 3 1 + γ ) 2 9 2 1 − γ ) 3 ≤ 0 . 2 n ( h 2 2

  20. Single Spike Simulation (n, γ , l−factor) = (50,0.1,0.3) (n, γ , l−factor) = (50,1,0.3) 1.0 Edgeworth Edgeworth 1.2 Normal Normal 0.8 Upper Edge Upper Edge Density 0.8 Density 0.6 0.4 0.4 0.2 0.0 0.0 1.0 1.5 2.0 2.5 3.0 3.5 4 5 6 7 ^ ^ l l (n, γ , l−factor) = (100,0.1,0.3) (n, γ , l−factor) = (100,1,0.3) 2.0 1.5 Edgeworth Edgeworth Normal Normal 1.5 Upper Edge Upper Edge 1.0 Density Density 1.0 0.5 0.5 0.0 0.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 ^ ^ l l

  21. Edgeworth for multiple spikes

  22. Eigenvalues are repulsive! (n, γ n ,l) = (400,1,(2.7,2.2)) (n, γ n ,l) = (400,1,(2.7,2.4)) (n, γ n ,l) = (400,1,(3.2,2.7)) 0.5 0.5 0.5 Normal Normal Normal 0.4 0.4 0.4 0.3 0.3 0.3 Density Density Density 0.2 0.2 0.2 0.1 0.1 0.1 0.0 0.0 0.0 −3 −2 −1 0 1 2 3 4 −3 −2 −1 0 1 2 3 4 −3 −2 −1 0 1 2 3 4 ^ 1n ^ 1n ^ 2n z z z ◮ joint density of (ˆ ρ 1 , · · · , ˆ ρ n ∧ ( p +1) ) has a Jacobian factor � | ˆ ρ i − ˆ ρ j | i < j → pushes eigenvalues apart ◮ But, not visible at leading order (for supercritical spikes:) (ˆ z kn ) k =1 , ··· , r ⇒ N (0 , I r )

  23. Multi spike, first order expansion for ˆ ρ k z kn = n 1 / 2 (ˆ ˆ ρ k − ρ kn ) /σ kn Theorem In spiked model, h k = ℓ k − 1 > √ γ, γ n = p / n , z kn ≤ x ) = Φ( x ) + n − 1 / 2 p kn ( x ) φ ( x ) + o ( n − 1 / 2 ) , P (ˆ uniformly in x ∈ R , with p kn ( x ) = − α 2 ( h k , γ n ) H 2 ( x ) − α 0 , k ( h ,γ n ) √ h 3 2 k + γ n α 2 ( h k , γ n ) = k − γ n ) 3 / 2 , ( h 2 3 1 h k + 1 � γ h j � � α 0 , k ( h , γ ) = √ k − γ + ( h 2 h 2 k − γ ) 1 / 2 h k − h j 2 j � = k

  24. Interpretation Edgeworth corrected density φ + n − 1 / 2 ( α 2 H 3 + α 0 H 1 ) φ Relative to single spike case: α 2 unchanged, but 1 h k + 1 h j � √ ∆ α 0 = α 0 , k ( h , γ n ) − α 0 ( h k , γ n ) = ( h 2 k − γ n ) 1 / 2 h k − h j 2 j � = k ◮ ∆ α 0 > 0, e.g. smaller spikes h j < h k , push density to right, conversely for ∆ α 0 < 0 ◮ closer spikes ⇒ larger effect ◮ additive in ℓ j , j � = k

  25. Repulsion example 1 : ˆ z kn associated with ℓ k = 2 . 7 (n, γ n ,l) = (400,1,(2.7)) (n, γ n ,l) = (400,1,(2.7,2.2)) 0.5 0.5 Normal Normal Edgeworth Edgeworth 0.4 0.4 0.3 Density 0.3 Density 0.2 0.2 0.1 0.1 0.0 0.0 − 3 − 2 − 1 0 1 2 3 4 −3 −2 −1 0 1 2 3 4 ^ 1n z ^ 1n z (n, γ n ,l) = (400,1,(3.2,2.7)) (n, γ n ,l) = (400,1,(2.7,2.4)) 0.5 0.5 Normal Normal Edgeworth Edgeworth 0.4 0.4 0.3 0.3 Density Density 0.2 0.2 0.1 0.1 0.0 0.0 − 3 − 2 − 1 0 1 2 3 4 − 3 − 2 − 1 0 1 2 3 4 ^ 2n ^ 1n z z Figure: Density of ˆ z kn associated with ℓ k = 2 . 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend