Spectral distributions of high-dimensional sample correlation - PowerPoint PPT Presentation

Spectral distributions of high-dimensional sample correlation matrices under infinite variance Johannes Heiny Ruhr-University Bochum Joint work with Jianfeng Yao (HKU), Thomas Mikosch and Jorge Yslas (Copenhagen). Random Matrices and Complex Data Analysis Workshop, December 10-12, 2019, Shanghai J. Heiny Sample correlation & off-diagonal 1 / 30

Normalized histogram of eigenvalues and MP density 0.9 Histogram of eigenvalues 0.8 y = f γ (x) 0.7 0.6 0.5 y 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 6 7 8 9 x Figure: These are NOT spikes! J. Heiny Sample correlation & off-diagonal 2 / 30

Setup for the picture Data matrix X = X n : p × n matrix with iid centered entries and generic variable X d = X 11 . X = ( X it ) i =1 ,...,p ; t =1 ,...,n Sample covariance matrix S = 1 n XX ′ Ordered eigenvalues of S λ 1 ( S ) ≥ λ 2 ( S ) ≥ · · · ≥ λ p ( S ) Sample correlation matrix R = (diag( S )) − 1 / 2 S (diag( S )) − 1 / 2 . J. Heiny Sample correlation & off-diagonal 3 / 30

Regular variation Regular variation with index α > 0 : P ( | X | > x ) = x − α L ( x ) , where L is a slowly varying function. This implies E [ | X | α + ε ] = ∞ for any ε > 0 . Normalizing sequence ( a 2 np ) such that np P ( X 2 > a 2 np x ) → x − α/ 2 , as n → ∞ for x > 0 . 1 / α ℓ ( np ) for a slowly varying function ℓ . Then a np = ( np ) J. Heiny Sample correlation & off-diagonal 4 / 30

Reduction to Diagonal Diagonal X with iid regularly varying entries α ∈ (0 , 4) and p = n β with β ∈ [0 , 1] . We have np � XX ′ − diag( XX ′ ) � P a − 2 → 0 , where � · � denotes the spectral norm. n � ( XX ′ ) ij = X it X jt . t =1 J. Heiny Sample correlation & off-diagonal 5 / 30

Eigenvalues Weyl’s inequality � � � λ i ( A + B ) − λ i ( A ) � ≤ � B � . max i =1 ,...,p Choose A + B = XX ′ and A = diag( XX ′ ) to obtain � � � P a − 2 � λ i ( XX ′ ) − λ i (diag( XX ′ )) max → 0 , n → ∞ . np i =1 ,...,p Note: Limit theory for ( λ i ( S )) reduced to ( S ii ) . J. Heiny Sample correlation & off-diagonal 6 / 30

Heavy-tailed case Theorem (Heiny and Mikosch, 2016) X with iid regularly varying entries α ∈ (0 , 4) and p n = n β ℓ ( n ) with β ∈ [0 , 1] . 1 If β ∈ [0 , 1] , then � � � P a − 2 � λ i ( XX ′ ) − λ i (diag( XX ′ )) max → 0 . np i =1 ,...,p 2 If β ∈ (( α/ 2 − 1) + , 1] , then � � � P a − 2 � λ i ( XX ′ ) − X 2 max → 0 . np ( i ) ,np i =1 ,...,p J. Heiny Sample correlation & off-diagonal 7 / 30

Example: Eigenvalues Figure: Smoothed histogram based on 20000 simulations of the approximation error for the normalized eigenvalue a − 2 np λ 1 ( S ) for entries X it with α = 1 . 6 , β = 1 , n = 1000 and p = 200 . J. Heiny Sample correlation & off-diagonal 8 / 30

Eigenvectors v k unit eigenvector of S associated to λ k ( S ) Unit eigenvectors of diag( S ) are canonical basisvectors e j . Eigenvectors X with iid regularly varying entries with index α ∈ (0 , 4) and p n = n β ℓ ( n ) with β ∈ [0 , 1] . Then for any fixed k ≥ 1 , P � v k − e L k � ℓ 2 → 0 , n → ∞ . J. Heiny Sample correlation & off-diagonal 9 / 30

Localization vs. Delocalization Pareto data Normal Data 1.0 ● ● 0.15 ● ● ● ● ● ● ● ● ● ● 0.10 ● ● ● 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.05 ● ● ● ● ● ● ● ● ● ● ● ● ● Size of components Size of Components ● ● ● ● ● ● ● ● ● ● ● ● 0.6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.4 −0.05 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● −0.15 ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 50 100 150 200 0 50 100 150 200 Indices of components Indices of Components Figure: X ∼ Pareto (0 . 8) Figure: X ∼ N (0 , 1) Components of eigenvector v 1 . p = 200 , n = 1000 . J. Heiny Sample correlation & off-diagonal 10 / 30

Point Process of Normalized Eigenvalues Point process convergence p ∞ � � d N n = δ a − 2 → δ Γ − 2 /α = N np λ i ( XX ′ ) i i =1 i =1 The limit is a PRM on (0 , ∞ ) with mean measure µ ( x, ∞ ) = x − α/ 2 , x > 0 , and Γ i = E 1 + · · · + E i , ( E i ) iid standard exponential . J. Heiny Sample correlation & off-diagonal 11 / 30

Point Process of Normalized Eigenvalues Limiting distribution: For k ≥ 1 , n →∞ P ( a − 2 lim np λ k ≤ x ) = lim n →∞ P ( N n ( x, ∞ ) < k ) = P ( N ( x, ∞ ) < k ) � x − α/ 2 � s k − 1 � e − x − α/ 2 , = x > 0 . s ! s =0 J. Heiny Sample correlation & off-diagonal 12 / 30

Point Process of Normalized Eigenvalues Limiting distribution: For k ≥ 1 , n →∞ P ( a − 2 lim np λ k ≤ x ) = lim n →∞ P ( N n ( x, ∞ ) < k ) = P ( N ( x, ∞ ) < k ) � x − α/ 2 � s k − 1 � e − x − α/ 2 , = x > 0 . s ! s =0 Largest eigenvalue n λ 1 ( S ) d → Γ − α/ 2 , 1 a 2 np where the limit has a Fr´ echet distribution with parameter α/ 2 . Soshnikov ( 2006 ), Auffinger et al. ( 2009 ), Auffinger and Tang ( 2016 ), Davis et al. ( 2014 , 2016 2 ), JH and Mikosch ( 2016 ) J. Heiny Sample correlation & off-diagonal 12 / 30

α = 3 . 99 α = 3 . 99 , n = 2000 , p = 1000 J. Heiny Sample correlation & off-diagonal 13 / 30

α = 3 α = 3 , n = 2000 , p = 1000 J. Heiny Sample correlation & off-diagonal 14 / 30

α = 2 . 1 α = 2 . 1 , n = 10000 , p = 1000 J. Heiny Sample correlation & off-diagonal 15 / 30

Infinite variance, α < 2 Limiting spectral distribution of ( XX ′ ) under E [ X 2 ] = ∞ : Regular variation with α < 2 : n + p XX ′ → G γ F a − 2 α weakly , whose density g γ α satisfies α ( x ) ∼ c x − 1 − α/ 2 , g γ x → ∞ . Ben Arous and Guionnet (2008), Belinschi et al. (2009) J. Heiny Sample correlation & off-diagonal 16 / 30

Moments of LSD Assumption: X symmetric and regularly varying with index α ∈ (0 , 2) . Goal: For k ≥ 1 , find the limit of � � � = 1 x k F R ( dx ) p E [tr( R k )] E J. Heiny Sample correlation & off-diagonal 17 / 30

Moments of LSD One has p n � � E [tr( R k )] = E [ Y i 1 t 1 Y i 2 t 1 · · · Y i k t k Y i 1 t k ] . i 1 ,...,i k =1 t 1 ,...,t k =1 � �� := F ( i 1 ,...,i k ) Assumption: X symmetric ⇒ Y ij symmetric X ij √ � n Y ij = t =1 X 2 it J. Heiny Sample correlation & off-diagonal 18 / 30

Moments of LSD k − 2 r − 2 � � p E [tr( R k )] → β k ( γ ) + 2 1 γ r − 1 (Γ(1 − α/ 2)) − r + q +1 α r =2 q =0 � r − q � t ⋆ ( � I ) � � s � � � � Γ( d i ( � α/ 2 I, T )) Γ(1 − α/ 2) Γ( N i ( � I )) s =1 i =1 I ∈C ( q ) I | ( � T ∈C s, | � I ) r,k � m it ( � � � I, T ) − α Γ . 2 ( i,t ) ∈ ∆( � I,T ) J. Heiny Sample correlation & off-diagonal 19 / 30

J. Heiny Sample correlation & off-diagonal 20 / 30

J. Heiny Sample correlation & off-diagonal 21 / 30

Motivation Random walk S n = X 1 + · · · + X n , n ≥ 1 . ( X i ) are iid random variables with generic element X . 1 E [ X ] = 0 and E [ X 2 ] = 1 . 2 Dimension p = p n → ∞ Consider iid copies ( S ( i ) n ) i ≤ p of S n and define the point process p � N n = δ d p ( S ( i ) n / √ n − d p ) . i =1 J. Heiny Sample correlation & off-diagonal 22 / 30

We want to prove: p � d N n = δ d p ( S ( i ) → N , n → ∞ , n / √ n − d p ) i =1 where N is a Poisson random measure with mean measure µ ( x, ∞ ) = e − x , x ∈ R , and � 2 log p − log log p + log 4 π d p = . 2(2 log p ) 1 / 2 J. Heiny Sample correlation & off-diagonal 23 / 30

Spectral distributions of high-dimensional sample correlation - PowerPoint PPT Presentation

Spectral distributions of high-dimensional sample correlation matrices under infinite variance Johannes Heiny Ruhr-University Bochum Joint work with Jianfeng Yao (HKU), Thomas Mikosch and Jorge Yslas (Copenhagen). Random Matrices and Complex

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Spectral Clustering Spectral Clustering? Spectral methods Methods using eigenvectors of

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

Statistics and Data Analysis Distributions and Sampling (2) Ling-Chieh Kung Department of

An Introduction to Spectral Learning Hanxiao Liu November 8, 2013 An Introduction to Spectral

Stat 5102 Lecture Slides: Deck 1 Empirical Distributions, Exact Sampling Distributions,

Create Distributions Empirically using Excel V0E 10/11/2014 0E 2014 Schield Creating

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory

Lecture 5: Probability Distributions Random Variables Probability Distributions

Outline Power Law Size Distributions Distributions Power Law Size Distributions Overview

n -dimensional manifold M with T := TM n -dimensional manifold M with T := TM T n -dimensional

Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices Guangming Pan,

SEM Photographs of Activated ash samples SEM Micrographs (Original ash samples) (a) Sample S1F1

Volatility is rough Jim Gatheral 1 , Thibault Jaisson 2 and Mathieu Rosenbaum 3 2 1 City

Assessing the dependence of high-dimensional time series via sample autocovariances and

Spectral Analysis of Stationary Stochastic Process Hanxiao Liu hanxiaol@cs.cmu.edu February 20,

r ts rs

CS6220: DATA MINING TECHNIQUES Mining Time Series Data Instructor: Yizhou Sun yzsun@ccs.neu.edu

Time Series Modelling and Kalman Filters Chris Williams School of Informatics, University of

Definitions and Examples I (10.1) Definitions and Examples I (10.1) Definitions and Examples II

Towards an information-theoretic model of the Allison mixture A canonical measure of dependence

Spectral distributions of high-dimensional sample correlation - PowerPoint PPT Presentation

Spectral distributions of high-dimensional sample correlation matrices under infinite variance Johannes Heiny Ruhr-University Bochum Joint work with Jianfeng Yao (HKU), Thomas Mikosch and Jorge Yslas (Copenhagen). Random Matrices and Complex

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Spectral Clustering Spectral Clustering? Spectral methods Methods using eigenvectors of

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Sample 2 Inlet in western (Sunset) Bay 0 Sample 3 Inlet behind Christian Island 1 Sample

? ? ? ? Basic Charts Outline - Distributions &amp; Histograms - Mean, Mode, Average - Chart

Agglomeration of Ash Particles due to Flue Gas Conditioning (a) Sample CA8S12F1 (b) Sample

Statistics and Data Analysis Distributions and Sampling (2) Ling-Chieh Kung Department of

An Introduction to Spectral Learning Hanxiao Liu November 8, 2013 An Introduction to Spectral

Stat 5102 Lecture Slides: Deck 1 Empirical Distributions, Exact Sampling Distributions,

Create Distributions Empirically using Excel V0E 10/11/2014 0E 2014 Schield Creating

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory

Lecture 5: Probability Distributions Random Variables Probability Distributions

Outline Power Law Size Distributions Distributions Power Law Size Distributions Overview

n -dimensional manifold M with T := TM n -dimensional manifold M with T := TM T n -dimensional

Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices Guangming Pan,

SEM Photographs of Activated ash samples SEM Micrographs (Original ash samples) (a) Sample S1F1

Volatility is rough Jim Gatheral 1 , Thibault Jaisson 2 and Mathieu Rosenbaum 3 2 1 City

Assessing the dependence of high-dimensional time series via sample autocovariances and

Spectral Analysis of Stationary Stochastic Process Hanxiao Liu hanxiaol@cs.cmu.edu February 20,

r ts rs

CS6220: DATA MINING TECHNIQUES Mining Time Series Data Instructor: Yizhou Sun yzsun@ccs.neu.edu

Time Series Modelling and Kalman Filters Chris Williams School of Informatics, University of

Definitions and Examples I (10.1) Definitions and Examples I (10.1) Definitions and Examples II

Towards an information-theoretic model of the Allison mixture A canonical measure of dependence

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart