The Stieltjes Transform and its Role in Eigenvalue Behavior of Large - PowerPoint PPT Presentation

The Stieltjes Transform and its Role in Eigenvalue Behavior of Large Dimensional Random Matrices Jack W. Silverstein Department of Mathematics North Carolina State University

1. Introduction . Let M ( R ) denote the collection of all subprobability distribution functions v on R . We say for { F n } ⊂ M ( R ), F n converges vaguely to F ∈ M ( R ) (written F n − → F ) if for all D [ a, b ], a, b continuity points of F , lim n →∞ F n { [ a, b ] } = F { [ a, b ] } . We write F n − → F , when F n , F are probability distribution functions (equivalent to lim n →∞ F n ( a ) = F ( a ) for all continuity points a of F ). For F ∈ M ( R ), � 1 z ∈ C + ≡ { z ∈ C : ℑ z > 0 } m F ( z ) ≡ x − z dF ( x ) , is defined as the Stieltjes transform of F . Properties: 1. m F is an analytic function on C + . 2. ℑ m F ( z ) > 0 . 1 3. | m F ( z ) | ≤ ℑ z . 4. For continuity points a < b of F � b F { [ a, b ] } = 1 π lim ℑ m F ( ξ + iη ) dξ, η → 0 + a since the right hand side � b � � � b = 1 ( x − ξ ) 2 + η 2 dF ( x ) dξ = 1 η η π lim π lim ( x − ξ ) 2 + η 2 dξdF ( x ) η → 0 + η → 0 + a a 1

� � � b − x � � a − x �� = 1 Tan − 1 − Tan − 1 π lim dF ( x ) η η η → 0 + � = I [ a,b ] dF ( x ) = F { [ a, b ] } . 5. If, for x 0 ∈ R , ℑ m F ( x 0 ) ≡ lim z ∈ C + → x 0 ℑ m F ( z ) exists, then F is differentiable at x 0 with value ( 1 π ) ℑ m F ( x 0 ) (S. and Choi (1995)). Let S ⊂ C + be countable with a cluster point in C + . Using 4., the fact that F n v − → F is equivalent to � � f n ( x ) dF n ( x ) → f ( x ) dF ( x ) for all continuous f vanishing at ±∞ , and the fact that an analytic function defined on C + is uniquely determined by the values it takes on S , we have v F n − → F ⇐ ⇒ m F n ( z ) → m F ( z ) for all z ∈ S. The fundamental connection to random matrices: For any Hermitian n × n matrix A , we let F A denote the empirical distribution function (e.d.f.) of its eigenvalues: F A ( x ) = 1 n (number of eigenvalues of A ≤ x ) . Then m F A ( z ) = 1 n tr ( A − zI ) − 1 . 2

So, if we have a sequence { A n } of Hermitian random matrices, to show, with probability one, v F A n → F for some F ∈ M ( R ), it is equivalent to show for any z ∈ C + − 1 n tr ( A n − zI ) − 1 → m F ( z ) a.s. The main goal of the lecture is to show the importance of the Stieltjes transform to limiting behavior of certain classes of random matrices. We will begin with an attempt at providing a systematic way to show a.s. convergence of the e.d.f.’s of the eigenvalues of three classes of large dimensional random matrices via the Stieltjes transform approach. Essential properties involved will be emphasized in order to better understand where randomness comes in and where basic properties of matrices are used. Then it will be shown, via the Stieltjes transform, how the limiting distribution can be numer- ically constructed, how it can explicitly (mathematically) be derived in some cases, and, in general, how important qualitative information can be inferred. Other results will be reviewed, namely the exact separation properties of eigenvalues, and distributional behavior of linear spectral statistics. It is hoped that with this knowledge other ensembles can be explored for possible limiting behavior. Each theorem below corresponds to a matrix ensemble. For each one the random quantities are defined on a common probability space. They all assume: For n = 1 , 2 , . . . X n = ( X n ij ), n × N , X n ij ∈ C , i.d. for all n, i, j , independent across i, j for each n , 1 1 | 2 = 1, and N = N ( n ) with n/N → c > 0 as n → ∞ . E | X 1 1 1 − E X 1 Theorem 1.1 (Marˇ cenko and Pastur (1967), S. and Bai (1995)). Assume: 3

a) T n = diag ( t n 1 , . . . , t n n ) , t n of { t n 1 , . . . , t n ∈ R , and the e.d.f. n } converges weakly, with i probability one to a nonrandom probability distribution function H as n → ∞ . v b) A n is a random N × N Hermitian random matrix for which F A n − → A where A is nonrandom (possibly defective). c) X n , T n , and A n are independent. v → ˆ Let B n = A n + (1 /N ) X ∗ n T n X n . Then, with probability one F B n − F as n → ∞ where for each z ∈ C + m = m ˆ F ( z ) satisfies � � � t (1 . 1) m = m A z − c 1 + tmdH ( t ) . It is the only solution to (1.1) with positive imaginary part. 4

Theorem 1.2 (Yin (1986), S. (1995)). Assume: D T n n × n is random Hermitian non-negative definite, independent of X n with F T n − → H a.s. as n → ∞ , H nonrandom. Let T 1 / 2 denote any Hermitian square root of T n , and define B n = (1 /N ) T 1 / 2 XX ∗ T 1 / 2 . n n n D → F as n → ∞ where for each z ∈ C + m = m F ( z ) satisfies Then, with probability one F B n − � 1 (1 . 2) m = t (1 − c − czm ) − z dH ( t ) . It is the only solution to (1.2) in the set { m ∈ C : − (1 − c ) /z + cm ∈ C + } . Theorem 1.3 (Dozier and S. a)). Assume: D R n n × N is random, independent of X n , with F (1 /N ) R n R ∗ − → H a.s. as n → ∞ , H n nonrandom. Let B n = (1 /N )( R n + σX n )( R n + σX n ) ∗ where σ > 0 , nonrandom. Then, with probability D → F as n → ∞ where for each z ∈ C + m = m F ( z ) satisfies one F B n − � 1 (1 . 3) . m = 1+ σ 2 cm − (1 + σ 2 cm ) z + σ 2 (1 − c ) dH ( t ) t It is the only solution to (1.3) in the set { m ∈ C + : ℑ ( mz ) ≥ 0 } . Remark: In Theorem 1.1 if A n = 0 for all n large, then m A ( z ) = − 1 /z and we find that m F has an inverse � z = − 1 t (1 . 4) m + c 1 + tmdH ( t ) . 5

Since � � 1 − n I [0 , ∞ ) + n N F (1 /N ) T 1 / 2 n T 1 / 2 n T n X n = F (1 /N ) X ∗ X n X ∗ n n N we have nTnXn ( z ) = − 1 − n/N + n z ∈ C + , (1 . 5) m F (1 /N ) X ∗ N m ( z ) F (1 /N ) T 1 / 2 nT 1 / 2 XnX ∗ z n n so we have F ( z ) = − 1 − c (1 . 6) m ˆ + cm F ( z ) . z Using this identity, it is easy to see that (1.2) and (1.4) are equivalent. 2. Why these theorems are true. We begin with three facts which account for most of why the limiting results are true, and the appearance of the limiting equations for the Stieltjes transforms. Lemma 2.1 For n × n A , q ∈ C n , and t ∈ C with A and A + tqq ∗ invertible, we have 1 q ∗ ( A + tqq ∗ ) − 1 = 1 + tq ∗ A − 1 q q ∗ A − 1 (since q ∗ A − 1 ( A + tqq ∗ ) = (1 + tq ∗ A − 1 q ) q ∗ ). Corollary 2.1 For q = a + b , t = 1 we have a ∗ A − 1 ( a + b ) a ∗ ( A + ( a + b )( a + b ) ∗ ) − 1 = a ∗ A − 1 − 1 + ( a + b ) ∗ A − 1 ( a + b )( a + b ) ∗ A − 1 6

1 + b ∗ A − 1 ( a + b ) a ∗ A − 1 ( a + b ) 1 + ( a + b ) ∗ A − 1 ( a + b ) a ∗ A − 1 − 1 + ( a + b ) ∗ A − 1 ( a + b ) b ∗ A − 1 . = Proof: Using Lemma 2.1 we have ( A + ( a + b )( a + b ) ∗ ) − 1 − A − 1 = − ( A + ( a + b )( a + b ) ∗ ) − 1 ( a + b )( a + b ) ∗ A − 1 1 1 + ( a + b ) ∗ A − 1 ( a + b ) A − 1 ( a + b )( a + b ) ∗ A − 1 = − Multiplying both sides on the left by a ∗ gives the result. Lemma 2.2 For n × n A and B , with B Hermitian, z ∈ C + , t ∈ R , and q ∈ C n , we have � � � � tq ∗ ( B − zI ) − 1 A (( B − zI ) − 1 q � � ≤ � A � | tr [( B − zI ) − 1 − ( B + tqq ∗ − zI ) − 1 ] A | = � � ℑ z . 1 + tq ∗ ( B − zI ) − 1 q Proof. The identity follows from Lemma 2.1. We have � � � � tq ∗ ( B − zI ) − 1 A (( B − zI ) − 1 q � � ( B − zI ) − 1 q � 2 � � � ≤ � A �| t |� | 1 + tq ∗ ( B − zI ) − 1 q | . 1 + tq ∗ ( B − zI ) − 1 q Write B = � i λ i e i e ∗ i , its spectral decomposition. Then � | e ∗ i q | 2 � ( B − zI ) − 1 q � 2 = | λ i − z | 2 i and � | e ∗ i q | 2 | 1 + tq ∗ ( B − zI ) − 1 q | ≥ | t |ℑ ( q ∗ ( B − zI ) − 1 q ) = | t |ℑ z | λ i − z | 2 . i 7

Lemma 2.3 . For X = ( X 1 , . . . , X n ) T i.i.d. standardized entries, C n × n , we have for any p ≥ 2 �� E | X 1 | 4 tr CC ∗ � p/ 2 + E | X 1 | 2 p tr ( CC ∗ ) p/ 2 � E | X ∗ CX − tr C | p ≤ K p where the constant K p does not depend on n , C , nor on the distribution of X 1 . (Proof given in Bai and S. (1998).) From these properties, roughly speaking, we can make observations like the following: for n × n Hermitian A , q = (1 / √ n )( X 1 , . . . , X n ) T , with X i i.i.d. standardized and independent of A , and z ∈ C + , t ∈ R tq ∗ ( A − zI ) − 1 q 1 tq ∗ ( A + tqq ∗ − zI ) − 1 q = 1 + tq ∗ ( A − zI ) − 1 q = 1 − 1 + tq ∗ ( A − zI ) − 1 q 1 1 ≈ 1 − 1 + t (1 /n ) tr ( A − zI ) − 1 ≈ 1 − 1 + t m A + tqq ∗ ( z ) . Making this and other observations rigorous requires technical considerations, the first being truncation and centralization of the elements of X n , and truncation of the eigenvalues of T n in Theorem 1.2 (not needed in Theorem 1.1) and (1 /n ) R n R ∗ n in Theorem 1.3, all at a rate slower than n ( a ln n for some positive a is sufficient). The truncation and centralization steps will be outlined later. We are at this stage able to go through algebraic manipulations, keeping in mind the above three lemmas, and intuitively derive the equations appearing in each of the three theorems. At the same time we can see what technical details need to be worked out. Before continuing, two more basic properties of matrices is included here. 8

The Stieltjes Transform and its Role in Eigenvalue Behavior of Large - PowerPoint PPT Presentation

The Stieltjes Transform and its Role in Eigenvalue Behavior of Large Dimensional Random Matrices Jack W. Silverstein Department of Mathematics North Carolina State University 1. Introduction . Let M ( R ) denote the collection of all

Topic 10: The Z Transform o Introduction to Z Transform o Relationship to the Fourier transform o

Fourier Series and Transform Overview Why Fourier transform? Trigonometric functions Who is

SMART GOVERNMENT INVOICING: INVOICE PROCESSING PLATFORM LEAD. TRANSFORM. DELIVER LEAD. TRANSFORM.

Topic 4: Continuous-Time Fourier Transform (CTFT) o Introduction to Fourier Transform o Fourier

An indefinite inverse spectral problem of Stieltjes type Andreas Fleige, OTIND 2016 (joint work

Analysis and Computation for Analysis and Computation for Nonlinear Eigenvalue Eigenvalue

ECS 231 Gradient descent methods for solving large scale eigenvalue problems 1 / 17 Generalized

On Eigenvalue Complementarity Problems Alfredo Iusem May 10, 2018 Alfredo Iusem On Eigenvalue

Parallel Numerical Algorithms Chapter 5 Eigenvalue Problems Section 5.2 Eigenvalue

Random Eigenvalue Problem for Linear Dynamic Systems S. A DHIKARI Cambridge University

Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline

Matrix-Eigenvalue Problems in Stochastic Structural Dynamics S Adhikari Department of Aerospace

6. The Symmetric Eigenvalue Problem A must for engineers . . . 6. The Symmetric Eigenvalue

Algebraic Eigenvalue Problem Algebraic Eigenvalue Problem Computers are useless. They can only

Parallel Solution of Symmetric Eigenvalue Problems Zack 2/21/2014 Typically, the eigenvalue

AM 205: lecture 21 Today: eigenvalue sensitivity Eigenvalue Decomposition In some cases, the

Random Conical tilt 3D reconstruction Central section theorem Euler angles Principle

Perspectives on Network Calculus No Free Lunch but Still Good Value Florin Ciucu Jens

Qualitative Research Theoretical Orientations ScWk 240 Week 10 Slides 1 Why Qualitative

Attack on Broadcast RC4 Revisited S. Maitra 1 G. Paul 2 S. Sen Gupta 1 1 Indian Statistical

STAT 113 Sampling, Randomization and Confounding Colin Reimer Dawson Oberlin College September

Markups and Firm-level Export Status Jan De Loecker, Frederic Warzynski Princeton University,

Information and Learning in Markets by Xavier Vives, Princeton University Press 2008

Computing World Dan Boneh and Mark Zhandry Stanford University Classical Chosen Message Attack