A CLT for Information-Theoretic Statistics of Gram Random Matrices - PowerPoint PPT Presentation

A CLT for Information-Theoretic Statistics of Gram Random Matrices Malika Kharouf Joint work with W.Hachem, J.Najim and J.Silverstein October 12, 2010 Workshop Large Random Matrices and their applications - October 11-13, 2010.

The Model: A Non-Centered Random Matrices Consider a p × n random matrices: 1 √ nX n + A n , Σ n = where, ◮ X n ij , 1 ≤ i ≤ p , 1 ≤ j ≤ n are i.i.d. centered with unit variance and E | X 11 | 16 < ∞ . ◮ A n is a p × n deterministic matrix with uniformly bounded spectral norm.

The Model: Information-Theoretic Statistics of Gram random matrices Linear spectral statistics: p I n ( ρ ) = 1 � � λ ( n ) � log + ρ , i p i =1 where, λ ( n ) , i = 1 , . . . , p are the eigenvalues of the Gram random i matrix Σ n Σ ∗ n and ρ is a nonnegative parameter. Objective: Understanding the asymptotic distribution of the fluctuations of I n ( ρ ), when the dimensions of the matrix Σ n converge to infinity at the same pace and obtain a simple form of the variance.

Plan Motivations: Mutual Information for Multiple Antenna Radio Channels Asymptotic behavior of I n ( ρ ): First-order results Fundamental system of equations Deterministic equivalents Study of the fluctuations Definition of the variance The Central Limit Theorem Outline of the proof of the CLT The approach: REFORM method Main steps of the proof The bias Outline of the proof of the bias term

Motivations: Mutual Information for Multiple Antenna Radio Channels

Multi-user MIMO scheme Figure: MIMO Systems

MIMO System: Mathematical Model The p -dimensional receiver vector r n is given by: r n = Σ n t n + b n , where, ◮ Σ n represents the channel matrix which assumed to be random. ◮ t n is the n -dimensional transmitter vector. ◮ b n is an additive white Gaussian noise with covariance matrix E b n b ∗ n = ρ I p . Performance indicator: The Mutual Information: p I n ( ρ ) = 1 n + ρ I p ) = 1 � p log det (Σ n Σ ∗ log ( λ i + ρ ) p i =1 Asymptotic behavior of I n ( ρ ) when n , p → ∞ at the same rate ?

Asymptotic behavior of I n ( ρ ) : First-order results

First-order results Let f n denotes the ST of µ Σ n Σ ∗ n , the spectral measure of the eigenvalues of Σ n Σ ∗ n . Then, � ∞ I n ( ρ ) = − f n ( − ω ) d ω. ρ Then the asymptotic behavior of I n ( ρ ) is closely linked to the asymptotic behavior of f n as p , n → ∞ with the same pace.

State of the art n → H , H is a deterministic probability measure. ◮ F A n A ∗ Dozier and Silverstein (04): n weakly F Σ n Σ ∗ − − − − − → F , where, F is a deterministic probability measure which the Stieltjes transform is a unique solution of a given coupled equation.

State of the art n → H , H is a deterministic probability measure. ◮ F A n A ∗ Dozier and Silverstein (04): n weakly F Σ n Σ ∗ − − − − − → F , where, F is a deterministic probability measure which the Stieltjes transform is a unique solution of a given coupled equation. ◮ V. L. Girko (91), Hachem-Loubaton-Najim (07) : Look for a deterministic approximation of the Stieltjes transform f n of F Σ n Σ ∗ n . ∃ a p × p deterministic valued function T n ( ρ ) such that: f n ( − ρ ) − 1 a . s p Tr T n ( − ρ ) − − − → n →∞ 0

Fundamental equations Theorem (Girko ’91, Hachem-Loubaton-Najim ’07) The following system of two equations � − 1 △  A n A ∗ δ n ( ρ ) = 1 � = 1 � � 1 + ˜ n δ n ( ρ ) I p + n Tr T n ( ρ )  n Tr ρ   1 + δ n ( ρ )  � − 1 △ A ∗ δ n ( ρ ) = 1 � = 1 n A n ˜ n Tr ˜  n Tr ρ (1 + δ n ( ρ )) I n + T n ( ρ ) ,   1 + ˜  δ n ( ρ ) admits a unique solution ( δ n , ˜ δ n ) in S ( R + ) 2 . Moreover, � � a . s R + f ( λ ) dF Σ n Σ ∗ ∀ f ∈ C B ( R + ) , n ( λ ) − R + f ( λ ) π n ( d λ ) − n →∞ 0 , − − → where π n is the positive measure where δ n is the Stieltjes transform.

First order result: Deterministic equivalents Theorem (Hachem-Loubaton-Najim ’07) � Let V n ( ρ ) = R + log( λ + ρ ) π n ( d λ ) . Then we have: E I n ( ρ ) − V n ( ρ ) − − − − − − − − − − → n → c > 0 0 . n , p →∞ , p Moreover, V n ( ρ ) admits a closed-form expression p � µ 2 � 1 � � n , i � 1 + ˜ V n ( ρ ) = log ρ δ n + 1 + δ n p i =1 + n p log (1 + δ n ) − ρ n p δ n ˜ δ n , where µ n , i are the singular values of the mean matrix A n .

In the non-centered case, the first-order asymptotic study of the mutual information depends mainly on the limiting behavior of the singular values of the mean matrix A n .

Study of the fluctuations

CLT for p ( I n ( ρ ) − V n ( ρ )) In order to study the CLT for p ( I n ( ρ ) − V n ( ρ )) we study separately two quantities: ◮ The random quantity p ( I n ( ρ ) − E I n ( ρ )) from which the fluctuations arise and, ◮ The deterministic quantity p ( E I n ( ρ ) − V n ( ρ )) which yields a bias.

Asymptotic distribution of the fluctuations: Definition of the variance Theorem (Hachem-Kharouf-Najim-Silverstein ’10) 11 , κ = E | X 11 | 4 − 2 − ϑ 2 and let Let ϑ = E X 2 T ¯ n Tr T 2 , ˜ T 2 , γ = 1 γ = 1 γ = 1 n Tr ˜ n Tr T ¯ γ = 1 n Tr ˜ ˜ T , ˜ T . Denote by 2     1 Θ 2 � Tr TAA ∗ T − ρ 2 γ ˜ = − log  1 − γ n    � 1 + ˜ n δ 2 � �   � � 1 � Tr ¯ T ¯ � AA ∗ T � − | ϑ | 2 ρ 2 γ ˜ − log 1 − ϑ γ  � �  � 1 + ˜ � � n δ � � + κρ 2 � � t 2 t 2 ˜ ii jj n 2 i j Then Θ 2 n is well defined.

Some remarks ◮ The variance is the sum of tree terms: the first term would be the same in the Gaussian case. ◮ The variance depends on the singular values of the main matrix as well as on its singular vectors. = X ij e i α for all α ), the second term D ◮ In the circular case ( X ij disappears.

Asymptotic distribution of the fluctuations: The CLT Theorem (Hachem-Kharouf-Najim-Silverstein ’10) The following convergence holds true: p D ( I n ( ρ ) − E I n ( ρ )) − p , n →∞ N (0 , 1) , − − − → Θ n where D stands for convergence in distribution.

Proof of the CLT: The approach REFORM ( RE solvent FOR mula and M artingale). ◮ I n ( ρ ) − E I n ( ρ ) as a sum of increments of martingale. ◮ Identification of the variance.

CLT for martingales Theorem Let Γ ( n ) 1 , . . . , Γ ( n ) be a sequence of increments of martingale with n respect to a given filtration F ( n ) 1 , . . . , F ( n ) n . Assume that there exists a sequence of nonnegative real numbers (Θ 2 n ) n uniformly bounded away from zero and from infinity. Assume that: ◮ n � Γ ( n ) 2 � |F ( n ) P � − Θ 2 − n →∞ 0 . − − → E n j j − 1 j =1 ◮ The Lyapunov’s condition n 1 | 2+ α − � E | Γ ( n ) ∃ α > 0 , n →∞ 0 , − − → holds . j Θ 2(1+ α ) n j =1 j =1 Γ ( n ) � n Then Θ − 1 converges in distribution to N (0 , 1) . n j

Sum of martingale differences We have, n n △ � � I n − E I n = ( E j − E j − 1 ) ( − log(1 + ξ j )) = Γ j , j =1 j =1 where, � � 1 η ∗ n Tr Q j + a ∗ j Q j η j − j Q j a j ξ j = . 1 + 1 n Tr Q j + a ∗ j Q j a j with η j , a j are resp. the jth columns of matrices Σ n and A n , Q j is the resolvent of the matrix Σ j Σ ∗ j and E j stands for the conditional expectation with respect to the σ -algebra F ( n ) = σ ( x 1 , . . . , x j ). j

Sum of the conditional variances Some properties of the function log, n n E j − 1 (( E j − E j − 1 ) log(1 + ξ j )) 2 − � � E j − 1 ( E j ξ j ) 2 P − p , n →∞ 0 − − − → j =1 j =1 where (recall) � � 1 η ∗ n Tr Q j + a ∗ j Q j η j − j Q j a j ξ j = . 1 + 1 n Tr Q j + a ∗ j Q j a j

Study of the sum of conditional variances Standard calculations remain the problem to the study of the asymptotic behavior of the quantities: 1 n Tr ( E j Q n ) 2 j ( E j Q n ) 2 a j , a ∗ and where Q n is the resolvent of Σ n Σ ∗ n matrix.

Outline of the proof A good comprehension of the asymptotic behavior of these terms requires a specific study of bilinear forms of type u ∗ n Q ( ρ ) v n where at least u n or v n is a given column of the deterministic mean matrix A n . If u n and v n are deterministics, Hachem-Loubaton-Najim-Vallet (preprint’10) u ∗ n Q ( ρ ) v n ≈ u ∗ n T ( ρ ) v n

Asymptotic behavior of the bias: Theorem (Hachem-Kharouf-Najim-Silverstein ’10) We have, p ( E I n ( ρ ) − V n ( ρ )) − B n ( ρ ) − p , n →∞ 0 − − − → where, B n ( ρ ) = κ Cte ( ρ, δ, ˜ δ ) κ = E | X 11 | 4 − 2 − ϑ 2 .

Outline of the proof of the bias term The bias term is given by p ( E I n ( ρ ) − V n ( ρ )) χ n ( ρ ) = � ∞ d d ω E log det (Σ n Σ ∗ = n + ω I p ) d ω p ρ � ∞ �� d − p R + log ( λ + ω ) π n ( d λ ) d ω d ω ρ � ∞ = Tr ( E Q n ( ω ) − T n ( ω )) d ω. ρ Then it remains to study the asymptotic behavior of Tr ( E Q n ( ω ) − T n ( ω )). We prove, Tr ( E Q n ( ω ) − T n ( ω )) − κ Cte ( ρ, δ, ˜ δ ) − n →∞ 0 − − →

Case of a non-centered separable random matrix model

A CLT for Information-Theoretic Statistics of Gram Random Matrices - PowerPoint PPT Presentation

A CLT for Information-Theoretic Statistics of Gram Random Matrices Malika Kharouf Joint work with W.Hachem, J.Najim and J.Silverstein October 12, 2010 Workshop Large Random Matrices and their applications - October 11-13, 2010. The Model: A

21 st Century Antibiotics Gram Negative Antibiotic Gram Positive Antibiotic Plasmid Library

More microscopic slides of bacteria Gram stain Good example of bacilli gram stain that is

A conditional quenched CLT for random walks among random conductances on Z d Christophe Gallesco

CLT 3xW The What, Why and When success story of Cross Laminated Timber Technology and the

CLT Continued Aircraft Operations Evaluations Airport Community Roundtable Presentation August

Bristol CLT: Umbrella CLT 500 + members Bristol City Council support and funding

CS70: Lecture 33. WLLN, Confidence Intervals (CI): Chebyshev vs. CLT 1. Review: Inequalities:

Gra rant ntmakers rs of Western rn Penns nnsylv ylvania nia 9.14.18 9.14.18 agend

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

N-gram models Unsmoothed n-gram models (finish slides from last class) Smoothing

N-Gram Model Formulas Estimating Probabilities N-gram conditional probabilities can be

GOLD/SILVER/PLATINUM BARS & COINS RSBL 0.5 Gram 999 Purity Platinum Bar/Coin More Details

18.175: Lecture 17 Poisson random variables Scott Sheffield MIT 18.175 Lecture 16 1 Outline More

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

INFORMATION-THEORETIC SECURITY INFORMATION-THEORETIC SECURITY Lecture 4 - Elements of Information

18.175: Lecture 18 Poisson random variables Scott Sheffield MIT 18.175 Lecture 18 1 Outline Extend

Gram-Schmidt Finding Orthonormal Basis The famous Gram-Schmidt process is used to produce an

The Infinite Markov Model Daichi Mochihashi NTT Communication Science Laboratories, Japan

Fast Multipole Methods in Arbitrary Dimensions with Chenhan Yu James Levitt Severin Riez

Programming Languages and Machine Learning Martin Vechev DeepCode.ai and ETH Zurich PL Research:

N-gram Graph: Representation for Graphs Shengchao Liu, Mehmet Furkan Demirel, Yingyu Liang

On Out-of-Distribution Detection Algorithms with Deep Neural Skin Cancer Classifiers Andre G. C.

CS7015 (Deep Learning) : Lecture 10 Learning Vectorial Representations Of Words Mitesh M. Khapra

Information Retrieval WS 2016 / 2017 Lecture 5, Tuesday November 22 nd , 2016 (Fuzzy Search, Edit

Sambuz

Useful Links

Newsletter

Mail Us

A CLT for Information-Theoretic Statistics of Gram Random Matrices - PowerPoint PPT Presentation

A CLT for Information-Theoretic Statistics of Gram Random Matrices Malika Kharouf Joint work with W.Hachem, J.Najim and J.Silverstein October 12, 2010 Workshop Large Random Matrices and their applications - October 11-13, 2010. The Model: A

21 st Century Antibiotics Gram Negative Antibiotic Gram Positive Antibiotic Plasmid Library

More microscopic slides of bacteria Gram stain Good example of bacilli gram stain that is

A conditional quenched CLT for random walks among random conductances on Z d Christophe Gallesco

CLT 3xW The What, Why and When success story of Cross Laminated Timber Technology and the

CLT Continued Aircraft Operations Evaluations Airport Community Roundtable Presentation August

Bristol CLT: Umbrella CLT 500 + members Bristol City Council support and funding

CS70: Lecture 33. WLLN, Confidence Intervals (CI): Chebyshev vs. CLT 1. Review: Inequalities:

Gra rant ntmakers rs of Western rn Penns nnsylv ylvania nia 9.14.18 9.14.18 agend

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

N-gram models Unsmoothed n-gram models (finish slides from last class) Smoothing

N-Gram Model Formulas Estimating Probabilities N-gram conditional probabilities can be

GOLD/SILVER/PLATINUM BARS &amp; COINS RSBL 0.5 Gram 999 Purity Platinum Bar/Coin More Details

18.175: Lecture 17 Poisson random variables Scott Sheffield MIT 18.175 Lecture 16 1 Outline More

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

INFORMATION-THEORETIC SECURITY INFORMATION-THEORETIC SECURITY Lecture 4 - Elements of Information

18.175: Lecture 18 Poisson random variables Scott Sheffield MIT 18.175 Lecture 18 1 Outline Extend

Gram-Schmidt Finding Orthonormal Basis The famous Gram-Schmidt process is used to produce an

The Infinite Markov Model Daichi Mochihashi NTT Communication Science Laboratories, Japan

Fast Multipole Methods in Arbitrary Dimensions with Chenhan Yu James Levitt Severin Riez

Programming Languages and Machine Learning Martin Vechev DeepCode.ai and ETH Zurich PL Research:

N-gram Graph: Representation for Graphs Shengchao Liu, Mehmet Furkan Demirel, Yingyu Liang

On Out-of-Distribution Detection Algorithms with Deep Neural Skin Cancer Classifiers Andre G. C.

CS7015 (Deep Learning) : Lecture 10 Learning Vectorial Representations Of Words Mitesh M. Khapra

Information Retrieval WS 2016 / 2017 Lecture 5, Tuesday November 22 nd , 2016 (Fuzzy Search, Edit

Sambuz

Useful Links

Newsletter

Mail Us

GOLD/SILVER/PLATINUM BARS & COINS RSBL 0.5 Gram 999 Purity Platinum Bar/Coin More Details