a clt for wishart tensors
play

A CLT for Wishart Tensors Dan Mikulincer Weizmann Institute of - PowerPoint PPT Presentation

A CLT for Wishart Tensors Dan Mikulincer Weizmann Institute of Science 1 Wishart Tensors Let { X i } d i =1 be i.i.d. copies of an isotropic random vector X in R n . Denote by W p n , d ( ) the law of d 1 X


  1. A CLT for Wishart Tensors Dan Mikulincer Weizmann Institute of Science 1

  2. Wishart Tensors Let { X i } d i =1 be i.i.d. copies of an isotropic random vector X ∼ µ in R n . Denote by W p n , d ( µ ) the law of � � �� d � 1 X ⊗ p X ⊗ p √ − E . i i d i =1 We are interested in the behavior as d → ∞ . Specifically, when is it true that W p n , d ( µ ) is approximately Gaussian? 2

  3. Wishart Tensors Let { X i } d i =1 be i.i.d. copies of an isotropic random vector X ∼ µ in R n . Denote by W p n , d ( µ ) the law of � � �� d � 1 X ⊗ p X ⊗ p √ − E . i i d i =1 We are interested in the behavior as d → ∞ . Specifically, when is it true that W p n , d ( µ ) is approximately Gaussian? 2

  4. Technicalities W p n , d ( µ ) is a measure on the tensor space ( R n ) ⊗ p , which we identify with R n · p , through the basis, { e i 1 ⊗ · · · ⊗ e i p | 1 ≤ i 1 , . . . , i p ≤ n } . For simplicity we will focus on the sub-space of ’principal’ tensors, with basis, { e i 1 ⊗ · · · ⊗ e i p | 1 ≤ i 1 < · · · < i p ≤ n } . The projection of W p n , d ( µ ) will be denoted by � W p n , d ( µ ). 3

  5. Technicalities W p n , d ( µ ) is a measure on the tensor space ( R n ) ⊗ p , which we identify with R n · p , through the basis, { e i 1 ⊗ · · · ⊗ e i p | 1 ≤ i 1 , . . . , i p ≤ n } . For simplicity we will focus on the sub-space of ’principal’ tensors, with basis, { e i 1 ⊗ · · · ⊗ e i p | 1 ≤ i 1 < · · · < i p ≤ n } . The projection of W p n , d ( µ ) will be denoted by � W p n , d ( µ ). 3

  6. Technicalities W p n , d ( µ ) is a measure on the tensor space ( R n ) ⊗ p , which we identify with R n · p , through the basis, { e i 1 ⊗ · · · ⊗ e i p | 1 ≤ i 1 , . . . , i p ≤ n } . For simplicity we will focus on the sub-space of ’principal’ tensors, with basis, { e i 1 ⊗ · · · ⊗ e i p | 1 ≤ i 1 < · · · < i p ≤ n } . The projection of W p n , d ( µ ) will be denoted by � W p n , d ( µ ). 3

  7. Wishart Matrices When p = 2 and X ∼ µ is isotropic, W 2 n , d ( µ ) can be realized as the law of XX T − d · Id √ . d Here, X is an n × d matrix, with columns being i.i.d. copies of X . In this case, � W 2 n , d ( µ ) is the law of the upper triangular part. 4

  8. Some Observations Let us restrict our attention to the case p = 2. ❼ for fixed n , by the central limit theorem W 2 n , d ( µ ) → N (0 , Σ). ❼ If n = d , then the spectral measure of XX T converges to the Marchenko-Pastur distribution. In particular, W 2 n , d ( µ ) is not Gaussian. Question How should n depend on d so that W p n , d ( µ ) is approximately Gaussian. 5

  9. Some Observations Let us restrict our attention to the case p = 2. ❼ for fixed n , by the central limit theorem W 2 n , d ( µ ) → N (0 , Σ). ❼ If n = d , then the spectral measure of XX T converges to the Marchenko-Pastur distribution. In particular, W 2 n , d ( µ ) is not Gaussian. Question How should n depend on d so that W p n , d ( µ ) is approximately Gaussian. 5

  10. Some Observations Let us restrict our attention to the case p = 2. ❼ for fixed n , by the central limit theorem W 2 n , d ( µ ) → N (0 , Σ). ❼ If n = d , then the spectral measure of XX T converges to the Marchenko-Pastur distribution. In particular, W 2 n , d ( µ ) is not Gaussian. Question How should n depend on d so that W p n , d ( µ ) is approximately Gaussian. 5

  11. Some Observations Let us restrict our attention to the case p = 2. ❼ for fixed n , by the central limit theorem W 2 n , d ( µ ) → N (0 , Σ). ❼ If n = d , then the spectral measure of XX T converges to the Marchenko-Pastur distribution. In particular, W 2 n , d ( µ ) is not Gaussian. Question How should n depend on d so that W p n , d ( µ ) is approximately Gaussian. 5

  12. Random Geometric Graphs From now on, let γ stand for the standard Gaussian, in different dimensions. In (Bubeck, Ding, Eldan, R´ acz 15’) and independently in (Jiang, Li 15’) it was shown, � � ❼ If n 3 � W 2 d → 0, then TV → 0. n , d ( γ ) , γ This is tight, in the sense, � � ❼ If n 3 � W 2 d → ∞ , then TV → 1. n , d ( γ ) , γ (R´ acz, Richey 16’) shows that the phase transition is smooth. 6

  13. Extensions (Bubeck, Ganguly 15’) extended the result to any log-concave product measure. That is, X i , j are i.i.d. as e − ϕ ( x ) dx for some convex ϕ . ❼ Original motivation came from random geomteric graphs. ❼ (Fang, Koike 20’) removed the log-concavity assumption. 7

  14. Extensions (Nourdin, Zheng 18’) gave the following results, as an answer to questions raised in (Bubeck, Ganguly 15’) ❼ If the rows of X are i.i.d. N (0 , Σ), for some positive definite Σ. Then � � � n 3 W 2 � W 1 n , d , γ � d . (See also (Eldan, M 16’)) � � � W p � n 2 p − 1 ❼ W 1 n , d ( γ ) , γ � . d 8

  15. Extensions (Nourdin, Zheng 18’) gave the following results, as an answer to questions raised in (Bubeck, Ganguly 15’) ❼ If the rows of X are i.i.d. N (0 , Σ), for some positive definite Σ. Then � � � n 3 W 2 � W 1 n , d , γ � d . (See also (Eldan, M 16’)) � � � W p � n 2 p − 1 ❼ W 1 n , d ( γ ) , γ � . d 8

  16. Extensions (Nourdin, Zheng 18’) gave the following results, as an answer to questions raised in (Bubeck, Ganguly 15’) ❼ If the rows of X are i.i.d. N (0 , Σ), for some positive definite Σ. Then � � � n 3 W 2 � W 1 n , d , γ � d . (See also (Eldan, M 16’)) � � � W p � n 2 p − 1 ❼ W 1 n , d ( γ ) , γ � . d 8

  17. Main Result Today: Theorem If µ is a measure on R n which is uniformly log-concave and unconditional, then � � � n 2 p − 1 W p � n , d ( µ ) , γ � . dist d ❼ dist stands from some notion of distance to be introduced soon. But could be replaced with W 2 . ❼ The assumptions of uniform log-concavity and unconditionality may be relaxed. ❼ The result also holds for a large class of product measures. 9

  18. Main Result Today: Theorem If µ is a measure on R n which is uniformly log-concave and unconditional, then � � � n 2 p − 1 W p � n , d ( µ ) , γ � . dist d ❼ dist stands from some notion of distance to be introduced soon. But could be replaced with W 2 . ❼ The assumptions of uniform log-concavity and unconditionality may be relaxed. ❼ The result also holds for a large class of product measures. 9

  19. Main Result Today: Theorem If µ is a measure on R n which is uniformly log-concave and unconditional, then � � � n 2 p − 1 W p � n , d ( µ ) , γ � . dist d ❼ dist stands from some notion of distance to be introduced soon. But could be replaced with W 2 . ❼ The assumptions of uniform log-concavity and unconditionality may be relaxed. ❼ The result also holds for a large class of product measures. 9

  20. Main Result Today: Theorem If µ is a measure on R n which is uniformly log-concave and unconditional, then � � � n 2 p − 1 W p � n , d ( µ ) , γ � . dist d ❼ dist stands from some notion of distance to be introduced soon. But could be replaced with W 2 . ❼ The assumptions of uniform log-concavity and unconditionality may be relaxed. ❼ The result also holds for a large class of product measures. 9

  21. The Challenge � � �� � d 1 X ⊗ p X ⊗ p By considering, − E , one may hope to be √ i i d i =1 able to apply an estimate of the high-dimensional central limit theorem. Optimistically, such estimates give: � � X ⊗ p � 3 � � � ≤ E W p � √ dist n , d ( µ ) , γ . d Thus, to obtain optimal convergence rates, we need to exploit the low dimensional structure of � W p n , d ( µ ). 10

  22. The Challenge � � �� � d 1 X ⊗ p X ⊗ p By considering, − E , one may hope to be √ i i d i =1 able to apply an estimate of the high-dimensional central limit theorem. Optimistically, such estimates give: � � X ⊗ p � 3 � � � ≤ E W p � √ dist n , d ( µ ) , γ . d Thus, to obtain optimal convergence rates, we need to exploit the low dimensional structure of � W p n , d ( µ ). 10

  23. The Challenge � � �� � d 1 X ⊗ p X ⊗ p By considering, − E , one may hope to be √ i i d i =1 able to apply an estimate of the high-dimensional central limit theorem. Optimistically, such estimates give: � � X ⊗ p � 3 � � � ≤ E W p � √ dist n , d ( µ ) , γ . d Thus, to obtain optimal convergence rates, we need to exploit the low dimensional structure of � W p n , d ( µ ). 10

  24. Stein’s Method Basic observation: If G ∼ γ on R n . Then, for any smooth test function f : R n → R n , E [ � G , f ( G ) � ] = E [ div f ( G )] . Moreover, the Gaussian is the only measure which satisfies this relation. Stein’s idea: E [ � X , f ( X ) � ] ≃ E [ div f ( X )] = ⇒ X ≃ G . 11

  25. Stein’s Method Basic observation: If G ∼ γ on R n . Then, for any smooth test function f : R n → R n , E [ � G , f ( G ) � ] = E [ div f ( G )] . Moreover, the Gaussian is the only measure which satisfies this relation. Stein’s idea: E [ � X , f ( X ) � ] ≃ E [ div f ( X )] = ⇒ X ≃ G . 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend