high dimensional covariance decomposition into sparse
play

High-Dimensional Covariance Decomposition into Sparse Markov and - PowerPoint PPT Presentation

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T .


  1. High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine

  2. High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T . High-dimensional regime: both n, p → ∞ and n ≪ p . Covariance estimation: Σ ∗ := E [ XX T ] . Challenge: empirical (sample) covariance ill-posed when n ≪ p : � n Σ n := 1 x ( k ) x ( k ) T . � n k =1 Solution: Imposing Sparsity for Tractable High-dimensional Estimation

  3. Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M

  4. Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M Relationship with Statistical Properties (Gaussian) Sparse Covariance= Independence Model: marginal independence. Sparse Inverse Covariance=Markov Model: conditional independence

  5. Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M Relationship with Statistical Properties (Gaussian) Sparse Covariance= Independence Model: marginal independence. Sparse Inverse Covariance=Markov Model: conditional independence Guarantees under Sparsity Constraints in High Dimensions Consistent Estimation when n = Ω(log p ) ⇒ n ≪ p .

  6. Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M Relationship with Statistical Properties (Gaussian) Sparse Covariance= Independence Model: marginal independence. Sparse Inverse Covariance=Markov Model: conditional independence Guarantees under Sparsity Constraints in High Dimensions Consistent Estimation when n = Ω(log p ) ⇒ n ≪ p . Going beyond Sparsity in High Dimensions?

  7. Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains.

  8. Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. One Possibility: Sparse Markov + Sparse Independence Models Sparsity in Multiple Domains: Multiple Statistical Relationships. 1 Σ R J − 1 Σ ∗ M

  9. Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. One Possibility: Sparse Markov + Sparse Independence Models Sparsity in Multiple Domains: Multiple Statistical Relationships. 1 Σ R J − 1 Σ ∗ M Efficient Decomposition and Estimation in High Dimensions?

  10. Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. One Possibility: Sparse Markov + Sparse Independence Models Sparsity in Multiple Domains: Multiple Statistical Relationships. 1 Σ R J − 1 Σ ∗ M Efficient Decomposition and Estimation in High Dimensions? Unique Decomposition? Good Sample Requirements?

  11. Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M

  12. Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M Contribution 1: Novel Method for Decomposition Decomposition into Markov and residual domains. Unification of Sparse Covariance and Inverse Covariance Estimation.

  13. Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M Contribution 1: Novel Method for Decomposition Decomposition into Markov and residual domains. Unification of Sparse Covariance and Inverse Covariance Estimation. Contribution 2: Guarantees for Estimation Conditions for unique decomposition (exact statistics). Sparsistency and norm guarantees in both Markov and independence domains (sample analysis) Sample requirement: no. of samples n = Ω(log p ) for p variables.

  14. Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M Contribution 1: Novel Method for Decomposition Decomposition into Markov and residual domains. Unification of Sparse Covariance and Inverse Covariance Estimation. Contribution 2: Guarantees for Estimation Conditions for unique decomposition (exact statistics). Sparsistency and norm guarantees in both Markov and independence domains (sample analysis) Sample requirement: no. of samples n = Ω(log p ) for p variables. Efficient Method for Covariance Decomposition and Estimation

  15. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.)

  16. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang)

  17. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al)

  18. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al) Multi-Resolution Markov+Independence Models (Choi et. al) Decomposition in inverse covariance domain Lack theoretical guarantees

  19. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al) Multi-Resolution Markov+Independence Models (Choi et. al) Decomposition in inverse covariance domain Lack theoretical guarantees Our contribution: Guaranteed Decomposition and Estimation

  20. Outline Introduction 1 Algorithm 2 Guarantees 3 Experiments 4 Conclusion 5

  21. Some Intuitions and Ideas − 1 + Σ ∗ Σ ∗ = J ∗ 1 R . M � Σ n : sample covariance using n i.i.d. samples

  22. Some Intuitions and Ideas − 1 + Σ ∗ Σ ∗ = J ∗ 1 R . M � Σ n : sample covariance using n i.i.d. samples Review Ideas for Special Cases: Sparse Covariance/Inverse Covariance

  23. Some Intuitions and Ideas − 1 + Σ ∗ Σ ∗ = J ∗ 1 R . M � Σ n : sample covariance using n i.i.d. samples Review Ideas for Special Cases: Sparse Covariance/Inverse Covariance Sparse Covariance Estimation (Independence Model) Σ ∗ = Σ ∗ R . � Σ n : sample covariance using n samples p variables: p ≫ n . Thresholding estimator for off-diagonals (Bickel & Levina): threshold � log p chosen as n Sparsistency (support recovery) and Norm guarantees when n = Ω(log p ) ⇒ n ≪ p .

  24. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples

  25. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08)

  26. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0

  27. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0 Max-entropy Formulation (Lagrangian Dual) � Σ M := argmax log det Σ M − λ � Σ R � 1 , off Σ M ≻ 0 , Σ R � � �� Σ n � � � Σ n − Σ M � ∞ , off ≤ γ, � � s . t . Σ M d = , Σ R d = 0 . d

  28. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0 Max-entropy Formulation (Lagrangian Dual) � Σ M := argmax log det Σ M − λ � Σ R � 1 , off Σ M ≻ 0 , Σ R � � �� Σ n � � � Σ n − Σ M � ∞ , off ≤ γ, � � s . t . Σ M d = , Σ R d = 0 . d Consistent Estimation Under Certain Conditions, n = Ω(log p )

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend