High-Dimensional Covariance Decomposition into Sparse Markov and - PowerPoint PPT Presentation

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine

High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T . Covariance estimation: Σ ∗ := E [ XX T ] . High-dimensional regime: both n, p → ∞ and n ≪ p . Challenge: empirical (sample) covariance ill-posed when n ≪ p : n � Σ n := 1 x ( k ) x ( k ) T . � n k =1 Solution: Imposing Sparsity for Tractable High-dimensional Estimation

Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 − 1 Σ ∗ J ∗ Σ ∗ Σ ∗ M R

Incorporating Sparsity in High Dimensions Sparse Covariance Σ ∗ Σ ∗ R Relationship with Statistical Properties (Gaussian) Sparse Covariance (Independence Model): marginal independence

Incorporating Sparsity in High Dimensions Sparse Inverse Covariance 1 J ∗ − 1 Σ ∗ M Relationship with Statistical Properties (Gaussian) Sparse Inverse Covariance (Markov Model): conditional independence Local Markov Property: X i ⊥ X V \{ nbd( i ) ∪ i } | X nbd( i ) For Gaussian: J ij = 0 ⇔ ( i, j ) / ∈ E

Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 − 1 Σ ∗ J ∗ Σ ∗ Σ ∗ M R Relationship with Statistical Properties (Gaussian) Sparse Covariance (Independence Model): marginal independence Sparse Inverse Covariance (Markov Model): conditional independence

Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J ∗ − 1 Σ ∗ Σ ∗ Σ ∗ M R Relationship with Statistical Properties (Gaussian) Sparse Covariance (Independence Model): marginal independence Sparse Inverse Covariance (Markov Model): conditional independence Guarantees under Sparsity Constraints in High Dimensions Consistent Estimation when n = Ω(log p ) ⇒ n ≪ p . Consistent: Sparsistent and Satisfying reasonable Norm Guarantees.

Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 − 1 Σ ∗ J ∗ Σ ∗ Σ ∗ M R Relationship with Statistical Properties (Gaussian) Sparse Covariance (Independence Model): marginal independence Sparse Inverse Covariance (Markov Model): conditional independence Guarantees under Sparsity Constraints in High Dimensions Consistent Estimation when n = Ω(log p ) ⇒ n ≪ p . Going beyond Sparsity in High Dimensions?

Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. Challenge: Hard to impose sparsity in different domains

Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. Challenge: Hard to impose sparsity in different domains One Possibility (This Work): Proposing Sparse Markov Model by adding Sparse Residual Perturbation 1 Σ ∗ J ∗ − 1 Σ ∗ R M

Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. Challenge: Hard to impose sparsity in different domains One Possibility (This Work): Proposing Sparse Markov Model by adding Sparse Residual Perturbation 1 Σ ∗ J ∗ − 1 Σ ∗ R M Efficient Decomposition and Estimation in High Dimensions? Unique Decomposition? Good Sample Requirements?

Summary of Results 1 Σ ∗ + Σ ∗ R = J ∗ − 1 . M

Summary of Results 1 Σ ∗ + Σ ∗ R = J ∗ − 1 . M Contribution 1: Novel Model for Decomposition Decomposition into Markov and residual domains. Statistically meaningful model Unification of Sparse Covariance and Inverse Covariance Estimation.

Summary of Results 1 Σ ∗ + Σ ∗ R = J ∗ − 1 . M Contribution 1: Novel Model for Decomposition Decomposition into Markov and residual domains. Statistically meaningful model Unification of Sparse Covariance and Inverse Covariance Estimation. Contribution 2: Methods and Guarantees Conditions for unique decomposition (exact statistics). Sparsistency and norm guarantees in both Markov and independence domains (sample analysis) Sample requirement: no. of samples n = Ω(log p ) for p variables. Efficient Method for Covariance Decomposition and Estimation in High-Dimension

Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang)

Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al)

Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al) Multi-Resolution Markov+Independence Models (Choi et. al) Decomposition in inverse covariance domain Lack theoretical guarantees Our contribution: Guaranteed Decomposition and Estimation

Outline Introduction 1 Algorithm 2 Guarantees 3 Experiments 4 Proof Techniques 5 Conclusion 6

Some Intuitions and Ideas Review Ideas for Special Cases: Sparse Covariance/Inverse Covariance

Some Intuitions and Ideas Review Ideas for Special Cases: Sparse Covariance/Inverse Covariance Sparse Covariance Estimation (Independence Model) Σ ∗ = Σ ∗ I . � Σ n : sample covariance using n samples p variables: p ≫ n .

Some Intuitions and Ideas Review Ideas for Special Cases: Sparse Covariance/Inverse Covariance Sparse Covariance Estimation (Independence Model) Σ ∗ = Σ ∗ I . � Σ n : sample covariance using n samples p variables: p ≫ n . Σ n (Bickel & Levina): Hard-thresholding the off-diagonal entries of � � log p threshold chosen as n Sparsistency (support recovery) and Norm Guarantees when n = Ω(log p ) ⇒ n ≪ p .

Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples

Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0 where � � J M � 1 , off := | ( J M ) ij | . i � = j

Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0 where � � J M � 1 , off := | ( J M ) ij | . i � = j Max-entropy Formulation (Lagrangian Dual) � Σ M := argmax log det Σ M − λ � Σ R � 1 , off Σ M ≻ 0 , Σ R � � �� Σ n � � � Σ n − Σ M � ∞ , off ≤ γ, � � s . t . Σ M d = , Σ R d = 0 . d

Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0 where � � J M � 1 , off := | ( J M ) ij | . i � = j Max-entropy Formulation (Lagrangian Dual) � Σ M := argmax log det Σ M − λ � Σ R � 1 , off Σ M ≻ 0 , Σ R � � �� Σ n � � � Σ n − Σ M � ∞ , off ≤ γ, � � s . t . Σ M d = , Σ R d = 0 . d Consistent Estimation Under Certain Conditions, n = Ω(log p )

Extension to Markov+Independence Models? 1 Σ ∗ + Σ ∗ − 1 . R = J ∗ M Sparse Covariance Estimation Hard-thresholding the off-diagonal entries of � Σ n . Sparse Inverse Covariance Estimation Add ℓ 1 penalty to maximum likelihood program (involving inverse covariance matrix estimation)

Extension to Markov+Independence Models? 1 Σ ∗ + Σ ∗ − 1 . R = J ∗ M Sparse Covariance Estimation Hard-thresholding the off-diagonal entries of � Σ n . Sparse Inverse Covariance Estimation Add ℓ 1 penalty to maximum likelihood program (involving inverse covariance matrix estimation) Is it possible to unify above methods and guarantees?

High-Dimensional Covariance Decomposition into Sparse Markov and - PowerPoint PPT Presentation

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T . Covariance

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid

Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance Functions 2 Nugget Covariance

Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices Guangming Pan,

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

High Dimensional Data, Covariance Matrices High Dimensional Data Examples and Application to

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Covariance Matrices and Covariance Operators Theory and Applications H` a Quang Minh Functional

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with

High-Dimensional Pattern Recognition via Sparse Representation Allen Y. Yang University of

High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou

A Spline Dimensional Decomposition for High-Dimensional Uncertainty Quantification Sharif Rahman

Modelling covariance kernels for nonstationary random fields Christopher G. Small University of

Covariance & anchored t ypes 1 Covariance? Wit hin t he t ype syst em of a programming

Covariance Matrix Adaptation Covariance Matrix Adaptation Evolution Strategies Recalling New

EEMBC IoT Security Benchmarks Hannes Tschofenig (hannes.tschofenig@arm.com) IoT Devices cannot

Information Theory and Synthetic Steganography CSM25 Secure Information Hiding Dr Hans Georg

Web Security CSP and Web Cryptography Habib Virji Samsung Open Source Group

Access control for data integration in presence of data dependencies Mehdi Haddad, Mohand-Sad

Graphics Hardware Hardware Overview of Pipeline Architecture Alternatives First graphics

Chafer Teaching Pastors Conference 2016 African Tribal Religion and its influence on Western

Toward Better Global Poverty Measures Martin Ravallion Georgetown University 1 Poverty

Energy modelling in StreetlightSim Sei Ping Lau Supervisory Team: Geoff Merrett, Alex Weddell,

High-Dimensional Covariance Decomposition into Sparse Markov and - PowerPoint PPT Presentation

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T . Covariance

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid

Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance Functions 2 Nugget Covariance

Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices Guangming Pan,

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

High Dimensional Data, Covariance Matrices High Dimensional Data Examples and Application to

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Covariance Matrices and Covariance Operators Theory and Applications H` a Quang Minh Functional

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with

High-Dimensional Pattern Recognition via Sparse Representation Allen Y. Yang University of

High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou

A Spline Dimensional Decomposition for High-Dimensional Uncertainty Quantification Sharif Rahman

Modelling covariance kernels for nonstationary random fields Christopher G. Small University of

Covariance &amp; anchored t ypes 1 Covariance? Wit hin t he t ype syst em of a programming

Covariance Matrix Adaptation Covariance Matrix Adaptation Evolution Strategies Recalling New

EEMBC IoT Security Benchmarks Hannes Tschofenig (hannes.tschofenig@arm.com) IoT Devices cannot

Information Theory and Synthetic Steganography CSM25 Secure Information Hiding Dr Hans Georg

Web Security CSP and Web Cryptography Habib Virji Samsung Open Source Group

Access control for data integration in presence of data dependencies Mehdi Haddad, Mohand-Sad

Graphics Hardware Hardware Overview of Pipeline Architecture Alternatives First graphics

Chafer Teaching Pastors Conference 2016 African Tribal Religion and its influence on Western

Toward Better Global Poverty Measures Martin Ravallion Georgetown University 1 Poverty

Energy modelling in StreetlightSim Sei Ping Lau Supervisory Team: Geoff Merrett, Alex Weddell,

Covariance & anchored t ypes 1 Covariance? Wit hin t he t ype syst em of a programming