High-Dimensional Covariance Decomposition into Sparse Markov and - PowerPoint PPT Presentation

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine

High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T . High-dimensional regime: both n, p → ∞ and n ≪ p . Covariance estimation: Σ ∗ := E [ XX T ] . Challenge: empirical (sample) covariance ill-posed when n ≪ p : � n Σ n := 1 x ( k ) x ( k ) T . � n k =1 Solution: Imposing Sparsity for Tractable High-dimensional Estimation

Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M

Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M Relationship with Statistical Properties (Gaussian) Sparse Covariance= Independence Model: marginal independence. Sparse Inverse Covariance=Markov Model: conditional independence

Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M Relationship with Statistical Properties (Gaussian) Sparse Covariance= Independence Model: marginal independence. Sparse Inverse Covariance=Markov Model: conditional independence Guarantees under Sparsity Constraints in High Dimensions Consistent Estimation when n = Ω(log p ) ⇒ n ≪ p .

Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M Relationship with Statistical Properties (Gaussian) Sparse Covariance= Independence Model: marginal independence. Sparse Inverse Covariance=Markov Model: conditional independence Guarantees under Sparsity Constraints in High Dimensions Consistent Estimation when n = Ω(log p ) ⇒ n ≪ p . Going beyond Sparsity in High Dimensions?

Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains.

Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. One Possibility: Sparse Markov + Sparse Independence Models Sparsity in Multiple Domains: Multiple Statistical Relationships. 1 Σ R J − 1 Σ ∗ M

Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. One Possibility: Sparse Markov + Sparse Independence Models Sparsity in Multiple Domains: Multiple Statistical Relationships. 1 Σ R J − 1 Σ ∗ M Efficient Decomposition and Estimation in High Dimensions?

Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. One Possibility: Sparse Markov + Sparse Independence Models Sparsity in Multiple Domains: Multiple Statistical Relationships. 1 Σ R J − 1 Σ ∗ M Efficient Decomposition and Estimation in High Dimensions? Unique Decomposition? Good Sample Requirements?

Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M

Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M Contribution 1: Novel Method for Decomposition Decomposition into Markov and residual domains. Unification of Sparse Covariance and Inverse Covariance Estimation.

Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M Contribution 1: Novel Method for Decomposition Decomposition into Markov and residual domains. Unification of Sparse Covariance and Inverse Covariance Estimation. Contribution 2: Guarantees for Estimation Conditions for unique decomposition (exact statistics). Sparsistency and norm guarantees in both Markov and independence domains (sample analysis) Sample requirement: no. of samples n = Ω(log p ) for p variables.

Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M Contribution 1: Novel Method for Decomposition Decomposition into Markov and residual domains. Unification of Sparse Covariance and Inverse Covariance Estimation. Contribution 2: Guarantees for Estimation Conditions for unique decomposition (exact statistics). Sparsistency and norm guarantees in both Markov and independence domains (sample analysis) Sample requirement: no. of samples n = Ω(log p ) for p variables. Efficient Method for Covariance Decomposition and Estimation

Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.)

Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang)

Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al)

Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al) Multi-Resolution Markov+Independence Models (Choi et. al) Decomposition in inverse covariance domain Lack theoretical guarantees

Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al) Multi-Resolution Markov+Independence Models (Choi et. al) Decomposition in inverse covariance domain Lack theoretical guarantees Our contribution: Guaranteed Decomposition and Estimation

Outline Introduction 1 Algorithm 2 Guarantees 3 Experiments 4 Conclusion 5

Some Intuitions and Ideas − 1 + Σ ∗ Σ ∗ = J ∗ 1 R . M � Σ n : sample covariance using n i.i.d. samples

Some Intuitions and Ideas − 1 + Σ ∗ Σ ∗ = J ∗ 1 R . M � Σ n : sample covariance using n i.i.d. samples Review Ideas for Special Cases: Sparse Covariance/Inverse Covariance

Some Intuitions and Ideas − 1 + Σ ∗ Σ ∗ = J ∗ 1 R . M � Σ n : sample covariance using n i.i.d. samples Review Ideas for Special Cases: Sparse Covariance/Inverse Covariance Sparse Covariance Estimation (Independence Model) Σ ∗ = Σ ∗ R . � Σ n : sample covariance using n samples p variables: p ≫ n . Thresholding estimator for off-diagonals (Bickel & Levina): threshold � log p chosen as n Sparsistency (support recovery) and Norm guarantees when n = Ω(log p ) ⇒ n ≪ p .

Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples

Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08)

Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0

Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0 Max-entropy Formulation (Lagrangian Dual) � Σ M := argmax log det Σ M − λ � Σ R � 1 , off Σ M ≻ 0 , Σ R � � �� Σ n � � � Σ n − Σ M � ∞ , off ≤ γ, � � s . t . Σ M d = , Σ R d = 0 . d

Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0 Max-entropy Formulation (Lagrangian Dual) � Σ M := argmax log det Σ M − λ � Σ R � 1 , off Σ M ≻ 0 , Σ R � � �� Σ n � � � Σ n − Σ M � ∞ , off ≤ γ, � � s . t . Σ M d = , Σ R d = 0 . d Consistent Estimation Under Certain Conditions, n = Ω(log p )

High-Dimensional Covariance Decomposition into Sparse Markov and - PowerPoint PPT Presentation

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T .

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid

Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance Functions 2 Nugget Covariance

Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices Guangming Pan,

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

High Dimensional Data, Covariance Matrices High Dimensional Data Examples and Application to

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Covariance Matrices and Covariance Operators Theory and Applications H` a Quang Minh Functional

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with

High-Dimensional Pattern Recognition via Sparse Representation Allen Y. Yang University of

High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou

A Spline Dimensional Decomposition for High-Dimensional Uncertainty Quantification Sharif Rahman

Modelling covariance kernels for nonstationary random fields Christopher G. Small University of

Covariance & anchored t ypes 1 Covariance? Wit hin t he t ype syst em of a programming

Covariance Matrix Adaptation Covariance Matrix Adaptation Evolution Strategies Recalling New

NUMA and VM Scalability Mark Johnston markj@FreeBSD.org FreeBSD Developer Summit MeetBSD 2018

Features of edge states and domain walls in chiral superconductors Manfred Sigrist NQS2017, YITP

How$to$Record$Quantum$Queries$ and$Applications$to$Quantum$Indifferentiability Mark%Zhandry

Tasks #10-13: Iden0fy Data Flows, Incoming PI, Internally Generated PI and Outgoing PI 1 PMRM

Nonlinear Control Lecture # 3 Stability of Equilibrium Points Nonlinear Control Lecture # 3

CS675: Convex and Combinatorial Optimization Fall 2019 Duality of Convex Optimization Problems

BradChamberlain,SungEunChoi,SteveDeitz,

A A novel hyb ybrid distributed-ro routing and SDN N solution for traffic engineering ANR

High-Dimensional Covariance Decomposition into Sparse Markov and - PowerPoint PPT Presentation

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T .

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid

Lecture 14 Covariance Functions 3/08/2018 1 More on Covariance Functions 2 Nugget Covariance

Spiked Eigenvalues of High Dimensional Separable Sample Covariance Matrices Guangming Pan,

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

High Dimensional Data, Covariance Matrices High Dimensional Data Examples and Application to

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Covariance Matrices and Covariance Operators Theory and Applications H` a Quang Minh Functional

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with

High-Dimensional Pattern Recognition via Sparse Representation Allen Y. Yang University of

High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou

A Spline Dimensional Decomposition for High-Dimensional Uncertainty Quantification Sharif Rahman

Modelling covariance kernels for nonstationary random fields Christopher G. Small University of

Covariance &amp; anchored t ypes 1 Covariance? Wit hin t he t ype syst em of a programming

Covariance Matrix Adaptation Covariance Matrix Adaptation Evolution Strategies Recalling New

NUMA and VM Scalability Mark Johnston markj@FreeBSD.org FreeBSD Developer Summit MeetBSD 2018

Features of edge states and domain walls in chiral superconductors Manfred Sigrist NQS2017, YITP

How$to$Record$Quantum$Queries$ and$Applications$to$Quantum$Indifferentiability Mark%Zhandry

Tasks #10-13: Iden0fy Data Flows, Incoming PI, Internally Generated PI and Outgoing PI 1 PMRM

Nonlinear Control Lecture # 3 Stability of Equilibrium Points Nonlinear Control Lecture # 3

CS675: Convex and Combinatorial Optimization Fall 2019 Duality of Convex Optimization Problems

BradChamberlain,SungEunChoi,SteveDeitz,

A A novel hyb ybrid distributed-ro routing and SDN N solution for traffic engineering ANR

Covariance & anchored t ypes 1 Covariance? Wit hin t he t ype syst em of a programming