Scale Covariance Multivariate Conclusion
Robust estimation of precision matrices under cellwise contamination - - PowerPoint PPT Presentation
Robust estimation of precision matrices under cellwise contamination - - PowerPoint PPT Presentation
Scale Covariance Multivariate Conclusion Robust estimation of precision matrices under cellwise contamination Garth Tarr, Samuel M uller and Neville Weber COMPSTAT 2014 Scale Covariance Multivariate Conclusion Outline Robust Scale
Scale Covariance Multivariate Conclusion
Outline
Robust Scale Estimator, Pn Covariance Covariance Matrix Autocovariance Long Range Dependence Short Range Dependence Inverse Covariance Matrix Estimation
Scale Covariance Multivariate Conclusion
Outline
Robust scale estimation with Pn Robust pairwise covariance estimation Robust covariance and precision matrices Summary and key references
Scale Covariance Multivariate Conclusion
Pairwise mean scale estimator: Pn
- Consider the U-statistic, based on the pairwise mean kernel,
Un(X) := ✓n 2 ◆−1 X
i<j
Xi + Xj 2 .
- Let H(t) = P((Xi + Xj)/2 ≤ t) be the cdf of the kernels
with corresponding empirical distribution function, Hn(t) := ✓n 2 ◆−1 X
i<j
I ⇢Xi + Xj 2 ≤ t
- ,
for t ∈ R.
Definition (Interquartile range of pairwise means)
Pn = c ⇥ H−1
n
(0.75) − H−1
n
(0.25) ⇤ , where c ≈ 1.048 is a correction factor to ensure Pn is consistent for the standard deviation when the underlying observations are Gaussian.
Scale Covariance Multivariate Conclusion
Outline
Robust scale estimation with Pn Robust pairwise covariance estimation Robust covariance and precision matrices Summary and key references
Scale Covariance Multivariate Conclusion
From scale to covariance: the GK device
- Gnanadesikan and Kettenring (1972) relate scale and
covariance using the following identity, cov(X, Y ) = 1 4↵ [var(↵X + Y ) − var(↵X − Y )] , where X and Y are random variables.
- In general, X and Y can have different units, so we set
↵ = 1/ p var(X) and = 1/ p var(Y ).
- Replacing variance with P 2
n we can similarly construct,
P (X, Y ) = 1 4↵ ⇥ P 2
n(↵X + Y ) − P 2 n(↵X − Y )
⇤ , where ↵ = 1/Pn(X) and = 1/Pn(Y ).
Scale Covariance Multivariate Conclusion
Outline
Robust scale estimation with Pn Robust pairwise covariance estimation Robust covariance and precision matrices Summary and key references
Scale Covariance Multivariate Conclusion
Estimating dependence
Problem:
To estimate dependence in multivariate settings with cellwise contamination.
Scale Covariance Multivariate Conclusion
1 50 100 1 15 30
Variables Observations
0% 25% 50% 75% 100% 0% 5% 10%
Cells contaminated Rows contaminated
p=30
For details see Alqallaf et al. (2009).
Scale Covariance Multivariate Conclusion
Estimating dependence
Problem:
To estimate dependence in multivariate settings with cellwise contamination.
Solution:
- 1. pairwise covariance matrices
2. 3.
Scale Covariance Multivariate Conclusion
Pairwise approach to the rescue?
1 50 100 1 2
Variables Observations
0% 25% 50% 75% 100% 0% 5% 10%
Cells contaminated Rows contaminated
p=30 p=2
Scale Covariance Multivariate Conclusion
Estimating dependence
Problem:
To estimate dependence in multivariate settings with cellwise contamination.
Solution:
- 1. pairwise covariance matrices
- 2. correct for positive definiteness
3.
Scale Covariance Multivariate Conclusion
Positive definite?
- Standard approach of Maronna and Zamar (2002) suffers
from outlier propagation so fails for cellwise contamination.
- Higham (2002) outlines the nearest positive definite (NPD)
approach:
- 1. Perform a spectral decomposition of the symmetric matrix of
pairwise covariances
- 2. Any negative eigenvalues are set to some small positive
constant
- 3. Reconstruct the covariance matrix using the adjusted
eigenstructure
! NPD approach produces poorly conditioned covariance matrices.
Scale Covariance Multivariate Conclusion
Estimating dependence
Problem:
To estimate dependence in multivariate settings with cellwise contamination.
Solution:
- 1. pairwise covariance matrices
- 2. correct for positive definiteness
- 3. regularisation procedure
Scale Covariance Multivariate Conclusion
Precision matrices
- In many practical applications, the covariance matrix is not
what is really required.
- PCA, Mahalanobis distance, LDA, etc. use the inverse
covariance matrix: the precision matrix, Θ = Σ−1.
- Precision matrices are also of interest in modelling Gaussian
Markov random fields, where zeros in the correspond to conditional independence between variables.
Sparsity!
In many applications, it is often useful to impose a level of sparsity
- n the estimated precision matrix.
Scale Covariance Multivariate Conclusion
Regularisation techniques
Graphical lasso (glasso) (Friedman, Hastie, Tibshirani, 2007) minimises the penalised negative Gaussian log-likelihood: f(Θ) = tr( ˆ ΣΘ) − log |Θ| + ||Θ||1, where ||Θ||1 is the L1 norm and is a tuning parameter for the amount of shrinkage. Quadratic Inverse Covariance (QUIC) (Hsieh, et. al. 2011) solves the same minimisation problem as the glasso but uses a second order approach. Constrained `1-minimisation for inverse matrix estimation (CLIME) (Cai, Liu and Luo, 2011) solves the following
- bjective function:
min ||Θ||1 subject to: || ˆ ΣΘ − I||∞ ≤ .
Scale Covariance Multivariate Conclusion
Estimating dependence
Problem:
To estimate dependence in multivariate settings with cellwise contamination.
Solution:
- 1. pairwise covariance matrices
- 2. correct for positive definiteness
- 3. regularisation procedure
Evaluation:
- Is the estimate “close” to the truth?
Scale Covariance Multivariate Conclusion
Simulation design
sample size n = 100 replications N = 100 variables p = 30, 60, 90 scenarios banded, sparse and dense precision matrices, Θ true data N(0, Σ) where Σ = Θ−1 contamination 0% to 25% randomly scattered component-wise
- 1. Banded
- 2. Sparse
- 3. Dense
Scale Covariance Multivariate Conclusion
10% contamination
- 20
- 10
10 20
- 20
- 10
10 20 Extreme X1 X2
- 20
- 10
10 20
- 20
- 10
10 20 X1 Moderate X2
Scale Covariance Multivariate Conclusion
Evaluating performance
Entropy Loss
Measures how “close” ˆ Θ is to Θ, L(Θ, ˆ Θ) = tr(Θ−1 ˆ Θ) − log |Θ−1 ˆ Θ| − p. Reported as percentage relative improvement in average loss: PRIAL( ˆ Θ) = L(Θ, ˆ Θ0) − L(Θ, ˆ Θ) L(Θ, ˆ Θ0) × 100, where ˆ Θ0 is the estimated precision matrix after a regularisation technique has been applied to the classical sample covariance matrix for uncontaminated data.
Scale Covariance Multivariate Conclusion
Changing dimension: p = 90
5 10 15 20 25
- 300
- 200
- 100
Percent contamination in each variable Entropy loss PRIAL Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical CLIME, banded, p = 90
Scale Covariance Multivariate Conclusion
Changing dimension: p = 60
5 10 15 20 25
- 300
- 200
- 100
Percent contamination in each variable Entropy loss PRIAL Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical CLIME, banded, p = 60
Scale Covariance Multivariate Conclusion
Changing dimension: p = 30
5 10 15 20 25
- 300
- 200
- 100
Percent contamination in each variable Entropy loss PRIAL Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical CLIME, banded, p = 30
Scale Covariance Multivariate Conclusion
Changing regularisation routine: GLASSO
5 10 15 20 25
- 300
- 200
- 100
Percent contamination in each variable Entropy loss PRIAL Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical GLASSO, banded, p = 60
Scale Covariance Multivariate Conclusion
Changing regularisation routine: QUIC
5 10 15 20 25
- 300
- 200
- 100
Percent contamination in each variable Entropy loss PRIAL Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical QUIC, banded, p = 60
Scale Covariance Multivariate Conclusion
Changing experiment: banded
5 10 15 20 25
- 300
- 200
- 100
Percent contamination in each variable Entropy loss PRIAL Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical QUIC, banded, p = 90
Scale Covariance Multivariate Conclusion
Changing experiment: sparse
5 10 15 20 25
- 300
- 200
- 100
Percent contamination in each variable Entropy loss PRIAL Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical QUIC, scattered, p = 90
Scale Covariance Multivariate Conclusion
Changing experiment: dense
5 10 15 20 25
- 300
- 200
- 100
Percent contamination in each variable Entropy loss PRIAL Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical QUIC, dense, p = 30
Scale Covariance Multivariate Conclusion
Estimating dependence
Problem:
To estimate dependence in multivariate settings with cellwise contamination.
Solution:
- 1. pairwise covariance matrices
- 2. correct for positive definiteness
- 3. regularisation procedure
Evaluation:
- Is the estimate “close” to the truth?
- Are we able to recover the support of a Gaussian graphical
model?
Scale Covariance Multivariate Conclusion
QUIC, uncontaminated data, N = 100 replications
True Θ Classic MCD 20 40 60 80 100 Pn with NPD Qn with NPD ⌧ with NPD 20 40 60 80 100
Scale Covariance Multivariate Conclusion
QUIC, 10% contamination, N = 100 replications
True Θ Classic MCD 20 40 60 80 100 Pn with NPD Qn with NPD ⌧ with NPD 20 40 60 80 100
Scale Covariance Multivariate Conclusion
Evaluating performance
Matthew’s Correlation Coefficient (MCC)
Takes into account the number of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN), MCC = TP × TN − FP × FN p (TP + FP)(TP + FN)(TN + FP)(TN + FN) . Basically the correlation between the observed and predicted binary classifications.
Scale Covariance Multivariate Conclusion
MCC: p = 30
5 10 15 20 25 0.0 0.1 0.2 0.3 0.4 Percent contamination in each variable Matthews correlation coefficient
Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical QUIC, scattered, p = 30
Scale Covariance Multivariate Conclusion
MCC: p = 60
5 10 15 20 25 0.0 0.1 0.2 0.3 0.4 Percent contamination in each variable Matthews correlation coefficient
Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical QUIC, scattered, p = 60
Scale Covariance Multivariate Conclusion
MCC: p = 90
5 10 15 20 25 0.0 0.1 0.2 0.3 0.4 Percent contamination in each variable Matthews correlation coefficient
Qn with NPD ⌧ with NPD Pn with NPD MCD Pn with OGK Classical QUIC, scattered, p = 90
Scale Covariance Multivariate Conclusion
Other considerations
- Contaminating model: compare with the missingness literature
- Choice of sparsity parameter: the billion euro question
- p > n: looks promising
Scale Covariance Multivariate Conclusion
QUIC, n = 50, p = 60, sparse precision matrix
5 10 15 20 25
- 300
- 200
- 100
Percent contamination in each variable Entropy loss PRIAL
Qn with NPD ⌧ with NPD Pn with NPD Classical
Scale Covariance Multivariate Conclusion
Outline
Robust scale estimation with Pn Robust pairwise covariance estimation Robust covariance and precision matrices Summary and key references
Scale Covariance Multivariate Conclusion
Summary
Problem:
To estimate dependence in multivariate settings with cellwise contamination.
Solution:
- 1. pairwise covariance matrices
- 2. correct for positive definiteness
- 3. regularisation procedure
Result:
- Performs well with moderate amounts of outliers
- Looks promising for p > n problems
Scale Covariance Multivariate Conclusion
References
Alqallaf, F., Van Aelst, S., Yohai, V.J., Zamar, R.H., (2009). Propagation of outliers in multivariate data. The Annals of Statistics, 37:311–331. Cai, T., Liu, W. and Luo, X. (2011). A constrained `1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106:594–607. Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441. Gnanadesikan, R. and Kettenring J. R. (1972). Robust estimates, residuals and outlier detection with multiresponse data. Biometrics, 28(1):81–124.
Scale Covariance Multivariate Conclusion