Robust Statistics Part 2: Multivariate location and scatter Peter - PDF document

Robust Statistics Part 2: Multivariate location and scatter Peter Rousseeuw LARS-IASC School, May 2019 Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 1 Multivariate location and scatter Multivariate location and scatter: Outline Classical estimators and outlier detection 1 M-estimators 2 The Stahel-Donoho estimator 3 The MCD estimator 4 The MVE estimator 5 S-estimators 6 MM-estimators 7 Some non affine equivariant estimators 8 Software availability 9 Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 2

Multivariate location and scatter Classical estimators and outlier detection Multivariate location and scatter Data: x 1 , . . . , x n where the observations x i are p -variate column vectors. We often combine the coordinates of the observations in an n × p matrix:   x 11 x 12 . . . x 1 p     . . . X = ( x 1 , . . . , x n ) ′ =  . . .  . . .         x n 1 x n 2 . . . x np Model for the observations: x i ∼ N p ( µ , Σ) More generally we can assume that the data were generated from an elliptical distribution, whose density contours are ellipses too. Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 3 Multivariate location and scatter Classical estimators and outlier detection Outlier detection In the multivariate setting, outliers cannot always be detected by simply applying outlier detection rules to each variable separately: Bivariate Outliers ● ● ● ● 6 ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● X 2 ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −4 ● ● ● ● ● ● ● ● −6 ● ● −4 −2 0 2 4 X 1 Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 4

Multivariate location and scatter Classical estimators and outlier detection Outlier detection These points are not outlying in either variable: Normal Q−Q plot of X1 Normal Q−Q plot of X2 4 ● ● ● ● ● ● ● 6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 4 ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Sample Quantiles Sample Quantiles ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 ● ● ● ● ● ● −2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −4 ● ● ● −6 ● ● ● ● −2 −1 0 1 2 −2 −1 0 1 2 Theoretical Quantiles Theoretical Quantiles We can only detect such outliers by correctly estimating the covariance structure! Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 5 Multivariate location and scatter Classical estimators and outlier detection Affine equivariance µ and ˆ We usually want estimators ˆ Σ that are affine equivariant. Affine equivariance µ ( { A x 1 + b , . . . , A x n + b } ) = A ˆ ˆ µ ( { x 1 , . . . , x n } ) + b Σ( { A x 1 + b , . . . , A x n + b } ) = A ˆ ˆ Σ( { x 1 , . . . , x n } ) A ′ for any nonsingular matrix A and any vector b . Affine equivariance implies that the estimator transforms well under any non-singular reparametrization of the space of the x i . Consequently, the data might be rotated, translated or rescaled (for example through a change of measurement units) without affecting the outlier detection diagnostics. Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 6

Multivariate location and scatter Classical estimators and outlier detection Affine equivariance A counterexample to affine equivariance is the coordinatewise median n n i =1 x ip ) ′ µ ( { x 1 , . . . , x n } ) = ( ˆ med i =1 x i 1 , . . . , med which is very easy to compute. It is not affine equivariant, and not even orthogonally equivariant since it does not transform well under rotations. What we can do is shift the data like { x 1 + b , . . . , x n + b } and rescale by a diagonal matrix A (that is, change the measurement units of the original variables). We will study the robustness of the coordinatewise median later. Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 7 Multivariate location and scatter Classical estimators and outlier detection Breakdown value We say that a multivariate location estimator ˆ µ breaks down when it can be carried outside any bounded set. Every affine equivariant location estimator satisfies µ , X n ) � 1 � n + 1 � ε ∗ n (ˆ . n 2 The breakdown value of a scatter estimator ˆ Σ is defined as the minimum of the explosion breakdown value and the implosion breakdown value. Explosion occurs when the largest eigenvalue becomes arbitrarily large. Implosion occurs when the smallest eigenvalue becomes arbitrarily small. Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 8

Multivariate location and scatter Classical estimators and outlier detection Breakdown value Any affine equivariant scatter estimator ˆ Σ satisfies Σ , X n ) � 1 � n − p + 1 � n (ˆ ε ∗ n 2 if the sample X n is in general position : General position A multivariate data set of dimension p is said to be in general position if at most p observations lie in a ( p − 1) -dimensional hyperplane. For example, at most 2 observations lie on a line, at most 3 on a plane, etc. Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 9 Multivariate location and scatter Classical estimators and outlier detection Overview Estimators of multivariate location and scatter can be divided into those that are affine equivariant or not, and those with low or high breakdown value: affine equivariant non affine equivariant Low BV Classical mean & covariance M-estimators High BV Stahel-Donoho estimator coordinatewise median MCD, MVE spatial median, sign covariance S-estimators OGK MM-estimators DetMCD Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 10

Multivariate location and scatter Classical estimators and outlier detection Classical estimators affine equivariant non affine equivariant Low BV Classical mean & covariance M-estimators High BV Stahel-Donoho estimator coordinatewise median MCD, MVE spatial median, sign covariance S-estimators OGK MM-estimators DetMCD Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 11 Multivariate location and scatter Classical estimators and outlier detection Classical estimators The classical estimators for µ and Σ are the empirical mean and covariance matrix: n x = 1 � ¯ x i n i =1 n 1 � x ) ′ . S n = ( x i − ¯ x )( x i − ¯ n − 1 i =1 Both are affine equivariant but highly sensitive to outliers, as they have: zero breakdown value unbounded influence function. Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 12

Multivariate location and scatter Classical estimators and outlier detection Classical estimators Consider the Animals data set containing the logarithm of the body and brain weight of 28 animals: Animals 15 10 log(brain) 5 0 −5 −10 −5 0 5 10 15 log(body) Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 13 Multivariate location and scatter Classical estimators and outlier detection Tolerance ellipsoid On this plot we can add the 97.5% tolerance ellipsoid. Its boundary contains those x -values with constant Mahalanobis distance to the mean. Mahalanobis distance � x n ) ′ S − 1 MD ( x ) = ( x − ¯ n ( x − ¯ x n ) Classical tolerance ellipsoid � χ 2 { x ; MD ( x ) � p, 0 . 975 } with χ 2 p, 0 . 975 the 97.5% quantile of the χ 2 -distribution with p degrees of freedom. We expect (for large n ) that about 97.5% of the observations belong to this ellipsoid. We could flag observation x i as an outlier if it does not belong to the classical tolerance ellipsoid, but... Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 14

Robust Statistics Part 2: Multivariate location and scatter Peter - PDF document

Robust Statistics Part 2: Multivariate location and scatter Peter Rousseeuw LARS-IASC School, May 2019 Peter Rousseeuw Robust Statistics, Part 2: Multivariate LARS-IASC School, May 2019 p. 1 Multivariate location and scatter Multivariate

Robust Location and Scatter Estimators Outline for Multivariate Data Analysis Background

Outline Multivariate Data 1 Multivariate Parametric Methods Multivariate Normal Distribution 2

Robust scatter regularization G. Haesbroeck and C. Croux University of Li` ege - University of

Location, Location, Location, Location, Location: Location: GPS and Google Earth GPS and

MPI types, Scatter and Scatterv MPI types, Scatter and Scatterv 0 1 2 3 4 5 Logical and

PHI: ARCHITECTURAL SUPPORT FOR SYNCHRONIZATION- AND BANDWIDTH-EFFICIENT COMMUTATIVE SCATTER

Reading multivariate data Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Multivariate t-distributions Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

CS371m - Mobile Computing Location (Location, Location, Location) Cheap GPS

MOBILE COMPUTING CSE 40814/60814 Fall 2015 Location, Location, Location Location information

Robust method for EnKF in the presence of observation outliers/Multivariate localization methods

Facility location II. Chapter 10 Location-Allocation Model Plant Location Model Network

Facility location I. Chapter 10 Facility location Continuous facility location models Single

Regression Diagnostics and the Forward Search 3. A Single Multivariate Sample Anthony Atkinson,

Advanced PHP Dr. Steven Bitner A/B and Multivariate testing Why use multivariate testing If

Unit I Lecture slides from August 27 to September 9. Exam One is Wednesday, Sept 26. air 80%

Yoga for Health, Well-Being and Education: The Science and the Research Evidence July 13, 2015

Bey Beyn Ha Arba n Ha Arbayim yim: A Historical Overview Beyn Beyn ha arbayim ha arbayim

Performance assessment of optimal allocation for large portfolios Luigi Grossi and Fabrizio

On Learning the Past Tenses of Verbs Rumelhart, McClelland 1985 Big Picture How do we (humans)

QUICK LESSONS Option Basics Land is on the market for $1,000,000 Settle with the owner on an

Policies for Cloud Service Brokerage Chenxi Qiu Holcombe Department of Electrical and Computer

Integrated CPU and L2 Cache Voltage Scaling using Machine Learning Nevine AbouGhazaleh, Alexandre