Fast and Robust Classifiers Adjusted for Skewness Mia Hubert and - PowerPoint PPT Presentation

Fast and Robust Classifiers Adjusted for Skewness Mia Hubert and Stephan Van der Veeken Katholieke Universiteit Leuven, Department of Mathematics Mia.Hubert@wis.kuleuven.be COMPSTAT 2010 Mia Hubert, August 24, 2010 Robust classifiers. - p. 1/30

Outline Outline Some classifiers New classifiers Simulations Example Conclusion K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 2/30

Outline ■ Review of some classifiers Outline Some classifiers New classifiers Simulations Example Conclusion K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 3/30

Outline ■ Review of some classifiers ◆ normally distributed data Outline Some classifiers New classifiers Simulations Example Conclusion K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 3/30

Outline ■ Review of some classifiers ◆ normally distributed data Outline ◆ depth based approaches Some classifiers New classifiers Simulations Example Conclusion K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 3/30

Outline ■ Review of some classifiers ◆ normally distributed data Outline ◆ depth based approaches Some classifiers New classifiers ■ New approaches based on adjusted outyingness Simulations Example Conclusion K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 3/30

Outline ■ Review of some classifiers ◆ normally distributed data Outline ◆ depth based approaches Some classifiers New classifiers ■ New approaches based on adjusted outyingness Simulations Example ■ Simulation results Conclusion K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 3/30

Outline ■ Review of some classifiers ◆ normally distributed data Outline ◆ depth based approaches Some classifiers New classifiers ■ New approaches based on adjusted outyingness Simulations Example ■ Simulation results Conclusion ■ A real data set K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 3/30

Outline ■ Review of some classifiers ◆ normally distributed data Outline ◆ depth based approaches Some classifiers New classifiers ■ New approaches based on adjusted outyingness Simulations Example ■ Simulation results Conclusion ■ A real data set ■ Conclusions and outlook K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 3/30

Some classifiers Setting: ■ Observations sampled from k different classes X j , j = 1 , . . . , k . Outline ■ data belonging to group X j are denoted by x j i ( i = 1 , . . . , n j ) Some classifiers New classifiers ■ the dimension of the data space is p and p ≪ n j . Simulations ■ outliers possible! Example Conclusion Classification: construct a rule to classify a new observation into one of the k populations. K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 4/30

Some classifiers Normally distributed data: ■ Classical Linear discriminant analysis (when covariance matrices in each Outline group are equal) Some classifiers ■ Classical Quadratic discriminant analysis (CQDA) New classifiers Simulations based on classical mean and covariance matrices. Example Conclusion K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 5/30

Some classifiers Normally distributed data: ■ Classical Linear discriminant analysis (when covariance matrices in each Outline group are equal) Some classifiers ■ Classical Quadratic discriminant analysis (CQDA) New classifiers Simulations based on classical mean and covariance matrices. Example Conclusion Robust versions (RLDA, RQDA) are obtained by using robust covariance matrices, such as the MCD-estimator or S-estimators. (He and Fung 2000, Croux and Dehon 2001, Hubert and Van Driessen 2004) . K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 5/30

Depth based classifiers Proposed by Ghosh and Chaudhuri (2005). ■ Consider a depth function (Tukey depth, simplicial depth, ...). Outline ■ For a new observation: compute its depth with respect to each group. Some classifiers New classifiers ■ Assign the new observation to the group for which it attains the maximal Simulations depth . Example Conclusion K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 6/30

Depth based classifiers Advantages: ■ does not rely on normality Outline ■ optimality results at normal data Some classifiers New classifiers ■ robust towards outliers (degree of robustness depends on depth function) Simulations ■ can handle multigroup classification, not only two-group Example Conclusion K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 7/30

Depth based classifiers Advantages: ■ does not rely on normality Outline ■ optimality results at normal data Some classifiers New classifiers ■ robust towards outliers (degree of robustness depends on depth function) Simulations ■ can handle multigroup classification, not only two-group Example Conclusion Disadvantages: ■ computation time ■ ties: observations outside the convex hull of all groups have zero depth w.r.t. each group ■ adaptations necessary for unequal sample sizes. Ghosh and Chaudhuri propose methods that rely on kernel density estimates. K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 7/30

New depth based classifiers New proposals based on adjusted outlyingness . First consider univariate data . Outline Standard boxplot has whiskers as the smallest and the largest data point that Some classifiers do not exceed: New classifiers [ Q 1 − 1 . 5 IQR , Q 3 + 1 . 5 IQR ] Simulations Adjusted boxplot has whiskers that end at the smallest and the largest data Example point that do not exceed Conclusion [ Q 1 − 1 . 5 e − 4 MC IQR , Q 3 + 1 . 5 e 3 MC IQR ] with MC ( X ) = x i <m<x j h ( x i , x j ) med with m the median of X and h ( x i , x j ) = ( x j − m ) − ( m − x i ) x j − x i (Hubert and Vandervieren, CSDA, 2008) K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 8/30

Medcouple - A robust measure of skewness ■ Robustness : ◆ bounded influence function Outline → adding a small probability mass at a certain point has a bounded Some classifiers influence on the estimate. New classifiers Simulations ◆ high breakdown point Example ǫ ∗ ( MC ) = 25% Conclusion → 25% of the data needs to be replaced to make the estimator break down K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 9/30

Medcouple - A robust measure of skewness ■ Robustness : ◆ bounded influence function Outline → adding a small probability mass at a certain point has a bounded Some classifiers influence on the estimate. New classifiers Simulations ◆ high breakdown point Example ǫ ∗ ( MC ) = 25% Conclusion → 25% of the data needs to be replaced to make the estimator break down ■ Computation : ◆ fast algorithm available O( n log n ) K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 9/30

Adjusted boxplot Example: Length of stay in hospital Comparison of the standard and adjusted boxplot Outline 100 Some classifiers 60 80 New classifiers Simulations 60 40 Values Example 40 Conclusion 20 20 0 0 0 20 40 60 80 100 Standard boxplot Adjusted boxplot data K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 10/30

Adjusted outlyingness - univariate data For univariate data, the adjusted outlyingness is defined as: | x i − m | Outline AO (1) = i ( w 2 − m ) I [ x i > m ] + ( m − w 1 ) I [ x i < m ] Some classifiers New classifiers with w 1 and w 2 the whiskers of the adjusted boxplot. Simulations Example d 1 d 2 Conclusion ✛ ✲ ✛ ✲ s s x 1 x 2 ✛ ✲ ✛ ✲ s 1 s 2 K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 11/30

Adjusted outlyingness - univariate data ■ AO (1) ( x 1 ) = d 1 /s 1 and AO (1) ( x 2 ) = d 2 /s 2 . i i ■ Although x 1 and x 2 are located at the same distance from the median, x 1 Outline will have a higher value of adjusted outlyingness, because of the fact that Some classifiers the denominator s 1 is smaller. New classifiers Simulations ■ Skewness is thus used to estimate the scale differently on both sides of the Example median. Conclusion ■ Data-driven (outlying with respect to bulk of the data) Brys, Hubert and Rousseeuw (2005), Hubert and Van der Veeken (2008) K A T H O L I E K E U N I V E R S I T E I T Mia Hubert, August 24, 2010 Robust classifiers. - p. 12/30

Fast and Robust Classifiers Adjusted for Skewness Mia Hubert and - PowerPoint PPT Presentation

Fast and Robust Classifiers Adjusted for Skewness Mia Hubert and Stephan Van der Veeken Katholieke Universiteit Leuven, Department of Mathematics Mia.Hubert@wis.kuleuven.be COMPSTAT 2010 Mia Hubert, August 24, 2010 Robust classifiers. - p.

The FASK algorithm FASK (Fast Adjacency Skewness) appeals to Skewness. It runs the Fast

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Skewness Expectations and Portfolio Choice Matthias Wibral, Maastricht University and IZA joint

Salience and Skewness Preferences Markus Dertwinkel-Kalt 1 and Mats K oster 2 1 Frankfurt School

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

On Robust Trimming of Bayesian Network Classifiers YooJung Choi and Guy Van den Broeck UCLA

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

BAYES AND NEAREST NEIGHBOR BAYES AND NEAREST NEIGHBOR CLASSIFIERS CLASSIFIERS Matthieu R Bloch

Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers

Data Dependence in Data Dependence in Combining Classifiers Combining Classifiers Mohamed

Making Estimates of National Income Better Reflect Economic Well-Being: The U.S. Experi ence

Q3 11 Investor Presentation August 23 2011 1 Financial Results August 23 2011

Privacy law overview Privacy law overview Engineering & Public Policy Lorrie Faith

Money matters & Risk management Anna Lyons - Senior Lawyer, Homeless Law May 2015

Concursus Event Sourcing Evolved GOTO London 2016 Introductions Dominic Fox Tareq Abedrabbo

Revised October 14, 2015 ISM Manufacturing Purchasing Managers Index (August 2014August

Plastics Briefing Strategic responses and potential solutions CAROLINE FRRY NICK BROWN WILL

Race and Economic Well-Being in the United States Jean-Felix Brouillette, Charles I. Jones and

Fast and Robust Classifiers Adjusted for Skewness Mia Hubert and - PowerPoint PPT Presentation

Fast and Robust Classifiers Adjusted for Skewness Mia Hubert and Stephan Van der Veeken Katholieke Universiteit Leuven, Department of Mathematics Mia.Hubert@wis.kuleuven.be COMPSTAT 2010 Mia Hubert, August 24, 2010 Robust classifiers. - p.

The FASK algorithm FASK (Fast Adjacency Skewness) appeals to Skewness. It runs the Fast

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Skewness Expectations and Portfolio Choice Matthias Wibral, Maastricht University and IZA joint

Salience and Skewness Preferences Markus Dertwinkel-Kalt 1 and Mats K oster 2 1 Frankfurt School

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

On Robust Trimming of Bayesian Network Classifiers YooJung Choi and Guy Van den Broeck UCLA

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

BAYES AND NEAREST NEIGHBOR BAYES AND NEAREST NEIGHBOR CLASSIFIERS CLASSIFIERS Matthieu R Bloch

Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers

Data Dependence in Data Dependence in Combining Classifiers Combining Classifiers Mohamed

Making Estimates of National Income Better Reflect Economic Well-Being: The U.S. Experi ence

Q3 11 Investor Presentation August 23 2011 1 Financial Results August 23 2011

Privacy law overview Privacy law overview Engineering &amp; Public Policy Lorrie Faith

Money matters &amp; Risk management Anna Lyons - Senior Lawyer, Homeless Law May 2015

Concursus Event Sourcing Evolved GOTO London 2016 Introductions Dominic Fox Tareq Abedrabbo

Revised October 14, 2015 ISM Manufacturing Purchasing Managers Index (August 2014August

Plastics Briefing Strategic responses and potential solutions CAROLINE FRRY NICK BROWN WILL

Race and Economic Well-Being in the United States Jean-Felix Brouillette, Charles I. Jones and

Privacy law overview Privacy law overview Engineering & Public Policy Lorrie Faith

Money matters & Risk management Anna Lyons - Senior Lawyer, Homeless Law May 2015