Selecting Variables in Two-Group Robust Linear Discriminant Analysis - PowerPoint PPT Presentation

. . Selecting Variables in Two-Group Robust Linear Discriminant Analysis . . . . . Stefan Van Aelst and Gert Willems Department of Applied Mathematics and Computer Science Ghent University, Belgium COMPSTAT’2010

Linear discriminant analysis Linear discriminant analysis setting p -dimensional data set Group 1: x 11 . . . , x 1 n 1 ∈ Π 1 ∼ F 1 = F µ 1 , Σ Group 2: x 21 . . . , x 2 n 2 ∈ Π 2 ∼ F 2 = F µ 2 , Σ Common covariance matrix Σ P ( X ∈ Π 1 ) = P ( X ∈ Π 2 ) j Σ − 1 x − 1 d L j ( x ) = µ t 2 µ t j Σ − 1 µ j ; j = 1 , 2 ✤ ✜ Classify x ∈ R p into Π 1 if Linear Bayes rule: d L 1 ( x ) > d L 2 ( x ) ✣ ✢ and into Π 2 otherwise. Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 2

Linear discriminant analysis Linear discriminant analysis setting p -dimensional data set Group 1: x 11 . . . , x 1 n 1 ∈ Π 1 ∼ F 1 = F µ 1 , Σ Group 2: x 21 . . . , x 2 n 2 ∈ Π 2 ∼ F 2 = F µ 2 , Σ Common covariance matrix Σ P ( X ∈ Π 1 ) = P ( X ∈ Π 2 ) d L j ( x ) = µ t j Σ − 1 x − 1 2 µ t j Σ − 1 µ j ; j = 1 , 2 ✤ ✜ Classify x ∈ R p into Π 1 if Linear Bayes rule: d L 1 ( x ) > d L 2 ( x ) ✣ ✢ and into Π 2 otherwise. Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 3

Linear discriminant analysis Discriminant coordinate Direction a that best separates the two populations: a = Σ − 1 ( µ 1 − µ 2 ) The projection a t x is called the canonical variate or discriminant coordinate Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 4

Linear discriminant analysis Sample LDA Estimate the centers µ 1 and µ 2 and the scatter Σ from the data Standard LDA uses the sample means ¯ x 1 and ¯ x 2 , and the pooled sample covariance matrix S n = ( n 1 − 1 ) S 1 + ( n 2 − 1 ) S 2 n 1 + n 2 − 2 Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 5

Robust LDA Robust LDA Use robust estimators of the centers µ 1 and µ 2 and the common scatter Σ − → S-estimators − → MM-estimators Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 6

Robust LDA One-sample S-estimators Observations { x 1 , . . . , x n } ⊂ R p ✬ ✩ ρ 0 : [ 0 , ∞ [ → [ 0 , ∞ [ is bounded, increasing and smooth µ n and scatter � S-estimates of the location � Σ n minimize | C | subject to ( ) ∑ n 1 1 [( x i − T ) t C − 1 ( x i − T )] ρ 0 = b 2 n i = 1 ✫ ✪ among all T ∈ R p and C ∈ PDS ( p ) (Davies 1987, Rousseeuw and Leroy 1987, Lopuhaä 1989) Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 7

Robust LDA ρ functions A popular family of loss functions is the Tukey biweight (bisquare) family of ρ functions:  t 2 2 − t 4 2 c 2 + t 6  if | t | ≤ c 6 c 4 ρ c ( t ) =  c 2 if | t | ≥ c . 6 The constant c can be tuned for robustness (breakdown point) The choice of c also determines the efficiency of the S-estimator → Trade-off robustness vs efficiency Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 8

Robust LDA Tukey biweight ρ functions c= ∞ 2.0 c=3 1.5 ρ ( t ) 1.0 c=2 0.5 0.0 −4 −2 0 2 4 t Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 9

Robust LDA One-sample MM-estimates σ n = det ( � ✬ Σ n ) 1 / 2 p , the S-estimate of scale ✩ Put ˜ µ n and shape � Γ n mini- Then the MM-estimates of the location � mize ( ) n ∑ 1 1 [( x i − T ) t G − 1 ( x i − T )] 2 / ˜ ρ 1 σ n n i = 1 ✫ ✪ among all T ∈ R p and G ∈ PDS ( p ) for which det( G )=1 (Tatsuoka and Tyler 2000) Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 10

Robust LDA ρ functions Both ρ 0 and ρ 1 are taken from the same family The constant c in ρ 0 can be tuned for robustness (breakdown point) MM-estimator inherits its robustness from the S-scale The constant c in ρ 1 can be tuned for efficiency of locations Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 11

Robust LDA Tukey biweight ρ functions p = 2 p = 5 2 6 5 1.5 4 ρ ρ 0 0 1 3 ρ 1 ρ 1 2 0.5 1 0 0 c 0 c 1 c 0 c 1 −7 0 7 −8 0 8 Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 12

Robust LDA Robust two-sample estimates Pool the scatter estimates � Σ 1 n 1 and � Σ 2 n 2 of both groups: Σ n = n 1 � Σ 1 n 1 + n 2 � Σ 2 n 2 � n 1 + n 2 Calculate simultaneous S-estimates of the two locations ✬ ✩ and the common scatter matrix: µ 2 n and � µ 1 n , � � Σ n minimize | C | subject to n j ( ) ∑ 2 ∑ 1 1 [( x ji − T j ) t C − 1 ( x ji − T j )] ρ 0 = b 2 n 1 + n 2 j = 1 i = 1 ✫ ✪ among all T 1 , T 2 ∈ R p and C ∈ PDS ( p ) (He and Fung 2000) Similarly, simultaneous MM-estimates can be calculated Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 13

Robust LDA Bootstrap inference Advantages of bootstrap Few assumptions Wide range of applications Bootstrapping robust estimators High computational cost Robustness not guaranteed Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 14

Robust LDA Bootstrap inference Advantages of bootstrap Few assumptions Wide range of applications Bootstrapping robust estimators High computational cost Robustness not guaranteed Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 15

Fast and robust bootstrap Fast and robust bootstrap principle For each bootstrap sample Calculate an approximation for the estimates Use the estimating equations Fast to compute approximations Inherit robustness of initial solution Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 16

Fast and robust bootstrap Fast and robust bootstrap Consider estimates that are the solution of a fixed point equation � Θ n = g n ( � Θ n ) For a bootstrap sample � n ( � Θ ∗ n = g ∗ Θ ∗ n ) consider the one-step approximation Θ 1 ⋆ � n ( � n = g ∗ Θ n ) Take a Taylor expansion about estimands Θ : Θ n = g n (Θ) + ∇ g n (Θ)( � � Θ n − Θ) + O P ( n − 1 ) which can be rewritten as: √ n ( � Θ n − Θ) = [ I − ∇ g n (Θ)] − 1 √ n ( g n (Θ) − Θ) + O P ( n − 1 / 2 ) We then obtain √ n ( � Θ n )] − 1 √ n ( g ∗ n − � Θ n ) = [ I −∇ g n ( � n ( � Θ n ) − � Θ n )+ O P ( n − 1 / 2 ) Θ ∗ which yields the FRB estimate Θ R ⋆ � n = � Θ n + [ I − ∇ g n ( � Θ n )] − 1 ( � Θ 1 ⋆ n − � Θ n ) Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 17

Fast and robust bootstrap Fast and robust bootstrap Consider estimates that are the solution of a fixed point equation � Θ n = g n ( � Θ n ) For a bootstrap sample � n ( � Θ ∗ n = g ∗ Θ ∗ n ) consider the one-step approximation Θ 1 ⋆ � n ( � n = g ∗ Θ n ) Take a Taylor expansion about estimands Θ : Θ n = g n (Θ) + ∇ g n (Θ)( � � Θ n − Θ) + O P ( n − 1 ) which can be rewritten as: √ n ( � Θ n − Θ) = [ I − ∇ g n (Θ)] − 1 √ n ( g n (Θ) − Θ) + O P ( n − 1 / 2 ) We then obtain √ n ( � Θ n )] − 1 √ n ( g ∗ n − � Θ n ) = [ I −∇ g n ( � n ( � Θ n ) − � Θ n )+ O P ( n − 1 / 2 ) Θ ∗ which yields the FRB estimate Θ R ⋆ � n = � Θ n + [ I − ∇ g n ( � Θ n )] − 1 ( � Θ 1 ⋆ n − � Θ n ) Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 18

Fast and robust bootstrap Properties of fast robust bootstrap Computational efficiency: The FRB estimates are solutions of a system of linear equations Robustness: The FRB estimates use the weights of the MM-estimates at the original sample Consistency: Under regularity conditions, the FRB distribution of � Θ n and the sample distribution of � Θ n converge to the same limiting distribution Smooth mappings: FRB commutes with smooth functions, such as a = Σ − 1 ( µ 1 − µ 2 ) Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 19

Fast and robust bootstrap Properties of fast robust bootstrap Computational efficiency: The FRB estimates are solutions of a system of linear equations Robustness: The FRB estimates use the weights of the MM-estimates at the original sample Consistency: Under regularity conditions, the FRB distribution of � Θ n and the sample distribution of � Θ n converge to the same limiting distribution Smooth mappings: FRB commutes with smooth functions, such as a = Σ − 1 ( µ 1 − µ 2 ) Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 20

Fast and robust bootstrap Variable selection in robust LDA Two group robust LDA Selection criterion: test for significance of the discriminant coordinate coefficients Use FRB distribution to estimate p-values Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 21

Examples Example: Biting Flies Two groups of 35 flies (Leptoconops torrens and Leptoconops carteri) Measurements of wing length wing width third palp length third palp width fourth palp length Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 22

Examples Biting Flies: outliers Wing width 2 Group 1 20 25 30 35 40 45 50 Wing width Robust Variable Selection in Discriminant Analysis Van Aelst & Willems 23

Selecting Variables in Two-Group Robust Linear Discriminant Analysis - PowerPoint PPT Presentation

. . Selecting Variables in Two-Group Robust Linear Discriminant Analysis . . . . . Stefan Van Aelst and Gert Willems Department of Applied Mathematics and Computer Science Ghent University, Belgium COMPSTAT2010 Linear discriminant

Dotmetrics Exclusive Users Selecting basic dimensions (country, devices) Selecting timeframe

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

YCL Week 3 Lets talk about variables! Variables Variables are containers for data. Variables

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Closures & Scoping Variables Parameters Local variables Free variables

Robust Linear Quantum Systems Robust Linear Quantum Systems Theory Theory Ian R. Petersen

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Linear Programming Linear Programming In a linear programming problem, there is a set of

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Linear Programming Linear Programs - example 1 Optimization problem x 1 ,x 2 = variables

Robust stability analysis of uncertain Linear Positive Systems via Integral Linear Constraints: L 1

Selecting Mischief Makers: Vital Selecting Mischief Makers: Vital Interviewing Skills

Researching Researching Your Paper Topic Your Paper Topic A HOW TO GUIDE A HOW TO GUIDE

1 SELECTING THE RIGHT NEBULIZER SELECTING THE RIGHT NEBULIZER Front-loaded versus Bottom-loaded

Runtime - Variables Storage and Access of Variables Three types of data memory (variables)

Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical

Sub-seasonal and seasonal forecast verification Young Scientists School, CITES 2019 Debbie

Stochastic Computing by Stochastic Computing by a New Polynomial a New Polynomial Dimensional

Numerical Optimization Biostatistics 615/815 Lecture 17: . . . . . . . Summary .

Evaluating the Population Size Adaptation Mechanism for CMA-ES on the BBOB Noiseless Testbed

Reporting Standards for Social Science Experiments Kevin Esterling University of California -

Recoverable Mineral Resources Designed for Mine Planning at Gold Fields Tarkwa Mine, Ghana

Investor Presentation June 2011 FORWARD LOOKING STATEMENTS This presentation contains forward

PURSUING A REVIVAL IN GOLD Corporate Presentation November 2018 TSX-V: RVG | OTCQB: RVLGF 1

Sambuz

Useful Links

Newsletter

Mail Us

Selecting Variables in Two-Group Robust Linear Discriminant Analysis - PowerPoint PPT Presentation

. . Selecting Variables in Two-Group Robust Linear Discriminant Analysis . . . . . Stefan Van Aelst and Gert Willems Department of Applied Mathematics and Computer Science Ghent University, Belgium COMPSTAT2010 Linear discriminant

Dotmetrics Exclusive Users Selecting basic dimensions (country, devices) Selecting timeframe

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

YCL Week 3 Lets talk about variables! Variables Variables are containers for data. Variables

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Closures &amp; Scoping Variables Parameters Local variables Free variables

Robust Linear Quantum Systems Robust Linear Quantum Systems Theory Theory Ian R. Petersen

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Linear Programming Linear Programming In a linear programming problem, there is a set of

Graphics 2014 Linear Algebra II Linear Maps &amp; Matrices Linear Maps &amp; Matrices CORE

Linear Programming Linear Programs - example 1 Optimization problem x 1 ,x 2 = variables

Robust stability analysis of uncertain Linear Positive Systems via Integral Linear Constraints: L 1

Selecting Mischief Makers: Vital Selecting Mischief Makers: Vital Interviewing Skills

Researching Researching Your Paper Topic Your Paper Topic A HOW TO GUIDE A HOW TO GUIDE

1 SELECTING THE RIGHT NEBULIZER SELECTING THE RIGHT NEBULIZER Front-loaded versus Bottom-loaded

Runtime - Variables Storage and Access of Variables Three types of data memory (variables)

Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical

Sub-seasonal and seasonal forecast verification Young Scientists School, CITES 2019 Debbie

Stochastic Computing by Stochastic Computing by a New Polynomial a New Polynomial Dimensional

Numerical Optimization Biostatistics 615/815 Lecture 17: . . . . . . . Summary .

Evaluating the Population Size Adaptation Mechanism for CMA-ES on the BBOB Noiseless Testbed

Reporting Standards for Social Science Experiments Kevin Esterling University of California -

Recoverable Mineral Resources Designed for Mine Planning at Gold Fields Tarkwa Mine, Ghana

Investor Presentation June 2011 FORWARD LOOKING STATEMENTS This presentation contains forward

PURSUING A REVIVAL IN GOLD Corporate Presentation November 2018 TSX-V: RVG | OTCQB: RVLGF 1

Sambuz

Useful Links

Newsletter

Mail Us

Closures & Scoping Variables Parameters Local variables Free variables

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE