Variable selection and parameter tuning in high-dimensional - PowerPoint PPT Presentation

Setup Results Discussion Variable selection and parameter tuning in high-dimensional prediction Christoph Bernau and Anne-Laure Boulesteix Institut f¨ ur Medizinische Informationsverarbeitung, Biometrie und Epidemiologie Ludwig-Maximilians-Universit¨ at M¨ unchen COMPSTAT 2010, 23. August 2010 Bernau and Boulesteix Variable selection and tuning 1/14

Setup Results Discussion Prediction based on high-dimensional data X : a n × p matrix containing n observations of p variables, possibly with n ≪ p . Examples: microarray data, chemometric data, proteomic data, metabolomic data X 1 . . . . . . X p Pat 1 . . . . . . . . . . . . . . . . . . Pat n . . . . Y : a response variable to be predicted. Examples: responder/non-responder, diseased/healthy Bernau and Boulesteix Variable selection and tuning 2/14

Setup Results Discussion Variable selection ◮ Many variables are irrelevant for the prediction problem. ◮ Variable selection is often useful as a preliminary step to model selection. ◮ Example: 1. Rank the variables according the absolute value of the t-statistic. 2. Select the p ∗ = 100 top-ranking variables and use them for model selection. Boulesteix et al, 2008. Evaluating microarray-based classifiers. Cancer Informatics 6:77–97. Bernau and Boulesteix Variable selection and tuning 3/14

Setup Results Discussion Variable selection and cross-validation ◮ In small sample settings, prediction error rates are often estimated through cross-validation (CV) or related approaches (repeated subsampling, bootstrap). ◮ It is then essential to consider variable selection as a part of model selection and perform it for each CV iteration successively . ◮ Otherwise the error rate may be considerably underestimated (Ambroise and McLahan 2002). A.-L. Boulesteix, 2007. WilcoxCV: an R package for fast variable in cross-validation. Bioinformatics 23:1702–1704. Bernau and Boulesteix Variable selection and tuning 4/14

Setup Results Discussion Parameter tuning ◮ Many classification methods involve a parameter that has to be tuned. ◮ Examples: ◮ the number k of nearest neighbors in the kNN algorithm ◮ the penalty λ in penalized regression ◮ the number of components in PLS-DA ◮ It is common practice to choose the value of the parameter through internal cross-validation . Bernau and Boulesteix Variable selection and tuning 5/14

Setup Results Discussion Internal cross-validation (CV) ◮ Error rates are estimated via external CV corresponding to partition S = ∪ S k . ◮ In each learning set S \ S k : ◮ Internal CV is performed with different values θ 1 , . . . , θ m of the parameter. ◮ The value θ ∗ yielding the lowest error rate is selected. ◮ θ ∗ is used for model selection based on S \ S k . ◮ In internal CV, error rates are calculated, but the goal is only to determine θ ∗ , not to estimate the error rates. Bernau and Boulesteix Variable selection and tuning 6/14

Setup Results Discussion Research question Should we perform variable selection before internal CV (V1) or repeat variable selection for each internal CV iteration (V2)? ◮ For external CV, variable selection must always be repeated for each iteration, but for internal CV the answer is not obvious. ◮ V2 is time consuming: for example, in LOO-CV, variable selection has to be performed n × ( n − 1) times. Bernau and Boulesteix Variable selection and tuning 7/14

Setup Results Discussion Our empirical study ◮ Two real data microarray sets ◮ Two classification methods: kNN and PLS+LDA ◮ Two variable selection methods: t-statistic and RFE ◮ 100 times 5-fold-CV for error estimation (external CV) ◮ 5 times 3-fold-CV for parameter tuning (internal CV) Bernau and Boulesteix Variable selection and tuning 8/14

Setup Results Discussion Result 1: V2 selects more complex models than V1 Bernau and Boulesteix Variable selection and tuning 9/14

Setup Results Discussion Result 2: The error rates of V1 and V2 are similar Golub data colon cancer data kNN t-test RFE t-test RFE V1 V2 V1 V2 V1 V2 V1 V2 mean 7.8% 7.4% 5.8% 6.1% 16.8% 18.8% 21.6% 23.3% 20 genes std. dev. 2.6% 2.8% 2.5% 2.9% 1.9% 2.4% 3.3% 4.1% mean 5.9% 5.5% 1.9% 2.2% 16.4% 19.9% 16.9% 18.5% 50 genes std. dev. 2.4% 2.7% 1.8% 1.7% 1.6% 1.9% 3.3% 3.0% No clear difference between V1 and V2 in terms of error rate (variances are high!) Bernau and Boulesteix Variable selection and tuning 10/14

Setup Results Discussion Why does V2 lead to more complex models? ◮ In V1 the variables are selected based on the external learning set S \ S k . ◮ In V2 the variables are selected based the smaller learning set ( S \ S k ) \ S kj , on which the models are fit in internal CV. → In V2 the variables better discriminate the two classes in the learning set ( S \ S k ) \ S kj than in V1. → In V2 complex models perform better. → In V1 complex models are fit to “bad variables” and thus lead to worse results. Bernau and Boulesteix Variable selection and tuning 11/14

Setup Results Discussion Why does V2 lead to more complex models? Bernau and Boulesteix Variable selection and tuning 12/14

Setup Results Discussion Further remarks ◮ V2 possibly leads to too complex models : since the internal learning sets are small, it is easier to find variables that separate the classes perfectly (and lead to comparatively good performance for complex models). ◮ A problem of V2 is that the parameter is chosen based on sets of variables but applied to another set of variables. ◮ A problem of V1 is that, for well-separated data sets, all parameter values yield an error rate of 0% → no tuning is performed in this case. Bernau and Boulesteix Variable selection and tuning 13/14

Setup Results Discussion Conclusion and outlook ◮ No definitive answer in terms of error rate ◮ V2 is more intuitive but has some inconveniences and is time consuming. ◮ Outlook: Methods with intrinsic variable selection (such as lasso) are implicitly based on V2. Do they also lead to too complex models? Bernau and Boulesteix Variable selection and tuning 14/14

Variable selection and parameter tuning in high-dimensional - PowerPoint PPT Presentation

Setup Results Discussion Variable selection and parameter tuning in high-dimensional prediction Christoph Bernau and Anne-Laure Boulesteix Institut f ur Medizinische Informationsverarbeitung, Biometrie und Epidemiologie

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

10/16/19 Parameters and Parameter Tuning Genetic Algorithms History Taxonomy

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

Data Mining II Optimization & Parameter Tuning Heiko Paulheim Why Parameter Tuning?

Data Mining II Optimization & Parameter Tuning Heiko Paulheim Why Parameter Tuning?

On Information-Maximization On Information-Maximization Clustering: Tuning Parameter Clustering:

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

Linkages Between Parameter Tuning and Scarcity Pricing Frank A. Wolak Chair, Market Surveillance

Discussion Regarding Uneconomic Adjustment Policy & Parameter Tuning Market and Product

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

MLCC 2019 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable

Luigi Spezia Biomathematics & Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

Variable selection STAT 401 - Statistical Methods for Research Workers Jarad Niemi Iowa State

Parameter Passing and Pointers Parameter passing and functions I: reference parameters

PAC PACE AUT AUTO-WER WERKS KS Vehicle Tuning Services Performance tuning with fuel

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 3 Slides adapted from

Leveraging Operation-Aware UREQA: Error Rates for Effective Quantum Circuit Mapping on NISQ-Era

Solution approaches for Solution approaches for address-selection problems address-selection

Why are Polls So Wrong? CTC1-1A 4 Dec, 2016 1A 1A 2016 Schield CTC1 1 2016 Schield CTC1

Overview Model Comparison Machine Learning and Pattern Recognition The model selection

The effects of data selection on The effects of data selection on the assimilation of AIRS data

Population sizing Correct size of the population important: too small: premature

Handling missing data P R AC TIC IN G MAC H IN E L E AR N IN G IN TE R VIE W QU E STION S IN

Variable selection and parameter tuning in high-dimensional - PowerPoint PPT Presentation

Setup Results Discussion Variable selection and parameter tuning in high-dimensional prediction Christoph Bernau and Anne-Laure Boulesteix Institut f ur Medizinische Informationsverarbeitung, Biometrie und Epidemiologie

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

10/16/19 Parameters and Parameter Tuning Genetic Algorithms History Taxonomy

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

Data Mining II Optimization &amp; Parameter Tuning Heiko Paulheim Why Parameter Tuning?

Data Mining II Optimization &amp; Parameter Tuning Heiko Paulheim Why Parameter Tuning?

On Information-Maximization On Information-Maximization Clustering: Tuning Parameter Clustering:

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

Linkages Between Parameter Tuning and Scarcity Pricing Frank A. Wolak Chair, Market Surveillance

Discussion Regarding Uneconomic Adjustment Policy &amp; Parameter Tuning Market and Product

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

MLCC 2019 Variable Selection and Sparsity Lorenzo Rosasco UNIGE-MIT-IIT Outline Variable

Luigi Spezia Biomathematics &amp; Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

Variable selection STAT 401 - Statistical Methods for Research Workers Jarad Niemi Iowa State

Parameter Passing and Pointers Parameter passing and functions I: reference parameters

PAC PACE AUT AUTO-WER WERKS KS Vehicle Tuning Services Performance tuning with fuel

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 3 Slides adapted from

Leveraging Operation-Aware UREQA: Error Rates for Effective Quantum Circuit Mapping on NISQ-Era

Solution approaches for Solution approaches for address-selection problems address-selection

Why are Polls So Wrong? CTC1-1A 4 Dec, 2016 1A 1A 2016 Schield CTC1 1 2016 Schield CTC1

Overview Model Comparison Machine Learning and Pattern Recognition The model selection

The effects of data selection on The effects of data selection on the assimilation of AIRS data

Population sizing Correct size of the population important: too small: premature

Handling missing data P R AC TIC IN G MAC H IN E L E AR N IN G IN TE R VIE W QU E STION S IN

Data Mining II Optimization & Parameter Tuning Heiko Paulheim Why Parameter Tuning?

Data Mining II Optimization & Parameter Tuning Heiko Paulheim Why Parameter Tuning?

Discussion Regarding Uneconomic Adjustment Policy & Parameter Tuning Market and Product

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Luigi Spezia Biomathematics & Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION