A Study of Cross-Validation and Bootstrap for Accuracy Estimation - PowerPoint PPT Presentation

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu February 27, 2012 A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Goal and Motivation Review accuracy estimation methods and compare the two most common methods: ◮ cross-validation and ◮ bootstrap. Estimating the accuracy of a classifier is important in order to ◮ predict future prediction accuracy, ◮ we would like low bias and low variance, and ◮ choose a classifier from a given set, ◮ we are willing to trade off bias for low variance. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Accuracy The accuracy of a classifier C is the probability of correctly classifying a randomly selected instance ◮ i.e., acc = Pr ( C ( v ) = y ) for a randomly selected instance � v , y � ∈ X . A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Holdout The holdout method partitions the data into a training and a test set (or holdout set). The holdout estimated accuracy is defined as acc h = 1 ∑ δ ( I ( D t , v i ) , y i ) , h � v , y �∈ D h where I ( D , v ) the label assigned to an unlabeled instance v by the classifier built by inducer I on dataset D , D h the holdout set, a subset of D of size h , D t = D / D h and δ ( i , j ) = 1 if i = j and 0 otherwise. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Holdout The more instances we leave for the test set, the higher the bias of our estimate. However, fewer test set instances means wider confidence interval for the accuracy. The holdout estimate depends on the division into a training set and a test set. ◮ In random subsampling, the estimated accuracy is derived by averaging k runs. ◮ The assumption of independence of instances in the test set from those in the training set is violated. In practice, the dataset size is always finite, and usually smaller than we would like it to be. The holdout method makes inefficient use of the data: a third of dataset is not used for training the inducer. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Cross-Validation, Leave-one-out, and Stratification Each time the inducer is trained on D / D t and tested on D t , t ∈ { 1 , 2 ,..., k } . The cross-validation estimate of accuracy is the overall number of correct classifications, divided by the number of instances in the dataset. Repeating cross-validation multiple times using different splits into folds provides a better Monte-Carlo estimate to the complete cross-validation at an added cost. In stratified cross-validation, the folds are stratified so that they contain approximately the same proportions of labels as the original dataset. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Cross-Validation, Leave-one-out, and Stratification Proposition (Variance in k -fold cross-validation) If the inducer is stable under the perturbations caused by deleting the instances for the folds in k-fold cross-validation, the cross-validation estimate will be unbiased and the variance of the estimated accuracy will be approximately acc cv × ( 1 − acc cv ) / n, where n is the number of instances in the dataset. Corollary (Variance in cross-validation) If the inducer is stable under the perturbations caused by deleting the test instances for the folds in k-fold cross-validation for various values of k, then the variance of the estimates will be the same. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Bootstrap Given a dataset of size n , a bootstrap sample is created by sampling n instances uniformly from the data (with replacement). Given a number b , the number of bootstrap samples, let ε 0 i be the accuracy estimate for bootstrap sample i . The 0 . 632 bootstrap estimate is defined as b acc boot = 1 ∑ ( 0 . 632 × ε 0 i + 0 . 368 × acc s ) b i = 1 The assumptions made by bootstrap are basically the same as that of cross-validation, i.e., stability of the algorithm on the dataset. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Methodology They use C 4 . 5 and a Naive-Bayesian classifier to conduct a large-scale experiment. Because the target concept is unknown for real-world concepts, the holdout method was used. Six datasets from a wide variety of domains, such that the learning curve for both algorithms did not flatten out too early, plus a no information dataset were used. To see how well an accuracy estimation method performs, they ran the induction algorithm on the training set and tested the classifier on the rest of the instances in the dataset. This was repeated 50 times at points where the learning curve was sloping up. The same folds in cross-validation and the same samples in bootstrap were used for both algorithms compared. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary The Bias Figure: (a) C 4 . 5: The bias of cross-validation with varying folds. (b) C 4 . 5: The bias of bootstrap with varying samples. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary The Bias The diagrams clearly show that k -fold cross-validation is pessimistically biased, especially for two and five folds. Most of the estimates are reasonably good at 10 folds and at 20 folds they are almost unbiased. Stratified cross-validation had similar behavior, except for lower pessimism. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary The Variance Figure: (a) Cross-validation and (b) . 632 Bootstrap: standard deviation of accuracy (population). A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary The Variance Cross-validation has high variance at 2-folds on both C 4 . 5 and Naive-Bayes. On C 4 . 5, there is high variance at the high-ends too -at leave-one-out and leave-two-out- for three files out of the seven datasets. Stratification reduces the variance slightly, and thus seems to be uniformly better than cross-validation, both for bias and variance. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Summary The results indicate that: ◮ stratification is generally a better scheme, both in terms of bias and variance, when compared to regular cross-validation, ◮ bootstrap has low variance, but extremely large bias on some problems, and ◮ the best method to use for model selection is ten-fold stratified cross validation, even if computation power allows using more folds. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary Thank you! A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary References I Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2 , San Francisco, CA, USA, pp. 1137–1143. Morgan Kaufmann Publishers Inc. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos

A Study of Cross-Validation and Bootstrap for Accuracy Estimation - PowerPoint PPT Presentation

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos Department of Information,

Cross-validation and the Bootstrap In the section we discuss two resampling methods:

A better Bootstrap, Mack, and the ELRF and PTF modelling Frameworks Bootstrap technique- a

Cross-validation and the Bootstrap In the section we discuss two resampling methods:

STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson Oberlin College 3 March 2017

AngularJS & Bootstrap Form Validation HTML default validation Browsers have built-in

1 Get Started 2 3 Web Application Development What is Bootstrap? Bootstrap is a free

STAT 213 Cross-Validation (and Multifactor ANOVA?) Colin Reimer Dawson Oberlin College 12

Progress to Date in A3: Method Transfer, Partial Validation and Cross validation A3: Method

Introduction to Data Science: Classifier n 1 n 1 k k Suppose you want to compare two

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

Data Mining II Model Validation Heiko Paulheim Why Model Validation? We have seen so far

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

Validation of National Burn Severity Validation of National Burn Severity Validation of National

Form Validation 1 CS380 What is form validation? 2 validation: ensuring that form's values

Holdout and Cross- -Validation Validation Holdout and Cross Methods Overfitting Avoidance

Lecture 21: Bootstrap and Permutation Tests The bootstrap Bootstrapping generally refers to

Model Validation: The Modelers Perspective Am ber Popovitch, FCAS CAS RPM Sem inar March 2 0

A Semi-Parametric Block Bootstrap Approach for Clustered Data Ray Chambers & Hukum Chandra

Whats an eBike? From 2006 to 2018 Whats an eBiketoday? <750 Watt Drive Unit: powered

PARCEL 6 DEVELOPMENT PROPOSAL I-195 REDEVELOPMENT DISTRICT MAY 2019 truth box ARCHITECTS D+P

Inference Barbara Brown National Center for Atmospheric Research Boulder Colorado USA

An Outlier Robust Block Bootstrap for Small Area Estimation Payam Mokhtarian and Ray Chambers

Conformal Field Theories, Conformal Bootstrap and Applications Konstantinos Deligiannis December

BANK FINANCING BOOTSTRAP ENTREPRENEURSHIP Saved by the banks? Growth challenges and investment

A Study of Cross-Validation and Bootstrap for Accuracy Estimation - PowerPoint PPT Presentation

Introduction Methods for Accuracy Estimation Methodology Results and Discussion Summary A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Panagiotis Adamopoulos Department of Information,

Cross-validation and the Bootstrap In the section we discuss two resampling methods:

A better Bootstrap, Mack, and the ELRF and PTF modelling Frameworks Bootstrap technique- a

Cross-validation and the Bootstrap In the section we discuss two resampling methods:

STAT 113 Bootstrap Confidence Intervals Colin Reimer Dawson Oberlin College 3 March 2017

AngularJS &amp; Bootstrap Form Validation HTML default validation Browsers have built-in

1 Get Started 2 3 Web Application Development What is Bootstrap? Bootstrap is a free

STAT 213 Cross-Validation (and Multifactor ANOVA?) Colin Reimer Dawson Oberlin College 12

Progress to Date in A3: Method Transfer, Partial Validation and Cross validation A3: Method

Introduction to Data Science: Classifier n 1 n 1 k k Suppose you want to compare two

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

Data Mining II Model Validation Heiko Paulheim Why Model Validation? We have seen so far

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

Validation of National Burn Severity Validation of National Burn Severity Validation of National

Form Validation 1 CS380 What is form validation? 2 validation: ensuring that form's values

Holdout and Cross- -Validation Validation Holdout and Cross Methods Overfitting Avoidance

Lecture 21: Bootstrap and Permutation Tests The bootstrap Bootstrapping generally refers to

Model Validation: The Modelers Perspective Am ber Popovitch, FCAS CAS RPM Sem inar March 2 0

A Semi-Parametric Block Bootstrap Approach for Clustered Data Ray Chambers &amp; Hukum Chandra

Whats an eBike? From 2006 to 2018 Whats an eBiketoday? &lt;750 Watt Drive Unit: powered

PARCEL 6 DEVELOPMENT PROPOSAL I-195 REDEVELOPMENT DISTRICT MAY 2019 truth box ARCHITECTS D+P

Inference Barbara Brown National Center for Atmospheric Research Boulder Colorado USA

An Outlier Robust Block Bootstrap for Small Area Estimation Payam Mokhtarian and Ray Chambers

Conformal Field Theories, Conformal Bootstrap and Applications Konstantinos Deligiannis December

BANK FINANCING BOOTSTRAP ENTREPRENEURSHIP Saved by the banks? Growth challenges and investment

AngularJS & Bootstrap Form Validation HTML default validation Browsers have built-in

A Semi-Parametric Block Bootstrap Approach for Clustered Data Ray Chambers & Hukum Chandra

Whats an eBike? From 2006 to 2018 Whats an eBiketoday? <750 Watt Drive Unit: powered