introduction to machine learning cart stopping criteria
play

Introduction to Machine Learning CART: Stopping Criteria & - PowerPoint PPT Presentation

Introduction to Machine Learning CART: Stopping Criteria & Pruning compstat-lmu.github.io/lecture_i2ml OVERFITTING TREES The recursive partitioning procedure used to grow a CART would run until every leaf only contains a single observation.


  1. Introduction to Machine Learning CART: Stopping Criteria & Pruning compstat-lmu.github.io/lecture_i2ml

  2. OVERFITTING TREES The recursive partitioning procedure used to grow a CART would run until every leaf only contains a single observation. Problem 1: This would take a very long time, as the amount of splits we have to try grows exponentially with the number of leaves in the trees. Problem 2: At some point before that we should stop splitting nodes into ever smaller child nodes: very complex trees with lots of branches and leaves will overfit the training data. Problem 3: However, it is very hard to tell where we should stop while we’re growing the tree: Before we actually try all possible additional splits further down a branch, we can’t know whether any one of them will be able to reduce the risk by a lot ( horizon effect ). � c Introduction to Machine Learning – 1 / 7

  3. STOPPING CRITERIA Problems 1 and 2 can be “solved” by defining different stopping criteria : Stop once the tree has reached a certain number of leaves. Don’t try to split a node further if it contains too few observations. Don’t perform a split that results in child nodes with too few observations. Don’t perform a split unless it achieves a certain minimal improvement of the empirical risk in the child nodes compared to the empirical risk in the parent node. Obviously: Stop once all observations in a node have the same target value ( pure node ) or identical values for all features. � c Introduction to Machine Learning – 2 / 7

  4. PRUNING We try to solve problem 3 by pruning : a method to select the optimal size of a tree Finding a combination of suitable strict stopping criteria (“pre-pruning”) is a hard problem: there are many different stopping criteria and it’s hard to find the best combination (see chapter on tuning ) Better: Grow a large tree, then remove branches so that the resulting smaller tree has optimal cross-validation risk Feasible without cross-validation: Grow a large tree, then remove branches so that the resulting smaller tree has a good balance between training set performance (risk) and complexity (i.e., number of terminal nodes). The trade-off between complexity and accuracy is governed by a complexity parameter . � c Introduction to Machine Learning – 3 / 7

  5. PRUNING 1 1 23 n=506 100% yes yes rm < 6.9 rm < 6.9 no no 2 2 3 3 20 37 n=430 85% n=76 15% lstat >= 14 lstat >= 14 rm < 7.4 rm < 7.4 5 5 4 4 6 6 7 7 15 23 32 45 n=175 35% n=255 50% n=46 9% n=30 6% Full tree � c Introduction to Machine Learning – 4 / 7

  6. PRUNING 1 1 23 n=506 100% yes yes rm < 6.9 rm < 6.9 no no 2 2 20 n=430 85% lstat >= 14 lstat >= 14 5 5 4 4 3 3 15 23 37 n=175 35% n=255 50% n=76 15% Pruning with complexity parameter = 0.072. � c Introduction to Machine Learning – 5 / 7

  7. PRUNING 1 1 23 n=506 100% yes yes rm < 6.9 rm < 6.9 no no 2 2 3 3 20 37 n=430 85% n=76 15% Pruning with complexity parameter = 0.171. � c Introduction to Machine Learning – 6 / 7

  8. PRUNING 1 1 23 n=506 100% Pruning with complexity parameter = 0.453. � c Introduction to Machine Learning – 7 / 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend