Introduction to Machine Learning CART: Stopping Criteria & - PowerPoint PPT Presentation

Introduction to Machine Learning CART: Stopping Criteria & Pruning compstat-lmu.github.io/lecture_i2ml

OVERFITTING TREES The recursive partitioning procedure used to grow a CART would run until every leaf only contains a single observation. Problem 1: This would take a very long time, as the amount of splits we have to try grows exponentially with the number of leaves in the trees. Problem 2: At some point before that we should stop splitting nodes into ever smaller child nodes: very complex trees with lots of branches and leaves will overfit the training data. Problem 3: However, it is very hard to tell where we should stop while we’re growing the tree: Before we actually try all possible additional splits further down a branch, we can’t know whether any one of them will be able to reduce the risk by a lot ( horizon effect ). � c Introduction to Machine Learning – 1 / 7

STOPPING CRITERIA Problems 1 and 2 can be “solved” by defining different stopping criteria : Stop once the tree has reached a certain number of leaves. Don’t try to split a node further if it contains too few observations. Don’t perform a split that results in child nodes with too few observations. Don’t perform a split unless it achieves a certain minimal improvement of the empirical risk in the child nodes compared to the empirical risk in the parent node. Obviously: Stop once all observations in a node have the same target value ( pure node ) or identical values for all features. � c Introduction to Machine Learning – 2 / 7

PRUNING We try to solve problem 3 by pruning : a method to select the optimal size of a tree Finding a combination of suitable strict stopping criteria (“pre-pruning”) is a hard problem: there are many different stopping criteria and it’s hard to find the best combination (see chapter on tuning ) Better: Grow a large tree, then remove branches so that the resulting smaller tree has optimal cross-validation risk Feasible without cross-validation: Grow a large tree, then remove branches so that the resulting smaller tree has a good balance between training set performance (risk) and complexity (i.e., number of terminal nodes). The trade-off between complexity and accuracy is governed by a complexity parameter . � c Introduction to Machine Learning – 3 / 7

PRUNING 1 1 23 n=506 100% yes yes rm < 6.9 rm < 6.9 no no 2 2 3 3 20 37 n=430 85% n=76 15% lstat >= 14 lstat >= 14 rm < 7.4 rm < 7.4 5 5 4 4 6 6 7 7 15 23 32 45 n=175 35% n=255 50% n=46 9% n=30 6% Full tree � c Introduction to Machine Learning – 4 / 7

PRUNING 1 1 23 n=506 100% yes yes rm < 6.9 rm < 6.9 no no 2 2 20 n=430 85% lstat >= 14 lstat >= 14 5 5 4 4 3 3 15 23 37 n=175 35% n=255 50% n=76 15% Pruning with complexity parameter = 0.072. � c Introduction to Machine Learning – 5 / 7

PRUNING 1 1 23 n=506 100% yes yes rm < 6.9 rm < 6.9 no no 2 2 3 3 20 37 n=430 85% n=76 15% Pruning with complexity parameter = 0.171. � c Introduction to Machine Learning – 6 / 7

PRUNING 1 1 23 n=506 100% Pruning with complexity parameter = 0.453. � c Introduction to Machine Learning – 7 / 7

Introduction to Machine Learning CART: Stopping Criteria & - PowerPoint PPT Presentation

Introduction to Machine Learning CART: Stopping Criteria & Pruning compstat-lmu.github.io/lecture_i2ml OVERFITTING TREES The recursive partitioning procedure used to grow a CART would run until every leaf only contains a single observation.

CART Workgroup Update Presented by Jonathan Chin Introduction CART Fact of the Day: The

COUNTY ANIMAL RESPONSE TEAMS (CART) Amy Wheeler - Oneida County CART Senior Telecommunicator,

Influence measures for CART Jean-Michel Poggi Orsay, Paris Sud & Paris Descartes Joint work

CARE Advisory Research & Training Ltd. (CART) A-1102/1103, 11th Floor, Kanakia Wall Street,

Keys to Creating Thumb Keys to Creating Thumb - Stopping Content Stopping Content Sean Ellenby

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

The use of stopping criteria for iterative Krylov methods in designing adaptive methods for PDEs

Town Halls - Proposed Golf Cart Path Project December 2017 & January 2018 1 Agenda

Introduction to Machine Learning CART: Splitting Criteria compstat-lmu.github.io/lecture_i2ml

Comparative Study of C5.0 and CART algorithms Presenter: Alvin Nguyen Presentation Framework 1.

Training Presentation Submitting a Requisition The training for submitting a requisition begins

NEW PRODUCT LAUNCH: MC300 MC CART Part Number: MC300 FASTER Rough-in an entire suite using

Preliminary Match-up of AIRS to ARM CART Soundings and AVN Grids Eric Fetzer AIRS Science Team

Jet Impinging on a Cart Andrew Ning September 12, 2016 1 Case 1: Cart fixed We will select a

ESG Criteria: ESG Criteria: ESG Criteria: ESG Criteria: New paradigm that will redefine the

Efficient Search-Space Pruning for Integrated Fusion and Tiling Transformations Xiaoyang Gao,

Random Sampling Revisited: Lattice Enumeration with Discrete Pruning Yoshinori Aono

High-Performance Hardware for Machine Learning U.C. Berkeley October 19, 2016 William Dally

MID-TERM FOLLOW-UP Semantic mining: Unsupervised acquisition of multilingual semantic classes

1 - Pruning Pseudocode - Pruning Properties Pruning has no effect on final result

TVM at Facebook Lots of contributors at FB and elsewhere TVM at Facebook Why TVM? Examples from

Near-Optimal Offline Cleaning for Flash-Based SSDs MANSOUR SHAFAEI & PETER DESNOYERS

XDN: Towards Efficient Inference of Residual Neural Networks on Cambricon Chips 2019

Introduction to Machine Learning CART: Stopping Criteria & - PowerPoint PPT Presentation

Introduction to Machine Learning CART: Stopping Criteria & Pruning compstat-lmu.github.io/lecture_i2ml OVERFITTING TREES The recursive partitioning procedure used to grow a CART would run until every leaf only contains a single observation.

CART Workgroup Update Presented by Jonathan Chin Introduction CART Fact of the Day: The

COUNTY ANIMAL RESPONSE TEAMS (CART) Amy Wheeler - Oneida County CART Senior Telecommunicator,

Influence measures for CART Jean-Michel Poggi Orsay, Paris Sud &amp; Paris Descartes Joint work

CARE Advisory Research &amp; Training Ltd. (CART) A-1102/1103, 11th Floor, Kanakia Wall Street,

Keys to Creating Thumb Keys to Creating Thumb - Stopping Content Stopping Content Sean Ellenby

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

The use of stopping criteria for iterative Krylov methods in designing adaptive methods for PDEs

Town Halls - Proposed Golf Cart Path Project December 2017 &amp; January 2018 1 Agenda

Introduction to Machine Learning CART: Splitting Criteria compstat-lmu.github.io/lecture_i2ml

Comparative Study of C5.0 and CART algorithms Presenter: Alvin Nguyen Presentation Framework 1.

Training Presentation Submitting a Requisition The training for submitting a requisition begins

NEW PRODUCT LAUNCH: MC300 MC CART Part Number: MC300 FASTER Rough-in an entire suite using

Preliminary Match-up of AIRS to ARM CART Soundings and AVN Grids Eric Fetzer AIRS Science Team

Jet Impinging on a Cart Andrew Ning September 12, 2016 1 Case 1: Cart fixed We will select a

ESG Criteria: ESG Criteria: ESG Criteria: ESG Criteria: New paradigm that will redefine the

Efficient Search-Space Pruning for Integrated Fusion and Tiling Transformations Xiaoyang Gao,

Random Sampling Revisited: Lattice Enumeration with Discrete Pruning Yoshinori Aono

High-Performance Hardware for Machine Learning U.C. Berkeley October 19, 2016 William Dally

MID-TERM FOLLOW-UP Semantic mining: Unsupervised acquisition of multilingual semantic classes

1 - Pruning Pseudocode - Pruning Properties Pruning has no effect on final result

TVM at Facebook Lots of contributors at FB and elsewhere TVM at Facebook Why TVM? Examples from

Near-Optimal Offline Cleaning for Flash-Based SSDs MANSOUR SHAFAEI &amp; PETER DESNOYERS

XDN: Towards Efficient Inference of Residual Neural Networks on Cambricon Chips 2019

Influence measures for CART Jean-Michel Poggi Orsay, Paris Sud & Paris Descartes Joint work

CARE Advisory Research & Training Ltd. (CART) A-1102/1103, 11th Floor, Kanakia Wall Street,

Town Halls - Proposed Golf Cart Path Project December 2017 & January 2018 1 Agenda

Near-Optimal Offline Cleaning for Flash-Based SSDs MANSOUR SHAFAEI & PETER DESNOYERS