Introduction to Machine Learning CART: Advantages & - PowerPoint PPT Presentation

Apr 17, 2023 •106 likes •192 views

Introduction to Machine Learning CART: Advantages & Disadvantages compstat-lmu.github.io/lecture_i2ml ADVANTAGES Fairly easy to understand, interpret and visualize. Not much preprocessing required: automatic handling of non-numeric

Introduction to Machine Learning CART: Advantages & Disadvantages compstat-lmu.github.io/lecture_i2ml
ADVANTAGES Fairly easy to understand, interpret and visualize. Not much preprocessing required: automatic handling of non-numeric features automatic handling of missing values via surrogate splits no problems with outliers in features monotone transformations of features change nothing so scaling of features is irrelevant Interaction effects between features are easily possible, even of higher orders Can model discontinuities and non-linearities (but see "disadvantages") � c Introduction to Machine Learning – 1 / 7
ADVANTAGES Performs automatic feature selection Quite fast, scales well with larger data Flexibility through definition of custom split criteria or leaf-node prediction rules: clustering trees, semi-supervised trees, density estimation, etc. � c Introduction to Machine Learning – 2 / 7
DISADVANTAGE: LINEAR DEPENDENCIES 1.00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.75 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● y ● ● ● x2 0.50 ● ● 0 ● ● ● ● 1 ● ● ● ● ● 0.25 ● ● ● 0.00 0.00 0.25 0.50 0.75 1.00 x1 Linear dependencies must be modeled over several splits. Logistic regression would model this easily. � c Introduction to Machine Learning – 3 / 7
DISADVANTAGE: SMOOTH FUNCTIONS 1.0 ● 0.5 ● y 0.0 ● ● ● ● −0.5 ● 0.2 0.4 0.6 0.8 1.0 x Prediction functions of trees are never smooth as they are always step functions. � c Introduction to Machine Learning – 4 / 7
DISADVANTAGES Empirically not the best predictor: Combine with bagging (forest) or boosting! High instability (variance) of the trees. Small changes in the training data can lead to completely different trees. This leads to reduced trust in interpretation and is a reason why prediction error of trees is usually not best. In regression: Trees define piecewise constant functions, so trees often do not extrapolate well. � c Introduction to Machine Learning – 5 / 7
FURTHER TREE METHODOLOGIES AID (Sonquist and Morgan, 1964) CHAID (Kass, 1980) CART (Breiman et al., 1984) C4.5 (Quinlan, 1993) Unbiased Recursive Partitioning (Hothorn et al., 2006) � c Introduction to Machine Learning – 6 / 7
CART: SYNOPSIS Hypothesis Space: CART models are step functions over a rectangular partition of X . Their maximal complexity is controlled by the stopping criteria and the pruning method. Risk: Trees can use any kind of loss function for regression or classification. Optimization: Exhaustive search over all possible splits in each node to minimize the empirical risk in the child nodes. Most literature on CARTs based on “impurity reduction”, which is mathematically equivalent to empirical risk minimization: Gini impurity ∼ = Brier Score loss, entropy impurity ∼ = Bernoulli loss, variance impurity ∼ = L2 loss. � c Introduction to Machine Learning – 7 / 7

Recommend

Gushers Advantages Gushers Advantages Gusher s Advantages Gusher s Advantages R&D

Gushers Advantages Gushers Advantages Gusher s Advantages Gusher s Advantages R&D in Pump design R&D in Pump design R&D in Pump design R&D in Pump design Gushers Advantages Gushers Advantages

253 views • 22 slides

CART Workgroup Update Presented by Jonathan Chin Introduction CART Fact of the Day: The

CART Workgroup Update Presented by Jonathan Chin Introduction CART Fact of the Day: The functional concept behind CART is the redistribution of system resources will reduce the arrival time of ALS CART Data Workgroup met February 8 th

435 views • 14 slides

COUNTY ANIMAL RESPONSE TEAMS (CART) Amy Wheeler - Oneida County CART Senior Telecommunicator,

COUNTY ANIMAL RESPONSE TEAMS (CART) Amy Wheeler - Oneida County CART Senior Telecommunicator, Oneida County 911 Brian Wood Albany County CART Commander, Albany County Sheriffs Office Why Do we have CART? 152,000 LIKES, 3600 149,000

390 views • 17 slides

Advantages and Advantages and Advantages and Advantages and Disadvantages of Disadvantages of

Advantages and Advantages and Advantages and Advantages and Disadvantages of Disadvantages of Disadvantages of Disadvantages of Windbreaks Windbreaks Windbreaks Windbreaks L.W. Timmer L.W. Timmer Extension Plant Pathologist, CREC,

548 views • 17 slides

Influence measures for CART Jean-Michel Poggi Orsay, Paris Sud & Paris Descartes Joint work

Influence measures for CART Jean-Michel Poggi Orsay, Paris Sud & Paris Descartes Joint work with Avner Bar-Hen Servane Gey (MAP5, Paris Descartes ) J-M. Poggi Influence measures for CART Introduction Influence measures for CART CART

507 views • 27 slides

CARE Advisory Research & Training Ltd. (CART) A-1102/1103, 11th Floor, Kanakia Wall Street,

CARE Advisory Research & Training Ltd. (CART) A-1102/1103, 11th Floor, Kanakia Wall Street, Chakala, Andheri - Kurla Road, Andheri East, Mumbai- 400093 +91-22-6837 4400 (Board) cart@care-cart.com www.care-cart.com/www.care-trainings.com

228 views • 8 slides

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Rob Schapire Princeton University www.cs.princeton.edu/ schapire Machine

1.26k views • 38 slides

Town Halls - Proposed Golf Cart Path Project December 2017 & January 2018 1 Agenda

Town Halls - Proposed Golf Cart Path Project December 2017 & January 2018 1 Agenda Committee Introduction Current Golf Cart Paths History of Golf Cart Paths What We Have Heard & Where Are We Project Detail & Cost

725 views • 39 slides

Introduction to Machine Learning CART: Stopping Criteria & Pruning

Introduction to Machine Learning CART: Stopping Criteria & Pruning compstat-lmu.github.io/lecture_i2ml OVERFITTING TREES The recursive partitioning procedure used to grow a CART would run until every leaf only contains a single observation.

515 views • 8 slides

Comparative Study of C5.0 and CART algorithms Presenter: Alvin Nguyen Presentation Framework 1.

Comparative Study of C5.0 and CART algorithms Presenter: Alvin Nguyen Presentation Framework 1. What is Classification? 2. Decision Tree: Binary or Multi- branches 3. CART Overview 4. C5.0 Overview 5. Comparative Study of CART and C5.0

609 views • 21 slides

Training Presentation Submitting a Requisition The training for submitting a requisition begins

Training Presentation Submitting a Requisition The training for submitting a requisition begins with the creation of a shopping cart. As a Requestor, you may have created the cart or a Shopper may have created the cart and assigned the cart to

617 views • 27 slides

NEW PRODUCT LAUNCH: MC300 MC CART Part Number: MC300 FASTER Rough-in an entire suite using

NEW PRODUCT LAUNCH: MC300 MC CART Part Number: MC300 FASTER Rough-in an entire suite using one cart. SAFER No lifting of heavy spools and wont tip over. SMARTER Compact design is reconfigurable. MC CART Part Number: MC300 FLEXIBLE

240 views • 9 slides

Preliminary Match-up of AIRS to ARM CART Soundings and AVN Grids Eric Fetzer AIRS Science Team

Preliminary Match-up of AIRS to ARM CART Soundings and AVN Grids Eric Fetzer AIRS Science Team Meeting June, 2001 Pasadena E. Fetzer 1 AVN / ARM Match-ups June, 2001 ARM CART Matchup Procedure Match Dave Tobins ARM CART best

279 views • 13 slides

Jet Impinging on a Cart Andrew Ning September 12, 2016 1 Case 1: Cart fixed We will select a

Jet Impinging on a Cart Andrew Ning September 12, 2016 1 Case 1: Cart fixed We will select a control volume that encloses and follows the water jet. We assume that the flow is steady, inviscid, and incompressible. Mass balance:

263 views • 3 slides

Real-World applications of Boosting Yoav Freund UCSD Practical Advantages of AdaBoost

Real-World applications of Boosting Yoav Freund UCSD Practical Advantages of AdaBoost Practical Advantages of AdaBoost Practical Advantages of AdaBoost Practical Advantages of AdaBoost Practical Advantages of AdaBoost fast simple and

1.23k views • 110 slides

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum Computing Machine Learning Quantum Computing Machine Learning so hot so so hot Quantum Computing Machine Learning Quantum Computing Machine Learning

835 views • 51 slides

Simple Stochastic Games: Risk Taking in Strategic Contexts Ryan O. Murphy Chair of Decision

Simple Stochastic Games: Risk Taking in Strategic Contexts Ryan O. Murphy Chair of Decision Theory and Behavioral Game Theory ETH Zrich Latsis Symposium September 12, 2012 www.dbgt.ethz.ch rmurphy@ethz.ch Michel Barnier, the EUs

915 views • 42 slides

Exploiting Seam s in Mobile Phone Gam es Gregor Broll (Embedded Interaction Research Group, LMU

PerGam es 2 0 0 6 W orkshop at the Pervasive, Dublin I reland May 7 th 2 0 0 6 Exploiting Seam s in Mobile Phone Gam es Gregor Broll (Embedded Interaction Research Group, LMU Munich) Steve Benford, Leif Opperm ann (MRL, University of Nottingham)

124 views • 8 slides

Extending Dependencies with Conditions Loreto Bravo University of Edinburgh Wenfei Fan

Extending Dependencies with Conditions Loreto Bravo University of Edinburgh Wenfei Fan University of Edinburgh & Bell Laboratories Shuai Ma University of Edinburgh 1 Outline Why Conditional Dependencies? Data Cleaning

565 views • 28 slides

Data & Visual Analytics We work with (really) large data. 4 Lesson 1 You need to learn many

10 Lessons Learned from Working with Tech Companies (e.g., Google, eBay, Symantec, Intel) Duen Horng (Polo) Chau Associate Director, MS Analytics Assistant Professor, CSE, College of Computing Georgia Tech 1 Google Polo

734 views • 52 slides

CPSC 121: Models of Computation Module 3: Representing Values in a Computer Module 3: Coming

CPSC 121: Models of Computation Module 3: Representing Values in a Computer Module 3: Coming up... Pre-class quiz #4 is due Tuesday January 23 rd at 19:00. Assigned reading for the quiz: Epp, 4 th edition: 2.3 Epp, 3 rd edition: 1.3 Rosen, 6 th

1.02k views • 45 slides

Lower Bounds for L 1 Discrepancy Armen Vagharshakyan Brown University January 10, 2013 Armen

Introduction to discrepancy Estimates Roths method Elements of proof Closing remarks Lower Bounds for L 1 Discrepancy Armen Vagharshakyan Brown University January 10, 2013 Armen Vagharshakyan Lower Bounds for L 1 Discrepancy Introduction

699 views • 12 slides

Measuring Sample Quality with Steins Method Lester Mackey Joint work with Jackson Gorham

Measuring Sample Quality with Steins Method Lester Mackey Joint work with Jackson Gorham , Andrew Duncan , Sebastian Vollmer Microsoft Research , Opendoor Labs , University of Sussex , University of Warwick

331 views • 32 slides

Flow Cytometry Data Assessment Flow Cytometry Data Assessment with L2 Discrepancy Learning with

Flow Cytometry Data Assessment Flow Cytometry Data Assessment with L2 Discrepancy Learning with L2 Discrepancy Learning with L2 Discrepancy Learning with L2 Discrepancy Learning Process Process Faysal El Khettabi Faysal El Khettabi Faysal

429 views • 14 slides