Introduction to Machine Learning Random Forest: Benchmarking Trees, - PowerPoint PPT Presentation

Nov 15, 2023 •737 likes •805 views

Introduction to Machine Learning Random Forest: Benchmarking Trees, Forests, and Bagging K-NN compstat-lmu.github.io/lecture_i2ml BENCHMARK: RANDOM FOREST VS. (BAGGED) CART VS. (BAGGED) K-NN Goal: Compare performance of random forest against

Introduction to Machine Learning Random Forest: Benchmarking Trees, Forests, and Bagging K-NN compstat-lmu.github.io/lecture_i2ml
BENCHMARK: RANDOM FOREST VS. (BAGGED) CART VS. (BAGGED) K-NN Goal: Compare performance of random forest against (bagged) stable and (bagged) unstable methods Algorithms: classification tree (CART, implemented in rpart , max.depth : 30, min.split : 20, cp : 0.01) bagged classification tree using 50 bagging iterations ( bagged.rpart ) k-nearest neighbors (k-NN, implemented in kknn , k = 7) bagged k-nearest neighbors using 50 bagging iterations ( bagged.knn ) random forest with 50 trees (implemented in randomForest ) Method to evaluate performance: 10-fold cross-validation Performance measure: mean misclassification error on test sets � c Introduction to Machine Learning – 1 / 4
BENCHMARK: RANDOM FOREST VS. (BAGGED) CART VS. (BAGGED) K-NN Datasets from mlbench : Name Kind of data n p Task Glass Glass identifi- 214 10 Predict the type of glass (6 levels) on cation data the basis of the chemical analysis of the glasses represented by the 10 fea- tures Ionosphere Radar data 351 35 Predict whether the radar returns show evidence of some type of structure in the ionosphere (“good”) or not (“bad”) Sonar Sonar data 208 61 Discriminate between sonar signals bounced off a metal cylinder (“M”) and those bounced off a cylindrical rock (“R”) Waveform Artificial data 100 21 Simulated 3-class problem which is considered to be a difficult pattern recognition problem. Each class is generated by the waveform generator. � c Introduction to Machine Learning – 2 / 4
BENCHMARK: RANDOM FOREST VS. (BAGGED) CART VS. (BAGGED) K-NN Glass Ionosphere 0.5 0.20 0.4 0.15 Mean misclassification error 0.3 0.10 0.2 0.05 0.1 0.00 rpart rpart.bagged kknn kknn.bagged rf rpart rpart.bagged kknn kknn.bagged rf Sonar Waveform 0.6 ● ● 0.4 0.3 0.4 ● 0.2 0.2 0.1 ● ● rpart rpart.bagged kknn kknn.bagged rf rpart rpart.bagged kknn kknn.bagged rf � c Introduction to Machine Learning – 3 / 4
BENCHMARK: RANDOM FOREST VS. (BAGGED) CART VS. (BAGGED) K-NN Bagging k-NN does not improve performance because: k-NN is stable w.r.t. perturbations In a 2-class problem, nearest neighbor based classification only changes under bagging if both the nearest neighbor in the learning set is not in at least half of the bootstrap samples, but the probability that any given observation is in the bootstrap sample is 63% which is greater than 50%, and, simultaneously, the new nearest neighbor(s) all have a different label than the missing nearest neighbor in those bootstrap samples, which is unlikely for most regions of X × Y . � c Introduction to Machine Learning – 4 / 4

Recommend

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Rob Schapire Princeton University www.cs.princeton.edu/ schapire Machine

1.26k views • 38 slides

U.S. Forest Service Forest Service U.S. Forest Inventory and Analysis Forest Service Research

U.S. Forest Service Forest Service U.S. Forest Inventory and Analysis Forest Service Research & Development This overview provides a summary of various activities of FIA that are of critical importance to National Forest Systems.

873 views • 27 slides

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number is a number chosen as if by chance from some specified distribution such that selection of a large set of these numbers reproduces the underlying

720 views • 12 slides

Random forests and wine Machine Learning Toolbox Random forests Popular type of machine

MACHINE LEARNING TOOLBOX Random forests and wine Machine Learning Toolbox Random forests Popular type of machine learning model Good for beginners Robust to overfi ing Yield very accurate, non-linear models Machine

553 views • 31 slides

Random Forest Applied Multivariate Statistics Spring 2012 Overview Intuition of Random

Random Forest Applied Multivariate Statistics Spring 2012 Overview Intuition of Random Forest The Random Forest Algorithm De-correlation gives better accuracy Healthy Diseased Out-of-bag error (OOB-error) Healthy Variable

528 views • 14 slides

Epping Forest Arts Epping Forest Arts Epping Forest Councils Epping Forest Councils Arts

Epping Forest Arts Epping Forest Arts Epping Forest Councils Epping Forest Councils Arts Development Service Arts Development Service Julie Chandler Community & Cultural Services Manager Arts Background Arts Background The Council

150 views • 12 slides

Forest management associations Forest owners own associations Forest Management Association is

Forest management associations Forest owners own associations Forest Management Association is forest owners own association There are 76 Forest Management Associations in Finland (2016) about 300 offices, almost in every

399 views • 17 slides

CURRENT U.S. FOREST DATA AND MAPS Forest age FIA MapMaker Forest ownership TPO Data CURRENT

CURRENT U.S. FOREST DATA AND MAPS Forest age FIA MapMaker Forest ownership TPO Data CURRENT U.S. FOREST DATA Timber harvest AND MAPS Urban influence Forest covertypes Top 10 species Return to FIA Home Return to FIA Home NEXT

301 views • 8 slides

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum Computing Machine Learning Quantum Computing Machine Learning so hot so so hot Quantum Computing Machine Learning Quantum Computing Machine Learning

835 views • 51 slides

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is Machine Learning? Azure Machine Learning: How it works Azure Machine Learning in action Get started Contents What is Machine Learning?

456 views • 21 slides

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING Exam Format The exam lasts a total of 3 hours: - Upon entering the room, you must

373 views • 21 slides

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

MACHINE LEARNING 2012 MACHINE LEARNING MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How to separate the red class from the grey class? x 2 360 r x 1 Polar coordinates Data

1.04k views • 44 slides

Introduction to Machine Learning Random Forest: Introduction compstat-lmu.github.io/lecture_i2ml

Introduction to Machine Learning Random Forest: Introduction compstat-lmu.github.io/lecture_i2ml RANDOM FORESTS Modification of bagging for trees proposed by Breiman (2001): Tree baselearners on bootstrap samples of the data Uses decorrelated

523 views • 8 slides

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach to Preventing to Preventing to Preventing to Preventing Avoidable ED Utilization Avoidable ED Utilization Avoidable ED

727 views • 13 slides

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Introduction to Machine Learning 1 / 18 Outline 1 Classification, Regression, Unsupervised Learning 2 About Dimensionality 3 Drawings and

701 views • 18 slides

Random Forests COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Random

Random Forests COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Random Forests 1 / 10 Outline 1 Motivation 2 Bagging 3 Randomizing Split Dimension 4 Training 5 Inference 6 Out-of-Bag Statistical Risk Estimate COMPSCI 371D

241 views • 10 slides

Non-Bayesian Classifiers Part I: k -Nearest Neighbor Classifier and Distance Functions Selim

Non-Bayesian Classifiers Part I: k -Nearest Neighbor Classifier and Distance Functions Selim Aksoy Bilkent University Department of Computer Engineering saksoy@cs.bilkent.edu.tr CS 551, Spring 2006 Non-Bayesian Classifiers We have been

271 views • 13 slides

c i,j max k,m c k,m 4 Wednesday, 26 Feb. 2020 Machine Learning (COMP 135) 3 Wednesday, 26

Uses of Nearest Neighbors } Once we have found the k -nearest neighbors of a point, we can use this information: In and of itself : sometimes we just want to know what 1. those nearest neighbors actually are (items that are similar to a given

221 views • 4 slides

Instance Based Learning Based on Machine Learning, T. Mitchell, McGRAW Hill, 1997, ch. 8

0. Instance Based Learning Based on Machine Learning, T. Mitchell, McGRAW Hill, 1997, ch. 8 Acknowledgement: The present slides are an adaptation of slides drawn by T. Mitchell 1. Key ideas: training: simply store all training examples

598 views • 17 slides

ECE 5984: Introduction to Machine Learning Topics: Supervised Learning General Setup,

ECE 5984: Introduction to Machine Learning Topics: Supervised Learning General Setup, learning from data Nearest Neighbour Readings: Barber 14 (kNN) Dhruv Batra Virginia Tech Administrativia New class room GBJ 102 More

746 views • 40 slides

Inference and Estimation Using Nearest Neighbors 2019 The Second Korea-Japan Machine Learning

Inference and Estimation Using Nearest Neighbors 2019 The Second Korea-Japan Machine Learning Workshop 2019. 2. 22 (Fri.) Yung-Kyun Noh Seoul National University Hanyang University Seoul National University Nearest Neighbors Similar

374 views • 25 slides

C o l o r G l a s s C o n d e n s a t e a n d p a r t o n s a t u

C o l o r G l a s s C o n d e n s a t e a n d p a r t o n s a t u r a t i o n : o v e r v i e w o f r e c e n t d e v e l o p m e n t s A d r i a n D u m i t r u R I

791 views • 43 slides

Size Estimation - Statistical Models for Underreporting Gerhard Neubauer, Gordana Djura &

Size Estimation - Statistical Models for Underreporting Gerhard Neubauer, Gordana Djura & Herwig Friedl JOANNEUM RESEARCH and Technical University, Graz useR! 2009, Rennes 8.-10. July 2009 p. 1 1 Introduction useR! 2009, Rennes

601 views • 27 slides

Monodromy Solver: sequential and parallel joint with Nathan Bliss (UIC), Tim Duff (Georgia Tech),

Monodromy Solver: sequential and parallel joint with Nathan Bliss (UIC), Tim Duff (Georgia Tech), Jeff Sommars (UIC). Anton Leykin Georgia Tech ISSAC 2018, New York, July 2018 Continuation in a nutshell: lifting paths from B to V . The

570 views • 11 slides