Introduction to Machine Learning Random Forest: Introduction - PowerPoint PPT Presentation

Introduction to Machine Learning Random Forest: Introduction compstat-lmu.github.io/lecture_i2ml

RANDOM FORESTS Modification of bagging for trees proposed by Breiman (2001): Tree baselearners on bootstrap samples of the data Uses decorrelated trees by randomizing splits (see below) Tree baselearners are usually fully expanded, without aggressive early stopping or pruning, to increase variance of the ensemble � c Introduction to Machine Learning – 1 / 7

RANDOM FEATURE SAMPLING From our analysis of bagging risk we can see that decorrelating trees improves the ensemble Simple randomized approach: At each node of each tree, randomly draw mtry ≤ p candidate features to consider for splitting. Recommended values: Classification: mtry = ⌊√ p ⌋ Regression: mtry = ⌊ p / 3 ⌋ � c Introduction to Machine Learning – 2 / 7

EFFECT OF ENSEMBLE SIZE 1 Tree for Iris Dataset 4.5 ● ● ● 4.0 ● ● ● ● ● ● ● ● ● ● ● ● ● 3.5 ● ● ● ● ● ● Species Sepal.Width ● ● ● ● ● ● ● ● ● ● ● ● setosa ● ● ● ● ● versicolor ● ● ● ● 3.0 ● ● ● ● ● ● virginica ● ● ● 2.5 ● 2.0 5 6 7 8 Sepal.Length � c Introduction to Machine Learning – 3 / 7

EFFECT OF ENSEMBLE SIZE 10 Trees for Iris Dataset 4.5 ● ● ● 4.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 3.5 ● ● ● ● ● ● Species Sepal.Width ● ● ● ● ● ● ● ● ● ● ● ● setosa ● ● ● ● ● versicolor ● ● ● ● 3.0 ● ● ● ● ● ● virginica ● 2.5 ● 2.0 5 6 7 8 Sepal.Length � c Introduction to Machine Learning – 4 / 7

EFFECT OF ENSEMBLE SIZE 500 Trees for Iris Dataset 4.5 ● ● ● 4.0 ● ● ● ● ● ● ● ● ● ● ● ● ● 3.5 ● ● ● ● ● ● Species Sepal.Width ● ● ● ● ● ● ● ● ● ● ● ● setosa ● ● ● ● ● versicolor ● ● ● ● 3.0 ● ● ● ● ● ● virginica ● 2.5 ● 2.0 5 6 7 8 Sepal.Length � c Introduction to Machine Learning – 5 / 7

OUT-OF-BAG ERROR ESTIMATE With the RF it is possible to obtain unbiased estimates of generalization error directly during training, based on the out-of-bag observations for each tree: 0.12 0.10 nonspam MCE 0.08 OOB spam 0.06 0.04 0 50 100 150 Number of Trees � c Introduction to Machine Learning – 6 / 7

OUT-OF-BAG ERROR ESTIMATE Tree 1 Tree 2 Tree 3 Tree M ... ... ... ... 1 1 1 1 2 2 2 2 …. 3 3 3 3 4 4 4 4 ... ... ... ... n n n n In-bag observations, used to build the trees {Remember: the same observation can enter the in-bag sample more than once} out-of-bag observations( ), used to evaluate prediction performance ( ) � n n →∞ 1 − 1 1 � OOB size: P ( not drawn ) = − → e ≈ 0 . 37 n Predict all observations with trees that didn’t use it for training and compute average loss of these predictions Similar to 3-CV, can be used for a quick model selection � c Introduction to Machine Learning – 7 / 7

Introduction to Machine Learning Random Forest: Introduction - PowerPoint PPT Presentation

Introduction to Machine Learning Random Forest: Introduction compstat-lmu.github.io/lecture_i2ml RANDOM FORESTS Modification of bagging for trees proposed by Breiman (2001): Tree baselearners on bootstrap samples of the data Uses decorrelated

Introduction to Machine Learning Random Forest: Benchmarking Trees, Forests, and Bagging K-NN

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

U.S. Forest Service Forest Service U.S. Forest Inventory and Analysis Forest Service Research

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Random forests and wine Machine Learning Toolbox Random forests Popular type of machine

Random Forest Applied Multivariate Statistics Spring 2012 Overview Intuition of Random

Epping Forest Arts Epping Forest Arts Epping Forest Councils Epping Forest Councils Arts

Forest management associations Forest owners own associations Forest Management Association is

CURRENT U.S. FOREST DATA AND MAPS Forest age FIA MapMaker Forest ownership TPO Data CURRENT

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Random Forests COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Random

More ggplot Steve Bagley somgen223.stanford.edu 1 More ggplot somgen223.stanford.edu 2 iris

Welcome to the IRIS workshop The consortium partners KFV (Austrian Road Safety Board), Austria

Todays Agenda Opening Remarks Nathan Schumacher, Public Participation Specialist The

Game Theory for Homeland Security: Lessons Learned from Deployed Applications Chr hris is

Speed Reading you learned how to read? Strategies Welcome to the Webinar! Instructor: Paul

Prophecy Variables in Separation Logic (Extending Iris with Prophecy Variables) Ralf Jung,

Towards an Axiomatic Basis for C++ Gregory Malecha, Abhishek Anand, Gordon Stewart BedRock

Part I: Introductory Materials Introduction to R Dr. Nagiza F. Samatova Department of Computer