the int u ition behind tree based methods
play

The int u ition behind tree - based methods SU P E R VISE D L E AR N - PowerPoint PPT Presentation

The int u ition behind tree - based methods SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC E x ample : Predict animal intelligence from Gestation Time and Litter Si z e SUPERVISED LEARNING IN


  1. The int u ition behind tree - based methods SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC

  2. E x ample : Predict animal intelligence from Gestation Time and Litter Si z e SUPERVISED LEARNING IN R : REGRESSION

  3. Decision Trees R u les of the form : if a AND b AND c THEN y Non - linear concepts inter v als non - monotonic relationships non - additi v e interactions AND : similar to m u ltiplication SUPERVISED LEARNING IN R : REGRESSION

  4. Decision Trees IF Li � er < 1.15 AND Gestation ≥ 268 → intelligence = 0.315 IF Li � er IN [1.15, 4.3) → intelligence = 0.131 SUPERVISED LEARNING IN R : REGRESSION

  5. Decision Trees Pro : Trees Ha v e an E x pressi v e Concept Space Model RMSE linear 0.1200419 tree 0.1072732 SUPERVISED LEARNING IN R : REGRESSION

  6. Decision Trees Con : Coarse - Grained Predictions SUPERVISED LEARNING IN R : REGRESSION

  7. It ' s Hard for Trees to E x press Linear Relationships Trees Predict A x is - Aligned Regions SUPERVISED LEARNING IN R : REGRESSION

  8. It ' s Hard for Trees to E x press Linear Relationships It ' s Hard to E x press Lines w ith Steps SUPERVISED LEARNING IN R : REGRESSION

  9. Other Iss u es w ith Trees Tree w ith too man y splits ( deep tree ): Too comple x - danger of o v er � t Tree w ith too fe w splits ( shallo w tree ): Predictions too coarse - grained SUPERVISED LEARNING IN R : REGRESSION

  10. Ensembles of Trees Ensembles Gi v e Finer - grained Predictions than Single Trees SUPERVISED LEARNING IN R : REGRESSION

  11. Ensembles of Trees Ensemble Model Fits Animal Intelligence Data Be � er than Single Tree Model RMSE linear 0.1200419 tree 0.1072732 random forest 0.0901681 SUPERVISED LEARNING IN R : REGRESSION

  12. Let ' s practice ! SU P E R VISE D L E AR N IN G IN R : R E G R E SSION

  13. Random forests SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LCC

  14. Random Forests M u ltiple di v erse decision trees a v eraged together Red u ces o v er � t Increases model e x pressi v eness Finer grain predictions SUPERVISED LEARNING IN R : REGRESSION

  15. B u ilding a Random Forest Model 1. Dra w bootstrapped sample from training data 2. For each sample gro w a tree At each node , pick best v ariable to split on ( from a random s u bset of all v ariables ) Contin u e u ntil tree is gro w n 3. To score a dat u m , e v al u ate it w ith all the trees and a v erage the res u lts . SUPERVISED LEARNING IN R : REGRESSION

  16. E x ample : Bike Rental Data cnt ~ hr + holiday + workingday + + weathersit + temp + atemp + hum + windspeed SUPERVISED LEARNING IN R : REGRESSION

  17. Random Forests w ith ranger () model <- ranger(fmla, bikesJan, + num.trees = 500, + respect.unordered.factors = "order") formula , data num.trees ( defa u lt 500) - u se at least 200 mtry - n u mber of v ariables to tr y at each node defa u lt : sq u are root of the total n u mber of v ariables respect.unordered.factors - recommend set to " order " " safe " hashing of categorical v ariables SUPERVISED LEARNING IN R : REGRESSION

  18. Random Forests w ith ranger () model Ranger result ... OOB prediction error (MSE): 3103.623 R squared (OOB): 0.7837386 Random forest algorithm ret u rns estimates of o u t - of - sample performance . SUPERVISED LEARNING IN R : REGRESSION

  19. Predicting w ith a ranger () model bikesFeb$pred <- predict(model, bikesFeb)$predictions predict() inp u ts : model data Predictions can be accessed in the element predictions . SUPERVISED LEARNING IN R : REGRESSION

  20. E v al u ating the model Calc u late RMSE : bikesFeb %>% + mutate(residual = pred - cnt) %>% + summarize(rmse = sqrt(mean(residual^2))) rmse 1 67.15169 Model RMSE Q u asipoisson 69.3 Random forests 67.15 SUPERVISED LEARNING IN R : REGRESSION

  21. E v al u ating the model SUPERVISED LEARNING IN R : REGRESSION

  22. E v al u ating the model SUPERVISED LEARNING IN R : REGRESSION

  23. Let ' s practice ! SU P E R VISE D L E AR N IN G IN R : R E G R E SSION

  24. One - Hot - Encoding Categorical Variables SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC

  25. Wh y Con v ert Categoricals Man u all y? Most R f u nctions manage the con v ersion for y o u model.matrix() xgboost() does not M u st con v ert categorical v ariables to n u meric representation Con v ersion to indicators : one - hot encoding SUPERVISED LEARNING IN R : REGRESSION

  26. One - hot - encoding and data cleaning w ith `v treat ` Basic idea : designTreatmentsZ() to design a treatment plan from the training data , then prepare() to created " clean " data all n u merical no missing v al u es u se prepare() w ith treatment plan for all f u t u re data SUPERVISED LEARNING IN R : REGRESSION

  27. A Small v treat E x ample Training Data Test Data x u y x u y one 44 0.4855671 one 5 2.6488148 t w o 24 1.3683726 three 12 1.5012938 three 66 2.0352837 one 56 0.1993731 t w o 22 1.6396267 t w o 28 1.2778516 SUPERVISED LEARNING IN R : REGRESSION

  28. Create the Treatment Plan vars <- c("x", "u") treatplan <- designTreatmentsZ(dframe, varslist, verbose = FALSE) Inp u ts to designTreatmentsZ() dframe : training data varlist : list of inp u t v ariable names set v erbose = FALSE to s u ppress progress messages SUPERVISED LEARNING IN R : REGRESSION

  29. Get the Ne w Variables The scoreFrame describes the v ariable mapping and t y pes (scoreFrame <- treatplan$scoreFrame %>% + select(varName, origName, code)) varName origName code 1 x_lev_x.one x lev 2 x_lev_x.three x lev 3 x_lev_x.two x lev 4 x_catP x catP 5 u_clean u clean Get the names of the ne w lev and clean v ariables (newvars <- scoreFrame %>% + filter(code %in% c("clean", "lev")) %>% + use_series(varName)) "x_lev_x.one" "x_lev_x.three" "x_lev_x.two" "u_clean" SUPERVISED LEARNING IN R : REGRESSION

  30. Prepare the Training Data for Modeling training.treat <- prepare(treatmentplan, dframe, varRestriction = newvars) Inp u ts to prepare() : treatmentplan : treatment plan dframe : data frame varRestriction : list of v ariables to prepare ( optional ) defa u lt : prepare all v ariables SUPERVISED LEARNING IN R : REGRESSION

  31. Before and After Data Treatment Training Data Treated Training Data x u y x_ le v x_ le v x_ le v _x. _x. _x. u_ clean one 44 0.4855671 one three t w o t w o 24 1.3683726 1 0 0 44 three 66 2.0352837 0 0 1 24 t w o 22 1.6396267 0 1 0 66 0 0 1 22 SUPERVISED LEARNING IN R : REGRESSION

  32. Prepare the Test Data Before Model Application (test.treat <- prepare(treatplan, test, varRestriction = newvars)) x_lev_x.one x_lev_x.three x_lev_x.two u_clean 1 1 0 0 5 2 0 1 0 12 3 1 0 0 56 4 0 0 1 28 SUPERVISED LEARNING IN R : REGRESSION

  33. v treat Treatment is Rob u st Pre v io u sl y u nseen x le v el : fo u r fo u r encodes to (0, 0, 0) prepare(treatplan, toomany, ...) x u y one 4 0.2331301 x_ le v x_ le v x_ le v _x. _x. _x. u_ clean t w o 14 1.9331760 one three t w o three 66 3.1251029 1 0 0 4 fo u r 25 4.0332491 0 0 1 14 0 1 0 66 0 0 0 25 SUPERVISED LEARNING IN R : REGRESSION

  34. Let ' s practice ! SU P E R VISE D L E AR N IN G IN R : R E G R E SSION

  35. Gradient boosting machines SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC

  36. Ho w Gradient Boosting Works 1. Fit a shallo w tree T to the 1 data : M = T 1 1 SUPERVISED LEARNING IN R : REGRESSION

  37. Ho w Gradient Boosting Works 1. Fit a shallo w tree T to the 1 data : M = T 1 1 2. Fit a tree T _2 to the resid u als . Find γ s u ch that M = M + γT is the 2 1 2 best � t to data SUPERVISED LEARNING IN R : REGRESSION

  38. Ho w Gradient Boosting Works Reg u lari z ation : learning rate η ∈ (0,1) M = M + ηγT 2 1 2 Larger η : faster learning Smaller η : less risk of o v er � t SUPERVISED LEARNING IN R : REGRESSION

  39. Ho w Gradient Boosting Works 1. Fit a shallo w tree T to the 1 data M = T 1 1 2. Fit a tree T _2 to the resid u als . M = M + ηγ T 2 1 2 2 3. Repeat (2) u ntil stopping condition met Final Model : ∑ M = M + η γ T 1 i i SUPERVISED LEARNING IN R : REGRESSION

  40. Cross -v alidation to G u ard Against O v erfit Training error keeps decreasing , b u t test error doesn ' t SUPERVISED LEARNING IN R : REGRESSION

  41. Best Practice (w ith x gboost ()) 1. R u n xgb.cv() w ith a large n u mber of ro u nds ( trees ). SUPERVISED LEARNING IN R : REGRESSION

  42. Best Practice (w ith x gboost ()) 1. R u n xgb.cv() w ith a large n u mber of ro u nds ( trees ). 2. xgb.cv()$evaluation_log : records estimated RMSE for each ro u nd . Find the n u mber of trees that minimi z es estimated RMSE : n best SUPERVISED LEARNING IN R : REGRESSION

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend