DataCamp Supervised Learning in R: Case Studies
Surveying Catholic sisters in 1967
SUPERVISED LEARNING IN R: CASE STUDIES
Surveying Catholic sisters in 1967 Julia Silge Data Scientist at - - PowerPoint PPT Presentation
DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Surveying Catholic sisters in 1967 Julia Silge Data Scientist at Stack Overflow DataCamp Supervised Learning in R: Case Studies Conference of Major
DataCamp Supervised Learning in R: Case Studies
SUPERVISED LEARNING IN R: CASE STUDIES
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
Response Code Disagree very much 1 Disagree somewhat 2 Neither agree nor disagree 3 Agree somewhat 4 Agree very much 5
DataCamp Supervised Learning in R: Case Studies
> sisters67 # A tibble: 77,112 x 67 age sister v116 v117 v118 v119 v120 v121 v122 v123 v124 v125 <dbl> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> 1 60.0 1 1 1 3 5 1 1 3 5 3 1 2 70.0 2 2 2 4 4 1 3 1 5 4 1 3 60.0 3 1 1 3 2 2 3 1 1 3 1 4 60.0 4 5 1 2 4 1 3 4 3 3 4 5 50.0 5 2 3 3 3 2 2 1 5 2 5 6 40.0 7 4 3 2 5 4 3 1 5 2 5 7 50.0 9 5 4 5 4 4 5 3 5 4 2 8 40.0 10 5 4 3 5 1 3 5 5 5 4 9 30.0 11 2 2 3 5 1 3 3 5 1 3 10 30.0 12 4 1 5 5 1 4 3 5 1 5 # ... with 77,102 more rows, and 55 more variables: v126 <int>, v127 <int>, # v128 <int>, v129 <int>, v130 <int>, v131 <int>, v132 <int>, v133 <int>, # v134 <int>, v135 <int>, v136 <int>, v137 <int>, v138 <int>, v139 <int>, # v140 <int>, v141 <int>, v142 <int>, v143 <int>, v144 <int>, v145 <int>, # v146 <int>, v147 <int>, v148 <int>, v149 <int>, v150 <int>, v151 <int>, # v152 <int>, v153 <int>, v154 <int>, v155 <int>, v156 <int>, v157 <int>, # v158 <int>, v159 <int>, v160 <int>, v161 <int>, v162 <int>, v163 <int>, # v164 <int>, v165 <int>, v166 <int>, v167 <int>, v168 <int>, v169 <int>, # v170 <int>, v171 <int>, v172 <int>, v173 <int>, v174 <int>, v175 <int>, # v176 <int>, v177 <int>, v178 <int>, v179 <int>, v180 <int>
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
> sisters67 %>% + select(-sister) %>% + gather(key, value, -age) # A tibble: 5,012,280 x 3 age key value <dbl> <chr> <int> 1 60.0 v116 1 2 70.0 v116 2 3 60.0 v116 1 4 60.0 v116 5 5 50.0 v116 2 6 40.0 v116 4 7 50.0 v116 5 8 40.0 v116 5 9 30.0 v116 2 10 30.0 v116 4 # ... with 5,012,270 more rows
DataCamp Supervised Learning in R: Case Studies
SUPERVISED LEARNING IN R: CASE STUDIES
DataCamp Supervised Learning in R: Case Studies
SUPERVISED LEARNING IN R: CASE STUDIES
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
SUPERVISED LEARNING IN R: CASE STUDIES
DataCamp Supervised Learning in R: Case Studies
SUPERVISED LEARNING IN R: CASE STUDIES
DataCamp Supervised Learning in R: Case Studies
"rpart" "xgbLinear" "gbm"
DataCamp Supervised Learning in R: Case Studies
## CART sisters_cart <- train(age ~ ., method = "rpart", data = training) ## xgboost sisters_rf <- train(age ~ ., method = "xgbLinear", data = training) ## gbm sisters_gbm <- train(age ~ ., method = "gbm", data = training)
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
> validation %>% + mutate(prediction = predict(sisters_xg, validation)) %>% + rmse(truth = age, estimate = prediction) [1] 13.27101 > testing %>% + mutate(prediction = predict(sisters_xg, testing)) %>% + rmse(truth = age, estimate = prediction) [1] 13.36945
DataCamp Supervised Learning in R: Case Studies
SUPERVISED LEARNING IN R: CASE STUDIES
DataCamp Supervised Learning in R: Case Studies
SUPERVISED LEARNING IN R: CASE STUDIES
DataCamp Supervised Learning in R: Case Studies
> metrics(model_results, truth = age, estimate = CART) # A tibble: 1 x 2 rmse rsq <dbl> <dbl> 1 14.8 0.170 > metrics(model_results, truth = age, estimate = XBG) # A tibble: 1 x 2 rmse rsq <dbl> <dbl> 1 13.3 0.338 > metrics(model_results, truth = age, estimate = GBM) # A tibble: 1 x 2 rmse rsq <dbl> <dbl> 1 12.8 0.382
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
DataCamp Supervised Learning in R: Case Studies
SUPERVISED LEARNING IN R: CASE STUDIES