welcome
play

Welcome! Julia Silge Data Scientist at Stack Overflow DataCamp - PowerPoint PPT Presentation

DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Welcome! Julia Silge Data Scientist at Stack Overflow DataCamp Supervised Learning in R: Case Studies In this course, you will... use exploratory data


  1. DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Welcome! Julia Silge Data Scientist at Stack Overflow

  2. DataCamp Supervised Learning in R: Case Studies In this course, you will... use exploratory data analysis to prepare for predictive modeling explore which modeling approaches to use for different kinds of data practice implementing supervised machine learning for classification and regression

  3. DataCamp Supervised Learning in R: Case Studies Supervised machine learning Regression Classification

  4. DataCamp Supervised Learning in R: Case Studies Case studies Fuel efficiency for cars Stack Overflow Developer Survey Voter turnout Predict age of nuns from survey responses

  5. DataCamp Supervised Learning in R: Case Studies Fuel efficiency

  6. DataCamp Supervised Learning in R: Case Studies Fuel efficiency From the US Department of Energy > cars2018 # A tibble: 1,144 x 15 Model `Model Index` Displacement Cylinders Gears Transmission MPG <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> 1 Acura NSX 57.0 3.50 6.00 9.00 Manual 21.0 2 ALFA ROMEO 4C 410 1.80 4.00 6.00 Manual 28.0 3 Audi R8 AWD 65.0 5.20 10.0 7.00 Manual 17.0 4 Audi R8 RWD 71.0 5.20 10.0 7.00 Manual 18.0 5 Audi R8 Spyde… 66.0 5.20 10.0 7.00 Manual 17.0 6 Audi R8 Spyde… 72.0 5.20 10.0 7.00 Manual 18.0 7 Audi TT Roads… 46.0 2.00 4.00 6.00 Manual 26.0 8 BMW M4 DTM Ch… 488 3.00 6.00 7.00 Manual 20.0 9 Bugatti Chiron 38.0 8.00 16.0 7.00 Manual 11.0 10 Chevrolet COR… 278 6.20 8.00 8.00 Automatic 18.0 # ... with 1,134 more rows, and 8 more variables: Aspiration <chr>, `Lockup # Torque Converter` <chr>, Drive <chr>, `Max Ethanol` <dbl>, `Recommended # Fuel` <fct>, `Intake Valves Per Cyl` <dbl>, `Exhaust Valves Per Cyl` <dbl>, # `Fuel injection` <chr>

  7. DataCamp Supervised Learning in R: Case Studies Fuel efficiency From the US Department of Energy > names(cars2018) [1] "Model" "Model Index" [3] "Displacement" "Cylinders" [5] "Gears" "Transmission" [7] "MPG" "Aspiration" [9] "Lockup Torque Converter" "Drive" [11] "Max Ethanol" "Recommended Fuel" [13] "Intake Valves Per Cyl" "Exhaust Valves Per Cyl" [15] "Fuel injection"

  8. DataCamp Supervised Learning in R: Case Studies Special characters in variable names > cars2018 %>% + select(`Fuel injection`) # A tibble: 1,144 x 1 `Fuel injection` <chr> 1 Direct ignition 2 Direct ignition 3 Direct ignition 4 Direct ignition 5 Direct ignition 6 Direct ignition 7 Direct ignition 8 Direct ignition 9 Multipoint/sequential ignition 10 Direct ignition # ... with 1,134 more rows

  9. DataCamp Supervised Learning in R: Case Studies Exploratory data analysis

  10. DataCamp Supervised Learning in R: Case Studies Exploratory data analysis library(tidyverse) ggplot2 dplyr tidyr others! To learn more about the tidyverse, visit this page .

  11. DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Time to train some models!

  12. DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Getting started with caret Julia Silge Data Scientist at Stack Overflow

  13. DataCamp Supervised Learning in R: Case Studies Predicting fuel efficiency

  14. DataCamp Supervised Learning in R: Case Studies Tools for predictive modeling THE CARET PACKAGE

  15. DataCamp Supervised Learning in R: Case Studies

  16. DataCamp Supervised Learning in R: Case Studies Training data and testing data with caret > library(caret) > > in_train <- createDataPartition(cars_vars$Aspiration, + p = 0.8, list = FALSE) > training <- cars_vars[in_train,] > testing <- cars_vars[-in_train,]

  17. DataCamp Supervised Learning in R: Case Studies Training data and testing data with caret Build your model with your training data Choose your model with your validation data Evaluate your model with your testing data

  18. DataCamp Supervised Learning in R: Case Studies Training a model > fit_lm <- train(log(MPG) ~ ., method = "lm", data = training, + trControl = trainControl(method = "none")) Train a model Evaluate that model using yardstick

  19. DataCamp Supervised Learning in R: Case Studies Evaluating a model THE YARDSTICK PACKAGE

  20. DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Let's practice!

  21. DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Training a model with resampling Julia Silge Data Scientist at Stack Overflow

  22. DataCamp Supervised Learning in R: Case Studies Bootstrap resampling Sample with replacement from the original dataset

  23. DataCamp Supervised Learning in R: Case Studies

  24. DataCamp Supervised Learning in R: Case Studies

  25. DataCamp Supervised Learning in R: Case Studies Bootstrap resampling with caret > cars_rf_bt <- train(log(MPG) ~ ., method = "rf", + data = training, + trControl = trainControl(method = "boot")

  26. DataCamp Supervised Learning in R: Case Studies Comparing predicted to real values `log(MPG)` `Linear regression` `Random forest` <dbl> <dbl> <dbl> 1 2.89 2.79 2.83 2 2.89 3.00 2.89 3 3.26 3.22 3.26 4 3.14 3.09 3.10 5 3.26 3.22 3.26 6 2.89 3.11 2.98 7 2.48 2.59 2.51 8 2.71 2.81 2.82 9 3.37 3.29 3.27 10 2.83 2.90 2.90

  27. DataCamp Supervised Learning in R: Case Studies Visualizing model predictions

  28. DataCamp Supervised Learning in R: Case Studies SUPERVISED LEARNING IN R : CASE STUDIES Let's practice!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend