welcome and introd u ction
play

Welcome and Introd u ction SU P E R VISE D L E AR N IN G IN R : R - PowerPoint PPT Presentation

Welcome and Introd u ction SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Data Scientists , Win Vector LLC What is Regression ? Regression : Predict a n u merical o u tcome (" dependent v ariable ")


  1. Welcome and Introd u ction SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Data Scientists , Win Vector LLC

  2. What is Regression ? Regression : Predict a n u merical o u tcome (" dependent v ariable ") from a set of inp u ts (" independent v ariables "). Statistical Sense : Predicting the e x pected v al u e of the o u tcome . Cas u al Sense : Predicting a n u merical o u tcome , rather than a discrete one . SUPERVISED LEARNING IN R : REGRESSION

  3. What is Regression ? Ho w man y u nits w ill w e sell ? ( Regression ) Will this c u stomer b uy o u r prod u ct (y es / no )? ( Classi � cation ) What price w ill the c u stomer pa y for o u r prod u ct ? ( Regression ) SUPERVISED LEARNING IN R : REGRESSION

  4. E x ample : Predict Temperat u re from Chirp Rate SUPERVISED LEARNING IN R : REGRESSION

  5. Predict Temperat u re from Chirp Rate SUPERVISED LEARNING IN R : REGRESSION

  6. Predict Temperat u re from Chirp Rate SUPERVISED LEARNING IN R : REGRESSION

  7. Regression from a Machine Learning Perspecti v e Scienti � c mindset : Modeling to u nderstand the data generation process Engineering mindset : * Modeling to predict acc u ratel y Machine Learning : Engineering mindset SUPERVISED LEARNING IN R : REGRESSION

  8. Let ' s practice ! SU P E R VISE D L E AR N IN G IN R : R E G R E SSION

  9. Linear regression - the f u ndamental method SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector LLC

  10. Linear Regression y = β + β x + β x + ... 0 1 1 2 2 y is linearl y related to each x i Each x contrib u tes additi v el y to y i SUPERVISED LEARNING IN R : REGRESSION

  11. Linear Regression in R : lm () cmodel <- lm(temperature ~ chirps_per_sec, data = cricket) form u la : temperature ~ chirps_per_sec data frame : cricket SUPERVISED LEARNING IN R : REGRESSION

  12. Form u las fmla_1 <- temperature ~ chirps_per_sec fmla_2 <- blood_pressure ~ age + weight LHS : o u tcome RHS : inp u ts u se + for m u ltiple inp u ts fmla_1 <- as.formula("temperature ~ chirps_per_sec") SUPERVISED LEARNING IN R : REGRESSION

  13. Looking at the Model y = β + β x + β x + ... 0 1 1 2 2 cmodel Call: lm(formula = temperature ~ chirps_per_sec, data = cricket) Coefficients: (Intercept) chirps_per_sec 25.232 3.291 SUPERVISED LEARNING IN R : REGRESSION

  14. More Information abo u t the Model summary(cmodel) Call: lm(formula = fmla, data = cricket) Residuals: Min 1Q Median 3Q Max -6.515 -1.971 0.490 2.807 5.001 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 25.2323 10.0601 2.508 0.026183 * chirps_per_sec 3.2911 0.6012 5.475 0.000107 *** Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.829 on 13 degrees of freedom Multiple R-squared: 0.6975, Adjusted R-squared: 0.6742 F-statistic: 29.97 on 1 and 13 DF, p-value: 0.0001067 SUPERVISED LEARNING IN R : REGRESSION

  15. More Information abo u t the Model broom::glance(cmodel) sigr::wrapFTest(cmodel) SUPERVISED LEARNING IN R : REGRESSION

  16. Let ' s practice ! SU P E R VISE D L E AR N IN G IN R : R E G R E SSION

  17. Predicting once y o u fit a model SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector LLC

  18. Predicting From the Training Data cricket$prediction <- predict(cmodel) predict() b y defa u lt ret u rns training data predictions SUPERVISED LEARNING IN R : REGRESSION

  19. Looking at the Predictions ggplot(cricket, aes(x = prediction, y = temperature)) + + geom_point() + + geom_abline(color = "darkblue") + + ggtitle("temperature vs. linear model prediction") SUPERVISED LEARNING IN R : REGRESSION

  20. Predicting on Ne w Data newchirps <- data.frame(chirps_per_sec = 16.5) newchirps$prediction <- predict(cmodel, newdata = newchirps) newchirps chirps_per_sec pred 1 16.5 79.53537 SUPERVISED LEARNING IN R : REGRESSION

  21. Let ' s practice ! SU P E R VISE D L E AR N IN G IN R : R E G R E SSION

  22. Wrapping u p linear regression SU P E R VISE D L E AR N IN G IN R : R E G R E SSION Nina Z u mel and John Mo u nt Win - Vector , LLC

  23. Pros and Cons of Linear Regression Pros Eas y to � t and to appl y Concise Less prone to o v er � � ing SUPERVISED LEARNING IN R : REGRESSION

  24. Pros and Cons of Linear Regression Pros Eas y to � t and to appl y Concise Less prone to o v er � � ing Interpretable Call: lm(formula = blood_pressure ~ age + weight, data = bloodpressure) Coefficients: (Intercept) age weight 30.9941 0.8614 0.3349 SUPERVISED LEARNING IN R : REGRESSION

  25. Pros and Cons of Linear Regression Pros Eas y to � t and to appl y Concise Less prone to o v er � � ing Interpretable Cons Can onl y e x press linear and additi v e relationships SUPERVISED LEARNING IN R : REGRESSION

  26. Collinearit y Collinearit y -- w hen inp u t v ariables are partiall y correlated . Call: lm(formula = blood_pressure ~ age + weight, data = bloodpressure) Coefficients: (Intercept) age weight 30.9941 0.8614 0.3349 SUPERVISED LEARNING IN R : REGRESSION

  27. Collinearit y Collinearit y -- w hen v ariables are partiall y correlated . Coe � cients might change sign Call: lm(formula = blood_pressure ~ age + weight, data = bloodpressure) Coefficients: (Intercept) age weight 30.9941 0.8614 0.3349 SUPERVISED LEARNING IN R : REGRESSION

  28. Collinearit y Collinearit y -- w hen v ariables are partiall y correlated . Coe � cients might change sign High collinearit y: Coe � cients ( or standard errors ) look too large Model ma y be u nstable Call: lm(formula = blood_pressure ~ age + weight, data = bloodpressure) Coefficients: (Intercept) age weight 30.9941 0.8614 0.3349 SUPERVISED LEARNING IN R : REGRESSION

  29. Coming Ne x t E v al u ating a regression model Properl y training a model SUPERVISED LEARNING IN R : REGRESSION

  30. Let ' s practice ! SU P E R VISE D L E AR N IN G IN R : R E G R E SSION

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend