workshop 7 generalized linear models
play

Workshop 7: (Generalized) Linear models Murray Logan July 19, - PDF document

-1- Workshop 7: (Generalized) Linear models Murray Logan July 19, 2017 Table of contents 1 Linear model Assumptions 1 2 Multiple (Genearalized) Linear Regression 15 3 Centering data 17 4 Assumptions 20 5 Multiple linear models


  1. -1- Workshop 7: (Generalized) Linear models Murray Logan July 19, 2017 Table of contents 1 Linear model Assumptions 1 2 Multiple (Genearalized) Linear Regression 15 3 Centering data 17 4 Assumptions 20 5 Multiple linear models in R 21 6 Model selection 26 7 Worked Examples 27 8 Anova Parameterization 29 9 Partitioning of variance (ANOVA) 35 10 Worked Examples 37 1. Linear model Assumptions 1.1. Assumptions • Independence - unbiased, scale of treatment • Normality - residuals • Homogeneity of variance - residuals • Linearity 1.2. Assumptions 1.2.1. Normality ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● y y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

  2. ● 2.5 ● ● ● ● ● ● -2- 2.0 ● ● ● ● 1.5 ● ● ● 1.0 ● ● ● ● 0.5 ● 0.0 0 5 10 15 20 25 30 1.3. Assumptions 1.3.1. Homogeneity of variance ● ● ● ● ● ● Residuals ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● res Y ● ● ● ● y y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● X x Predicted x ● ● ● ● ● Residuals ● ● ● ● ● ● ● ● ● ● ● ● res Y y ● ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● X x Predicted x ● 2.5 ● 1.4. Assumptions ● 1.4.1. Linearity ● ● ● Trendline ● ● ● 2.0 60 ● ● 2.5 ● ● ● ● ● ● ● ● ● 50 ● 2.0 ● 40 ● ● ● ● ● ● ● 1.5 ● ● ● ● 30 ● 1.5 ● ● ● 1.0 ● ● 20 ● ● ● ● ● ● ● ● 0.5 10 ● ● ● ● ● ● ● 0 0.0 ● 0 5 10 15 20 25 30 0 5 10 15 20 25 30 1.0 1.5. Assumptions ● ● 1.5.1. Linearity ● Loess (lowess) smoother 0.5 ● 0.0 0 5 10 15 20 25 30

  3. -3- ● 60 ● ● ● ● 50 40 ● ● ● ● 30 ● ● ● 20 ● ● ● ● 10 ● ● ● ● 0 0 5 10 15 20 25 30 ● ● 2.5 ● ● ● ● ● ● 2.0 ● ● ● ● 1.5 ● ● ● ● 1.0 ● ● ● 0.5 ● 0.0 0 5 10 15 20 25 30 1.6. Assumptions 1.6.1. Linearity Spline smoother

  4. -4- ● ● 60 ● ● 2.5 ● ● ● ● ● ● ● ● 50 ● 2.0 ● 40 ● ● ● ● ● ● 1.5 ● ● 30 ● ● ● ● 1.0 ● ● 20 ● ● ● ● ● ● ● 0.5 10 ● ● ● ● ● 0 0.0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 1.7. Assumptions y i = β 0 + β 1 × x i + ε i ϵ i ∼ N (0, σ 2 ) 1.8. Assumptions y i = β 0 + β 1 × x i + ε i ϵ i ∼ N (0, σ 2 ) 1.9. The linear predictor y = β 0 + β 1 x Y X 3 0 2.5 1 6 2 5.5 3 9 4 8.6 5 12 6 β 0 × 1 β 1 × 0 3.0 = + β 0 × 1 β 1 × 1 2.5 = + β 0 × 1 β 1 × 2 6.0 = + 5.5 = β 0 × 1 + β 1 × 3 9.0 = β 0 × 1 + β 1 × 4 8.6 = β 0 × 1 + β 1 × 5 12.0 = β 0 × 1 + β 1 × 6

  5. -5-     3.0 1 0 2.5 1 1         6.0 1 2     ( β 0 )     5.5 = 1 3 ×     β 1     9.0 1 4     � �� �     8.6 1 5 Parameter vector     12.0 1 6 � �� � � �� � Response values Model matrix 1.10. Linear models in R > lm(formula, data= DATAFRAME) Model R formula Description y i = β 0 + β 1 x i y~1+x Full model y~x y i = β 0 Null model y~1 y i = β 1 Through origin y~-1+x > lm(Y~X, data= DATA) Call: lm(formula = Y ~ X, data = DATA) Coefficients: (Intercept) X 2.136 1.507 1.11. Linear models in R > Xmat = model.matrix(~X, data=DATA) > Xmat (Intercept) X 1 1 0 2 1 1 3 1 2 4 1 3 5 1 4 6 1 5 7 1 6 attr(,"assign") [1] 0 1 > lm.fit(Xmat,DATA$Y) $coefficients (Intercept) X 2.135714 1.507143

  6. -6- $residuals [1] 0.8642857 -1.1428571 0.8500000 -1.1571429 0.8357143 -1.0714286 0.8214286 $effects (Intercept) X -17.6131444 7.9750504 0.6507186 -1.5697499 0.2097817 -1.9106868 -0.2311552 $rank [1] 2 $fitted.values [1] 2.135714 3.642857 5.150000 6.657143 8.164286 9.671429 11.178571 $assign [1] 0 1 $qr $qr (Intercept) X 1 -2.6457513 -7.93725393 2 0.3779645 5.29150262 3 0.3779645 0.03347335 4 0.3779645 -0.15550888 5 0.3779645 -0.34449112 6 0.3779645 -0.53347335 7 0.3779645 -0.72245559 attr(,"assign") [1] 0 1 $qraux [1] 1.377964 1.222456 $pivot [1] 1 2 $tol [1] 1e-07 $rank [1] 2 attr(,"class") [1] "qr" $df.residual [1] 5 1.12. Example 1.12.1. Linear Model N (0, σ 2 ) y i = β 0 + β 1 x i > DATA.lm<-lm(Y~X, data=DATA)

  7. -7- 1.12.2. Generalized Linear Model N (0, σ 2 ) g ( y i ) = β 0 + β 1 x i > DATA.glm<-glm(Y~X, data=DATA, family='gaussian') 1.13. Model diagnostics 1.13.1. Residuals 30 25 20 ● ● y 15 ● 10 ● ● ● 5 ● ● 0 0 1 2 3 4 5 6 x 1.14. Model diagnostics

  8. -8- 1.14.1. Residuals 30 25 20 ● ● y 15 ● 10 ● ● ● 5 ● ● 0 0 1 2 3 4 5 6 x 1.15. Model diagnostics 1.15.1. Leverage 30 ● ● 25 20 y 15 10 ● ● ● ● 5 ● ● 0 0 5 10 15 20 x

  9. -9- 1.16. Model diagnostics 1.16.1. Cook’s D 50 40 ● ● 30 y 20 10 ● ● ● ● ● ● 0 0 5 10 15 20 x 1.17. Example 1.17.1. Model evaluation Extractor Description residuals() Extracts residuals from model 15 ● 10 ● ● ● y 5 ● ● ● 0 0 1 2 3 4 5 6 x > residuals(DATA.lm) 1 2 3 4 5 6 7 0.8642857 -1.1428571 0.8500000 -1.1571429 0.8357143 -1.0714286 0.8214286 1.18. Example

  10. -10- 1.18.1. Model evaluation Extractor Description residuals() Extracts residuals from model fitted() Extracts the predicted values 15 ● ● 10 ● ● ● ● ● y ● 5 ● ● ● ● ● ● 0 0 1 2 3 4 5 6 x > fitted(DATA.lm) 1 2 3 4 5 6 7 2.135714 3.642857 5.150000 6.657143 8.164286 9.671429 11.178571 1.19. Example 1.19.1. Model evaluation Extractor Description Extracts residuals from model residuals() fitted() Extracts the predicted values plot() Series of diagnostic plots > plot(DATA.lm)

  11. -11- Residuals vs Fitted Normal Q−Q 1.0 Standardized residuals 1.0 ● ● ● ● ● ● ● ● Residuals 0.0 0.0 −1.0 −1.0 ● ● ● 6 2 4 4 6 ● ● ● 2 2 4 6 8 10 −1.0 0.0 0.5 1.0 Fitted values Theoretical Quantiles Scale−Location Residuals vs Leverage Standardized residuals 2 ● Standardized residuals ● 4 6 ● 1.0 ● 1 7 ● 0.5 ● ● ● ● ● ● 0.8 0.0 0.4 −1.0 0.5 ● ● Cook's distance ● 2 0.0 2 4 6 8 10 0.0 0.1 0.2 0.3 0.4 Fitted values Leverage 1.20. Example 1.20.1. Model evaluation Extractor Description residuals() Residuals Predicted values fitted() Diagnostic plots plot() Leverage (hat) and Cook’s D influence.measures() 1.21. Example 1.21.1. Model evaluation > influence.measures(DATA.lm) Influence measures of lm(formula = Y ~ X, data = DATA) : dfb.1_ dfb.X dffit cov.r cook.d hat inf 1 0.9603 -7.99e-01 0.960 1.82 0.4553 0.464 2 -0.7650 5.52e-01 -0.780 1.15 0.2756 0.286 3 0.3165 -1.63e-01 0.365 1.43 0.0720 0.179 4 -0.2513 -7.39e-17 -0.453 1.07 0.0981 0.143 5 0.0443 1.60e-01 0.357 1.45 0.0696 0.179 6 0.1402 -5.06e-01 -0.715 1.26 0.2422 0.286 7 -0.3466 7.50e-01 0.901 1.91 0.4113 0.464 1.22. Example 1.22.1. Model evaluation

  12. -12- Extractor Description Residuals residuals() Predicted values fitted() Diagnostic plots plot() Leverage, Cook’s D influence.measures() Summarizes important output from model summary() 1.23. Example 1.23.1. Model evaluation > summary(DATA.lm) Call: lm(formula = Y ~ X, data = DATA) Residuals: 1 2 3 4 5 6 7 0.8643 -1.1429 0.8500 -1.1571 0.8357 -1.0714 0.8214 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.1357 0.7850 2.721 0.041737 * X 1.5071 0.2177 6.923 0.000965 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.152 on 5 degrees of freedom Multiple R-squared: 0.9055, Adjusted R-squared: 0.8866 F-statistic: 47.92 on 1 and 5 DF, p-value: 0.0009648 1.24. Example 1.24.1. Model evaluation Extractor Description Residuals residuals() Predicted values fitted() Diagnostic plots plot() Leverage, Cook’s D influence.measures() Model output summary() Confidence intervals of parameters confint() 1.25. Example

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend