stat 401a statistical methods for research workers
play

STAT 401A - Statistical Methods for Research Workers Case statistics - PowerPoint PPT Presentation

STAT 401A - Statistical Methods for Research Workers Case statistics Jarad Niemi (Dr. J) Iowa State University last updated: November 17, 2014 Jarad Niemi (Iowa State) Case statistics November 17, 2014 1 / 9 Influential observations Case


  1. STAT 401A - Statistical Methods for Research Workers Case statistics Jarad Niemi (Dr. J) Iowa State University last updated: November 17, 2014 Jarad Niemi (Iowa State) Case statistics November 17, 2014 1 / 9

  2. Influential observations Case statistics Definition Leverage ( h i ) is a measure of the distance between an observation’s explanatory variable values and the average of the explanatory variable values in the entire data set. Rule-of-thumb: Possible concern when leverage > 2 p / n where p is the number of regression coefficients and n is the number of observations. Definition Cook’s distance (D) is a measure of the overall effect on estimated regression coefficients when removing an observation. Rule-of-thumb: Concerned when Cook’s D ≈ 1. Jarad Niemi (Iowa State) Case statistics November 17, 2014 2 / 9

  3. Influential observations Leverage and influence Consider simple linear regression (point of interest is the open circle): Low influence Leverage= 0.05 Leverage= 0.42 Cook's D= 0 Cook's D= 0.05 High influence Leverage= 0.05 Leverage= 0.42 Cook's D= 0.36 Cook's D= 4.11 Low leverage High leverage Jarad Niemi (Iowa State) Case statistics November 17, 2014 3 / 9

  4. Influential observations Residuals Residuals Residual (observed minus predicted): r i = ˆ e i = Y i − ˆ µ i (Internally) studentized residual r i r i = σ √ 1 − h i � ˆ SD ( r i ) Externally studentized residuals r i √ 1 − h i σ ( i ) ˆ where ˆ σ ( i ) is the estimate of the standard deviation about the regression line from the fit that excludes observation i . 95% of studentized residuals should be within -2 and 2. Jarad Niemi (Iowa State) Case statistics November 17, 2014 4 / 9

  5. Influential observations Residuals SAT residuals after adjusting for % taking and median class rank: Residuals Studentized residuals Externally studentized residuals 50 1 1 0 0 0 value −1 −1 −50 −2 −2 −3 −3 −100 0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50 Case number Jarad Niemi (Iowa State) Case statistics November 17, 2014 5 / 9

  6. Influential observations Residuals DATA case1201; INFILE 'case1201.csv' DSD FIRSTOBS=2; INPUT state $ sat takers income years public expend rank; ltakers = log(takers); IF state='Alaska' THEN DELETE; RUN; PROC GLM DATA=case1201; MODEL sat = ltakers rank; RUN; Jarad Niemi (Iowa State) Case statistics November 17, 2014 6 / 9

  7. Influential observations Residuals SAS diagnostics: Jarad Niemi (Iowa State) Case statistics November 17, 2014 7 / 9

  8. Influential observations Residuals mod = lm(SAT~log(Takers)+Rank, case1201) opar = par(mfrow=c(2,3)); plot(mod, 1:6, ask=FALSE); par(opar) Residuals vs Fitted Normal Q−Q Scale−Location 2 50 50 Standardized residuals Standardized residuals 1.5 1 48 16 Residuals 0 0 1.0 −1 −50 0.5 16 48 −2 48 16 −100 50 −3 0.0 50 850 950 1050 −2 −1 0 1 2 850 950 1050 Fitted values Theoretical Quantiles Fitted values Cook's dist vs Leverage h ii ( 1 Cook's distance Residuals vs Leverage 0.15 2 3.5 2.5 3 2 1.5 1 50 50 Standardized residuals 0.12 1 Cook's distance Cook's distance 0.10 16 16 0 0.08 48 48 −1 0.05 0.04 0.5 16 48 −3 Cook's distance 0.5 0.00 50 0.00 0 0 10 20 30 40 50 0.00 0.05 0.10 0.15 0.02 0.08 0.14 Leverage h ii Obs. number Leverage Jarad Niemi (Iowa State) Case statistics November 17, 2014 8 / 9

  9. Influential observations Summary Summary of case statistics Leverage: observations that might be influential Cook’s distance: observations had large overall influence on their own If influential, fit with and without to determine impact on questions of interest Residuals: observations are not being fit accurately by the model Check out this app (on campus or VPN): http://shiny1.stat.iastate.edu/_Statistics/14-outlier/ Jarad Niemi (Iowa State) Case statistics November 17, 2014 9 / 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend