Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
STAT 213 Regression Inference II
Colin Reimer Dawson
Oberlin College
STAT 213 Regression Inference II Colin Reimer Dawson Oberlin - - PowerPoint PPT Presentation
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability STAT 213 Regression Inference II Colin Reimer Dawson Oberlin College 18 February 2016 Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Oberlin College
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
library(mosaic) BrainBodyWeight <- read.file("http://colinreimerdawson.com/data/BrainBodyWeight.csv") xyplot( brain.weight.grams ~ body.weight.kilograms, data = BrainBodyWeight, type = c("p", "r")) body.weight.kilograms brain.weight.grams
1000 2000 3000 4000 5000 2000 4000 6000
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
brain.model <- lm(brain.weight.grams ~ body.weight.kilograms, data = BrainBodyWeight) par(mfrow = c(1,2)) # to create a 1-by-2 plotting grid plot(brain.model, which = 1) #residuals by predicted plot(brain.model, which = 2) #quantile-quantile 2000 4000 6000 −1000 1000 Fitted values Residuals
5 34 1
−1 1 2 −6 −2 2 4 6 8 Theoretical Quantiles Standardized residuals Normal Q−Q
5 1 34
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
xyplot( log(brain.weight.grams) ~ log(body.weight.kilograms), data = BrainBodyWeight, type = c("p", "r")) log(body.weight.kilograms) log(brain.weight.grams)
−2 2 4 6 8 −5 5
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
log.brain.model <- lm(log(brain.weight.grams) ~ log(body.weight.kilograms), data = BrainBodyWeight) par(mfrow = c(1,2)) plot(log.brain.model, which = 1) #residuals by predicted plot(log.brain.model, which = 2) #quantile-quantile −2 2 4 6 8 −2 −1 1 2 Fitted values Residuals
34 61 50
−1 1 2 −2 1 2 3 Theoretical Quantiles Standardized residuals Normal Q−Q
34 61 50
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
library(mosaic) transform( BrainBodyWeight, percent.brain = brain.weight.grams / (body.weight.kilograms * 1000) ) %>% xyplot( log(percent.brain) ~ log(body.weight.kilograms), data = ., type = c("p", "r")) log(body.weight.kilograms) log(percent.brain)
−7 −6 −5 −4 −5 5
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
7.0 8.0 9.0 10.0 −2 −1 1 2 Fitted values Residuals
34 61 50
−1 1 2 −2 1 2 3 Theoretical Quantiles Standardized residuals Normal Q−Q
34 61 50
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
library(Stat2Data) data(LongJumpOlympics) xyplot( Gold ~ Year, data = LongJumpOlympics, type = c("p", "r"), groups = (Year == 1968) ## highlight the outlier ) Year Gold
7.5 8.0 8.5 1900 1920 1940 1960 1980 2000
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
long.jump.model <- lm(Gold ~ Year, data = LongJumpOlympics) par(mfrow = c(1,2)) plot(long.jump.model, which = 1) plot(long.jump.model, which = 2) 7.5 8.0 8.5 −0.4 0.0 0.4 0.8 Fitted values Residuals
16 12 26
−1 1 2 −1 1 2 3 Theoretical Quantiles Standardized residuals Normal Q−Q
16 26 12
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
long.jump.model <- lm(Gold ~ Year, data = LongJumpOlympics) par(mfrow = c(1,2)) plot(long.jump.model, which = 1) plot(long.jump.model, which = 2) 7.5 8.0 8.5 −0.4 0.0 0.4 0.8 Fitted values Residuals
16 12 26
−1 1 2 −1 1 2 3 Theoretical Quantiles Standardized residuals Normal Q−Q
16 26 12
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
30 40 50 60 70 20 30 40 50 60 70 Wife's Age Husband's Age
Sample 1 Sample 2 Sample 3 Sample 4
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Slope predicting male partner's age from female partner's age Density
0.0 0.5 1.0 1.5 1 2 3
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
βi
^ − β1) SEβ1
^
Density
0.0 0.2 0.4 0.6 0.8 −2 2 4
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
n−2
n−2
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability sample.model <- lm(Husband ~ Wife, data = sample1) summary(sample.model) Call: lm(formula = Husband ~ Wife, data = sample1) Residuals: Min 1Q Median 3Q Max
0.7395 3.3295 7.5107 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -13.7455 7.1773
0.0918 . Wife 1.5486 0.1989 7.786 5.3e-05 ***
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 5.774 on 8 degrees of freedom Multiple R-squared: 0.8834,Adjusted R-squared: 0.8689 F-statistic: 60.63 on 1 and 8 DF, p-value: 5.304e-05 MoE.95 <- qt(0.975, df = 8) * 0.1989 (CI.95 <- c(1.5486 - MoE.95, 1.5486 + MoE.95)) [1] 1.089936 2.007264
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
confint(sample.model, level = 0.95) 2.5 % 97.5 % (Intercept) -30.296476 2.805433 Wife 1.089947 2.007218
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
n−2
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
brain.model <- lm(log(brain.weight.grams) ~ log(body.weight.kilograms), data = BrainBodyWeight) anova(brain.model) Analysis of Variance Table Response: log(brain.weight.grams) Df Sum Sq Mean Sq F value Pr(>F) log(body.weight.kilograms) 1 336.19 336.19 697.42 < 2.2e-16 *** Residuals 60 28.92 0.48
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability
summary(brain.model) Call: lm(formula = log(brain.weight.grams) ~ log(body.weight.kilograms), data = BrainBodyWeight) Residuals: Min 1Q Median 3Q Max
0.43597 1.94829 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.13479 0.09604 22.23 <2e-16 *** log(body.weight.kilograms) 0.75169 0.02846 26.41 <2e-16 ***
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.6943 on 60 degrees of freedom Multiple R-squared: 0.9208,Adjusted R-squared: 0.9195 F-statistic: 697.4 on 1 and 60 DF, p-value: < 2.2e-16
Outline Key Ideas: Last Time Simulation Approaches Partitioning Variability