Introduction to Multiple Regression
James H. Steiger
Department of Psychology and Human Development Vanderbilt University
James H. Steiger (Vanderbilt University) 1 / 54
Introduction to Multiple Regression James H. Steiger Department of - - PowerPoint PPT Presentation
Introduction to Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 54 Introduction to Multiple Regression 1 The Multiple Regression Model
James H. Steiger (Vanderbilt University) 1 / 54
1
The Multiple Regression Model
2
Some Key Regression Terminology
3
The Kids Data Example Visualizing the Data – The Scatterplot Matrix Regression Models for Predicting Weight
4
Understanding Regression Coefficients
5
Statistical Testing in the Fixed Regressor Model Introduction Partial F-Tests: A General Approach Partial F-Tests: Overall Regression Partial F-Tests: Adding a Single Term
6
Variable Selection in Multiple Regression Introduction Forward Selection Backward Elimination Stepwise Regression Automatic Single-Term Sequential Testing in R
7
Variable Selection in R Problems with Statistical Testing in the Variable Selection Context
8
Information-Based Selection Criteria The Active Terms Information Criteria
9
(Estimated)Standard Errors
10 Standard Errors for Predicted and Fitted Values
James H. Steiger (Vanderbilt University) 2 / 54
The Multiple Regression Model
James H. Steiger (Vanderbilt University) 3 / 54
The Multiple Regression Model
James H. Steiger (Vanderbilt University) 4 / 54
The Multiple Regression Model
James H. Steiger (Vanderbilt University) 5 / 54
Some Key Regression Terminology
James H. Steiger (Vanderbilt University) 6 / 54
Some Key Regression Terminology
James H. Steiger (Vanderbilt University) 7 / 54
Some Key Regression Terminology
James H. Steiger (Vanderbilt University) 8 / 54
Some Key Regression Terminology
James H. Steiger (Vanderbilt University) 9 / 54
Some Key Regression Terminology
James H. Steiger (Vanderbilt University) 10 / 54
The Kids Data Example
James H. Steiger (Vanderbilt University) 11 / 54
The Kids Data Example
James H. Steiger (Vanderbilt University) 12 / 54
The Kids Data Example Visualizing the Data – The Scatterplot Matrix
James H. Steiger (Vanderbilt University) 13 / 54
The Kids Data Example Visualizing the Data – The Scatterplot Matrix
> pairs(kids.data)
WGT
45 50 55 60 50 55 60 65 70 75 45 50 55 60
HGT
50 55 60 65 70 75 6 7 8 9 10 11 12 6 7 8 9 10 11 12
AGE
James H. Steiger (Vanderbilt University) 14 / 54
The Kids Data Example Regression Models for Predicting Weight
James H. Steiger (Vanderbilt University) 15 / 54
The Kids Data Example Regression Models for Predicting Weight
Call: lm(formula = WGT ~ HGT) Residuals: Min 1Q Median 3Q Max
2.26 11.84 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.190 12.849 0.48 0.6404 HGT 1.072 0.242 4.44 0.0013 **
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 5.47 on 10 degrees of freedom Multiple R-squared: 0.663,Adjusted R-squared: 0.629 F-statistic: 19.7 on 1 and 10 DF, p-value: 0.00126 Call: lm(formula = WGT ~ HGT + AGE) Residuals: Min 1Q Median 3Q Max
0.345 1.464 10.234 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.553 10.945 0.60 0.564 HGT 0.722 0.261 2.77 0.022 * AGE 2.050 0.937 2.19 0.056 .
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 4.66 on 9 degrees of freedom Multiple R-squared: 0.78,Adjusted R-squared: 0.731 F-statistic: 16 on 2 and 9 DF, p-value: 0.0011
James H. Steiger (Vanderbilt University) 16 / 54
The Kids Data Example Regression Models for Predicting Weight
James H. Steiger (Vanderbilt University) 17 / 54
The Kids Data Example Regression Models for Predicting Weight
James H. Steiger (Vanderbilt University) 18 / 54
The Kids Data Example Regression Models for Predicting Weight
James H. Steiger (Vanderbilt University) 19 / 54
The Kids Data Example Regression Models for Predicting Weight
James H. Steiger (Vanderbilt University) 20 / 54
The Kids Data Example Regression Models for Predicting Weight
James H. Steiger (Vanderbilt University) 21 / 54
Understanding Regression Coefficients
James H. Steiger (Vanderbilt University) 22 / 54
Understanding Regression Coefficients
James H. Steiger (Vanderbilt University) 23 / 54
Understanding Regression Coefficients
> library(car) > av.plots(model.2) Warning: ’av.plots’ is deprecated. Use ’avPlots’ instead. See help("Deprecated") and help("car-deprecated"). −10 −5 5 −10 −5 5 10 HGT | others WGT | others −2 −1 1 2 3 −5 5 10 AGE | others WGT | others Added−Variable Plots
James H. Steiger (Vanderbilt University) 24 / 54
Statistical Testing in the Fixed Regressor Model Introduction
James H. Steiger (Vanderbilt University) 25 / 54
Statistical Testing in the Fixed Regressor Model Partial F-Tests: A General Approach
James H. Steiger (Vanderbilt University) 26 / 54
Statistical Testing in the Fixed Regressor Model Partial F-Tests: A General Approach
James H. Steiger (Vanderbilt University) 27 / 54
Statistical Testing in the Fixed Regressor Model Partial F-Tests: Overall Regression
James H. Steiger (Vanderbilt University) 28 / 54
Statistical Testing in the Fixed Regressor Model Partial F-Tests: Overall Regression
James H. Steiger (Vanderbilt University) 29 / 54
Statistical Testing in the Fixed Regressor Model Partial F-Tests: Adding a Single Term
James H. Steiger (Vanderbilt University) 30 / 54
Variable Selection in Multiple Regression Introduction
James H. Steiger (Vanderbilt University) 31 / 54
Variable Selection in Multiple Regression Forward Selection
1 You select a group of independent variables to be examined. 2 The variable with the highest squared correlation with the criterion is
3 The partial F statistic for each possible remaining variable is
4 If the variable with the highest F statistic passes a criterion, it is
5 Keep going back to step 3, recomputing the partial F statistics until
James H. Steiger (Vanderbilt University) 32 / 54
Variable Selection in Multiple Regression Backward Elimination
1 You start with all the variables you have selected as possible
2 You then compute partial F statistics for each of the variables
3 Find the variable with the lowest F. 4 If this F is low enough to be below a criterion you have selected,
5 Continue until no partial F is found that is sufficiently low. James H. Steiger (Vanderbilt University) 33 / 54
Variable Selection in Multiple Regression Stepwise Regression
James H. Steiger (Vanderbilt University) 34 / 54
Variable Selection in Multiple Regression Automatic Single-Term Sequential Testing in R
James H. Steiger (Vanderbilt University) 35 / 54
Variable Selection in Multiple Regression Automatic Single-Term Sequential Testing in R
James H. Steiger (Vanderbilt University) 36 / 54
Variable Selection in Multiple Regression Automatic Single-Term Sequential Testing in R
James H. Steiger (Vanderbilt University) 37 / 54
Variable Selection in Multiple Regression Automatic Single-Term Sequential Testing in R
James H. Steiger (Vanderbilt University) 38 / 54
Variable Selection in Multiple Regression Automatic Single-Term Sequential Testing in R
Notice also that the difference test p-value for the last variable entered is the same as the p-values reported in the overall output for the full model, but, in general, the other p-values will not be the same.
> anova(model.2b) Analysis of Variance Table Response: WGT Df Sum Sq Mean Sq F value Pr(>F) AGE 1 526 526 24.24 0.00082 *** HGT 1 166 166 7.66 0.02181 * Residuals 9 195 22
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > summary(model.2b) Call: lm(formula = WGT ~ AGE + HGT) Residuals: Min 1Q Median 3Q Max
0.345 1.464 10.234 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.553 10.945 0.60 0.564 AGE 2.050 0.937 2.19 0.056 . HGT 0.722 0.261 2.77 0.022 *
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 4.66 on 9 degrees of freedom Multiple R-squared: 0.78,Adjusted R-squared: 0.731 F-statistic: 16 on 2 and 9 DF, p-value: 0.0011
James H. Steiger (Vanderbilt University) 39 / 54
Variable Selection in R Problems with Statistical Testing in the Variable Selection Context
James H. Steiger (Vanderbilt University) 40 / 54
Variable Selection in R Problems with Statistical Testing in the Variable Selection Context
James H. Steiger (Vanderbilt University) 41 / 54
Variable Selection in R Problems with Statistical Testing in the Variable Selection Context
James H. Steiger (Vanderbilt University) 42 / 54
Variable Selection in R Problems with Statistical Testing in the Variable Selection Context
We start the forward selection procedure (which is fully automated by SPSS) by looking for the X predictor that correlates most highly with the criterion variable Y . We can examine all the predictor-criterion correlations, sorted, using the following command, which grabs the first column and sorts its entries, then restrict the output to the largest 3 values:
> sort(cor(test.data)[, 1])[88:91] X48 X77 X53 Y 0.2568 0.3085 0.3876 1.0000
Since we have been privileged to examine all the data and select the best predictor, the probability model on which the F-test for overall regression is based is no longer valid. We can see that X53 has a correlation of .388, despite the fact that the population correlation is
> summary(lm(Y ~ X53)) Call: lm(formula = Y ~ X53) Residuals: Min 1Q Median 3Q Max
0.106 0.601 1.998 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.189 0.144 1.31 0.1979 X53 0.448 0.154 2.91 0.0054 **
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.02 on 48 degrees of freedom Multiple R-squared: 0.15,Adjusted R-squared: 0.133 F-statistic: 8.49 on 1 and 48 DF, p-value: 0.00541
The regression is “significant” beyond the .01 level.
James H. Steiger (Vanderbilt University) 43 / 54
Variable Selection in R Problems with Statistical Testing in the Variable Selection Context
The next largest correlation is X77. Adding that to the equation produces a “significant” improvement, and an R2 value of 0.26.
> summary(lm(Y ~ X53 + X77)) Call: lm(formula = Y ~ X53 + X77) Residuals: Min 1Q Median 3Q Max
0.100 0.614 1.788 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.211 0.136 1.55 0.1284 X53 0.471 0.145 3.24 0.0022 ** X77 0.363 0.137 2.65 0.0110 *
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.963 on 47 degrees of freedom Multiple R-squared: 0.261,Adjusted R-squared: 0.229 F-statistic: 8.28 on 2 and 47 DF, p-value: 0.00083
It is precisely because F tests perform so poorly under these conditions that alternative methods have been sought. Although R implements stepwise procedures in its step library, it does not use the F-statistic, but rather employs information-based criteria such as the AIC. In its leaps procedure, R implements an “all-possible-subsets” search for the best model. We shall examine the performance of some of these selection procedures in Homework 5.
James H. Steiger (Vanderbilt University) 44 / 54
Information-Based Selection Criteria The Active Terms
James H. Steiger (Vanderbilt University) 45 / 54
Information-Based Selection Criteria The Active Terms
James H. Steiger (Vanderbilt University) 46 / 54
Information-Based Selection Criteria Information Criteria
James H. Steiger (Vanderbilt University) 47 / 54
Information-Based Selection Criteria Information Criteria
James H. Steiger (Vanderbilt University) 48 / 54
Information-Based Selection Criteria Information Criteria
James H. Steiger (Vanderbilt University) 49 / 54
Information-Based Selection Criteria Information Criteria
James H. Steiger (Vanderbilt University) 50 / 54
(Estimated)Standard Errors
James H. Steiger (Vanderbilt University) 51 / 54
Standard Errors for Predicted and Fitted Values
James H. Steiger (Vanderbilt University) 52 / 54
Standard Errors for Predicted and Fitted Values
James H. Steiger (Vanderbilt University) 53 / 54
Standard Errors for Predicted and Fitted Values
∗ ˆ
∗(X′X)−1x∗
James H. Steiger (Vanderbilt University) 54 / 54