Chapter 13 Multiple Regression and Model Building Multiple - PowerPoint PPT Presentation

Chapter 13 Multiple Regression and Model Building

Multiple Regression Models The General Multiple Regression Model            ... y x x x 0 1 1 2 2 k k is the dependent variable y are the independent variables , , ..., x x x 1 2 k   is the deterministic portion of          ... E y x x x 0 1 1 2 2 k k the model determines the contribution of the independent variable  x i i

Multiple Regression Models Analyzing a Multiple Regression Model 1. Hypothesize the deterministic component of the model Use sample data to estimate β 0 , β 1 , β 2 ,… β k 2. Specify probability distribution of ε and estimate σ 3. Check that assumptions on ε are satisfied 4. 5. Statistically evaluate model usefulness 6. Useful model used for prediction, estimation, other purposes

The First-Order Model: Estimating and Interpreting the  -Parameters   For             E y x x x x x 0 1 1 2 2 3 3 4 4 5 5 ˆ ˆ ˆ the chosen fitted model        ˆ ... y x x 0 1 1 k k minimizes 2      ˆ S S E y y

The First-Order Model: Estimating and Interpreting the  -Parameters y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + ε where Y = Sales price (dollars) X 1 = Appraised land value (dollars) X 2 = Appraised improvements (dollars) X 3 = Area (square feet )

The First-Order Model: Estimating and Interpreting the  -Parameters Plot of data for sample size n=20

The First-Order Model: Estimating and Interpreting the  -Parameters Fit model to data

The First-Order Model: Estimating and Interpreting the  -Parameters Interpret β estimates E(y), the mean sale price of the property is ˆ estimated to increase .8145 dollars for every $1   .8 1 4 5 increase in appraised land value, holding other 1 variables constant E(y), the mean sale price of the property is ˆ estimated to increase .8204 dollars for every $1   .8 2 0 4 2 increase in appraised improvements, holding other variables constant E(y), the mean sale price of the property is ˆ estimated to increase 13.53 dollars for additional   1 3 .5 3 square foot of living area, holding other variables 1 constant

The First-Order Model: Estimating and Interpreting the  -Parameters Given the model E(y) = 1 +2x 1 +x 2 , the effect of x 2 on E(y), holding x 1 and x 2 constant is

Model Assumptions Assumptions about Random Error ε For any given set of values of x 1 , x 2 ,…..x k , the random 1. error has a normal probability distribution with mean 0 and variance σ 2 2. The random errors are independent Estimators of σ 2 for a Multiple Regression Model with k Independent Variables SSE SSE s 2 = = n -Number of Estimated β parameters n -( k +1)

Inferences about the  -Parameters 2 types of inferences can be made, using either confidence intervals or hypothesis testing For any inferences to be made, the assumptions made about the random error term ε (normal distribution with mean 0 and variance σ 2 , independence or errors) must be met

Inferences about the  -Parameters A 100(1- α )% Confidence Interval for a  -Parameter ˆ   t s  ˆ 2 i  i where t α /2 is based on n -( k +1) degrees of freedom and n = Number of observations k +1 = Number of  parameters in the model

Inferences about the  -Parameters A Test of an Individual Parameter Coefficient Two-Tailed One-Tailed Test Test H 0 : β i =0 H 0 : β i =0 H a : β i <0 (or H a : β i >0) H a : β i ≠0 ˆ   i : T e s t S ta tis tic t s ˆ  i Rejection region: t < -t α Rejection region: | t |> t α /2 (or t < - t α when H a : β 1 >0) Where t α and t α /2 are based on n -( k +1) degrees of freedom

Inferences about the  -Parameters An Excel Analysis Use for hypotheses about parameter coefficients Use for confidence Intervals

Checking the Overall Utility of a Model 3 tests: Multiple coefficient of determination R 2 1.  S S S S E S S E E x p la in e d v a r ia b ility 2 y y     1 R S S S S T o ta l v a r ia b ility y y y y 2. Adjusted multiple coefficient of determination             1 1 n n S S E   2 2        1   1   1 R R       a      1   1  n k S S n k       y y 3. Global F-test    2 S S S S E k R k y y   T e st sta tistic F :            2     1 S S E n k 1 R n k 1    

Checking the Overall Utility of a Model Testing Global Usefulness of the Model: The Analysis of Variance F-test H 0 : β 1 = β 2=.... β k =0 H a : At least one β i ≠ 0    2 S S S S E k R k M e a n S q u a re M o d e l y y    T e st sta tistic F :           2      1 M e a n S q u a re E rro r S S E n k 1 1 R n k     where n is the sample size and k is number of terms in the model Rejection region: F>F α , with k numerator degrees of freedom and [n- (k+1)] denominator degrees of freedom

Checking the Overall Utility of a Model Checking the Utility of a Multiple Regression Model 1. Conduct a test of overall model adequacy using the F-test. If H 0 is rejected, proceed to step 2 Conduct t-tests on β parameters of particular 2. interest

Using the Model for Estimation and Prediction As in Simple Linear Regression, intervals around a predicted value will be wider than intervals around an estimated value Most statistics packages will print out both confidence and prediction intervals

Model Building: Interaction Models An Interaction Model relating E(y) to Two Quantitative Independent Variables           E y x x x x 0 1 1 2 2 3 1 2 where represents the change in E(y) for      x 1 3 2 every 1-unit increase in x 1 , holding x 2 fixed represents the change in E(y) for      x 2 3 1 every 1-unit increase in x 2 , holding x 1 fixed

Model Building: Interaction Models When the relationship between two y When the linear relationship and x i is not impacted by a second x between y and x i depends on (no interaction) another x

Model Building: Interaction Models

Model Building: Quadratic and other Higher-Order Models A Quadratic (Second-Order) Model   2       E y x x 0 1 2 where is the y-intercept of the curve  0 is a shift parameter  1 is the rate of curvature  2

Model Building: Quadratic and other Higher-Order Models Home Size-Electrical Usage Data Size of Home, Monthly Usage, x (sq. ft.) y (kilowatt-hours) 1,290 1,182 1,350 1,172 1,470 1,264 1,600 1,493 1,710 1,571 1,840 1,711 1,980 1,804 2,230 1,840 2,400 1,95 2,930 1,954

Model Building: Quadratic and other Higher-Order Models 2     ˆ 1, 2 1 6 .1 2 .3 9 8 9 .0 0 0 4 5 y x x

Model Building: Quadratic and other Higher-Order Models A Complete Second-Order Model with Two Quantitative Independent Variables   2 2             E y x x x x x x 0 1 2 2 3 1 2 4 1 5 2 where is the y-intercept, value of E(y) when x 1 = x 2 =0  0 changes cause the surface to shift along the x 1 and x 2   , 1 2 axes controls the rotation of the surface  3 control the type of surface, rates of curvature   , 4 5

Model Building: Quadratic and other Higher-Order Models

Model Building: Qualitative (Dummy) Variable Models Dummy variables – coded, qualitative variables • Codes are in the form of (1, 0), 1 being the presence of a condition, 0 the absence • Create Dummy variables so that there is one less dummy variable than categories of the qualitative variable of interest Gender dummy variable coded as x = 1 if male, x=0 if female If model is E(y)= β 0 + β 1 x , β 1 captures the effect of being male on the dependent variable

Model Building: Models with both Quantitative and Qualitative Variables Start with a first order model with one quantitative variable, E(y)= β 0 + β 1 x Adding a qualitative variable with no interaction, E(y)= β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3

Model Building: Models with both Quantitative and Qualitative Variables Adding an interaction term, E(y)= β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + β 4 x 1 x 2 + β 5 x 1 x 3 Main effect, Main effect Interaction x 1 x 2 and x 3

Model Building: Comparing Nested Models Models are nested if one model contains all the terms of the other model and at least one additional term. Complete (full) model – the more complex model Reduced model – the simpler model

Chapter 13 Multiple Regression and Model Building Multiple - PowerPoint PPT Presentation

Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model ... y x x x 0 1 1 2 2 k k is the dependent variable y are

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 11/27/2006 Chapter 13

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 Inheritance Concepts

Chapter 13 Chapter 13 1 What is this? Chapter 13 2 What is this? Chapter 13 3 What is

CHAPTER CHAPTER VII CHAPTER CHAPTER VII VII VII MANAGEMENT AND MANAGEMENT AND

Appendix A Chapter 9 versus Chapter 1 1 at a Glance Chapter 9 Chapter 1 1 ( I n) voluntary Cannot

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Pushdown Automata Chapter 5 Chapter 5 Chapter 5 Chapter 5

Chapter 6 Programme design and development Lets Recap Chapter 2: Chapter 3: Chapter 1:

OWASP London Chapter Meeting 27th July 2017 London Chapter Chapter Leaders: Sam

Constraint Satisfaction Problem s C t i t S ti f ti P bl Reading: Chapter 6 (3 rd ed );

Chapter 3 Chapter 3 Data Description McGraw-Hill, Bluman, 7 th ed, Chapter 3 1 Ch Chapter 3

OWASP London Chapter Meeting 23rd November 2017 London Chapter Chapter Leaders: Sam

A.I.S. Class 22: Outline I Learning Objectives for Chapter 8 I Chapter 8 Quiz I New ACCESS Features

A.I.S. Class 27: Outline I Learning Objectives for Chapter 8 I Chapter 8 Quiz I New ACCESS Features

Chapters for the Final Exam Chapter 20: Electric forces and fields (Conceptual Questions) Chapter

Chapter: 9 9 9 9 Chapter: Chapter: Chapter: High-Speed Downlink High-Speed Downlink Packet

Sta$s$cs & Experimental Design with R Barbara Kitchenham

Overview of Fourier Representation Properties Review of Signal Types Range of equations

Planning and Optimization E7. Linear & Integer Programming Malte Helmert and Gabriele R

LP techniques for set cover Chs. 13, 14, 15 Risto Hakala risto.m.hakala@tkk.fi March 10, 2008

Lecture 6. GLM for Binary Response Nan Ye School of Mathematics and Physics University of

Runtime Complexity Mark Redekopp David Kempe Sandra Batista Revised: 12/20/2019 2 2

Program control constructs Branching using if endif and select case loops (repeated

Estimation of Autoregressive Processes with Sparse Parameters Abbas Kazemipour MAST Group

Chapter 13 Multiple Regression and Model Building Multiple - PowerPoint PPT Presentation

Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model ... y x x x 0 1 1 2 2 k k is the dependent variable y are

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 11/27/2006 Chapter 13

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 Inheritance Concepts

Chapter 13 Chapter 13 1 What is this? Chapter 13 2 What is this? Chapter 13 3 What is

CHAPTER CHAPTER VII CHAPTER CHAPTER VII VII VII MANAGEMENT AND MANAGEMENT AND

Appendix A Chapter 9 versus Chapter 1 1 at a Glance Chapter 9 Chapter 1 1 ( I n) voluntary Cannot

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Pushdown Automata Chapter 5 Chapter 5 Chapter 5 Chapter 5

Chapter 6 Programme design and development Lets Recap Chapter 2: Chapter 3: Chapter 1:

OWASP London Chapter Meeting 27th July 2017 London Chapter Chapter Leaders: Sam

Constraint Satisfaction Problem s C t i t S ti f ti P bl Reading: Chapter 6 (3 rd ed );

Chapter 3 Chapter 3 Data Description McGraw-Hill, Bluman, 7 th ed, Chapter 3 1 Ch Chapter 3

OWASP London Chapter Meeting 23rd November 2017 London Chapter Chapter Leaders: Sam

A.I.S. Class 22: Outline I Learning Objectives for Chapter 8 I Chapter 8 Quiz I New ACCESS Features

A.I.S. Class 27: Outline I Learning Objectives for Chapter 8 I Chapter 8 Quiz I New ACCESS Features

Chapters for the Final Exam Chapter 20: Electric forces and fields (Conceptual Questions) Chapter

Chapter: 9 9 9 9 Chapter: Chapter: Chapter: High-Speed Downlink High-Speed Downlink Packet

Sta$s$cs &amp; Experimental Design with R Barbara Kitchenham

Overview of Fourier Representation Properties Review of Signal Types Range of equations

Planning and Optimization E7. Linear &amp; Integer Programming Malte Helmert and Gabriele R

LP techniques for set cover Chs. 13, 14, 15 Risto Hakala risto.m.hakala@tkk.fi March 10, 2008

Lecture 6. GLM for Binary Response Nan Ye School of Mathematics and Physics University of

Runtime Complexity Mark Redekopp David Kempe Sandra Batista Revised: 12/20/2019 2 2

Program control constructs Branching using if endif and select case loops (repeated

Estimation of Autoregressive Processes with Sparse Parameters Abbas Kazemipour MAST Group

Sta$s$cs & Experimental Design with R Barbara Kitchenham

Planning and Optimization E7. Linear & Integer Programming Malte Helmert and Gabriele R