Presentation 7.3a: Multiple linear re- gression Murray Logan July - PDF document

-1- Presentation 7.3a: Multiple linear regression Murray Logan July 19, 2017 Table of contents 1 Theory 1 2 Centering data 3 3 Assumptions 5 4 Multiple linear models in R 7 5 Model selection 12 6 Worked Examples 13 1. Theory 1.1. Multiple Linear Regression 1.1.1. Additive model growth = intercept + temperature + nitrogen y i = β 0 + β 1 x i 1 + β 2 x i 2 + ... + β j x ij + ϵ i OR N ∑ y i = β 0 + β j x ji + ϵ i j =1: n 1.2. Multiple Linear Regression 1.2.1. Additive model growth = intercept + temperature + nitrogen y i = β 0 + β 1 x i 1 + β 2 x i 2 + ... + β j x ij + ϵ i - effect of one predictor holding the other(s) constant 1.3. Multiple Linear Regression 1.3.1. Additive model growth = intercept + temperature + nitrogen y i = β 0 + β 1 x i 1 + β 2 x i 2 + ... + β j x ij + ϵ i

-2- Y X1 X2 3 22.7 0.9 2.5 23.7 0.5 6 25.7 0.6 5.5 29.1 0.7 9 22 0.8 8.6 29 1.3 12 29.4 1 1.4. Multiple Linear Regression 1.4.1. Additive model 3 = β 0 + ( β 1 × 22.7) + ( β 2 × 0.9) + ε 1 2.5 = β 0 + ( β 1 × 23.7) + ( β 2 × 0.5) + ε 2 6 = β 0 + ( β 1 × 25.7) + ( β 2 × 0.6) + ε 3 5.5 = β 0 + ( β 1 × 29.1) + ( β 2 × 0.7) + ε 4 + ( β 1 × 22) + ( β 2 × 0.8) 9 = β 0 + ε 5 + ( β 1 × 29) + ( β 2 × 1.3) 8.6 = β 0 + ε 6 + ( β 1 × 29.4) + ( β 2 × 1) 12 = β 0 + ε 7 1.5. Multiple Linear Regression 1.5.1. Multiplicative model growth = intercept + temp + nitro + temp × nitro y i = β 0 + β 1 x i 1 + β 2 x i 2 + β 3 x i 1 x i 2 + ... + ϵ i 1.6. Multiple Linear Regression 1.6.1. Multiplicative model 3 = β 0 + ( β 1 × 22.7) + ( β 2 × 0.9) + ( β 3 × 22.7 × 0.9) + ε 1 2.5 = β 0 + ( β 1 × 23.7) + ( β 2 × 0.5) + ( β 3 × 23.7 × 0.5) + ε 2 6 = β 0 + ( β 1 × 25.7) + ( β 2 × 0.6) + ( β 3 × 25.7 × 0.6) + ε 3 5.5 = β 0 + ( β 1 × 29.1) + ( β 2 × 0.7) + ( β 3 × 29.1 × 0.7) + ε 4 9 = β 0 + ( β 1 × 22) + ( β 2 × 0.8) + ( β 3 × 22 × 0.8) + ε 5 8.6 = β 0 + ( β 1 × 29) + ( β 2 × 1.3) + ( β 3 × 29 × 1.3) + ε 6 12 = β 0 + ( β 1 × 29.4) + ( β 2 × 1) + ( β 3 × 29.4 × 1) + ε 7

-3- 2. Centering data 2.1. Multiple Linear Regression 2.1.1. Centering ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● 10 y 0 −10 −20 0 10 20 30 40 50 60 x 2.2. Multiple Linear Regression 2.2.1. Centering ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 47 48 49 50 51 52 53 54 2.3. Multiple Linear Regression

-4- 2.3.1. Centering ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 47 48 49 50 51 52 53 54 2.4. Multiple Linear Regression 2.4.1. Centering ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 47 48 49 50 51 52 53 54 −3 −2 −1 0 1 2 3 4

-5- 2.5. Multiple Linear Regression 2.5.1. Centering 24 ● ● ● ● 22 ● ● ● ● ● y 20 ● ● ● ● 18 ● ● ● ● ● 16 ● −4 −2 0 2 4 cx1 3. Assumptions 3.1. Multiple Linear Regression 3.1.1. Assumptions Normality, homog., linearity 3.2. Multiple Linear Regression 3.2.1. Assumptions (multi)collinearity

-6- 3.3. Multiple Linear Regression 3.3.1. Variance inflation Strength of a relationship R 2 Strong when R 2 ≥ 0.8 3.4. Multiple Linear Regression 3.4.1. Variance inflation 1 var . inf = 1 − R 2 Collinear when var . inf > = 5 Some prefer > 3 3.5. Multiple Linear Regression 3.5.1. Assumptions (multi)collinearity library(car) # additive model - scaled predictors vif(lm(y ~ cx1 + cx2, data)) cx1 cx2 1.743817 1.743817 3.6. Multiple Linear Regression 3.6.1. Assumptions (multi)collinearity library(car) # additive model - scaled predictors vif(lm(y ~ cx1 + cx2, data))

-7- cx1 cx2 1.743817 1.743817 # multiplicative model - raw predictors vif(lm(y ~ x1 * x2, data)) x1 x2 x1:x2 7.259729 5.913254 16.949468 3.7. Multiple Linear Regression 3.7.1. Assumptions # multiplicative model - raw predictors vif(lm(y ~ x1 * x2, data)) x1 x2 x1:x2 7.259729 5.913254 16.949468 # multiplicative model - scaled predictors vif(lm(y ~ cx1 * cx2, data)) cx1 cx2 cx1:cx2 1.769411 1.771994 1.018694 4. Multiple linear models in R 4.1. Model fitting Additive model y i = β 0 + β 1 x i 1 + β 2 x i 2 + ϵ i data.add.lm <- lm(y~cx1+cx2, data) 4.2. Model fitting Additive model y i = β 0 + β 1 x i 1 + β 2 x i 2 + ϵ i data.add.lm <- lm(y~cx1+cx2, data) Multiplicative model y i = β 0 + β 1 x i 1 + β 2 x i 2 + β 3 x i 1 x i 2 + ϵ i data.mult.lm <- lm(y~cx1+cx2+cx1:cx2, data) #OR data.mult.lm <- lm(y~cx1*cx2, data) 4.3. Model evaluation Additive model plot(data.add.lm)

-8- Residuals vs Fitted Normal Q−Q 40 ● 40 ● ● ● ● 2 ● 2 ● ● ● ● ● ● ● Standardized residuals ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ●● ● ● 1 ● ● ● ● ● ● ● ● ● ● Residuals ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −1 ● ● ● ● ● ● ● ● ● ● ● ● ● −2 ● ● ● ● ● ● ● 74 ● −2 ● ● 30 ● 74 30 ● −3 −2 −1 0 −2 −1 0 1 2 Fitted values Theoretical Quantiles Scale−Location Residuals vs Leverage 1.5 40 30 ● ● 74 ● ● 40 ● ● ● ● ● ● 2 ● 19 ● ● ● ● ● ● ● ● Standardized residuals ● ● ● ● ● Standardized residuals ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1.0 ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● −2 ● ● ● ● ● ● ● ● ● Cook's distance 0.0 −3 −2 −1 0 0.00 0.02 0.04 0.06 Fitted values Leverage 4.4. Model evaluation Multiplicative model plot(data.mult.lm) Residuals vs Fitted Normal Q−Q 2 ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Standardized residuals ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Residuals ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 ● ● ● ● ● 59 74 ● ● 30 −2 7459 ● ● ● 30 −4 −3 −2 −1 0 1 2 −2 −1 0 1 2 Fitted values Theoretical Quantiles Scale−Location Residuals vs Leverage 1.5 30 ● ● ● 59 74 ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 19 ● ● ● 40 ● ● Standardized residuals ● ● ● ● Standardized residuals ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● −1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 84 ● −2 ● ● ● ● ● Cook's distance 0.0 −4 −3 −2 −1 0 1 2 0.00 0.05 0.10 0.15 0.20 Fitted values Leverage

Presentation 7.3a: Multiple linear re- gression Murray Logan July - PDF document

-1- Presentation 7.3a: Multiple linear re- gression Murray Logan July 19, 2017 Table of contents 1 Theory 1 2 Centering data 3 3 Assumptions 5 4 Multiple linear models in R 7 5 Model selection 12 6 Worked Examples 13 1.

Presentation 7.3b: Multiple linear re- gression Murray Logan August 9, 2016 Table of contents

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Unit 7: Multiple linear regression 1. Introduction to multiple linear regression Sta 101 - Fall

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model

Categorical Semantics for Linear Logic Categorical semantics for linear logic Interaction

Linear Programming Linear Programming In a linear programming problem, there is a set of

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Multiple Decrement Models Lecture: Weeks 8-9 Lecture: Weeks 8-9 (STT 456) Multiple Decrement

Multiple Decrement Models Lecture: Weeks 8-9 Lecture: Weeks 8-9 (STT 456) Multiple Decrement

Multiple Sequence Multiple Sequence Alignments Alignments Multiple alignment Pairwise

Single Single- -Thread NVE Thread NVE Multiple Subsystems, Multiple Threads Multiple

Multiple Access Readings: Kurose & Ross, 5.3, 5.5 Multiple Access Multiple hosts sharing

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

The General Linear Model. April 22, 2008 Multiple regression Data: The Faroese Mercury Study

Multiple logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models in

Introduction to Data Science Winter Semester 2019/20 Oliver Ernst TU Chemnitz, Fakultt fr

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by

High Dimensional Bayesian Optimisation and Bandits via Additive Models Kirthevasan Kanda samy ,

The additive model revisited Sara van de Geer January 8, 2013 but first something else (Les

Lattice and Non-Lattice Markov Additive Models Jevgenijs Ivanovs, Guy Latouche and Peter Taylor

Tetraquarks in the Steiner tree model of confinement available at

Exploring models Summary, explainability, and prediction R.W. Oldford Modelling Recall how J.W.

Midterm 1 Financial Econometrics University of Notre Dame Fall 2018 Professor Mark Write

Presentation 7.3a: Multiple linear re- gression Murray Logan July - PDF document

-1- Presentation 7.3a: Multiple linear re- gression Murray Logan July 19, 2017 Table of contents 1 Theory 1 2 Centering data 3 3 Assumptions 5 4 Multiple linear models in R 7 5 Model selection 12 6 Worked Examples 13 1.

Presentation 7.3b: Multiple linear re- gression Murray Logan August 9, 2016 Table of contents

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Graphics 2014 Linear Algebra II Linear Maps &amp; Matrices Linear Maps &amp; Matrices CORE

Unit 7: Multiple linear regression 1. Introduction to multiple linear regression Sta 101 - Fall

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model

Categorical Semantics for Linear Logic Categorical semantics for linear logic Interaction

Linear Programming Linear Programming In a linear programming problem, there is a set of

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Multiple Decrement Models Lecture: Weeks 8-9 Lecture: Weeks 8-9 (STT 456) Multiple Decrement

Multiple Decrement Models Lecture: Weeks 8-9 Lecture: Weeks 8-9 (STT 456) Multiple Decrement

Multiple Sequence Multiple Sequence Alignments Alignments Multiple alignment Pairwise

Single Single- -Thread NVE Thread NVE Multiple Subsystems, Multiple Threads Multiple

Multiple Access Readings: Kurose &amp; Ross, 5.3, 5.5 Multiple Access Multiple hosts sharing

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

The General Linear Model. April 22, 2008 Multiple regression Data: The Faroese Mercury Study

Multiple logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models in

Introduction to Data Science Winter Semester 2019/20 Oliver Ernst TU Chemnitz, Fakultt fr

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by

High Dimensional Bayesian Optimisation and Bandits via Additive Models Kirthevasan Kanda samy ,

The additive model revisited Sara van de Geer January 8, 2013 but first something else (Les

Lattice and Non-Lattice Markov Additive Models Jevgenijs Ivanovs, Guy Latouche and Peter Taylor

Tetraquarks in the Steiner tree model of confinement available at

Exploring models Summary, explainability, and prediction R.W. Oldford Modelling Recall how J.W.

Midterm 1 Financial Econometrics University of Notre Dame Fall 2018 Professor Mark Write

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Multiple Access Readings: Kurose & Ross, 5.3, 5.5 Multiple Access Multiple hosts sharing