Bus 701: Advanced Statistics Harald Schmidbauer c Harald - PowerPoint PPT Presentation

Bus 701: Advanced Statistics Harald Schmidbauer c � Harald Schmidbauer & Angi R¨ osch, 2008

Chapter 14: Multiple Regression c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 2/43

14.1 Introduction SLR and Multiple Linear Regression. • Goal of SLR: Explain the variablity in Y , using a variable X . • Goal of multiple linear regression: Explain the variablity in Y , using a set of variables X 1 , X 2 , . . . , X k . c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 3/43

14.1 Introduction The problem. Given are points ( x 1 i , x 2 i , . . . , x ki , y i ) , where: • y i : observations from a variable Y , the dependent variable; • x ji : observations from a variable X j , which is an independent variable. Given a (k+1)-dimensional cloud of points, how can we fit a hyperplane? c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 4/43

14.1 Introduction Outlook on Chapter 14. • 14.2 An Intuitive Approach three-dimensional scatterplots and a regression plane • 14.3 The Regression Plane the method of least squares • 14.4 Explanatory Power of the Model decomposition of variance; coefficient of determination • 14.5 A Stochastic Model of Multiple Regression stochastic model and statistical inference • 14.6 Examples • 14.7 Prediction Based on Multiple Regression point prediction and prediction intervals c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 5/43

14.2 An Intuitive Approach The case of three variables: X 1 , X 2 , Y . We shall now see a three-dimensional scatterplot in two perspectives with: • black points, representing the observations, • a plane, which somehow fits these points, • red points, the projection of the black points onto the plane, • the distance between the black and the red points. c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 6/43

14.2 An Intuitive Approach Observed points and their projections onto the plane. c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 7/43

14.2 An Intuitive Approach Observed points and their projections onto the plane. c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 8/43

14.2 An Intuitive Approach How to find that plane. . . . in order to find a “good” plane to represent the cloud of points, we need: • the equation of a plane, depending on parameters, • a distance function, • to find the parameter values such that the distance function is minimized. c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 9/43

14.3 The Regression Plane A plane and the observations. • Plane in 3-dimensional space: y = a + b 1 x 1 + b 2 x 2 • With observations ( x 1 i , x 2 i , y i ) , i = 1 , . . . , n : ˆ = a + b 1 x 11 + b 2 x 21 , = y 1 − ˆ y 1 e 1 y 1 ˆ = a + b 1 x 12 + b 2 x 22 , = y 2 − ˆ y 2 e 2 y 2 . . . . . . ˆ = a + b 1 x 1 n + b 2 x 2 n , = y n − ˆ y n e n y n • The ˆ y i are called the fitted values. c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 10/43

14.3 The Regression Plane Using matrices. — The last relations can be written as: ˆ y = Xb , e = y − ˆ y = y − Xb , where     � a ˆ 1 y 1 x 11 x 21 � ˆ 1 y 2 x 12 x 22 y = ˆ X = b =  ,  , , b 1     . . . . . . . . . . . .   b 2 ˆ 1 y n x 1 n x 2 n     y 1 e 1 y 2 e 2 y = e =  ,  .     . . . . . .   y n e n c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 11/43

14.3 The Regression Plane Definition. • Define ˆ y i = a + b 1 x 1 i + b 2 x 2 i and e i = y i − ˆ y i . • The regression plane of Y with respect to X 1 and X 2 is the plane y = a + b 1 x 1 + b 2 x 2 with a , b 1 and b 2 such that n n � � y i ) 2 e 2 Q ( a, b 1 , b 2 ) = i = ( y i − ˆ i =1 i =1 n � ( y i − a − b 1 x 1 i − b 2 x 2 i ) 2 = i =1 attains its minimum. • b 1 and b 2 : regression coefficients. c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 12/43

14.3 The Regression Plane Regression: some first comments. • This procedure is asymmetric — like SLR! • It conforms to the idea: Given X 1 and X 2 , what is Y ? • X 1 , X 2 : “independent variables”, Y : “dependent variable” • This procedure can be easily generalized to k > 2 independent variables. • The case k > 2 cannot be easily visualized in terms of a scatterplot. c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 13/43

14.3 The Regression Plane Example: Used cars. • For a set of used cars, consider these variables: – mileage (km) – age (months) – price ( e ) • A natural choice is: – dependent variable: price – inpendent variables: mileage, age c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 14/43

14.3 The Regression Plane Example: Used cars. • Important: The so-called “independent variables” need not be uncorrelated. • For our sample of 400 cars (VW Golf 1.8): 200 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 150 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● – correlation: 0.43 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● mileage ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● – red points: cars with ac ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 ● ● ● ● 60 80 100 140 180 age c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 15/43

14.3 The Regression Plane Computing the regression plane. • Minimizing Q leads to the following vector equation: b = ( X ′ X ) − 1 X ′ y • The fitted values are: y = Xb = X ( X ′ X ) − 1 X ′ y ˆ • These formulas apply to any number k of independent variables. • For k = 1 , the formulas of SLR are obtained. c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 16/43

14.3 The Regression Plane Multiple regression — some properties in the context of descriptive statistics. • The vector of arithmetic means (¯ x 1 , ¯ x 2 , ¯ y ) is on the regression plane. • The average error ¯ e equals zero. • The matrix X ( X ′ X ) − 1 X ′ in ˆ y = Xb = X ( X ′ X ) − 1 X ′ y is a projection matrix: y is projected onto a sub-space of R n . c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 17/43

14.3 The Regression Plane Example: Used cars. • Data from 400 used cars (VW Golf 1.8, age at least 5 years, mileage at most 200000 km). • The fitted regression plane is: price = 14146 . 2 − 24 . 61 · mileage − 49 . 13 · age (Price in e , mileage in 1000 km, age in months.) • According to this result: What is the average price of a car with mileage 100000 km, age 10 years? • How much will this decrease if the car is used for another year, for another 12000 km? c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 18/43

14.3 The Regression Plane Example: Used cars. Scatterplot: c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 19/43

14.3 The Regression Plane Example: Used cars. Scatterplot: c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 20/43

14.4 Explanatory Power of the Model Decomposition of variance. As in SLR, it holds that: y ) 2 = � (ˆ y ) 2 + � ( y i − ˆ y i ) 2 , � ( y i − ¯ y i − ¯ SST = SSR + SSE where SST: total sum of squares SSR: regression sum of squares SSE: error sum of squares c � Harald Schmidbauer & Angi R¨ osch, 2008 14. Multiple Regression 21/43

Bus 701: Advanced Statistics Harald Schmidbauer c Harald - PowerPoint PPT Presentation

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer & Angi R osch, 2008 Chapter 14: Multiple Regression c Harald Schmidbauer & Angi R osch, 2008 14. Multiple Regression 2/43 14.1 Introduction SLR and

701 HARRISON Planning Commission Hearing April 30th, 2020 701 HARRISON PROJECT SITE ASSESSOR'S

The Bus Services Bill and Municipal Bus Companies Summary Why we need bus services What

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer & Angi R osch,

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer & Angi R osch,

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer & Angi R osch,

Consolidated Bus Stops 2015-2016 What is a consolidated bus stop? A consolidated bus stop is a

Consolidated Bus Stops 2020-2021 What is a consolidated bus stop? A consolidated bus stop is

Inter&IntegratedCircuitBus (I 2 CSerialBus) http://www.i2c&bus.org/ 1 I 2 C*OVERVIEW

System Buses Chapter 5 S. Dandamudi Outline Introduction Bus arbitration Dynamic

Input Front side bus A Front side bus B controller Bus #0 HI North bridge Output Bus #0

lecture 22 Input / Output (I/O) 4 - asynchronous bus, handshaking - serial bus Mon.

System Buses Chapter 5 S. Dandamudi Outline Introduction Bus arbitration Dynamic

Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Carnegie Mellon

Hafner valves with Namur interface With the standard MNH 310 701 and MNH 510 701 Hafner offers

Machine Learning Machine Learning 10 10- -701/15 701/15- -781, Fall 2006 781, Fall 2006

Leveraging Supply Chain Finance to Optimize Value Brad Peterson +1 312 701 8568

Unit 7: Multiple Linear Regression Lecture 1: Introduction to MLR Statistics 101 Thomas

Has the World Changed? Myles Bradshaw, Head of Global Aggregate Fixed Income, Amundi Insert your

RAPID TRANSITIONS IN THE GLOBAL ECONOMY: OPPORTUNITIES AND MAJOR CHALLENGES Michael Spence ISEO

Erasmus+ mobility for studies 19/20 Info Meeting Call available in English:

Linear Regression Cohen Chapter 10 EDUC/PSY 6600 Fit the analysis to the data, not the data to

Which models can be fit with linear regression? Simple linear regression in Matlab X = rand(3,3)

Announcements Grades for the first midterm are posted, solutions to the midterm are on Smartsite

STAT 215 Indicator Variables Colin Reimer Dawson Oberlin College 31 October and 2 November 2016

Bus 701: Advanced Statistics Harald Schmidbauer c Harald - PowerPoint PPT Presentation

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer & Angi R osch, 2008 Chapter 14: Multiple Regression c Harald Schmidbauer & Angi R osch, 2008 14. Multiple Regression 2/43 14.1 Introduction SLR and

701 HARRISON Planning Commission Hearing April 30th, 2020 701 HARRISON PROJECT SITE ASSESSOR'S

The Bus Services Bill and Municipal Bus Companies Summary Why we need bus services What

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer &amp; Angi R osch,

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer &amp; Angi R osch,

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer &amp; Angi R osch,

Consolidated Bus Stops 2015-2016 What is a consolidated bus stop? A consolidated bus stop is a

Consolidated Bus Stops 2020-2021 What is a consolidated bus stop? A consolidated bus stop is

Inter&amp;Integrated*Circuit*Bus (I 2 C*Serial*Bus) http://www.i2c&amp;bus.org/ 1 I 2 C*OVERVIEW

System Buses Chapter 5 S. Dandamudi Outline Introduction Bus arbitration Dynamic

Input Front side bus A Front side bus B controller Bus #0 HI North bridge Output Bus #0

lecture 22 Input / Output (I/O) 4 - asynchronous bus, handshaking - serial bus Mon.

System Buses Chapter 5 S. Dandamudi Outline Introduction Bus arbitration Dynamic

Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Carnegie Mellon

Hafner valves with Namur interface With the standard MNH 310 701 and MNH 510 701 Hafner offers

Machine Learning Machine Learning 10 10- -701/15 701/15- -781, Fall 2006 781, Fall 2006

Leveraging Supply Chain Finance to Optimize Value Brad Peterson +1 312 701 8568

Unit 7: Multiple Linear Regression Lecture 1: Introduction to MLR Statistics 101 Thomas

Has the World Changed? Myles Bradshaw, Head of Global Aggregate Fixed Income, Amundi Insert your

RAPID TRANSITIONS IN THE GLOBAL ECONOMY: OPPORTUNITIES AND MAJOR CHALLENGES Michael Spence ISEO

Erasmus+ mobility for studies 19/20 Info Meeting Call available in English:

Linear Regression Cohen Chapter 10 EDUC/PSY 6600 Fit the analysis to the data, not the data to

Which models can be fit with linear regression? Simple linear regression in Matlab X = rand(3,3)

Announcements Grades for the first midterm are posted, solutions to the midterm are on Smartsite

STAT 215 Indicator Variables Colin Reimer Dawson Oberlin College 31 October and 2 November 2016

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer & Angi R osch,

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer & Angi R osch,

Bus 701: Advanced Statistics Harald Schmidbauer c Harald Schmidbauer & Angi R osch,

Inter&IntegratedCircuitBus (I 2 CSerialBus) http://www.i2c&bus.org/ 1 I 2 C*OVERVIEW