Linear Regression 18.05 Spring 2014 Agenda Fitting curves to - PowerPoint PPT Presentation

Linear Regression 18.05 Spring 2014

Agenda Fitting curves to bivariate data Measuring the goodness of fit The fit vs. complexity tradeoff Regression to the mean Multiple linear regression January 1, 2017 2 / 20

Modeling bivariate data as a function + noise Ingredients Bivariate data ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x n , y n ) . Model: y i = f ( x i ) + E i where f ( x ) is some function, E i random error. n n � � E 2 2 Total squared error: i = ( y − f ( x i )) i i =1 i =1 Model allows us to predict the value of y for any given value of x . • x is called the independent or predictor variable. • y is the dependent or response variable. January 1, 2017 3 / 20

Examples of f ( x ) lines: y = ax + b + E polynomials: y = ax 2 + bx + c + E other: y = a / x + b + E other: y = a sin( x ) + b + E January 1, 2017 4 / 20

Simple linear regression: finding the best fitting line Bivariate data ( x 1 , y 1 ) , . . . , ( x n , y n ). Simple linear regression: fit a line to the data where E i ∼ N(0 , σ 2 ) y i = ax i + b + E i , and where σ is a fixed value, the same for all data points. n n � � E 2 − ax i − b ) 2 Total squared error: = ( y i i i =1 i =1 Goal: Find the values of a and b that give the ‘best fitting line’. Best fit: (least squares) The values of a and b that minimize the total squared error. January 1, 2017 5 / 20

Linear Regression: finding the best fitting polynomial Bivariate data: ( x 1 , y 1 ) , . . . , ( x n , y n ). Linear regression: fit a parabola to the data y i = ax 2 where E i ∼ N(0 , σ 2 ) i + bx i + c + E i , and where σ is a fixed value, the same for all data points. n n � � E 2 ( y i − ax 2 i − bx i − c ) 2 . Total squared error: = i i =1 i =1 Goal: Find the values of a , b , c that give the ‘best fitting parabola’. Best fit: (least squares) The values of a , b , c that minimize the total squared error. Can also fit higher order polynomials. January 1, 2017 6 / 20

Stamps 50 ● ● ● ● ● ● 40 ● ● ● ● ● ● 30 ● y ● ● 20 ● ● ● 10 ● ● ● ● 0 10 20 30 40 50 60 x Stamp cost (cents) vs. time (years since 1960) (Red dot = 49 cents is predicted cost in 2016.) (Actual cost of a stamp dropped from 49 to 47 cents on 4/8/16.) January 1, 2017 7 / 20

Parabolic fit 15 10 y 5 0 −1 0 1 2 3 4 5 6 x January 1, 2017 8 / 20

Board question: make it fit Bivariate data: (1 , 3) , (2 , 1) , (4 , 4) 1. Do (simple) linear regression to find the best fitting line. Hint: minimize the total squared error by taking partial derivatives with respect to a and b . 2. Do linear regression to find the best fitting parabola. 3. Set up the linear regression to find the best fitting cubic. but don’t take derivatives. 4. Find the best fitting exponential y = e ax + b . Hint: take ln( y ) and do simple linear regression. January 1, 2017 9 / 20

What is linear about linear regression? Linear in the parameters a , b , . . . . y = ax + b . y = ax 2 + bx + c . It is not because the curve being fit has to be a straight line –although this is the simplest and most common case. Notice: in the board question you had to solve a system of simultaneous linear equations. Fitting a line is called simple linear regression. January 1, 2017 10 / 20

Homoscedastic BIG ASSUMPTIONS : the E i are independent with the same variance σ 2 . 20 ● ● 4 ● ● ● ● ● ● ● ● ● 3 ● ● ● ● ● ● 15 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● 10 ● ● ● ● ● ● ● ● ● ● y ● e ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −1 ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 ● ● ● ● ● ● ● ● 0 ● −3 0 2 4 6 8 10 0 2 4 6 8 10 x x Regression line (left) and residuals (right). Homoscedasticity = uniform spread of errors around regression line. January 1, 2017 11 / 20

Heteroscedastic 20 ● ● ● ● ● ● ● ● ● ● ● ● 15 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 0 2 4 6 8 10 x Heteroscedastic Data January 1, 2017 12 / 20

Formulas for simple linear regression Model: y i = ax i + b + E i where E i ∼ N(0 , σ 2 ) . Using calculus or algebra: s xy ˆ = y a ˆ = and b ¯ − a ˆ x ¯ , s xx where 1 1 � � ¯) 2 x ¯ = n x i s xx = ( x i − x n − 1 1 1 � � ¯ = n s xy = ( x − x ¯ y i )( − y ¯). y y i i − 1 n WARNING: This is just for simple linea r regression. For polynomials and other functions you need other formulas. January 1, 2017 13 / 20

Board Question: using the formulas plus some theory Bivariate data: (1 , 3) , (2 , 1) , (4 , 4) 1.(a) Calculate the sample means for x and y . 1.(b) Use the formulas to find a best-fit line in the xy -plane. s xy ˆ a ˆ = s xx b = y − ˆ ax 1 1 � � x i − x ) 2 . s xy = n − 1 ( x i − x )( y i − y ) s xx = ( n − 1 2. Show the point ( x , y ) is always on the fitted line. 3. Under the assumption E i ∼ N(0 , σ 2 ) show that the least squares method is equivalent to finding the MLE for the parameters ( a , b ). 2 Hint: f ( y i | x i , a , b ) ∼ N( ax i + b , σ ). January 1, 2017 14 / 20

Measuring the fit y = ( y 1 , · · · , y n ) = data values of the response variable. y ˆ = ( y ˆ 1 , · · · , y ˆ n ) = ‘fitted values’ of the response variable. ( y i − y ) 2 = total sum of squares = total variation. � TSS = i ) 2 = residual sum of squares. � RSS = ( y i − y ˆ RSS = unexplained by model squared error (due to random fluctuation) RSS / TSS = unexplained fraction of the total error. R 2 = 1 − RSS / TSS is measure of goodness-of-fit R 2 is the fraction of the variance of y explained by the model. January 1, 2017 15 / 20

Overfitting a polynomial Increasing the degree of the polynomial increases R 2 Increasing the degree of the polynomial increases the complexity of the model. The optimal degree is a tradeoff between goodness of fit and complexity. ˆ and R 2 = 1. If all data points lie on the fitted curve, then y = y R demonstration! January 1, 2017 16 / 20

Outliers and other troubles Question: Can one point change the regression line significantly? Use mathlet http://mathlets.org/mathlets/linear-regression/ January 1, 2017 17 / 20

Regression to the mean Suppose a group of children is given an IQ test at age 4. One year later the same children are given another IQ test. Children’s IQ scores at age 4 and age 5 should be positively correlated. Those who did poorly on the first test (e.g., bottom 10%) will tend to show improvement (i.e. regress to the mean) on the second test. A completely useless intervention with the poor-performing children might be misinterpreted as causing an increase in their scores. Conversely, a reward for the top-performing children might be misinterpreted as causing a decrease in their scores. This example is from Rice Mathematical Statistics and Data Analysis January 1, 2017 18 / 20

A brief discussion of multiple linear regression Multivariate data: ( x i , 1 , x i , 2 , . . . , x i , m , y i ) ( n data points: i = 1 , . . . , n ) Model y ˆ i = a 1 x i , 1 + a 2 x i , 2 + . . . + a m x i , m x i , j are the explanatory (or predictor) variables. y i is the response variable. The total squared error is n n � � 2 2 ( y i − y ˆ i ) = ( y i − − − . . . − m x i , m ) a 1 x i , 1 a 2 x i , 2 a i =1 i =1 January 1, 2017 19 / 20

MIT OpenCourseWare https://ocw.mit.edu 18.05 Introduction to Probability and Statistics Spring 2014 For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

Linear Regression 18.05 Spring 2014 Agenda Fitting curves to - PowerPoint PPT Presentation

Linear Regression 18.05 Spring 2014 Agenda Fitting curves to bivariate data Measuring the goodness of fit The fit vs. complexity tradeoff Regression to the mean Multiple linear regression January 1, 2017 2 / 20 Modeling bivariate data

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Linear regression How to measure the accuracy of linear regression models Linear Regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Logistic regression CS 446 1. Linear classifiers Linear regression Last two lectures, we studied

LINEAR REGRESSION LINEAR REGRESSION - FROM A MACHINE LEARNING POINT OF VIEW 25 SIMPLE LINEAR

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Chapter 7 Linear Regression 04/05/2016 Huamei Dong 1. Review Least square regression line 2.

Technical conditions for linear regression Jo Hardin Professor, Pomona College DataCamp

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Converting High Volume Data challenges to Relevant Clinical Data Insight Navneet Kumar Manager

EMRs - Realizing Personalized Medicine Guna Rajagopal PhD Executive Director, Bioinformatics,

Forum Webinar March 8, 2018 We bina r Pur pose a nd De sir e d Outc ome Purpose : Meeting

21 November, 2019 Paris Agreement: Opportunity for Climate Markets Climate markets can mobilize

Regression Analysis in Stata Hsueh-Sheng Wu CFDR Workshop Series February 18, 2019 1 Overview

Getting to Regression: The Workhorse of Quantitative Political Analysis Department of

ECON 950 Winter 2020 Prof. James MacKinnon 3. Methods Based on Linear Regression The methods

Correlated Component Regression: A Fast Parsimonious Approach for Predicting Outcome Variables