Simple Linear Regression Recall: A regression model describes how a - PowerPoint PPT Presentation

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Simple Linear Regression Recall: A regression model describes how a dependent variable (or response ) Y is affected, on average, by one or more independent variables (or factors , or covariates ) x 1 , x 2 , . . . , x k . 1 / 20 Simple Linear Regression Introduction

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II The Straight-Line Probabilistic Model Simplest case of a regression model: One independent variable, k = 1, x 1 ≡ x ; Linear dependence; Model equation: E ( Y ) = β 0 + β 1 x , or equivalently Y = β 0 + β 1 x + ǫ. 2 / 20 Simple Linear Regression The Straight-Line Probabilistic Model

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Interpreting the parameters: β 0 is the intercept (so called because it is where the graph of y = β 0 + β 1 x meets the y -axis x = 0); β 1 is the slope ; that is, the change in E ( y ) as x is changed to x + 1. Note: if β 1 = 0, x has no effect on y ; that will often be an interesting hypothesis to test. 3 / 20 Simple Linear Regression The Straight-Line Probabilistic Model

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Advertising and Sales example x = monthly advertising expenditure, in hundreds of dollars; y = monthly sales revenue, in thousands of dollars; β 0 = expected revenue with no advertising; β 1 = expected revenue increase per $100 increase in advertising, in thousands of dollars. Sample data for five months: Advertising 1 2 3 4 5 Revenue 1 1 2 2 4 4 / 20 Simple Linear Regression The Straight-Line Probabilistic Model

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II What do these data tell about β 0 and β 1 ? Advertising and revenue scatterplot 4 ● 3 ● ● y 2 1 ● ● 0 0 1 2 3 4 5 x 5 / 20 Simple Linear Regression The Straight-Line Probabilistic Model

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II We could try various values of β 0 and β 1 . For given values of β 0 and β 1 , we get predictions p i = β 0 + β 1 x i , i = 1 , 2 , 3 , 4 , 5 . The difference betweem the observed value y i and the prediction p i is the residual r i = y i − p i , i = 1 , 2 , 3 , 4 , 5 . A good choice of β 0 and β 1 gives accurate predictions, and generally small residuals. 6 / 20 Simple Linear Regression The Straight-Line Probabilistic Model

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II One candidate line ( β 0 = − 0 . 1 , β 1 = 0 . 7): Advertising and revenue with candidate line 4 ● 3 2 ● ● y 1 ● ● 0 0 1 2 3 4 5 x 7 / 20 Simple Linear Regression The Straight-Line Probabilistic Model

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Fitting the Model How to measure the overall size of the residuals? Most common measure (but not the only possibility): sum of squares of residuals � � r 2 ( y i − p i ) 2 i = � { y i − ( β 0 + β 1 x i ) } 2 = = S ( β 0 , β 1 ) . The least squares line is the one with the smallest sum of squares. Note: the least squares line has the property that � r i = 0; Definition 3.1 (page 95) does not need to impose that as a constraint. 8 / 20 Simple Linear Regression Fitting the Model

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II The least squares estimates of β 0 and β 1 are the coefficients of the least squares line. Some algebra shows that the least squares estimates are � x i y i − n ¯ � ( x i − ¯ x )( y i − ¯ y ) x ¯ y ˆ β 1 = = � x 2 � ( x i − ¯ x ) 2 i − n ¯ x 2 and ˆ y − ˆ β 0 = ¯ β 1 ¯ x . With a little luck, you will never need to use these formulæ. 9 / 20 Simple Linear Regression Fitting the Model

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Other criteria Why square the residuals? We could use least absolute deviations estimates, minimizing � S 1 ( β 0 , β 1 ) = | y i − ( β 0 + β 1 x i ) | . Convenience: we have equations for the least squares estimates, but to find the least absolute deviations estimates we have to solve a linear programming problem. Optimality: least squares estimates are BLUE if the errors ǫ are uncorrelated with constant variance, and MVUE if additionally ǫ is normal. 10 / 20 Simple Linear Regression Fitting the Model

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Model Assumptions The least squares line gives point estimates of β 0 and β 1 . These estimates are always unbiased. To use the other forms of statistical inference: interval estimates, such as confidence intervals; hypothesis tests; we need some assumptions about the random errors ǫ . 11 / 20 Simple Linear Regression Model Assumptions

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Zero mean : E ( ǫ i ) = 0; as noted earlier, this is not really an 1 assumption, but a consequence of the definition ǫ = Y − E ( Y ) . Constant variance : V ( ǫ i ) = σ 2 ; this is a nontrivial assumption, 2 often violated in practice. Normality : ǫ i ∼ N (0 , σ 2 ); this is also a nontrivial assumption, 3 always violated in practice, but sometimes a useful approximation. Independence : ǫ i and ǫ j are statistically independent ; another 4 nontrivial assumption, often true in practice, but typically violated with time series and spatial data. 12 / 20 Simple Linear Regression Model Assumptions

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Notes: Assumptions 2 and 4 are the conditions under which least squares estimates are BLUE (Best Linear Unbiased Estimators); Assumptions 2, 3, and 4 are the conditions under which least squares estimates are MVUE (Minimum Variance Unbiased Estimators). 13 / 20 Simple Linear Regression Model Assumptions

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Estimating σ 2 Recall that σ 2 is the variance of ǫ i , which we have assumed to be the same for all i . That is, σ 2 = V ( ǫ i ) = V [ Y i − E ( Y i )] = V [ Y i − ( β 0 + β 1 x i )] , i = 1 , 2 , . . . , n . We observe Y i = y i and x i ; if we knew β 0 and β 1 , we would estimate σ 2 by 1 { y i − ( β 0 + β 1 x i ) } 2 = 1 � nS ( β 0 , β 1 ) . n An Estimator of σ 2 14 / 20 Simple Linear Regression

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II We do not know β 0 and β 1 , but we have least squares estimates ˆ β 0 and ˆ β 1 . � � β 0 , ˆ ˆ So we could use S as an approximation to S ( β 0 , β 1 ). β 1 � � β 0 , ˆ ˆ But we know that S β 1 < S ( β 0 , β 1 ), so 1 � � β 0 , ˆ ˆ nS β 1 would be a biased estimate of σ 2 . An Estimator of σ 2 15 / 20 Simple Linear Regression

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II We can show that, under Assumptions 2 and 4, � � �� β 0 , ˆ ˆ = ( n − 2) σ 2 . E S β 1 So 1 1 s 2 = � � β 0 , ˆ ˆ � y i ) 2 , n − 2 S β 1 = ( y i − ˆ n − 2 y i = ˆ β 0 + ˆ β 1 x i , is an unbiased estimate of σ 2 . where ˆ This is sometimes written s 2 = Mean Square for Error = MS E degrees of freedom for Error = SS E Sum of Squares for Error = . df E An Estimator of σ 2 16 / 20 Simple Linear Regression

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Inferences about the line We are often interested in the question of whether x has any effect on E ( Y ). Since E ( Y ) = β 0 + β 1 x , the independent variable x has some effect whenever β 1 � = 0. So we need to test the null hypothesis H 0 : β 1 = 0. 17 / 20 Simple Linear Regression Making Inferences About the Slope β 1

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II We also need to construct a confidence interval for β 1 , to indicate how precisely we know its value. For both purposes, we need the standard error : σ β 1 = √ SS xx σ ˆ , where � x ) 2 . SS xx = ( x i − ¯ As always, since σ is unknown, we replace it by its estimate s , to get the estimated standard error s ˆ β 1 = √ SS xx σ ˆ . 18 / 20 Simple Linear Regression Making Inferences About the Slope β 1

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II A confidence interval for β 1 is ˆ β 1 ± t α/ 2 , n − 2 × ˆ σ ˆ β 1 . Note that we use the t -distribution with n − 2 degrees of freedom, because that is the degrees of freedom associated with s 2 . To test H 0 : β 1 = 0, we use the test statistic ˆ β 1 t = , σ ˆ ˆ β 1 and reject H 0 at the significance level α if | t | > t α/ 2 , n − 2 . 19 / 20 Simple Linear Regression Making Inferences About the Slope β 1

Simple Linear Regression Recall: A regression model describes how a - PowerPoint PPT Presentation

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Simple Linear Regression Recall: A regression model describes how a dependent variable (or response ) Y is affected, on average, by one or more

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Linear regression Linear regression is a simple approach to supervised learning. It assumes

LINEAR REGRESSION LINEAR REGRESSION - FROM A MACHINE LEARNING POINT OF VIEW 25 SIMPLE LINEAR

Linear regression How to measure the accuracy of linear regression models Linear Regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

Simple linear regression STAT 401A - Statistical Methods for Research Workers Jarad Niemi Iowa

Outline The Simple Linear Regression Model (12.1) Fitting the Regression Line (12.2)

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model

Logistic regression CS 446 1. Linear classifiers Linear regression Last two lectures, we studied

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

Regression: Simple and Linear Introduction to Machine Learning Regression Principle REGRESSION

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Non-Stationary Time Series, Cointegration and Spurious Regression Heino Bohn Nielsen 1 of 32

Notation ^ y = b 0 + b 1 x 1 + b 2 x 2 + b 3 x 3 +. . .+ b k x k 0 = the y -intercept, or the

Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public

Linear Models for Regression Henrik I Christensen Robotics & Intelligent Machines @ GT

8.4.3 Linear Regression Prof. Tesler Math 283 Fall 2019 Prof. Tesler 8.4.3: Linear Regression

Poli 5D Social Science Data Analytics Regression in Stata Shane Xinyang Xuan ShaneXuan.com

Statistical Machine Learning Lecture 13: Kernel Regression and Gaussian Processes Kristian

Regression Testing Gavan Fantom gavan@NetBSD.org pkgsrcCon 2005 Introduction Have you ever

Simple Linear Regression Recall: A regression model describes how a - PowerPoint PPT Presentation

ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Simple Linear Regression Recall: A regression model describes how a dependent variable (or response ) Y is affected, on average, by one or more

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Linear regression Linear regression is a simple approach to supervised learning. It assumes

LINEAR REGRESSION LINEAR REGRESSION - FROM A MACHINE LEARNING POINT OF VIEW 25 SIMPLE LINEAR

Linear regression How to measure the accuracy of linear regression models Linear Regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

Simple linear regression STAT 401A - Statistical Methods for Research Workers Jarad Niemi Iowa

Outline The Simple Linear Regression Model (12.1) Fitting the Regression Line (12.2)

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model

Logistic regression CS 446 1. Linear classifiers Linear regression Last two lectures, we studied

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

Regression: Simple and Linear Introduction to Machine Learning Regression Principle REGRESSION

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Non-Stationary Time Series, Cointegration and Spurious Regression Heino Bohn Nielsen 1 of 32

Notation ^ y = b 0 + b 1 x 1 + b 2 x 2 + b 3 x 3 +. . .+ b k x k 0 = the y -intercept, or the

Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public

Linear Models for Regression Henrik I Christensen Robotics &amp; Intelligent Machines @ GT

8.4.3 Linear Regression Prof. Tesler Math 283 Fall 2019 Prof. Tesler 8.4.3: Linear Regression

Poli 5D Social Science Data Analytics Regression in Stata Shane Xinyang Xuan ShaneXuan.com

Statistical Machine Learning Lecture 13: Kernel Regression and Gaussian Processes Kristian

Regression Testing Gavan Fantom gavan@NetBSD.org pkgsrcCon 2005 Introduction Have you ever

Linear Models for Regression Henrik I Christensen Robotics & Intelligent Machines @ GT