welcome back
play

Welcome Back! EDUC 7610 Chapter 2 The Simple Regression Model - PowerPoint PPT Presentation

Welcome Back! EDUC 7610 Chapter 2 The Simple Regression Model Fall 2018 Tyson S. Barrett, PhD ! " = $ % + $ ' ( '" + ) " Lets start with Scatterplots Each point represents a single 6 observation The red line is the line


  1. Welcome Back!

  2. EDUC 7610 Chapter 2 The Simple Regression Model Fall 2018 Tyson S. Barrett, PhD ! " = $ % + $ ' ( '" + ) "

  3. Let’s start with Scatterplots Each point represents a single 6 observation The red line is the line of best fit 4 y The line happens to go through each Conditional Mean 2 It goes through the mean at each value • of x E.g. When x = 1, mean of y = 2.5 (the • 0 2 4 6 conditional mean of y at x = 1 is 2.5) x

  4. Conditional Means and Prediction The open circles are where the 6 Conditional Means are In this case, all conditional means 4 run along the line y When this happens (or approx. • happens) we have linearity 2 The line is the linear model’s predicted level of y for each level of 0 x 2 4 6 x

  5. Why is that line the “best”? That line is the line that minimizes 6 the error between the predicted values and the observed values 4 i.e., “residual” or “error” y . . % ) 4 = + % −2 4 !! "#$%&'() = + ( 0 0 5 % 2 %,- %,- This approach is called Ordinary 0 Least Squares (OLS) regression 2 4 6 x

  6. Features of the “Best” Line (Simple Regression) Slope = ! " Intercept = ! 5 ! 5 = 7 + − ! " 7 ( ' ( % − * % − * ( % + + % ! " = $ ( % − * ( % , %&" = -./((+) 234(() The Line ( 8 + % ) = ! 5 + ! " ( % ' ( % − * ( % , ' ( % − * % − * ( % + + % 234(() = $ -./((+) = $ 6 6 %&" %&"

  7. The “Best” Line and Correlation ' & ! " = $ %& ' ( ! " is only affected by variables that ) * We unstandardized the $ %& by ) + influence both X and Y while $ %& is affected by variables that only %& has no scale but ! " is in the units $ influence Y of the outcome $ %& is affected by the range of the ! " is the effect of X on Y while $ %& is variables measured the relative importance of X on Y

  8. "# by $ % We unstandardized the ! $ & That is, ! "# is the standardized version of ' ( If we standardize our variables before using regression, both ! "# a nd ' ( are the same + " = - . − 0 - Why? 1 2

  9. ! "# has no scale but $ % is in the units of the outcome ! "# has a range of -1 to 1 $ % is in the range of the outcome (approximately), often is from – ∞ to ∞ “For a one unit increase in X there is an associated increase of $ % units in the outcome”

  10. # $% is affected by the range of the variables measured The value of ! " is not affected by the range of X (the significance is…) # $% is affected by having a less-than-representative range of X Why?

  11. ! "# is affected by the range of the variables measured 6 6 4 4 y y 2 2 0 0 2 4 6 1 2 3 4 x x

  12. $ % is only affected by variables that $ % is the effect of X on influence both X and Y while ! "# is Y while ! "# is the affected by variables that only relative importance of influence Y X on Y ! "# is a measure of relative importance compared to other variables If other variables are important, ! "# will be relatively smaller • $ % is a measure of the effect of X on Y and therefore shouldn’t change much based on the range of X • The standard error is affected though (we’ll discuss later)

  13. Back to Residuals 2 The estimate of ! " depends on minimizing the residuals so they are kind of a big deal 0 y / / ' ) 5 = - ' −3 5 ## $%&'()*+ = - ( 1 1 6 ' − 2 '." '." − 2 − 1 0 1 2 x

  14. Back to Residuals 2 Our ! " values can be separated into three parts: 0 y " = $ ! + & " − $ " − & ! ! ! + ! ! " The same for Unexplained Explained everyone (a − 2 component component constant) (residuals) − 2 − 1 0 1 2 x

  15. Back to Residuals 2 Our ! " values can be separated into three parts: 0 y " = $ ! + & " − $ " − & ! ! ! + ! ! " The same for Unexplained Explained everyone (a − 2 component component constant) (residuals) − 2 − 1 0 1 2 x

  16. Properties of the Residuals 1. The mean is exactly zero. 2. The correlation with X is exactly zero. 3. The variance is: . ) Var Y. X = Var(Y)(1 − r ,- Var Y. X . ) The proportion of variance Var(Y) = (1 − r ,- in Y not explained by X

  17. Properties of the Residuals 1. The mean is exactly zero. 2. The correlation with X is exactly zero. . is the proportion of r ,- 3. The variance is: variance in Y explained by X . ) Var Y. X = Var(Y)(1 − r ,- Var Y. X . ) The proportion of variance Var(Y) = (1 − r ,- in Y not explained by X

  18. Residuals tell us stuff 1. Partial relationships because the residual is what is remaining in Y after adjusting for X 2. Residual analysis to detect anomalies 3. Detect non-linearities 4. Assess the homoskedasticity assumption

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend