Lecture 1: Introduction to Regression An Example: Explaining State - PowerPoint PPT Presentation

Lecture 1: Introduction to Regression

An Example: Explaining State Homicide Rates  What kinds of variables might we use to explain/predict state homicide rates?  Let’s consider just one predictor for now: poverty Ignore omitted variables, measurement error  How might this be related to homicide rates? 

Poverty and Homicide  These data are located here: http://www.public.asu.edu/~gasweete/crj604/data/hom_pov.dta   Download these data and create a scatterplot in Stata.  Does there appear to be a relationship between poverty and homicide? What is the correlation?

Scatterplots and correlations Scatterplots with correlations of a) +1.00; b) – 0.50; c) +0.85; and d) +0.15.

Poverty and Homicide  There appears to be some relationship between poverty and homicide rates, but it’s not perfect.  But there is a lot of “noise” which we will attribute to unobserved factors and random error.

Poverty and Homicide, cont.  There is some nonzero value of expected homicides in the absence of  poverty. ( ) 0  We expect homicide rates to increase  as poverty rates increase. ( ) 1      Thus, Y X 0 1  This is the Population Regression Function

Poverty and Homicide, Sample Regression Function ˆ ˆ      y x u i 0 1 i i  y i is the dependent variable, homicide rate, which we are trying to explain. ˆ   represents our estimate of what the homicide 0 rate would be in the absence of poverty* ˆ   is our estimate of the “effect” of a higher 1 poverty rate on homicide  u i is a “noise” term reflecting other things that influence homicide rates *This is extrapolation outside the range of data. Not recommended.

Poverty and Homicide, cont. ˆ ˆ      y x u i 0 1 i i  Only y i and x i are directly observable in the equation above. The task of a regression analysis is to provide estimates of the slope and intercept terms.  The relationship is assumed to be linear. An increase in x is associated with an increase in y . Same expected change in homicide going from 6  to 7% poverty as from 15 to 16%

. twoway (scatter homrate poverty) (lfit homrate poverty)      .973 0.475 0 1

Ordinary Least Squares     y .973 .475 x u i i i Substantively, what do these estimates mean?  -.973 is the expected homicide rate if poverty rates were  zero. This is never the case, except perhaps in the case of a zombie apocalypse, so it’s not a meaningful estimate. .475 is the effect of a 1 unit increase in the poverty rate on  the homicide rate. You need to know how you are measuring poverty. In this case, 1 unit increase is an increase of 1 percentage point. So a 1 percentage point increase (not “percent increase”)  in the poverty rate is associated with an increase of .475 homicides per 100,000 people in the state. In AZ, this would be ~31 homicides. 

Ordinary Least Squares     y .973 .475 x u i i i  How did we arrive at this estimate? Why did we draw the line exactly where we did? Minimize the sum of the “squared error”, aka  Ordinary Least Squares (OLS) estimation n  ˆ  2 min ( Y Y ) i i  i 1  Why squared error?  Why vertical error? (Not perpendicular).

Ordinary Least Squares Estimates n  ˆ ˆ     2 min ( ( ) y x i 0 1 i  1 i  Solving for the minimum requires calculus (set derivative with respect to β to 0 and solve)  The book shows how we can go from some basic assumptions to estimates for β 0 and β 1 without using calculus.  I will go through two different ways to obtain these estimates: Wooldridge’s and Khan’s (khanacademy.org)

Ordinary Least Squares: Estimating the intercept (Wooldridge’s method)  E u ( ) 0  Assuming that the average      value of the u y x 0 1 error term is      zero, it is a ( ) 0 E y x trivial matter to 0 1 calculate β 0 ˆ ˆ      y x 0 once we know 0 1 β 1. ˆ ˆ     y x 0 1

Ordinary Least Squares: Estimating the intercept (Wooldridge)  Incidentally, these last sets of equations also imply that the regression line passes through the point that corresponds to the mean of x and   the mean of y: x , y ˆ ˆ     y x 0 1 ˆ ˆ     y x 0 1

Ordinary Least Squares: Estimating the slope (Wooldridge) First, we use the fact   E ( u ) 0 that the expected value of the error term ˆ ˆ      y x u is zero, to create i 0 1 i i generate a new ˆ ˆ      equation equal to u y x i i 0 1 i zero. n  We saw this before,  ˆ ˆ       1 n ( y x ) 0 but here I use the 0 1 i i exact formula used in  i 1 the book.

Ordinary Least Squares: Estimating the slope (Wooldridge)   Cov ( x , u ) E ( xu ) 0 We can multiply this  last equation by x i n  ˆ ˆ       since the 1 n x ( y x ) 0 i i 0 1 i covariance between  i 1 x and u is assumed to be zero and the n  ˆ ˆ        1 ( ( ) ) 0 n x y y x x terms in the i i 1 1 i parentheses are  i 1 equal to u . n  ˆ ˆ       Next, we plug in our  x ( y y x x ) 0 formula for the i i 1 1 i  1 i intercept and simplify

Ordinary Least Squares: Estimating the slope (Wooldridge) n  ˆ ˆ       x ( y y x x ) 0 Re-arranging . . .  i i 1 1 i  1 i n n   ˆ ˆ       x ( y y ) x ( x x ) 0 i i i 1 1 i   i 1 i 1 n n   ˆ      x ( y y ) x ( x x ) 0 1 i i i i   i 1 i 1 n n   ˆ     ( ) ( ) x y y x x x i i 1 i i   i 1 i 1

Ordinary Least Squares: Estimating the slope (Wooldridge) Re-arranging . . .  n n     ˆ      Interestingly, the  2 x x ( y y ) ( x x ) final result leads us i i 1 i   i 1 i 1 to the relationship between covariance n      x x ( y y ) of x and y and i i cov( x , y ) variance of x. ˆ     i 1 1 n  var( ) x  2 ( x x ) i  i 1

Ordinary Least Squares: Estimates (Khan’s method) Khan starts with the  actual points, and elaborates how these points are related to the squared error, the square of the distance between each point ( x n ,y n ) and the line y=mx+b= β 1 x+ β 0

Ordinary Least Squares: Estimates (Khan’s method) The vertical distance between any point ( x n ,y n ), and the  regression line y= β 1 x+ β 0 is simply y n -( β 1 x n + β 0 )                  Total Error ( y ( x )) ( y ( x )) ( y ( x )) 1 1 1 0 2 1 2 0 n 1 n 0 It would be trivial to minimize the total error. We could set  β 1 (the slope) equal to zero, and β 0 equal to the mean of y, and then the total error would be zero. Another approach is to minimize the absolute difference ,  but this actually creates thornier math problems than squaring the differences, and results in situations where there is not a unique solution. In short, what we want is the sum of the squared error  (SE), which means we have to square every term in that equation.

Ordinary Least Squares: Estimates (Khan’s method)                 2 2 2  SE ( y ( x )) ( y ( x )) ( y ( x )) 1 1 1 0 2 1 2 0 n 1 n 0 We need to find the β 1 and β 0 that minimize the SE. Let’s  expand this out. To be clear, the subscripts for the β estimates just refer to  our two regression line estimates, whereas the subscripts for our x’s and y’s refer to the first observation, second observation and so on.                    2 2 2 2  SE ( y 2 y ( x ) ( x ) ) ( y 2 y ( x ) ( x ) ) 1 1 1 1 0 1 1 0 n n 1 n 0 1 n 0              2 2 2 2  y 2 y x 2 y x 2 x 1 1 1 1 1 0 1 1 1 1 0 0             2 2 2 2 y 2 y x 2 y x 2 x n n 1 n n 0 1 n 1 n 0 0

Lecture 1: Introduction to Regression An Example: Explaining State - PowerPoint PPT Presentation

Lecture 1: Introduction to Regression An Example: Explaining State Homicide Rates What kinds of variables might we use to explain/predict state homicide rates? Lets consider just one predictor for now: poverty Ignore omitted

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Regression: Simple and Linear Introduction to Machine Learning Regression Principle REGRESSION

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Chapter 7 Linear Regression 04/05/2016 Huamei Dong 1. Review Least square regression line 2.

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

The Firefighter Problem on Trees David Ellison RMIT School of Science Co-authors: Pierre

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

January 2014 FOR PRESENTATION - NOT FOR DISTRIBUTION -

Congressional Ask a Criminologist Series Understanding

Lecture 3: Logic and Boolean algebra gradescope Homework #1 is up (and has been since Friday). It

cse 311: foundations of computing Spring 2015 Lecture 3: Logic and Boolean algebra gradescope

State firearm laws and workplace homicides in the United States Erika Sabbath, Summer S Hawkins,

Main Points Traum atic Effect of School Shootings 1. School violence is a small part of a larger

ADVANCING THE RULE OF LAW MIKE WALSH, CEO, LEXISNEXIS LEGAL

Science One March 6, 2017 Application of Integration: Differential Equations (last week physics

Lecture 1: Introduction to Regression An Example: Explaining State - PowerPoint PPT Presentation

Lecture 1: Introduction to Regression An Example: Explaining State Homicide Rates What kinds of variables might we use to explain/predict state homicide rates? Lets consider just one predictor for now: poverty Ignore omitted

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Planning and Optimization B2. Regression: Introduction &amp; STRIPS Case Malte Helmert and

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Regression: Simple and Linear Introduction to Machine Learning Regression Principle REGRESSION

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Chapter 7 Linear Regression 04/05/2016 Huamei Dong 1. Review Least square regression line 2.

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

The Firefighter Problem on Trees David Ellison RMIT School of Science Co-authors: Pierre

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

January 2014 FOR PRESENTATION - NOT FOR DISTRIBUTION -

Congressional Ask a Criminologist Series Understanding

Lecture 3: Logic and Boolean algebra gradescope Homework #1 is up (and has been since Friday). It

cse 311: foundations of computing Spring 2015 Lecture 3: Logic and Boolean algebra gradescope

State firearm laws and workplace homicides in the United States Erika Sabbath, Summer S Hawkins,

Main Points Traum atic Effect of School Shootings 1. School violence is a small part of a larger

ADVANCING THE RULE OF LAW MIKE WALSH, CEO, LEXISNEXIS LEGAL

Science One March 6, 2017 Application of Integration: Differential Equations (last week physics

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and