poli 5d social science data analytics
play

Poli 5D Social Science Data Analytics Regression in Stata Shane - PowerPoint PPT Presentation

Poli 5D Social Science Data Analytics Regression in Stata Shane Xinyang Xuan ShaneXuan.com February 10, 2017 ShaneXuan.com 1 / 10 Contact Information Shane Xinyang Xuan xxuan@ucsd.edu The teaching staff is a team! Professor Roberts M


  1. Poli 5D Social Science Data Analytics Regression in Stata Shane Xinyang Xuan ShaneXuan.com February 10, 2017 ShaneXuan.com 1 / 10

  2. Contact Information Shane Xinyang Xuan xxuan@ucsd.edu The teaching staff is a team! Professor Roberts M 1600-1800 (SSB 299) Jason Bigenho Th 1000-1200 (Econ 116) Shane Xuan M 1100-1150 (SSB 332) Th 1200-1250 (SSB 332) Supplemental Materials UCLA STATA starter kit http://www.ats.ucla.edu/stat/stata/sk/ Princeton data analysis http://dss.princeton.edu/training/ ShaneXuan.com 2 / 10

  3. Road map Some quick notes before we start today’s section: – Make sure that you pass around the attendance sheet – Open a .do file – Import your data (“h1 fams data.xlsx”) – I will be using my slides, and you will need to type the code in your .do file ShaneXuan.com 3 / 10

  4. Regression: Examples! Figure: Data points ShaneXuan.com 4 / 10

  5. Regression: Examples! Figure: Bad fit ShaneXuan.com 4 / 10

  6. Regression: Examples! Figure: Good fit ShaneXuan.com 4 / 10

  7. Model – Population y i = β 0 + β 1 x i ShaneXuan.com 5 / 10

  8. Model – Population y i = β 0 + β 1 x i – Estimation y i = ˆ β 0 + ˆ ˆ β 1 x i + ˆ e i ShaneXuan.com 5 / 10

  9. Model – Population y i = β 0 + β 1 x i – Estimation y i = ˆ β 0 + ˆ ˆ β 1 x i + ˆ e i – (You don’t need to memorize this) Regression Coefficient is calculated by � i ( x i − x )( y i − y ) ˆ β 1 = � i ( x i − x ) 2 ShaneXuan.com 5 / 10

  10. Interpretation of regression coefficient Suppose we have the model y = ˆ β 1 x 1 + ˆ β 2 x 2 + ˆ β 0 + ˆ e ShaneXuan.com 6 / 10

  11. Interpretation of regression coefficient Suppose we have the model y = ˆ β 1 x 1 + ˆ β 2 x 2 + ˆ β 0 + ˆ e ◮ A 1-unit change in x 1 is associated with a β 1 -unit change in y , all else equal. ShaneXuan.com 6 / 10

  12. Interpretation of regression coefficient Suppose we have the model y = ˆ β 1 x 1 + ˆ β 2 x 2 + ˆ β 0 + ˆ e ◮ A 1-unit change in x 1 is associated with a β 1 -unit change in y , all else equal. ◮ A 1-unit change in x 2 is associated with a β 2 -unit change in y , all else equal. ShaneXuan.com 6 / 10

  13. Application ◮ Suppose consumption ( cons ) is a function of family income ( inc ): cons = β 0 + β 1 inc + u where u contains other factors affecting consumption. What change do you expect to see in cons with a two-unit increase in inc ? ShaneXuan.com 7 / 10

  14. Application ◮ Suppose consumption ( cons ) is a function of family income ( inc ): cons = β 0 + β 1 inc + u where u contains other factors affecting consumption. What change do you expect to see in cons with a two-unit increase in inc ? ◮ With a two-unit increase in inc, ShaneXuan.com 7 / 10

  15. Application ◮ Suppose consumption ( cons ) is a function of family income ( inc ): cons = β 0 + β 1 inc + u where u contains other factors affecting consumption. What change do you expect to see in cons with a two-unit increase in inc ? ◮ With a two-unit increase in inc, cons = β 0 + β 1 ( inc + 2) + u = β 0 + ( β 1 inc + 2 β 1 ) + u = ( β 0 + β 1 inc + u ) + 2 β 1 ShaneXuan.com 7 / 10

  16. Application ◮ Suppose consumption ( cons ) is a function of family income ( inc ): cons = β 0 + β 1 inc + u where u contains other factors affecting consumption. What change do you expect to see in cons with a two-unit increase in inc ? ◮ With a two-unit increase in inc, cons = β 0 + β 1 ( inc + 2) + u = β 0 + ( β 1 inc + 2 β 1 ) + u = ( β 0 + β 1 inc + u ) + 2 β 1 Thus, we see a 2 β 1 increase in cons with a 2-unit increase in inc ! ShaneXuan.com 7 / 10

  17. Code ◮ Scatter plot: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) ShaneXuan.com 8 / 10

  18. Code ◮ Scatter plot: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) ◮ Regression: regress povertyratio mom age mom ShaneXuan.com 8 / 10

  19. Code ◮ Scatter plot: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) ◮ Regression: regress povertyratio mom age mom ◮ Visualization: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) (lfit povertyratio mom age mom) ShaneXuan.com 8 / 10

  20. Code ◮ Scatter plot: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) ◮ Regression: regress povertyratio mom age mom ◮ Visualization: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) (lfit povertyratio mom age mom) ShaneXuan.com 8 / 10

  21. Residuals ◮ Fitted values ShaneXuan.com 9 / 10

  22. Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom ShaneXuan.com 9 / 10

  23. Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ShaneXuan.com 9 / 10

  24. Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ◮ Residuals ShaneXuan.com 9 / 10

  25. Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ◮ Residuals – Manually: gen resid = povertyratio mom - fv ShaneXuan.com 9 / 10

  26. Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ◮ Residuals – Manually: gen resid = povertyratio mom - fv – Stata command: predict e, residual ShaneXuan.com 9 / 10

  27. Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ◮ Residuals – Manually: gen resid = povertyratio mom - fv – Stata command: predict e, residual Figure: Similar results for fitted values, and residuals ShaneXuan.com 9 / 10

  28. What else can you do using regressions? ◮ Suppose you run a regression of y on x 1 , and get an error term ˆ e. You can then do a scatterplot of error term ( ˆ e ) and a different variable ( x 2 ) to see how much of the difference can be explained by this variable: ShaneXuan.com 10 / 10

  29. What else can you do using regressions? ◮ Suppose you run a regression of y on x 1 , and get an error term ˆ e. You can then do a scatterplot of error term ( ˆ e ) and a different variable ( x 2 ) to see how much of the difference can be explained by this variable: – twoway scatter e x 2 ShaneXuan.com 10 / 10

  30. What else can you do using regressions? ◮ Suppose you run a regression of y on x 1 , and get an error term ˆ e. You can then do a scatterplot of error term ( ˆ e ) and a different variable ( x 2 ) to see how much of the difference can be explained by this variable: – twoway scatter e x 2 ◮ You can do a multiple regression ShaneXuan.com 10 / 10

  31. What else can you do using regressions? ◮ Suppose you run a regression of y on x 1 , and get an error term ˆ e. You can then do a scatterplot of error term ( ˆ e ) and a different variable ( x 2 ) to see how much of the difference can be explained by this variable: – twoway scatter e x 2 ◮ You can do a multiple regression – regress y 1 x 1 x 2 ... ShaneXuan.com 10 / 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend