Poli 5D Social Science Data Analytics Regression in Stata Shane - PowerPoint PPT Presentation

Poli 5D Social Science Data Analytics Regression in Stata Shane Xinyang Xuan ShaneXuan.com February 10, 2017 ShaneXuan.com 1 / 10

Contact Information Shane Xinyang Xuan xxuan@ucsd.edu The teaching staff is a team! Professor Roberts M 1600-1800 (SSB 299) Jason Bigenho Th 1000-1200 (Econ 116) Shane Xuan M 1100-1150 (SSB 332) Th 1200-1250 (SSB 332) Supplemental Materials UCLA STATA starter kit http://www.ats.ucla.edu/stat/stata/sk/ Princeton data analysis http://dss.princeton.edu/training/ ShaneXuan.com 2 / 10

Road map Some quick notes before we start today’s section: – Make sure that you pass around the attendance sheet – Open a .do file – Import your data (“h1 fams data.xlsx”) – I will be using my slides, and you will need to type the code in your .do file ShaneXuan.com 3 / 10

Regression: Examples! Figure: Data points ShaneXuan.com 4 / 10

Regression: Examples! Figure: Bad fit ShaneXuan.com 4 / 10

Regression: Examples! Figure: Good fit ShaneXuan.com 4 / 10

Model – Population y i = β 0 + β 1 x i ShaneXuan.com 5 / 10

Model – Population y i = β 0 + β 1 x i – Estimation y i = ˆ β 0 + ˆ ˆ β 1 x i + ˆ e i ShaneXuan.com 5 / 10

Model – Population y i = β 0 + β 1 x i – Estimation y i = ˆ β 0 + ˆ ˆ β 1 x i + ˆ e i – (You don’t need to memorize this) Regression Coefficient is calculated by � i ( x i − x )( y i − y ) ˆ β 1 = � i ( x i − x ) 2 ShaneXuan.com 5 / 10

Interpretation of regression coefficient Suppose we have the model y = ˆ β 1 x 1 + ˆ β 2 x 2 + ˆ β 0 + ˆ e ShaneXuan.com 6 / 10

Interpretation of regression coefficient Suppose we have the model y = ˆ β 1 x 1 + ˆ β 2 x 2 + ˆ β 0 + ˆ e ◮ A 1-unit change in x 1 is associated with a β 1 -unit change in y , all else equal. ShaneXuan.com 6 / 10

Interpretation of regression coefficient Suppose we have the model y = ˆ β 1 x 1 + ˆ β 2 x 2 + ˆ β 0 + ˆ e ◮ A 1-unit change in x 1 is associated with a β 1 -unit change in y , all else equal. ◮ A 1-unit change in x 2 is associated with a β 2 -unit change in y , all else equal. ShaneXuan.com 6 / 10

Application ◮ Suppose consumption ( cons ) is a function of family income ( inc ): cons = β 0 + β 1 inc + u where u contains other factors affecting consumption. What change do you expect to see in cons with a two-unit increase in inc ? ShaneXuan.com 7 / 10

Application ◮ Suppose consumption ( cons ) is a function of family income ( inc ): cons = β 0 + β 1 inc + u where u contains other factors affecting consumption. What change do you expect to see in cons with a two-unit increase in inc ? ◮ With a two-unit increase in inc, ShaneXuan.com 7 / 10

Application ◮ Suppose consumption ( cons ) is a function of family income ( inc ): cons = β 0 + β 1 inc + u where u contains other factors affecting consumption. What change do you expect to see in cons with a two-unit increase in inc ? ◮ With a two-unit increase in inc, cons = β 0 + β 1 ( inc + 2) + u = β 0 + ( β 1 inc + 2 β 1 ) + u = ( β 0 + β 1 inc + u ) + 2 β 1 ShaneXuan.com 7 / 10

Application ◮ Suppose consumption ( cons ) is a function of family income ( inc ): cons = β 0 + β 1 inc + u where u contains other factors affecting consumption. What change do you expect to see in cons with a two-unit increase in inc ? ◮ With a two-unit increase in inc, cons = β 0 + β 1 ( inc + 2) + u = β 0 + ( β 1 inc + 2 β 1 ) + u = ( β 0 + β 1 inc + u ) + 2 β 1 Thus, we see a 2 β 1 increase in cons with a 2-unit increase in inc ! ShaneXuan.com 7 / 10

Code ◮ Scatter plot: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) ShaneXuan.com 8 / 10

Code ◮ Scatter plot: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) ◮ Regression: regress povertyratio mom age mom ShaneXuan.com 8 / 10

Code ◮ Scatter plot: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) ◮ Regression: regress povertyratio mom age mom ◮ Visualization: twoway (scatter povertyratio mom age mom, mlabsize(tiny) msize(tiny)) (lfit povertyratio mom age mom) ShaneXuan.com 8 / 10

Residuals ◮ Fitted values ShaneXuan.com 9 / 10

Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom ShaneXuan.com 9 / 10

Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ShaneXuan.com 9 / 10

Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ◮ Residuals ShaneXuan.com 9 / 10

Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ◮ Residuals – Manually: gen resid = povertyratio mom - fv ShaneXuan.com 9 / 10

Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ◮ Residuals – Manually: gen resid = povertyratio mom - fv – Stata command: predict e, residual ShaneXuan.com 9 / 10

Residuals ◮ Fitted values – Manually: gen fitted = -1.091357 + .1305531 * age mom – Stata command: predict fv ◮ Residuals – Manually: gen resid = povertyratio mom - fv – Stata command: predict e, residual Figure: Similar results for fitted values, and residuals ShaneXuan.com 9 / 10

What else can you do using regressions? ◮ Suppose you run a regression of y on x 1 , and get an error term ˆ e. You can then do a scatterplot of error term ( ˆ e ) and a different variable ( x 2 ) to see how much of the difference can be explained by this variable: ShaneXuan.com 10 / 10

What else can you do using regressions? ◮ Suppose you run a regression of y on x 1 , and get an error term ˆ e. You can then do a scatterplot of error term ( ˆ e ) and a different variable ( x 2 ) to see how much of the difference can be explained by this variable: – twoway scatter e x 2 ShaneXuan.com 10 / 10

What else can you do using regressions? ◮ Suppose you run a regression of y on x 1 , and get an error term ˆ e. You can then do a scatterplot of error term ( ˆ e ) and a different variable ( x 2 ) to see how much of the difference can be explained by this variable: – twoway scatter e x 2 ◮ You can do a multiple regression ShaneXuan.com 10 / 10

What else can you do using regressions? ◮ Suppose you run a regression of y on x 1 , and get an error term ˆ e. You can then do a scatterplot of error term ( ˆ e ) and a different variable ( x 2 ) to see how much of the difference can be explained by this variable: – twoway scatter e x 2 ◮ You can do a multiple regression – regress y 1 x 1 x 2 ... ShaneXuan.com 10 / 10

Poli 5D Social Science Data Analytics Regression in Stata Shane - PowerPoint PPT Presentation

Poli 5D Social Science Data Analytics Regression in Stata Shane Xinyang Xuan ShaneXuan.com February 10, 2017 ShaneXuan.com 1 / 10 Contact Information Shane Xinyang Xuan xxuan@ucsd.edu The teaching staff is a team! Professor Roberts M

POLI 100M: Poli-cal Psychology Lecture 9: Social Networks, Poli-cal Discussion, and Social Media

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

POLI 100M: Poli-cal Psychology Lecture 3: Poli-cal Par-cipa-on and Vo-ng Taylor N. Carlson

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Particulate Matte r Scienc e fo r Particulate Matter Science for Poli cy ker s: Poli

Poli 5D Social Science Data Analytics More on Stata Shane Xinyang Xuan ShaneXuan.com February

Poli 5D Social Science Data Analytics Functions in Excel (2); Intro to Stata Shane Xinyang Xuan

Poli 5D Social Science Data Analytics Functions in Excel Shane Xinyang Xuan ShaneXuan.com

Poli 5D Social Science Data Analytics Introduction to R Shane Xinyang Xuan ShaneXuan.com

DIGITAL ANALYTICS in Social Media Enterprise Solution For Todays Social Media DIGITAL

Social Media Analytics Ahmed Abbasi University of Virginia 1 Outline Social Media Overview

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

Debate Tactics Legitimate and Illegitimate Arguments POLI 12 - Intro to IR January 22, 2014

METROBUS METROBUS PRI ORI TY CORRI DOR NETWORK POLI CI ES AND STANDARDS POLI CI ES AND

POLI 332. Poli,cs and Governments of La,n America Instructor:

POLI 332. Poli,cs and Governments of La,n America Instructor:

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

8.4.3 Linear Regression Prof. Tesler Math 283 Fall 2019 Prof. Tesler 8.4.3: Linear Regression

Simple Linear Regression Recall: A regression model describes how a dependent variable (or

Non-Stationary Time Series, Cointegration and Spurious Regression Heino Bohn Nielsen 1 of 32

Statistical Machine Learning Lecture 13: Kernel Regression and Gaussian Processes Kristian

Regression Testing Gavan Fantom gavan@NetBSD.org pkgsrcCon 2005 Introduction Have you ever

Econometric Analysis Using Stata Introduction Time Series Panel Data Stata : Data Analysis and

Text Selection Bryan Kelly Yale University Asaf Manela Washington University in St. Louis Alan