2/18/20 & 2/19/20 POL 144A: Eastern European Democratization - PowerPoint PPT Presentation

Section 5: Regression 2/18/20 & 2/19/20 POL 144A: Eastern European Democratization Isaac Hale Winter 2020

Hale Regression Outline 1. Regression Crash Course 2. Regression in Excel 2/12

Hale Regression What Is Regression Analysis? • A statistical technique for estimating the relationship between: – A dependent variable (“Y”) – One or more independent variables (“X”) • It is not a coincidence that we call these Y and X – you should think of the dependent variable as the y-axis of a graph • While there are many kinds of regression, we will be using a simple type of linear regression called Ordinary Least Squares (“OLS”) in this class 3/12

Hale Regression Regression Basics • When we get a regression, we will get a series of numbers • Here are the ones you should pay attention to: – Coefficients for each of our independent variables (our X’s) – P-values for each independent variable – An “intercept” – Number of observations – R-squared • Remember the equation for a line? (think high school math!) • Y= MX + b 4/12

Hale Regression Regression Basics • The equation for a simple regression is very similar: – Y = a + B 1 X 1 + E • Y is our dependent variable, “a” is the intercept, “X 1 ” is our first independent variable, “B 1 ” is the coefficient for our independent, and “E” is our error term – Think of the coefficient like the slope in our line equation (“m”) – Why an error term? Regression is an estimate , not reality • If we have multiple independent variables, our regression might look like this: • Y = a + B 1 X 1 + B 2 X 2 + B 3 X 3 + E 5/12

Hale Regression A Simple Regression: Theory First! • Let’s do an imaginary regression together, with made -up data • Let’s imagine that we’re looking at data on literacy and K -12 education spending for several Eastern European countries across many years • How might we expect education spending and literacy are related? • Hypothesis: more education spending causes higher literacy • NOTE: unlike with correlations, regression assumes that X is causing Y. – You, the researcher, must justify this! 6/12

Hale Regression A Simple Regression: Interpretation • IMPORTANT: how exactly is each variable measured? – Let’s say our dependent variable, literacy, is the percent of the population that is literate – Let’s imagine that our independent variable, education spending, is millions of dollars spent by the country on K-12 education • Let’s imagine we run the regression, and we get results like this: – Intercept = 50 – Education Spending B 1 = 1 • What does this mean? – When education spending is zero , we expect literacy to be 50% – For each million dollars spent on K-12 education, literacy increases 1% 7/12

Hale Regression A Simple Regression: Interpretation • What would happen if our “X” variable were thousands of dollars, not millions? • Our Education Spending B 1 would be .001, and our intercept would be unchanged at 50 • What does this mean? – When education spending is zero , we expect literacy to be 50% – For each thousand dollars spent on K-12 education, literacy increases .001% • This is why knowing the measurement of our variables is critical for interpreting a regression! 8/12

Hale Regression Another Simple Regression Example • Let’s try another! • Let’s imagine that we’re looking at data on life expectancy and poverty in Eastern Europe • What might we hypothesize? • Hypothesis: higher levels of poverty cause lower life expectancy – What is our dependent variable (Y)? – What is our independent variable (X)? 9/12

Hale Regression A Simple Regression: Interpretation • MEASUREMENT – Let’s say life expectancy is years – Let’s say poverty is percent of the country’s population • Let’s imagine we run the regression, and we get results like this: – Intercept = 80 – Poverty Level B 1 = -0.5 • What does this mean? – When poverty is zero , we expect life expectancy to be 80 – For each percent higher poverty in a country, life expectancy decreases by half a year 10/12

Hale Regression P- Values & “Stars” • Something else our model will tell us is the p-value for each coefficient • Basically, the p-value tells us if we can have confidence that the effect of our independent variables is significant • If a coefficient has a p-value of 0.05 or lower , we generally say the variable is significant – This means we are 95% confident that the effect of the variable is distinct from zero • This is the same things the “stars” in the regression tables from the readings were telling us 11/12

Hale Regression Excel Time! • Let’s see how to do this in Excel! • You will be doing your own regression for the homework for next week • If you get lost, here is a YouTube link to walk you through how to do a regression in Excel: https://youtu.be/0lpfmFnlDHI 12/12

2/18/20 & 2/19/20 POL 144A: Eastern European Democratization - PowerPoint PPT Presentation

Section 5: Regression 2/18/20 & 2/19/20 POL 144A: Eastern European Democratization Isaac Hale Winter 2020 Hale Regression Outline 1. Regression Crash Course 2. Regression in Excel 2/12 Hale Regression What Is Regression Analysis?

Pablo A. Simn ENDESA, EURELECTRIC DAY 1: SM ART GRIDS TABLE 2: REGULATORY CHALLENGES AND

Data Streaming Lukasz Golab lgolab@uwaterloo.ca engineering.uwaterooo.ca/~lgolab Outline

Emerging Tech + Wrap-Up Spring 2020 Franziska (Franzi) Roesner franzi@cs.washington.edu Thanks

SMA CENTCOM Panel Discussion The Gulf and Egypt From the SMA Study in Support of USCENTCOM:

Demystifying Standards and Certifications Thursday 8 October 1 pm 2:30 pm Eastern Rhys Davies

The Psychiatrist Experience Shabana Khan, MD Assistant Professor of Psychiatry University of

Our Approach: J. N. Darby and Theological Method Outline Overview: Materials, History, and

Re-Placing Research in the Literature Classroom Aaron Brenner Robin Kear Amy Twyning

Computing and using the deviance with classification trees Gilbert Ritschard Dept of

Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul

Mixed models in R using the lme4 package Part 4: Inference based on profiled deviance Douglas

Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled

Session 07 GLM extensions The Negative Binomial distribution Probability function (

Analysis of Count Data A Business Perspective George J. Hurley Sr. Research Manager The

for Poisson Regression 1 Outline Example 3: Recall of Stressful Events Goodness of fit

Introduction to Bayesian Statistics Lecture 11: Model Comparison Rung-Ching Tsai Department of

Lecture 11. Modelling Process and Model Diagnostics (cont.) Nan Ye School of Mathematics and

Sta n d a r d s a n d Devia tion s: Th e Role of Rou tin e in Testin g Micha el Bolton, Dev elop

Advanced Mathematical Methods Part II Statistics GLM Analysis of Variance Designs Mel

Open Day 2019

Causal Inference via Kernel Deviance Measures Jovana Mitrovic , Dino Sejdinovic, Yee Whye Teh JM,

Model Estimation, Testing, and Reporting Model Estimation, Testing, and Reporting PSYC 575 PSYC

Talking About Concerns . . . James D. Herbsleb School of Computer Science Carnegie Mellon

Marlon Dumas University of Tartu, Estonia Estonian Theory Days | 3-4 October 2015 Process Mining

Sambuz

Useful Links

Newsletter

Mail Us