Assessment Option 1: Take-home exam Option 1: Take-home exam - - PowerPoint PPT Presentation

assessment
SMART_READER_LITE
LIVE PREVIEW

Assessment Option 1: Take-home exam Option 1: Take-home exam - - PowerPoint PPT Presentation

Assessment Option 1: Take-home exam Option 1: Take-home exam Replicate an analysis we provide and answer questions Will be uploaded by Friday, December 2, 2016 Option 2: Essay involving data analysis Option 2: Essay


slide-1
SLIDE 1

Assessment

  • Option 1: Take-home exam

Option 1: Take-home exam

– Replicate an analysis we provide and answer questions – Will be uploaded by Friday, December 2, 2016

  • Option 2: Essay involving data analysis

Option 2: Essay involving data analysis

– Replicate (or do) an analysis of your choice and write an essay – Consultation emails by Friday, December 9, 2016 (not required)

  • Deadline for both

Deadline for both:

– 12:00 p.m. on Friday 27 January 2017 (HT Week 2) via Weblearn

1

slide-2
SLIDE 2

Drop-in sessions

  • MT Week 8

MT Week 8 (Q-Step Lab):

  • Tuesday, 10.00-12.00: Anna Petherick (Comp Gov)
  • Wednesday, 10.00-12.00: Giacomo Arrighini (Pol Soc)
  • Thursday, 10.00-12.00: Jeffrey Wright (IR)
  • Friday, 10.00-12.00: Gerda Hooijer (Pol Soc)
  • HT Week 1

HT Week 1 (Q-Step Lab):

  • 15-min. slot bookings
  • Alice E. will send you an email with a link to sign up

2

slide-3
SLIDE 3

Last week: influence

3

Outlier = big residuals.

  • Problem = limited, but can

increase standard error.

  • Detect with studentized residuals.

High leverage = usually big or small value(s) of independent variable(s)

  • Problem = can change coefficients
  • Detect with hat values.

Influence = Outlierness * Leverage

slide-4
SLIDE 4

If you find high-influence obs…

4

  • Don’t pretend it doesn’t exist
  • Check for a coding mistake
  • Investigate (think of Pat!)
  • Dummy out + compare
slide-5
SLIDE 5

Checking OLS:

Multicollinearity is the high correlation between two or more independent variables

  • Problem: larger standard errors, unstable coefficients
  • Detect: variance-inflation factor VIF

5

What to check:

Heteroscedasticity Uncorrelated error Mean independence (And) Normality E lineaRity MulticollineaRity O R

slide-6
SLIDE 6

Interpretation of Regression Results

anna.petherick@politics.ox.ac.uk

Political Analysis 2, Lab 6 (Just one more to go after this!)

6

slide-7
SLIDE 7

Linearity Linearity

  • whether X is really big or really tiny in value, its relationship

with Y—its coefficient—is constant.

  • “The predicted change in Y for a one-unit increase in X

holding the other variables in the model constant.”

7

slide-8
SLIDE 8

For example…

Predicting an incumbent party’s vote share in U.S. elections, using model C: = 48.24 + (0.57*Growth) + (0.67*Good News)

  • Growth is “previous year’s

economic growth rate” (%)

  • Good news is headlines – ‘no. of

consecutive quarters of growth’.

Kellstedt and Whitten, p203

8

slide-9
SLIDE 9

9

Interpreting coefficients

  • Statistical significance: p-value
  • Substantive significance: effect size
  • Independent variables are measured in their own units

– Age (in years) – Gender (1=female, 0=male) – Income (in thousands $) à Calculate along the range (min-max) of X à Standardized coefficients (see homework section)

6

slide-10
SLIDE 10

Think sensibly Think sensibly

  • Always think substantively.
  • Sometimes predictions are

logically impossible or irrelevant.

Credit: Catherine de Vries 10

slide-11
SLIDE 11

(non)-Linearity (non)-Linearity

11

slide-12
SLIDE 12

Transformations of X Transformations of X

  • In OLS, the coefficients of the independent

variables are not squared, cubed, square- rooted or logged.

  • But non-linear relationships between X and Y

can still be modelled by transforming X.

  • For example, you could have something like:

Y= B0 + B1*X + B2*X2 + B3*logeZ

12

slide-13
SLIDE 13

Transformations of X cont… Transformations of X cont…

  • Examples from the

literature: district magnitude is often logged; so is area and GDP per capita… age is

  • ften squared.
  • Predictions using logged
  • r squared variables are

generated as before, just remember to transform your value of X before multiplying by the coefficient.

Brazil’s 5,570 municipalities:

  • Altamira = 159 696 km²
  • Santa Cruz de Minas = 2.8 km²

13

slide-14
SLIDE 14

Exponential exp(X)=ex

Risk War Geographic Proximity Note: number e = 2:718281.

14

slide-15
SLIDE 15

GDP Democracy

logarithmic

log(Y ) is defined as the power that you would need to raise e to in order to end up with Y ; i.e., elog(Y ) = Y . For example, log(7) = 1.945910149 because e1.945910149 = 7.

15

slide-16
SLIDE 16

Stretching and compressing data

16

To make these data linear..

Either stretch out values of X: Or compress values of Y:

slide-17
SLIDE 17

Over to you…

17

?

slide-18
SLIDE 18

When to transform X When to transform X

  • Plot X against Y.
  • Apply the 'bulging rule’:

Tukey and Mosteller

X X2

18

slide-19
SLIDE 19

When to transform X cont… When to transform X cont…

  • Look for patterns in the residuals.

Hmm…that’s hardly a random distribution around zero!

Credit: statwing.com

19

slide-20
SLIDE 20

Fails and Krieckhaus (2010)

20