PS 405 Week 8 Section: Non-Linear Transformations, Outliers, and - - PowerPoint PPT Presentation

ps 405 week 8 section non linear transformations outliers
SMART_READER_LITE
LIVE PREVIEW

PS 405 Week 8 Section: Non-Linear Transformations, Outliers, and - - PowerPoint PPT Presentation

PS 405 Week 8 Section: Non-Linear Transformations, Outliers, and Heteroskedasticity D.J. Flynn March 4, 2014 Announcements 1. Yanna reviewed everyones dataset for the final and theyre fine. Just make sure DV is (quasi-)continuous.


slide-1
SLIDE 1

PS 405 – Week 8 Section: Non-Linear Transformations, Outliers, and Heteroskedasticity

D.J. Flynn March 4, 2014

slide-2
SLIDE 2

Announcements

  • 1. Yanna reviewed everyone’s dataset for the final and they’re
  • fine. Just make sure DV is (quasi-)continuous.
  • 2. Today’s plan: briefly review transformations (Yanna is talking

about them Thursday) and outliers (Jay does entire week on

  • utliers/missing data).
  • 3. Questions on the final problem set or anything else.
slide-3
SLIDE 3

What the linearity assumption does (and does not) mean

◮ first assumption of OLS: linearity. ◮ formally, we say Y is a linear function of the data:

ˆ Yi = Xiβ

◮ parameters/coefficients are linear ◮ we can transform the IVs and DV to improve our model (e.g.,

remove heteroskedasticity), but parameters must remain linear in order to use OLS

◮ lots of models eschew linearity. An example is the logit model:

ˆ Yi = 1 1 + e−Xiβ

slide-4
SLIDE 4

Acceptable transformations1 Y = α + βX2 Y = α + β(ln(X)) ln(Y) = α + βX ln(Y) = α + βln(X)

1More on this in 407.

slide-5
SLIDE 5

Unacceptable transformation Y = ln(α + βX)

slide-6
SLIDE 6

Transforming data

◮ key point: linear transformations change units of measure

(e.g., ounces to pounds) but don’t change the distribution. Re-coding is a common example. So if right-skewed data are transformed linearly, the new data will still have right skew.

◮ same goes for relationships between 2+ variables: linear

transformations won’t change anything

◮ non-linear transformations will change the distribution.

Sometimes we use logs to make linear regression more appropriate.

◮ Example: Jacobson (1990):

...it is clear that linear models of campaign spending are inadequate becase diminishing returns must apply to campaign

  • spending. Green and Krasno recognize this and offer an

alternative model which uses log transformations...

slide-7
SLIDE 7

Reasons for log transformations2

◮ make relationships more linear (Jacobson 1990) ◮ reduce heteroskedasticity or skew ◮ hard sciences do this for certain natural patterns (e.g.,

exponential processes)

◮ easier interpretation (%s) ◮ key point: transformations change interpretation of

coefficients (e.g., in linear-log models, divide coefficient on logged variable by 100)

2Yanna will talk more about logs.

slide-8
SLIDE 8

Outliers

Determining whether an outlier is influential: Influence = Leverage*Discrepancy, where leverage is the distance of a given xi from the center of a distribution (mean or centroid) and discrepancy is the distance of Yi from regression line when fitted without observation. In the end, we care about influence = is there one (or two or three)

  • bservations that are changing our entire estimated effect?
slide-9
SLIDE 9

Quantifying influence

  • 1. DFBETA
  • 2. Cook’s Distance

Subjective standard. Most say if either stat is > 1, then problematic.

slide-10
SLIDE 10

You’ll need these...

library(car) install.packages("nnet") library(nnet) install.packages("MASS") library(MASS) install.packages("stats") library(stats) install.packages("zoo") library(zoo)

slide-11
SLIDE 11

DFBETA

A measure of how much a coefficient changes with observation included vs. excluded, scaled by the standard error with the

  • bservation deleted.

From Yanna’s lecture: a<-c(4, 3, 2, 1, 5, 2, 3, 4, 5, 1, 3, 2, 1, 1500) b<-c(1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1) c<-c(10, 11, 25, 20, 18, 17, 10, 11, 12, 33, 38, 12, 14, 17) plot(a) model<-lm(c∼a+b) dfbetasPlots(model) influence.measures(model) dfbetas(model)

slide-12
SLIDE 12

2 4 6 8 12 100 200 300 Index a 2 4 6 8 12

  • 1.0
  • 0.5

0.0 0.5 Index b

dfbetas Plots

slide-13
SLIDE 13

Cook’s Distance

Similar idea as DFBETA. Cook’s Distance quantifies how much coefficient moves within range of possible true values as a result of excluding a given observation. plot(model, which=4, cook.levels=cutoff)

slide-14
SLIDE 14

2 4 6 8 10 12 14 10000 20000 30000 40000 50000

  • Obs. number

Cook's distance lm(c ~ a + b) Cook's distance

14 11 10

slide-15
SLIDE 15

Testing for heteroskedasticity

Recap: heteroskedasticity is non-constant error variance = loss of efficiency Tests:

  • 1. Breusch-Pagan/Cook-Weisbeg (“BP test”)
  • 2. White’s test
slide-16
SLIDE 16

BP Test

◮ we assume that error variances are equal (null), test

alternative that they are unequal

◮ idea: regress squared residuals on IVs, see if they predict size

  • f resids

◮ distribution is χ2, so critical value depends on degrees of

freedom (it will tell you significance)

◮ some simulated heteroskedastic data is on BB if you want to

practice

◮ command is easy:

library(lmtest) bptest(model)

slide-17
SLIDE 17

White’s test

◮ similar idea as BP Test, but instead regresses squared

residuals on IVs, squared versions of IVs, and cross-products of regressors

◮ again, distribution is χ2, so critical value depends on degrees

  • f freedom (it will tell you significance)

◮ there’s now a package for running White’s test:

install.packages("bstats") library(bstats) white.test(model)

slide-18
SLIDE 18

Questions?