PS 405 Week 8 Section: Non-Linear Transformations, Outliers, and - - PowerPoint PPT Presentation

▶

Mar 02, 2023 332 likes •531 views

PS 405 Week 8 Section: Non-Linear Transformations, Outliers, and Heteroskedasticity D.J. Flynn March 4, 2014 Announcements 1. Yanna reviewed everyones dataset for the final and theyre fine. Just make sure DV is (quasi-)continuous.

SLIDE 1

PS 405 – Week 8 Section: Non-Linear Transformations, Outliers, and Heteroskedasticity

D.J. Flynn March 4, 2014

SLIDE 2

Announcements

1. Yanna reviewed everyone’s dataset for the final and they’re
fine. Just make sure DV is (quasi-)continuous.
2. Today’s plan: briefly review transformations (Yanna is talking

about them Thursday) and outliers (Jay does entire week on

utliers/missing data).
3. Questions on the final problem set or anything else.

SLIDE 3

What the linearity assumption does (and does not) mean

◮ first assumption of OLS: linearity. ◮ formally, we say Y is a linear function of the data:

ˆ Yi = Xiβ

◮ parameters/coefficients are linear ◮ we can transform the IVs and DV to improve our model (e.g.,

remove heteroskedasticity), but parameters must remain linear in order to use OLS

◮ lots of models eschew linearity. An example is the logit model:

ˆ Yi = 1 1 + e−Xiβ

SLIDE 4

Acceptable transformations1 Y = α + βX2 Y = α + β(ln(X)) ln(Y) = α + βX ln(Y) = α + βln(X)

1More on this in 407.

SLIDE 5

Unacceptable transformation Y = ln(α + βX)

SLIDE 6

Transforming data

◮ key point: linear transformations change units of measure

(e.g., ounces to pounds) but don’t change the distribution. Re-coding is a common example. So if right-skewed data are transformed linearly, the new data will still have right skew.

◮ same goes for relationships between 2+ variables: linear

transformations won’t change anything

◮ non-linear transformations will change the distribution.

Sometimes we use logs to make linear regression more appropriate.

◮ Example: Jacobson (1990):

...it is clear that linear models of campaign spending are inadequate becase diminishing returns must apply to campaign

spending. Green and Krasno recognize this and offer an

alternative model which uses log transformations...

SLIDE 7

Reasons for log transformations2

◮ make relationships more linear (Jacobson 1990) ◮ reduce heteroskedasticity or skew ◮ hard sciences do this for certain natural patterns (e.g.,

exponential processes)

◮ easier interpretation (%s) ◮ key point: transformations change interpretation of

coefficients (e.g., in linear-log models, divide coefficient on logged variable by 100)

2Yanna will talk more about logs.

SLIDE 8

Outliers

Determining whether an outlier is influential: Influence = Leverage*Discrepancy, where leverage is the distance of a given xi from the center of a distribution (mean or centroid) and discrepancy is the distance of Yi from regression line when fitted without observation. In the end, we care about influence = is there one (or two or three)

bservations that are changing our entire estimated effect?

SLIDE 9

Quantifying influence

1. DFBETA
2. Cook’s Distance

Subjective standard. Most say if either stat is > 1, then problematic.

SLIDE 10

You’ll need these...

library(car) install.packages("nnet") library(nnet) install.packages("MASS") library(MASS) install.packages("stats") library(stats) install.packages("zoo") library(zoo)

SLIDE 11

DFBETA

A measure of how much a coefficient changes with observation included vs. excluded, scaled by the standard error with the

bservation deleted.

From Yanna’s lecture: a<-c(4, 3, 2, 1, 5, 2, 3, 4, 5, 1, 3, 2, 1, 1500) b<-c(1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1) c<-c(10, 11, 25, 20, 18, 17, 10, 11, 12, 33, 38, 12, 14, 17) plot(a) model<-lm(c∼a+b) dfbetasPlots(model) influence.measures(model) dfbetas(model)

SLIDE 12

2 4 6 8 12 100 200 300 Index a 2 4 6 8 12

0.0 0.5 Index b

dfbetas Plots

SLIDE 13

Cook’s Distance

Similar idea as DFBETA. Cook’s Distance quantifies how much coefficient moves within range of possible true values as a result of excluding a given observation. plot(model, which=4, cook.levels=cutoff)

SLIDE 14

2 4 6 8 10 12 14 10000 20000 30000 40000 50000

Obs. number

Cook's distance lm(c ~ a + b) Cook's distance

14 11 10

SLIDE 15

Testing for heteroskedasticity

Recap: heteroskedasticity is non-constant error variance = loss of efficiency Tests:

1. Breusch-Pagan/Cook-Weisbeg (“BP test”)
2. White’s test

SLIDE 16

BP Test

◮ we assume that error variances are equal (null), test

alternative that they are unequal

◮ idea: regress squared residuals on IVs, see if they predict size

f resids

◮ distribution is χ2, so critical value depends on degrees of

freedom (it will tell you significance)

◮ some simulated heteroskedastic data is on BB if you want to

practice

◮ command is easy:

library(lmtest) bptest(model)

SLIDE 17

White’s test

◮ similar idea as BP Test, but instead regresses squared

residuals on IVs, squared versions of IVs, and cross-products of regressors

◮ again, distribution is χ2, so critical value depends on degrees

f freedom (it will tell you significance)

◮ there’s now a package for running White’s test:

install.packages("bstats") library(bstats) white.test(model)

SLIDE 18