contrast coding
play

Contrast Coding Or: One of These Levels is Not Like the Others - PowerPoint PPT Presentation

Contrast Coding Or: One of These Levels is Not Like the Others Scott Fraundorf (and Tuan Lam) MLM Reading Group 03.10.11 Administrivia 3/10 (TODAY): Contrast coding overview 4/7: Simple vs main effects 4/21: Principal components


  1. Contrast Coding Or: One of These Levels is Not Like the Others Scott Fraundorf (and Tuan Lam) MLM Reading Group – 03.10.11

  2. Administrivia ● 3/10 (TODAY): Contrast coding overview ● 4/7: Simple vs main effects ● 4/21: Principal components analysis ● 1 st week of May: Harald Baayen visit

  3. Outline Why use contrast coding? ● Example contrasts ● Contrast estimates ● Contrasts in R ● Multiple comparisons ● How does it work? ● Other kinds of coding ● Interactions ●

  4. Why Use Contrast Coding? Scott's example study: ● + + + = ITEM LOCATION OF PRIOR SUBJECT DISFLUENCY KNOWLEDGE Examining recall memory for spoken ● discourse as a function of: Location of disfluencies (categorical variable) ● Prior story knowledge (continuous variable) ●

  5. Why Use Contrast Coding? Regression equation: Predicts values ● Could use this to predict whether or not ● something will be remembered + + + = ITEM LOCATION OF PRIOR SUBJECT DISFLUENCY KNOWLEDGE But in cognitive psych: ● Often interested in the effect of specific levels ● Test which ones differ significantly ●

  6. Outline Why use contrast coding? ● Example contrasts ● Contrast estimates ● Contrasts in R ● Multiple comparisons ● How does it work? ● Other kinds of coding ● Interactions ●

  7. Contrast Coding 0.8 0.75 % of story recalled 0.7 0.65 0.6 0.55 0.5 0.45 0.4 Typical Atypical Fluent ● Example: Fluent vs. disfluencies in typical locations vs. in atypical locations ● Which ones differ significantly?

  8. Contrast Coding ● Contrasts: Test differences between specific levels – Same as a planned comparison in an ANOVA – Also analogous to a post-hoc test ● Planned comparisons vs post-hoc tests – If we are deciding tests post-hoc, greater chance of capitalizing on chance / spurious effect – Contrasts are set before you fit the model, but it would be possible to go back and change the contrasts afterwards – We are basically on the honor system here—no way to prove the comparison was planned ahead of time

  9. Contrasts! ● Contrasts like weighted sums of means – In multiple regression / MLM context, also subject to other variables in the model ● Using your scale to test what's different

  10. Contrast Coding It looks like the Fluent stories might not be remembered as well. 0.8 Let's use a contrast to test this. 0.75 % of story recalled 0.7 0.65 0.6 0.55 0.5 0.45 0.4 Typical Atypical Fluent

  11. Contrasts TYPICAL FLUENT ATYPICAL Question 1: Do disfluencies affect recall?

  12. One side positive. One side negative. Contrasts This determines which levels are being compared (+ versus -) Doesn't really matter .33 .33 -.66 which side you choose as the + side. It just affects the sign of the result, but not magnitude or statistical TYPICAL FLUENT ATYPICAL significance Contrast weights are assigned

  13. One side positive. Contrasts One side negative. Codes add up to zero . Also nice to have the absolute values of the .33 .33 -.66 + code and the – code sum to 1 . (We'll see why later.) abs(.33) + abs(-.66) = 1 TYPICAL FLUENT ATYPICAL Contrast weights are assigned

  14. One side positive. One side negative. Contrasts Codes add up to zero. .33 .33 -.66 TYPICAL FLUENT ATYPICAL Does contrast differ significantly from zero? If so, difference between levels is significant. Can conceptualize the comparison as: Contrast 1: .33(Typical) + .33 (Typical) - .66(Fluent) (holding other variables constant)

  15. Contrasts .33 .33 -.66 TYPICAL ATYPICAL FLUENT * Contrast 1: .33(Typical) + .33 (Typical) - .66(Fluent)

  16. Contrast Coding * 0.8 0.75 Our first contrast reveals that fluent stories are 0.7 remembered worse. % of story recalled 0.65 Now let's look at Typical vs Atypical 0.6 0.55 0.5 0.45 0.4 Typical Atypical Fluent We always have j – 1 contrasts, where j = the # of levels of the factor So, here 2 contrasts needed to fully describe

  17. Contrasts TYPICAL ATYPICAL Question 2: Does location of disfluencies matter?

  18. One side positive. Contrasts One side negative. Codes add up to zero. Sum of absolute values -.50 .50 of codes is 1. TYPICAL ATYPICAL 0 FLUENT (zeroed out here!) Contrast 2: .50(Typical) - .50(Atypical) + 0(Rest)

  19. Contrast Coding * 0.8 n.s. 0.75 0.7 % of story recalled 0.65 0.6 0.55 0.5 0.45 0.4 Typical Atypical Fluent

  20. One Important Point! Choice of contrasts doesn't affect total ● variance accounted for by variable Only about differences between levels ● Can divide this up in multiple different ways ● and still account for same total variance LOCATION IN STORY

  21. Outline Why use contrast coding? ● Example contrasts ● Contrast estimates ● Contrasts in R ● Multiple comparisons ● How does it work? ● Other kinds of coding ● Interactions ●

  22. Why -.5 and .5? ● Why [-.5 .5] instead of [-1 1]? ● Doesn't affect significance test ● Does affect β weight (estimate) – Std error is also scaled accordingly FILLER LOCATION: [-.5 .5] FILLER LOCATION: [-1 1]

  23. Contrast Estimates CONTRAST Beta weight (estimate) represents the effect of a 1-unit CODE change in the contrast, holding ATYPICAL .5 everything else constant LOCATION In this case, a 1-unit change in }1 contrast IS the difference between the levels' codes Thus, the contrast correctly represents .04825 as the difference between the conditions TYPICAL -.5 LOCATION

  24. Contrast Estimates CONTRAST Here, the total difference between the levels' codes is 2 CODE ATYPICAL 1 So, a 1-unit change in the LOCATION contrast is only HALF the difference between the levels' }2 codes Thus, the estimate of the contrast is .024 … only half the difference between the conditions TYPICAL -1 LOCATION

  25. Contrast Estimates Beta weight (estimate) represents the effect of a 1-unit change in the contrast CONTRAST CONTRAST CODE CODE ATYPICAL .5 ATYPICAL 1 LOCATION LOCATION }2 }1 TYPICAL TYPICAL -.5 -1 LOCATION LOCATION 1 unit change in contrast IS 1 unit change in contrast IS the difference between levels only half the difference (.04825 in this case) between levels

  26. So Why -.5 and .5? ● Better tell you about difference in means! – The actual difference between conditions is .048 – It would be perfectly correct to describe .024 as half the difference between levels and you could even put a CI around it … it's just less intuitive for your readers FILLER LOCATION: [-.5 .5] FILLER LOCATION: [-1 1]

  27. So Why -.5 and .5? ● Better tell you about difference in means! – The actual difference between conditions is .048 – It would be perfectly correct to describe .024 as half the difference between levels and you could even put a CI around it … it's just less intuitive for your readers ● Both contrasts would account for the same amount of variance ● This is just another case of deciding the scale of a variable – Akin to measuring temperature in C versus F … both account for the same variance, but the numbers are on different scales

  28. Imbalanced Designs You may have an ● unequal number of observations per cell e.g. some data lost, – or responses not codable Correct for this ● in your contrast codes if you want things centered Ask Tuan or Scott – about how to do this :)

  29. Outline Why use contrast coding? ● Example contrasts ● Contrast estimates ● Contrasts in R ● Multiple comparisons ● How does it work? ● Other kinds of coding ● Interactions ●

  30. Contrasts in R ● To check what the current contrasts are: – contrasts(YourDataFrame$VariableName) ● To set the contrasts: – contrasts(YourDataFrame$VariableName) = cbind(c(.33,.33,-.66),c(.50,-.50,0)) ● Each c(xx,yy,zz) is the weights for one of the contrasts you want to run ● e.g. (.33, .33, -.66) is one contrast ● After setting contrasts, run lmer model to get the results of the contrasts

  31. Contrasts in R ● Should have j – 1 contrasts, where k = # of levels of the factor ● If using a subset of data, some levels of the factor may no longer be present – e.g. you dropped a condition – But, R still “remembers” that these levels exist and will get mad you didn't specify enough contrasts – Fix this by reconverting to a factor: ● YourDataFrame$Variable = factor(YourDataFrame$Variable)

  32. Another R Tip ● To see the mean of each level of an I.V.: – tapply(YourDataFrame$DVName, YourDataFrame$IVName,mean) – Could also do median, sd, etc. ● For a 2-way (or more!) table – tapply(YourDataFrame$DVName, list(YourDataFrame$IVName1, YourDataFrame$IVName2), mean) ● Doesn't work if you have missing values – But Tuan has made a version of tapply that fixes this problem

  33. Outline Why use contrast coding? ● Example contrasts ● Contrast estimates ● Contrasts in R ● Multiple comparisons ● How does it work? ● Other kinds of coding ● Interactions ●

  34. Multiple Comparisons (Here Comes Trouble!)

  35. Multiple Comparisons Lots of comparisons you can run ● Suppose we tested both young & older ● adults on the disfluency task: ATYPICAL / FLUENT / TYPICAL / YOUNGER YOUNGER YOUNGER ATYPICAL / FLUENT / TYPICAL / OLDER OLDER OLDER

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend