Course Business Midterm grades next week Two new datasets on - - PowerPoint PPT Presentation

course business
SMART_READER_LITE
LIVE PREVIEW

Course Business Midterm grades next week Two new datasets on - - PowerPoint PPT Presentation

Course Business Midterm grades next week Two new datasets on CourseWeb: vocab.csv and relationship.csv Also, packages to download: languageR and lattice Four lectures to go Today : Specialized designs Then:


slide-1
SLIDE 1

Course Business

  • Midterm grades next week
  • Two new datasets on CourseWeb: vocab.csv

and relationship.csv

  • Also, packages to download: languageR and

lattice

  • Four lectures to go
  • Today: Specialized designs
  • Then: Troubleshooting & data management
  • Switch: Missing data next week & statistical

power the following week

slide-2
SLIDE 2

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-3
SLIDE 3

Distributed Practice

l Elika is running an experiment in which subjects

envision themselves in a number of hypothetical dating scenarios (items) and rate their relationship satisfaction in that scenario. Elika is interested both in features of the scenarios & how those features may interact with the participants’ gender. Her initial model, with a maximal random effects structure, is:

l

model1 <- lmer(Rating ~ 1 +

SubjectGender * PhysicalIntimacy * EmotionalIntimacy + (1 + PhysicalIntimacy * EmotionalIntimacy|Subject) +

(1 + SubjectGender|Item), data=dating)

l Unfortunately, this model did not converge.

What could Elika do as her next step?

slide-4
SLIDE 4

Distributed Practice

l Elika is running an experiment in which subjects

envision themselves in a number of hypothetical dating scenarios (items) and rate their relationship satisfaction in that scenario. Elika is interested both in features of the scenarios & how those features may interact with the participants’ gender. Her initial model, with a maximal random effects structure, is:

l

model1 <- lmer(Rating ~ 1 +

SubjectGender * PhysicalIntimacy.cen * EmotionalIntimacy.cen + (1 + PhysicalIntimacy.cen * EmotionalIntimacy.cen|Subject) +

(1 + SubjectGender|Item), data=dating)

l Unfortunately, this model did not converge.

What could Elika do as her next step?

l Centering the variables / using effects coding (if

justified by the research question)

slide-5
SLIDE 5

Distributed Practice

l Elika is running an experiment in which subjects

envision themselves in a number of hypothetical dating scenarios (items) and rate their relationship satisfaction in that scenario. Elika is interested both in features of the scenarios & how those features may interact with the participants’ gender. Her initial model, with a maximal random effects structure, is:

l

model1 <- lmer(Rating ~ 1 +

SubjectGender * PhysicalIntimacy.cen * EmotionalIntimacy.cen + (1 + PhysicalIntimacy.cen * EmotionalIntimacy.cen||Subject) +

(1 + SubjectGender||Item), data=dating)

l Unfortunately, this model did not converge.

What could Elika do as her next step?

l Try taking out the correlation parameters by using

|| instead of |

slide-6
SLIDE 6

Distributed Practice

l Elika is running an experiment in which subjects

envision themselves in a number of hypothetical dating scenarios (items) and rate their relationship satisfaction in that scenario. Elika is interested both in features of the scenarios & how those features may interact with the participants’ gender. Her initial model, with a maximal random effects structure, is:

l

model1 <- lmer(Rating ~ 1 +

SubjectGender * PhysicalIntimacy.cen * EmotionalIntimacy.cen + (1 + PhysicalIntimacy.cen * EmotionalIntimacy.cen|Subject) +

(1|Item), data=dating)

l Unfortunately, this model did not converge.

What could Elika do as her next step?

l Variance across items is often smaller. Try

removing the slope by items

l

And, use anova() to compare that model with only random intercepts to verify that item slope does not contribute to model fit

slide-7
SLIDE 7

Distributed Practice

l Elika is running an experiment in which subjects

envision themselves in a number of hypothetical dating scenarios (items) and rate their relationship satisfaction in that scenario. Elika is interested both in features of the scenarios & how those features may interact with the participants’ gender. Her initial model, with a maximal random effects structure, is:

l

model1 <- lmer(Rating ~ 1 +

SubjectGender * PhysicalIntimacy.cen * EmotionalIntimacy.cen + (1 + PhysicalIntimacy.cen * EmotionalIntimacy.cen|Subject) +

(1|Item), data=dating, control=lmerControl(optCtrl=list(maxfun=20000)))

l Unfortunately, this model did not converge.

What could Elika do as her next step?

l Could just add more iterations (but this probably

won’t be helpful)

slide-8
SLIDE 8

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-9
SLIDE 9

Longitudinal Designs

Time

slide-10
SLIDE 10

Longitudinal Designs

  • There are many methods available for

analyzing longitudinal data

  • Aidan Wright’s class on Applied Longitudinal Data

Analysis

  • Mixed-effects models are one effective

solution

  • Hierarchical models naturally account for

longitudinal data

  • Include all of the other features we’ve looked at

(multiple random effects, mix of categorical & continuous variables, non-normal DVs, etc.)

slide-11
SLIDE 11

Neigh- borhood 1 Neigh- borhood 2

Child

2

Child

1

Child

3

Child

4

Sampled NEIGHBORHOODS Sampled CHILDREN

LEVEL 2 LEVEL 1

  • What kind of random-effects structure is this?

Vocabulary Size at 2 Years Old

slide-12
SLIDE 12

Neigh- borhood 1 Neigh- borhood 2

Child

2

Child

1

Child

3

Child

4

Sampled NEIGHBORHOODS Sampled CHILDREN

LEVEL 2 LEVEL 1

  • What kind of random-effects structure is this?
  • Two levels of nesting – sample neighborhoods,

then sample children inside each neighborhood

Vocabulary Size at 2 Years Old

slide-13
SLIDE 13

Neigh- borhood 1 Neigh- borhood 2

Child

2

Child

1

Child

3

Child

4

Child 1 Assess- ment 1 Child 2 Assess- ment 2 Child 3

Assess- ment 1

Child 3 Assess- ment 2

  • Now imagine we observed each child several

different times

  • e.g., every month over the course of a year

vocab.csv: Vocabulary Size in 2nd Year of Life

Sampled NEIGHBORHOODS Sampled CHILDREN

LEVEL 2 LEVEL 1

slide-14
SLIDE 14

Neigh- borhood 1 Neigh- borhood 2

Child

2

Child

1

Child

3

Child

4

Child 1 Assess- ment 1 Child 2 Assess- ment 2 Child 3

Assess- ment 1

Child 3 Assess- ment 2

Sampled NEIGHBORHOODS Sampled CHILDREN Sampled TIME POINTS

LEVEL 3 LEVEL 2 LEVEL 1

  • This is just another level of nesting
  • Sample neighborhoods
  • Sample children within each neighborhood
  • Sample time points within each child

vocab.csv: Vocabulary Size in 2nd Year of Life

slide-15
SLIDE 15

Longitudinal Designs

  • Two big questions mixed-effects models can

help us answer about longitudinal data: 1) What is the overall trajectory of change across time? 2) How does an observation at one time point relate to the next time point?

slide-16
SLIDE 16

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-17
SLIDE 17

Longitudinal Designs

  • Two big questions mixed-effects models can

help us answer about longitudinal data: 1) What is the overall trajectory of change across time? 2) How does an observation at one time point relate to the next time point?

slide-18
SLIDE 18

Time as a Predictor Variable

  • How does Time affect vocabulary?
  • The simple way to answer this question: Add

Time as a variable in our model

  • Nothing “special” about time as a predictor
slide-19
SLIDE 19

Time as a Predictor Variable

  • Let’s add the effect of Time to our model
  • Here: Months since the study started (at 18 mos.)
  • Fixed effect because we’re interested in time effects
  • model1 <- lmer(VocabWords ~ 1 + Time +

(1|Child) + (1|Neighborhood), data=vocab)

  • We need to account for the nested random effects

structure … can you add appropriate random intercepts?

  • Tip #1: There are two levels of nesting here
  • Tip #2: Individual observations are nested within children,

and children are nested with neighborhoods

  • Tip #3: Include both Child and Neighborhood differences
slide-20
SLIDE 20

Time as a Predictor Variable

  • Let’s add the effect of Time to our model
  • Here: Months since the study started (at 18 mos.)
  • Fixed effect because we’re interested in time effects
  • model1 <- lmer(VocabWords ~ 1 + Time +

(1|Child) + (1|Neighborhood), data=vocab)

  • How would you interpret the two estimates?
  • Intercept:
  • Slope:
slide-21
SLIDE 21

Time as a Predictor Variable

  • Let’s add the effect of Time to our model
  • Here: Months since the study started (at 18 mos.)
  • Fixed effect because we’re interested in time effects
  • model1 <- lmer(VocabWords ~ 1 + Time +

(1|Child) + (1|Neighborhood), data=vocab)

  • How would you interpret the two estimates?
  • Intercept: Average vocab knowledge ~4 words at 18 months
  • Slope: Gain of about ~55 words per month
slide-22
SLIDE 22

Time as a Predictor Variable

  • Not necessary to have every time point

represented

  • Dependent variable should be on same scale

across time points for this to be meaningful

  • Time units don’t matter as long as they’re

consistent

  • Could be hours, days, years …
slide-23
SLIDE 23

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-24
SLIDE 24

Neigh- borhood 1 Neigh- borhood 2

Child

2

Child

1

Child

3

Child

4

Child 1 Assess- ment 1 Child 2 Assess- ment 2 Child 3

Assess- ment 1

Child 3 Assess- ment 2

Sampled NEIGHBORHOODS Sampled CHILDREN Sampled TIME POINTS

LEVEL 3 LEVEL 2 LEVEL 1

  • So far, we assume the same growth rate for all kids
  • Almost certainly not true!
  • At level 2, we’re sampling kids both with different

starting points (intercepts) and growth rates (slopes)

Longitudinal Data: Random Slopes

Growth Rate 1 Growth Rate 2

slide-25
SLIDE 25

Longitudinal Data: Random Slopes

RANDOM INTERCEPTS MODEL Kids vary in starting point, but all acquire vocabulary at the same rate

  • ver this period

WITH RANDOM SLOPES Allows rate of vocab acquisition to vary across kids (as well as intercept)

slide-26
SLIDE 26

Neigh- borhood 1 Neigh- borhood 2

Child

2

Child

1

Child

3

Child

4

Child 1 Assess- ment 1 Child 2 Assess- ment 2 Child 3

Assess- ment 1

Child 3 Assess- ment 2

Sampled NEIGHBORHOODS Sampled CHILDREN Sampled TIME POINTS

LEVEL 3 LEVEL 2 LEVEL 1

  • Can you update the model to allow the Time effect

to be different for each Child?

  • Tip 1: This involves some type of random slope…
  • Tip 2: We want a random slope of Time by Child

Longitudinal Data: Random Slopes

Growth Rate 1 Growth Rate 2

slide-27
SLIDE 27

Longitudinal Data: Random Slopes

  • model.Slope <- lmer(VocabWords ~ 1 + Time +

(1+Time|Child) + (1|Neighborhood), data=vocab)

In fact, LOTS of variability in the Time slope SD is 20 words! Mean slope is 53 words/mo, but some kids might have a slope of 73 or 33

slide-28
SLIDE 28

Longitudinal Data: Random Slopes

  • Would also be possible to have a random

slope of Time by Neighborhood

  • If there’s clustering of growth rates at the

neighborhood level

  • model.TwoSlopes <- lmer(VocabWords ~ 1 + Time

+ (1+Time|Child) + (1+Time|Neighborhood), data=vocab)

  • Is this are any evidence for this clustering?
  • anova(model.Slope, model.TwoSlopes)

χ2(2) = 0.61 p = .74 n.s.

slide-29
SLIDE 29

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-30
SLIDE 30

Other Variables

  • We may want to include other

variables in a longitudinal model:

  • Do parents frequently read picture

books to the child?

  • Time + Reading
  • Effect of Reading invariant across time
  • Can only affect the intercept

(parallel lines)

  • Time * Reading
  • Effect of Reading varies with time
  • Can affect intercept & slope
slide-31
SLIDE 31

Other Variables: Results

  • Fixed-effect results with the interaction:

e.g., Huttenlocher et al., 1991

Parental reading doesn’t affect vocab at time 0 But, results in faster vocab growth (amplifies + Time effect) Growth rate for “No” group: 50.040 words / month Growth rate for “Yes” group: 50.040 + 6.703 = 56.743 words / month

slide-32
SLIDE 32

Other Variables

  • Can be either:
  • Time-Invariant Predictor:

Same across all time points within a subject

  • e.g., race/ethnicity
  • Level 2 or Level 3 variables
  • Time-Varying Predictor: Varies

even within a subject, from one time point to another

  • e.g., hours of sleep
  • Level-1 variable

Neigh- borhood 1

Child

2

Child

1

Child 1 Assess- ment 1 Child 2 Assess- ment 2

Sampled NEIGHBORHOODS Sampled CHILDREN Sampled TIME POINTS

LEVEL 3 LEVEL 2 LEVEL 1

  • Since R automatically figures out what’s a level-1
  • vs. level-2 variable, we don’t have to do anything

special for either kind of variable

slide-33
SLIDE 33

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-34
SLIDE 34

Quadratic & Higher Degrees

  • We’ve been assuming

a linear effect of time

  • But, it looks like vocab

growth may accelerate

  • Growth between +2 mo.

and +4 mo. is much smaller than growth between +6 mo. and +8 mo.

  • Suggests a curve / quadratic equation
slide-35
SLIDE 35

Quadratic & Higher Degrees

  • Add quadratic effect (Time2):
  • model.poly <- lmer(VocabWords ~ 1 +

poly(Time,degree=2,raw=TRUE) + (1 + poly(Time,degree=2,raw=TRUE)|Child) + (1|Neighborhood), data=vocab)

  • degree=2 because we want Time2
  • raw=TRUE to keep the original scale of the

variables (time measured in months)

  • poly() automatically adds lower-order terms

as well

  • i.e., the linear term (Time)
slide-36
SLIDE 36

Quadratic & Higher Degrees

  • Results:
  • Implied equation (approximate):
  • VocabWords = 40 + 11*Time + 7*Time2
  • What are predicted values if…
  • Time=0?
  • Time=1?
  • Time=2?

Intercept Linear term Quadratic term

slide-37
SLIDE 37

Quadratic & Higher Degrees

  • Results:
  • Implied equation (approximate):
  • VocabWords = 40 + 11*Time + 7*Time2
  • What are predicted values if…
  • Time=0?
  • Time=1?
  • Time=2?
  • Vocab growth is accelerating (larger change from

time 1 to time 2 than from time 0 to time 1) VocabWords=40+(11*0)+(7*02) = 40 VocabWords=40+(11*1)+(7*12) = 58 VocabWords=40+(11*2)+(7*22) = 90

Intercept Linear term Quadratic term

slide-38
SLIDE 38

POSITIVE QUADRATIC TREND & NO LINEAR TREND NEGATIVE QUADRATIC TREND & NO LINEAR TREND POSITIVE QUADRATIC TREND & POSITIVE LINEAR TREND NEGATIVE QUADRATIC TREND & POSITIVE LINEAR TREND

Different patterns of quadratic & linear effects describe different curves

slide-39
SLIDE 39

Quadratic & Higher Degrees

  • The Yerkes-Dodson law (Yerkes & Dodson, 1908)

describes the optimal level of physiological arousal to perform a task: Low arousal results in poor performance because you are not alert enough, medium arousal results in strong performance, and high arousal results in poor performance because you are too anxious. Linear trend: Quadratic trend:

slide-40
SLIDE 40

Quadratic & Higher Degrees

  • The Yerkes-Dodson law (Yerkes & Dodson, 1908)

describes the optimal level of physiological arousal to perform a task: Low arousal results in poor performance because you are not alert enough, medium arousal results in strong performance, and high arousal results in poor performance because you are too anxious. Linear trend: Quadratic trend: NONE NEGATIVE

slide-41
SLIDE 41

Quadratic & Higher Degrees

  • In adulthood, working memory declines the
  • lder you get (Park et al., 2002).

Linear trend: Quadratic trend:

slide-42
SLIDE 42

Quadratic & Higher Degrees

  • In adulthood, working memory declines the
  • lder you get (Park et al., 2002).

Linear trend: Quadratic trend: NONE NEGATIVE

slide-43
SLIDE 43

Quadratic & Higher Degrees

  • Aphasia is a neuropsychological disorder that

disrupts speech, often resulting from a stroke. After initially acquiring aphasia, patients often experience rapid recovery in much of their language performance (dependent variable) as time (independent variable) increases. But, this recovery eventually slows down, and language performance doesn’t get much better no matter how much more time increases (e.g., Demeurisse et al., 1980). Linear trend: Quadratic trend:

slide-44
SLIDE 44

Quadratic & Higher Degrees

  • Aphasia is a neuropsychological disorder that

disrupts speech, often resulting from a stroke. After initially acquiring aphasia, patients often experience rapid recovery in much of their language performance (dependent variable) as time (independent variable) increases. But, this recovery eventually slows down, and language performance doesn’t get much better no matter how much more time increases (e.g., Demeurisse et al., 1980). Linear trend: Quadratic trend: POSITIVE NEGATIVE

slide-45
SLIDE 45

Quadratic & Higher Degrees

  • Studies of practice and expertise (e.g., Logan, 1988)

show that people learning to do a task—such as arithmetic—initially show a quick decrease in response time (dependent variable) as the amount of practice increases. However, eventually they hit the point where the task can’t possibly be done any faster, and response time reaches an asymptote and stops decreasing. Linear trend: Quadratic trend:

slide-46
SLIDE 46

Quadratic & Higher Degrees

  • Studies of practice and expertise (e.g., Logan, 1988)

show that people learning to do a task—such as arithmetic—initially show a quick decrease in response time (dependent variable) as the amount of practice increases. However, eventually they hit the point where the task can’t possibly be done any faster, and response time reaches an asymptote and stops decreasing. Linear trend: Quadratic trend: NEGATIVE POSITIVE

slide-47
SLIDE 47

Quadratic & Higher Degrees

  • Could go up to even higher degrees (Time3,

Time4…)

  • degree=3 if highest exponent is 3
  • Degree minus 1 = Number of bends in the

curve

  • 100
  • 50

50 100

x^3

20 40 60 80 100

x^1

20 40 60 80 100

x^2

0 bends 1 bend 2 bends

slide-48
SLIDE 48

Quadratic & Higher Degrees

  • Maximum degree of polynomial: # of

time points minus 1

  • Example: 2 time points perfectly fit by

a line (degree 1). Nothing left for a quadratic term to explain.

  • But, don’t want to overfit
  • Probably not the case that the real underlying

(population) trajectory has 6 bends in it

  • What degree should we include?
  • Theoretical considerations
  • If comparing conditions, look at mean

trajectory across conditions (Mirman et al., 2008)

slide-49
SLIDE 49

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-50
SLIDE 50

Longitudinal Designs

  • Two big questions mixed-effects models can

help us answer about longitudinal data: 1) What is the overall trajectory of change across time? 2) How does an observation at one time point relate to the next time point?

slide-51
SLIDE 51

Autocorrelation

  • So far, we’ve looked at general trajectory

across time

  • e.g., kids who are 24 months old have larger

vocabs than kids who are 18 months old

  • But, there may be relations among specific
  • bservations
  • Does reading to your kids a lot one month

increase their vocab growth the next month?

  • Does being in a positive mood one day carry over

to the next day?

  • Does purchasing a particular brand once make

you more likely to purchase it again?

slide-52
SLIDE 52

relationship.csv

  • One member of a dating couple rates their

warmth towards the partner

  • For each of 10 consecutive days
  • First, let’s see if WarmthToday consistently

increases or decreases across the 10 Days of the study

  • Remember to account for nesting within couples
  • model.time <- lmer(WarmthToday ~ 1 + Day +

(1|Couple), data=relationship)

  • Or:
  • model.time <- lmer(WarmthToday ~ 1 + Day + (1 +

Day|Couple), data=relationship)

slide-53
SLIDE 53

relationship.csv

  • No linear increase or decrease in warmth over

the 10 days

  • Makes sense … this is a relatively short timescale
  • Check out the descriptives
  • tapply(relationship$WarmthToday,

relationship$Day, mean)

slide-54
SLIDE 54

Autocorrelation

  • We found no overall increase/decrease
  • Nevertheless, succeeding days might be more

similar, even if there is no overall trend

1 2 3 4 5

L I N E A R T R E N D

slide-55
SLIDE 55

Autocorrelation

  • We found no overall increase/decrease
  • Nevertheless, succeeding days might be more

similar, even if there is no overall trend

  • If you have warm feelings towards your partner
  • ne day, maybe warmer the next, too

+ 2 3 4 5

L I N E A R T R E N D

slide-56
SLIDE 56

Autocorrelation

  • We found no overall increase/decrease
  • Nevertheless, succeeding days might be more

similar, even if there is no overall trend

  • If you have warm feelings towards your partner
  • ne day, maybe warmer the next, too

+ + 3 4 5

L I N E A R T R E N D

slide-57
SLIDE 57

Autocorrelation

  • We found no overall increase/decrease
  • Nevertheless, succeeding days might be more

similar, even if there is no overall trend

  • If you have warm feelings towards your partner
  • ne day, maybe warmer the next, too
  • If you have negative feelings, maybe less warm

the next day

+ + 3

  • 5

L I N E A R T R E N D

slide-58
SLIDE 58

Autocorrelation

  • We found no overall increase/decrease
  • Nevertheless, succeeding days might be more

similar, even if there is no overall trend

  • If you have warm feelings towards your partner
  • ne day, maybe warmer the next, too
  • If you have negative feelings, maybe less warm

the next day

+ + 3

  • L I N E A R T R E N D
slide-59
SLIDE 59

Autocorrelation

  • We found no overall increase/decrease
  • Nevertheless, succeeding days might be more

similar, even if there is no overall trend

  • If you have warm feelings towards your partner
  • ne day, maybe warmer the next, too
  • If you have negative feelings, maybe less warm

the next day

+ + 3

slide-60
SLIDE 60

Autocorrelation

  • head(relationship, n=10)

Couple 01

slide-61
SLIDE 61

Autocorrelation

  • These are examples of autocorrelation

Couple 01 Couple 02

slide-62
SLIDE 62

Autocorrelation

"Maybe in order to understand mankind, we have to look at that word itself: MANKIND. Basically, it's made up of two separate words, 'mank' and 'ind.' What do these words mean? It's a mystery, and that's why so is mankind.”

  • - Jack Handey
slide-63
SLIDE 63

Autocorrelation

To understand autocorrelation, we have to look at the two separate words that make it up auto (self) + correlation

  • Autocorrelation refers to a variable correlating

with itself, over time

  • Examples:
  • Positive mood in the morning à Positive mood in

the afternoon

  • RT on the previous trial à RT on this trial due to

waxing & waning of attention

slide-64
SLIDE 64

Autocorrelation

  • These are examples of autocorrelation

Couple 01 Couple 02

slide-65
SLIDE 65

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-66
SLIDE 66

Testing for Autocorrelation

  • Is there autocorrelation in relationship warmth?
  • We want to test whether yesterday’s warmth

predicts today’s warmth

  • So, we need to add WarmthYesterday as a variable
  • With package languageR…
  • relationship$WarmthYesterday <-

lags.fnc(relationship, time='Day', depvar='WarmthToday', group='Couple’, lag=1)

  • For each observation, get the value of WarmthToday

from 1 day previous and store in WarmthYesterday

Variable that identifies the time point Dependent variable Variable with the level-2 grouping factor (e.g., subjects or schools) Dataframe name

Baayen & Milin, 2010

slide-67
SLIDE 67

Testing for Autocorrelation

  • Is there autocorrelation in relationship warmth?
  • We want to test whether yesterday’s warmth

predicts today’s warmth

  • So, we need to add WarmthYesterday as a variable
  • head(relationship, n=13)

Day 1 for each couple gets the couple’s mean

Baayen & Milin, 2010

slide-68
SLIDE 68

Testing for Autocorrelation

  • Is there autocorrelation in relationship warmth?
  • Now, include WarmthYesterday in our model
  • model.auto <- lmer(WarmthToday ~

1 + Day + WarmthYesterday + (1 + Day + WarmthYesterday|Couple), data=relationship)

slide-69
SLIDE 69

Testing for Autocorrelation

  • Non-significant linear effect: No consistent

increase/decrease in warmth over these 10 days

  • But, significant autocorrelation: Adjacent days are

more similar in warmth towards partner

  • Random walk

1 2 3 4 5

L I N E A R T R E N D

X

slide-70
SLIDE 70

Autocorrelation

  • These are examples of autocorrelation

Couple 01 Couple 02

slide-71
SLIDE 71

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-72
SLIDE 72

Visualization of Autocorrelation

  • Visualization of the autocorrelation
  • With packages languageR and lattice:
  • acf.fnc(xxx ,

time='xxx ', x=' ', group=' ')

Variable that identifies the time point Dependent variable Variable with the level-2 grouping factor (e.g., subjects or schools) Dataframe name

Baayen & Milin, 2010

slide-73
SLIDE 73

Visualization of Autocorrelation

  • Visualization of the autocorrelation
  • With packages languageR and lattice:
  • acf.fnc(relationship,

time='Day', x='WarmthToday', group='Couple')

Variable that identifies the time point Dependent variable Variable with the level-2 grouping factor (e.g., subjects or schools) Dataframe name

Baayen & Milin, 2010

slide-74
SLIDE 74

Lag Acf

  • 0.5

0.0 0.5 1.0 0 2 4 6 8 Couple01 Couple02 0 2 4 6 8 Couple03 Couple04 0 2 4 6 8 Couple05 Couple06 0 2 4 6 8 Couple07 Couple08 Couple09 Couple10 Couple11 Couple12 Couple13

  • 0.5

0.0 0.5 1.0 Couple14

  • 0.5

0.0 0.5 1.0 Couple15 Couple16 Couple17 Couple18 Couple19 Couple20 Couple21 Couple22 Couple23 Couple24 Couple25 Couple26 Couple27

  • 0.5

0.0 0.5 1.0 Couple28

  • 0.5

0.0 0.5 1.0 Couple29 Couple30 Couple31 Couple32 Couple33 Couple34 Couple35 Couple36 0 2 4 6 8 Couple37 Couple38 0 2 4 6 8 Couple39

  • 0.5

0.0 0.5 1.0 Couple40

One box per couple Height of the second line indicates strength of autocorrelation for that couple

slide-75
SLIDE 75

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-76
SLIDE 76

Direction of Autocorrelation

  • Positive autocorrelation—Having more of something

at time t means you have more of it at time t+1

  • More warmth on day 3 à More warmth on day 4
  • Longer RT on one trial à Longer RT on the next trial
  • Negative autocorrelation—Having more

at time t means you have less at t+1

  • e.g., homeostatic processes
  • Very hungry at time t à Get something

to eat à Not hungry at time t+1

  • Stock prices: Price increases à People

sell the stock à Price decreases

  • Both analyzed the same way—just look at the sign to

understand the effect

slide-77
SLIDE 77

Positive autocorrelation. This bar would be < 0 if it were a negative autocorrelation.

Lag Acf

  • 0.5

0.0 0.5 1.0 0 2 4 6 8 Couple01 Couple02 0 2 4 6 8 Couple03 Couple04 0 2 4 6 8 Couple05 Couple06 0 2 4 6 8 Couple07 Couple08 Couple09 Couple10 Couple11 Couple12 Couple13

  • 0.5

0.0 0.5 1.0 Couple14

  • 0.5

0.0 0.5 1.0 Couple15 Couple16 Couple17 Couple18 Couple19 Couple20 Couple21 Couple22 Couple23 Couple24 Couple25 Couple26 Couple27

  • 0.5

0.0 0.5 1.0 Couple28

  • 0.5

0.0 0.5 1.0 Couple29 Couple30 Couple31 Couple32 Couple33 Couple34 Couple35 Couple36 0 2 4 6 8 Couple37 Couple38 0 2 4 6 8 Couple39

  • 0.5

0.0 0.5 1.0 Couple40

slide-78
SLIDE 78

Autocorrelation: Lag

  • Autocorrelation can happen at different time

scales or lags

  • Most common is lag 1: an observation correlates

with the next one

1 2 3 4 5

Autocorrelation: Lag

  • Autocorrelation can happen at different time

scales or lags

  • Most common is lag 1: an observation correlates

with the next one

1 2 3 4 5

slide-79
SLIDE 79

Autocorrelation: Lag

  • Autocorrelation can happen at different time

scales or lags

  • Most common is lag 1: an observation correlates

with the next one

  • Lag 2: an observation correlates not with the next
  • bservation, but the one two time points later
  • Like the (false) idea that twins “skip a generation”
  • Effects that recur, but not immediately (e.g.,

earthquakes) 1 2 3 4 5

slide-80
SLIDE 80

Autocorrelation: Lag

  • Autocorrelation can happen at different time

scales or lags

  • Most common is lag 1: an observation correlates

with the next one

  • Lag 2: an observation correlates not with the next
  • bservation, but the one two time points later
  • Like the (false) idea that twins “skip a generation”
  • Lag 3, Lag 4, etc…
  • Mood might have a lag 7 autocorrelation –

weekly change (sad Monday, happy Friday)

  • But, be careful with autocorrelations >1—is

there a theoretically plausible reason to expect them?

slide-81
SLIDE 81

Lag 1 autocorrelation Lag 2 autocorrelation Lag 3 autocorrelation

Leftmost line is lag 0—i.e., the correlation

  • f an
  • bservation

with itself This will always be 1 It’s shown for purposes of comparison

slide-82
SLIDE 82

Lag Acf

  • 0.5

0.0 0.5 1.0 0 2 4 6 8 Couple01 Couple02 0 2 4 6 8 Couple03 Couple04 0 2 4 6 8 Couple05 Couple06 0 2 4 6 8 Couple07 Couple08 Couple09 Couple10 Couple11 Couple12 Couple13

  • 0.5

0.0 0.5 1.0 Couple14

  • 0.5

0.0 0.5 1.0 Couple15 Couple16 Couple17 Couple18 Couple19 Couple20 Couple21 Couple22 Couple23 Couple24 Couple25 Couple26 Couple27

  • 0.5

0.0 0.5 1.0 Couple28

  • 0.5

0.0 0.5 1.0 Couple29 Couple30 Couple31 Couple32 Couple33 Couple34 Couple35 Couple36 0 2 4 6 8 Couple37 Couple38 0 2 4 6 8 Couple39

  • 0.5

0.0 0.5 1.0 Couple40

slide-83
SLIDE 83

Testing Autocorrelation, continued

  • Does our lag 1 autocorrelation account for the

sequential dependency?

  • Examine the autocorrelation of the model

residuals

  • relationship$resid <- resid(model.auto)
  • Now run acf.fnc() on resid rather than

WarmthToday

Baayen & Milin, 2010

slide-84
SLIDE 84

Lag Acf

  • 0.5

0.0 0.5 1.0 0 2 4 6 8 Couple01 Couple02 0 2 4 6 8 Couple03 Couple04 0 2 4 6 8 Couple05 Couple06 0 2 4 6 8 Couple07 Couple08 Couple09 Couple10 Couple11 Couple12 Couple13

  • 0.5

0.0 0.5 1.0 Couple14

  • 0.5

0.0 0.5 1.0 Couple15 Couple16 Couple17 Couple18 Couple19 Couple20 Couple21 Couple22 Couple23 Couple24 Couple25 Couple26 Couple27

  • 0.5

0.0 0.5 1.0 Couple28

  • 0.5

0.0 0.5 1.0 Couple29 Couple30 Couple31 Couple32 Couple33 Couple34 Couple35 Couple36 0 2 4 6 8 Couple37 Couple38 0 2 4 6 8 Couple39

  • 0.5

0.0 0.5 1.0 Couple40

No consistent autocorrelation remaining in the data

slide-85
SLIDE 85

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-86
SLIDE 86

Why Does Autocorrelation Matter?

  • Autocorrelation can be an interesting research

question in its own right

  • Important for model assumptions
  • A simple growth-curve model:
  • Assumes that time matters only insofar

as there’s a general increase/decrease with time

  • Otherwise, all observations from a couple equally

similar or dissimilar … error terms independent

=

Warmth

Yi(j) γ00

Overall baseline Time

γ10x1(j)

+

U0j

Random intercept for couple

+

Error term— independent, identically distributed

Ei(j)

+

slide-87
SLIDE 87

Why Does Autocorrelation Matter?

  • Autocorrelation can be an interesting research

question in its own right

  • Important for model assumptions
  • Assumption all observations from a couple equally

similar or dissimilar … error terms independent

  • Not true if there’s autocorrelation … observations

similar in time dependent on one another

  • Underestimates

standard error à inflates Type I error

1 2 3 4 5

L I N E A R T R E N D

slide-88
SLIDE 88

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-89
SLIDE 89

Cross-Lagged Models

  • Our diary study also includes records of

support attempts from the partner

  • Do these cause

increased warmth towards the partner?

slide-90
SLIDE 90

Cross-Lagged Models

  • Our diary study also includes records of

support attempts from the partner

  • Do these cause

increased warmth towards the partner?

Wait just a darn minute! Correlation does not imply causation! You didn’t experimentally manipulate these support attempts, so you don’t know which caused which! I’VE FINALLY GOT YOU, FRAUNDORF!!

slide-91
SLIDE 91

Cross-Lagged Models

  • Problem: Relation between support attempts &

warmth is ambiguous

  • Support attempts could increase warmth towards partner
  • Warmth towards partner could motivate support attempts
  • A third variable could explain both

Perceived warmth Support attempt

TIME t ?

Relationship commitment

slide-92
SLIDE 92

Cross-Lagged Models

  • Problem: Relation between support attempts &

warmth is ambiguous

  • But: Causes precede effects in time
  • Support attempt on a previous day should

influence warmth now

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

Duckworth, Tsukayama, & May, 2010

slide-93
SLIDE 93

Cross-Lagged Models

  • Use lags.fnc() to create a SupportYesterday variable
  • relationship$SupportYesterday <- lags.fnc(relationship,

time='Day', group='Couple', depvar='PartnerSupport', lag=1)

  • Then, use that in a model:
  • model.lagged <- lmer(WarmthToday ~ 1 + Day +

SupportYesterday + (1|Couple), data=relationship)

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

Duckworth, Tsukayama, & May, 2010

slide-94
SLIDE 94

Cross-Lagged Models

  • model.lagged <- lmer(WarmthToday ~ 1 + Day +

SupportYesterday + (1|Couple), data=relationship)

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

Duckworth, Tsukayama, & May, 2010

slide-95
SLIDE 95

Cross-Lagged Models

  • Warmth at t can’t be the cause of support at t-1
  • Helps clarify which is the cause and which is the

effect

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

X

Duckworth, Tsukayama, & May, 2010

slide-96
SLIDE 96

Cross-Lagged Models

  • Warmth at t can’t be the cause of support at t-1
  • But, warmth at time t-1 could still function as

a 3rd variable

  • Causes support attempts at time t-1
  • Leads to greater warmth at time t (autocorrelation)

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

X

Perceived warmth

Duckworth, Tsukayama, & May, 2010

X

slide-97
SLIDE 97

Cross-Lagged Models

  • To rule this out, we need to include the

autocorrelative effect of perceived warmth (our DV)

  • model.lagged2 <- lmer(WarmthToday ~ 1 + Day

+ SupportYesterday + WarmthYesterday + (1|Couple), data=relationship)

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

X

Perceived warmth

Duckworth, Tsukayama, & May, 2010

slide-98
SLIDE 98

Cross-Lagged Models

  • model.lagged2 <- lmer(WarmthToday ~ 1 + Day

+ SupportYesterday + WarmthYesterday + (1|Couple), data=relationship)

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

X

Perceived warmth

Duckworth, Tsukayama, & May, 2010

slide-99
SLIDE 99

Cross-Lagged Models

  • Now, we are seeing a time-lagged effect of support

attempts over and above what can predicted by previous warmth

  • No way to explain this in a model where the causation only

works in reverse

  • Strong evidence against the reverse direct of causation

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

X

Perceived warmth

X

Duckworth, Tsukayama, & May, 2010

slide-100
SLIDE 100

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality

slide-101
SLIDE 101

Establishing Causality

  • Between-person variation in support attempts

predicts within-couple change in warmth

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

X

Perceived warmth

X

Duckworth, Tsukayama, & May, 2010

slide-102
SLIDE 102

Establishing Causality

  • But, there’s still the possibility of a third variable that

really drives this between-person difference

  • e.g., relationship commitment could explain variation in

previous support attempts and increase in warmth

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

X

Perceived warmth

X

Relationship commitment

Duckworth, Tsukayama, & May, 2010

slide-103
SLIDE 103

Establishing Causality

  • If relationship is driven by an underlying 3rd variable,

then warmth & support don’t have a cause/effect relation

  • Should see the same relation regardless of their order

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

X

Perceived warmth

X

Relationship commitment

Duckworth, Tsukayama, & May, 2010

slide-104
SLIDE 104

Establishing Causality

  • To establish causality, show that the direction of the

relationship matters

  • Run the inverse model where support attempts are

the DV and previous warmth is the predictor

  • Hint: Support attempts are a categorical outcome. What do you have

to do differently?

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt Perceived warmth

X X

slide-105
SLIDE 105

Establishing Causality

  • model.lagged3 <- glmer(PartnerSupport ~ 1 + Day +

WarmthYesterday + SupportYesterday + (1|Couple), data=relationship, family=binomial)

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt Perceived warmth

X X

Duckworth, Tsukayama, & May, 2010

slide-106
SLIDE 106

Establishing Causality

  • model.lagged3 <- glmer(PartnerSupport ~ 1 + Day +

WarmthYesterday + SupportYesterday + (1|Couple), data=relationship, family=binomial)

  • No significant effects
  • Earlier support attempts predict later warmth

(model.lagged2)

  • But earlier warmth doesn’t predict later support

attempts (model.lagged3)

  • Evidence for directionality of effect
slide-107
SLIDE 107

Establishing Causality

  • This kind of evidence is called Granger causality
  • Still one kind of 3rd variable not ruled out: One with immediate

effect on support attempts & a delayed effect on warmth

  • However, much less likely
  • So, not quite as good as randomized experiment
  • But, effective when experimental control not

possible (e.g., economics, neuroscience)

Perceived warmth Support attempt

TIME t TIME t-1

Support attempt

X

Perceived warmth

X

???

slide-108
SLIDE 108

Establishing Causality

  • This kind of evidence is called Granger causality
  • Still one kind of 3rd variable not ruled out: One with immediate

effect on support attempts & a delayed effect on warmth

  • However, much less likely
  • So, not quite as good as randomized experiment
  • But, effective when experimental control not

possible (e.g., economics, neuroscience)

Adapted from Kaminski et al., 2011

slide-109
SLIDE 109

Week 10: Longitudinal Designs

l Overview l Growth Curve Analysis

l Main Effect l Random Slopes l Other Variables l Quadratic & Higher Degrees

l Autocorrelation

l Testing for Autocorrelation l Visualizing Autocorrelation l Direction & Lag l Why Does Autocorrelation Matter?

l Cross-Lagged Models

l Cross-Lagged Models l Establishing Causality