Week 8, Lectures 1 & 2 : Fixed-, Random-, and Mixed-Effects models 1. The repeated measures design, where each of n Ss is measured k times, is a popular one in Psych. We approach this design in 2 ways: 1. As a generalisation of the paired t-test 2. As an expansion from 1-way to 2-way designs 2. Fixed & random effects; nested & repeated measures designs 3. Using EMS tables to define appropriate F ratios for certain designs 4. Options for modeling random effects in R 1
Outline of Lectures 1 & 2 on Mixed- Effects models 5. ‘ kv0.csv ’ ; ‘ skv1.r ’ 6. Expanded HW-5: 1. range of possible models 2. Use AIC and log(Likelihood) to compare models 7. Refs: Howell , Chap. 14; http://www.ats.ucla.edu/stat/R/seminars/ Repeated_Measures/repeated_measures.htm ( ‘ ucla ’ ) http://www.personality-project.org/R/r.anova.html ( ‘ nwu ’ ) 2
Reprise of paired t-test • Each S gives 2 scores, one in Group = 0, and the other in Group = 1. • Wrong analysis wd be res1 = t.test(x0, x1, var.equal = T), because it treats the 2 groups as independent! • Correct analysis is res2 = t.test(x0, This latter analysis aptly finesses x1, var.equal = T, the problem introduced by the paired = T) correlation across Ss between x 0i and x 1i . 3
Reprise of paired t-test • For the i ’th S, compute the difference : d i = x 1i – x 0i . • Let s = s.d. of the { d i }. = nd 2 d t n − 1 = 1, n − 1 = t n − 1 2 ; or F s 2 . s / n d = x 1 − x 0 . Grand mean is x 1 + x 0 ≡ x . 2 SS Group = n [( x 1 − x ) 2 + ( x 0 − x ) 2 ] = n ( x 1 − x 0 ) 2 = nd 2 , after simplifying. 1, n − 1 = nd 2 s 2 = SS Group = MS Group Because k – 1 =1. Thus, F . s 2 s 2
Reprise of paired t-test 1, n − 1 = nd 2 s 2 = MS Group F s 2 • How best to interpret s 2 ? • d i is also the slope of each line in the Fig. Thus variation in d i is an index of a Subject x Group interaction . That is, • s 2 = var( d i ) = the Subj x Group interaction MS . 1, n − 1 = nd 2 MS Group s 2 = F MS Sub * Group This expression for F generalises to the case, k > 2. 5
Introduction to Mixed Models • An alternative way to finesse the problem of cor( x 0i , x 1i ) is to use the package lme4 ¡( = Linear Mixed Effects, with S4 classes), and the function, lmer() . ¡ ¡ First, arrange data in ‘ long form ’, d1 (as for lm() ). ¡ • res3 = lmer(score ~ Group + (1 | suid), data = d1). • NOT res3i = lm(score ~ Group, data = d1), which is incorrect, because it ignores the grouping/ correlation resulting from the repeated measures design! • Of course the linear mixed model , lmer(), can do much more than paired t-tests! [Recent articles tout its efficacy: JPSP 2012, by Judd, Westfall & Kenny; Science, Oct 2013, “Biology’s Dry Future”, by R. Service.]
Passage from 1- to 2- & 3-way designs • Example 1 : How does the Score on a memory test depend on the length of the study period, T (= 1, 2, or 3 units)? We could use a 1-way between-groups design in which, say, 24 participants (Ss) are randomly assigned, n = 8 to each level of T . T = 1 T = 2 T = 3 Source df SS MS F 3 4 3 Between 2 MS b MS b /MS w 5 5 7 Within 21 MS w … … … Total 23 1 6 6 • Even though the data in the table are in rows, there is no ‘Row’ factor because the 3 Ss in each row have nothing in common. 7
• Example 2 : Same as Ex. 1, except that you now worry that Score might also depend on Ss' verbal Ability (A) . So divide Ss into 2 levels of A , ‘lo’ and ‘hi’, 12 Ss at each level. At each level of ability, randomly assign n = 4 Ss to each level of T . This is a 2-way between-groups factorial design. Source df SS MS F T 2 MS T /MS w A T = 1 T = 2 T = 3 A 1 MS A /MS w lo 3,5, … 4,5, … 3,7, … A * T 2 MS A*T /MS w hi 5,1, … 5,6, … 7,6, … Within 18 MS w Total 23 • We have a ‘Row’( A ) & a ‘Col’ ( T ) factor, so we can define the A*T interaction. The 4 independent obs in each of the 6 cells are used to estimate MS w , which is the denominator in all 3 F ratios. 8
• Example 3 : Same as Ex. 2, except that you now decide that the best way to control for Ability (A) is to use each S as her or his own control, and to measure each S’s score at all 3 levels of T . Suppose we have 8 Ss. This is a 2-way within-group design, with S and T as the 2 factors. S T = 1 T = 2 T = 3 Source df SS MS F 1 3 4 7 T 2 MS T /MS res 2 5 5 6 S 7 MS S /MS res … … … … Residual 14 MS res 8 1 2 5 Total 23 • Because n = 1 obs per c ell, we cannot estimate the within-cell variance, MS w , separately from the interaction MS . Hence we lump the two sources of variation into MS res , as use this as the denominator in the F ratios for the 2 main effects. 9
• Example 3 (cont’d) : One way to finesse the problem that the interaction MS and MS w are confounded is to assume that the interaction MS is 0 and, therefore, that MS res = MS w . Assuming that the interaction MS is 0 is equivalent to assuming that S and T have additive effects . The lm() model would be: rs3 = lm(score ~ subid + time, data=d0), rather than rs3a = lm(score ~ subid * time, data=d0), which wd give an error message! Source df SS MS F S T = 1 T = 2 T = 3 T 2 MS T /MS res 1 3 4 7 S 7 MS S /MS res 2 5 5 6 Residual 14 MS res … … … … Total 23 8 1 2 5 10
ANOVA Table for the additive model Source df MS F df of F Row, e.g., r-1 MS r MS r / r-1, df error MS error gift Column, e.g., c-1 MS c MS c / c-1, df error income MS error Within Cell, df error = MS error or Error , or rc(n-1)+ or MS resid Residual (r-1)(c-1) Total N - 1 = rcn - 1 The formulae for the expected MS ( EMS ) tell us how to define F for testing each effect. n obs per cell. 11
ANOVA Table for the interactive model: **Note reduction in df error when we test for RxC interaction; and that df error = 0 if n = 1 Source df MS F df of F Row r-1 MS r MS r / r-1, df error MS error Column c-1 MS c MS c / c-1, df error MS error RxC (r-1)(c-1) MS rc MS rc / (r-1)(c-1), interaction MS error df error Within df error = MS error cell rc(n-1) ** Total N - 1 = rcn - 1 12
CAVEAT: When n = 1, one cannot test for interaction! cat('Interactive Model ’ ) rs3a = lm(score ~ subid * time, data=d0) print(anova(rs3a)) Response: score Df Sum Sq Mean Sq F value Pr(>F) ability 2 2.0000 1.0000 time 3 25.6667 8.5556 ability:time 6 3.3333 0.5556 #Note Residuals 0 0.0000 #Note Residuals: ALL 12 residuals are 0: no residual degrees of freedom! Warning message: In anova.lm(rs3a) : ANOVA F-tests on an essentially perfect fit are unreliable 13
Design Differences between Ex. 2 & 3 A T = 1 T = 2 T = 3 S T = 1 T = 2 T = 3 lo 3,5, … 4,5, … 3,7, … 1 3 4 7 hi 5,1, … 5,6, … 7,6, … 2 5 5 6 … … … … 8 1 2 5 • Both are 2-way factorial designs: A by T , and S by T . However, because we are rarely interested in S per se as an explanatory factor (Why?), but are interested in A , we refer to the S by T design as a 1-factor, within-S design ! • The number of levels of A is ‘small’; that of S is ‘large’ (more a symptom than a principled diff). 14
Design Differences between Ex. 2 & 3 A T = 1 T = 2 T = 3 S T = 1 T = 2 T = 3 lo 3,5, … 4,5, … 3,7, … 1 3 4 7 hi 5,1, … 5,6, … 7,6, … 2 5 5 6 … … … … 8 1 2 5 • We are rarely interested in the levels S per se (i.e., in ‘john’ vs ‘mary’ vs … ); these are merely random selections (or ‘effects’) from a ‘large’ popn of possible values. We wish to make inferences about this popn of S levels. • We are interested in the levels of A (‘lo’ vs ‘hi’). These are fixed effects to be interpreted. 15
Design Differences between Ex. 2 & 3 A T = 1 T = 2 T = 3 S T = 1 T = 2 T = 3 lo 3,5, … 4,5, … 3,7, … 1 3 4 7 hi 5,1, … 5,6, … 7,6, … 2 5 5 6 … … … … 8 1 2 5 • In the A by T design, all obs at each level of A come from different Ss and, therefore, are statistically independent (i.e., uncorrelated ) • In the S by T design, all obs at each level of S come from the same S and, therefore, are correlated . That is, the correlation between scores when, e.g., T=1 and T=2 shd be positive. 16
Design Differences between Ex. 2 & 3 A T = 1 T = 2 T = 3 S T = 1 T = 2 T = 3 lo 3,5, … 4,5, … 3,7, … 1 3 4 7 hi 5,1, … 5,6, … 7,6, … 2 5 5 6 … … … … 8 1 2 5 • The S by T design is also called a repeated measures design . The within-S correl between scores introduces complexities into the calculation of F ratios for this design • To solve these complexities, we assume that the within-S correls satisfy certain simplifying conditions, e.g., Compound Symmetry . 17
Defn of Compound Symmetry (CS) • We assume that the correlation across Ss between the scores at T = 1 and T = 2, cor( x 1i , x 2i ) = cor( x 1i , x 3i ) = cor( x 3i , x 2i ). • If there is a between-Ss factor, A , we also assume that the correls, cor( x 1i , x 2i ), cor( x 1i , x 3i ) and cor( x 3i , x 2i ), are the same when A = 1 as when A = 2. • An almost equivalent set of conditions, known as sphericity , is that the variance of the differences between scores at T = i and T = j is the same for all i and j . 18
Recommend
More recommend