1
A First Look at Multilevel and Longitudinal Models York University Statistical Consulting Service
(Revision 2, March 22, 2005)
A First Look at Multilevel and Longitudinal Models York University - - PowerPoint PPT Presentation
A First Look at Multilevel and Longitudinal Models York University Statistical Consulting Service (Revision 2, March 22, 2005) March 2005 Georges Monette with help from Ernest Kwan, Alina Rivilis, Qing Shao and Ye Sun 1 1
1
(Revision 2, March 22, 2005)
2
1 Introduction..........................................................................................................................................................................................................................................4 2 Preliminaries........................................................................................................................................................................................................................................7 2.1 Is 1
1
2.2 Visualizing multivariate variance............................................................................................................................................................................................10 2.3 Data example: Simpson and Robinson....................................................................................................................................................................................12 2.4 Matrix formulation of regression.............................................................................................................................................................................................13 3 General regression analysis................................................................................................................................................................................................................16 3.1 Detailed description of data set................................................................................................................................................................................................16 3.2 Looking at school 4458 ...........................................................................................................................................................................................................22 3.3 Model to compare two schools................................................................................................................................................................................................23 3.4 Comparing two Sectors............................................................................................................................................................................................................26 3.5 What’s wrong with 3.3?...........................................................................................................................................................................................................28 4 The Hierarchical Model.....................................................................................................................................................................................................................29 4.1 Within School model:..............................................................................................................................................................................................................30 4.2 Between School model:...........................................................................................................................................................................................................30 4.3 A simulated example ...............................................................................................................................................................................................................31 4.4 Between-School Model: What γ means................................................................................................................................................................................44 5 Combined (composite) model............................................................................................................................................................................................................46 5.1 From the multilevel to the combined form ..............................................................................................................................................................................46 5.2 GLS form of the model............................................................................................................................................................................................................48 5.3 Matrix form..............................................................................................................................................................................................................................49 5.4 Notational Babel......................................................................................................................................................................................................................50 5.5 The GLS fit..............................................................................................................................................................................................................................51
3
6 The simplest models ..........................................................................................................................................................................................................................52 6.1 One-way ANOVA with random effects ...................................................................................................................................................................................52 6.2 Estimating the one-way ANOVA model..................................................................................................................................................................................53 6.2.1 Mixed model approach...................................................................................................................................................................................................57 6.3 EBLUPs...................................................................................................................................................................................................................................58 7 Slightly more complex models ..........................................................................................................................................................................................................60 7.1 Means as outcomes regression.................................................................................................................................................................................................60 7.2 One-way ANCOVA with random effects.................................................................................................................................................................................61 7.3 Random coefficients model .....................................................................................................................................................................................................61 7.4 Intercepts and Slopes as outcomes...........................................................................................................................................................................................62 7.5 Nonrandom slopes...................................................................................................................................................................................................................63 7.6 Asking questions: CONTRAST and ESTIMATE statement ...................................................................................................................................................63 8 A second look at multilevel models...................................................................................................................................................................................................67 8.1 What is a mixed model really estimating.................................................................................................................................................................................67 8.2
and T ...........................................................................................................................................................................................................67 8.2.1 Random slope model ......................................................................................................................................................................................................67 8.2.2 Two random predictors...................................................................................................................................................................................................69 8.2.3 Interpreting Chol(T) .......................................................................................................................................................................................................70 8.2.4 Recentering and balancing the model.............................................................................................................................................................................72 8.2.5 Random slopes and variance components parametrization ............................................................................................................................................72 8.2.6 Testing hypotheses about T ..........................................................................................................................................................................................72 8.3 Examples .................................................................................................................................................................................................................................76 8.4 Fitting a multilevel model: contextual effects .........................................................................................................................................................................77 8.4.1 Example..........................................................................................................................................................................................................................78 9 Longitudinal Data ..............................................................................................................................................................................................................................86
4
9.1 The basic model.......................................................................................................................................................................................................................91 9.2 Analyzing longitudinal data...................................................................................................................................................................................................100 9.2.1 Classical or Mixed models ...........................................................................................................................................................................................100 9.3 Pothoff and Roy.....................................................................................................................................................................................................................102 9.3.1 Univariate ANOVA.......................................................................................................................................................................................................104 9.3.2 MANOVA repeated measures....................................................................................................................................................................................... 113 9.3.3 Random Intercept Model with Autocorrelation............................................................................................................................................................ 115 9.3.4 Comparing Different Covariance Models..................................................................................................................................................................... 118 9.3.5 Exercises on Pothoff and Roy.......................................................................................................................................................................................120 10 Bibliography (to be changed)...................................................................................................................................................................................................121 11 Outputs and pictures ................................................................................................................................................................................................................146 11.1 what...................................................................................................................................................................................................................................146 11.2 Looking at a single school: Public School P4458.............................................................................................................................................................146
The last two decades have seen rapid growth in the development and use of models suitable for multilevel or longitudinal data. We can identify at least four broad approaches: the use of derived variables, econometric models, latent trajectory models using structural equations models and mixed (fixed and random effects) models. This course focuses on the use of mixed models for multilevel and longitudinal data. Mixed models have a wide potential for applications to data that are otherwise awkward or impossible to model. Some key applications are to nested data structures, e.g. students within classes within schools within school boards, and to longitudinal data, especially ‘messy’ data where
5
measurements are taken at irregular times, e.g. clinical data from patients measured at irregular times or panel data with changing membership. Another application is to panel data with time-varying covariates, e.g. age where each cohort has a variety of ages and the researcher is interested in studying age effects. New books and articles on multilevel and longitudinal models are being released much faster than one can read or afford them. One way to get started for someone who intends to start with SAS would be to read: Judith D. Singer and John B. Willett (2003). Applied Longitudinal Data Analysis. Oxford University Press, New York. An accessible and comprehensive treatment of methods for longitudinal analysis. In addition to the use of mixed models, this book includes material on latent growth models and on event history analysis. Stephen W. Raudenbush and Anthony S. Bryk (2002) Hierarchical linear models : applications and data analysis methods, (2nd edition). Sage, Thousand Oaks, CA. Peter J. Diggle, Patrick Heagerty, Kyng-Yee Liang and Scott L. Zeger (2002) Analysis of Longitudinal Data, (2nd edition). Oxford University Press, Oxford. Tom A. B. Snijders and Roel J. Bosker (1999). Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. Sage, London. An excellent book with a broad conceptual coverage. ‘‘Using SAS PROC MIXED to fit Multilevel Models, Hierarchical Models, and Individual Growth Models’’ by Judith Singer Singer(1998). Mixed Effects Models in S and S-Plus Pinheiro and Bates (2000). The choice of software to fit mixed models continues to grow. Many packages are in active development and offer new functionality. It is safe to hypothesize that no package is uniformly superior to all the others.
6
PROC MIXED in SAS is a solid workhorse although it has been relatively static for some years. The ability to specify a wide variety of structures for both the within-cluster covariance matrix and for the random effects covariance matrix is an important feature. Graphics in SAS are improving and the description of version 9 mentions the ability to produce some important diagnostic plots automatically. NLMIXED is a more recent and interesting addition for non-linear models and for modeling binary and Poisson outcomes. SPSS 11.5 has a MIXED command to fit mixed models. See Alastair Leyland (2004) A review of multilevel modeling in SPSS. An excellent special-purpose programme is MLwiN developed by a group associated with Harvey Goldstein at the Institute of Education at the University of London. It uses a graphical interface that is very appealing for multilevel modelling and produces some very interesting plots quite easily. Its basic linear methods are well integrated with more advanced methods: logistic mixed models, MCMC estimation, to make the transition relatively easy. One shortcoming is its unstructured parametrization of the variance matrix of random effects. MLwiN seems to be able to handle very large data sets as long as they fit in RAM. NLME by Bates and Pinheiro is a library in R and S-Plus. This is my favourite working environment but it’s best for those who enjoy
Plus are very strong for graphics and for programmability which allows you to easily refine and reuse analytic solutions. GLLAMM in STATA is a very interesting program that is in active development and can fit a wide variety of models including those with latent variables. The current version of HLM has, like MLwiN, has a graphical interface. It is developed by Bryk and Raudenbush and is a powerful and popular program Some resources on the web include: UCLA Academic Technology Services seminars at http://www.ats.ucla.edu/stat/seminars/ contains an excellent collection of seminars
7
Two multilevel statistics books are available free online: The Internet Edition of Harvey Goldstein’s Extensive but challenging Multilevel Statistical Models can be downloaded from http://www.arnoldpublishers.com/support/goldstein.htm Applied Multilevel Analysis by Joop Hox is available at http://www.fss.uu.nl/ms/jh/publist/amaboek.pdf. Multilevel Modelling Newsletters produced twice yearly by The Multilevel Models Project in the Institute of Education at the University
The web site for the Multilevel Models Project: http://www.ioe.ac.uk/multilevel/ One can subscribe to an email discussion list at http://www.mailbase.ac.uk/lists/multilevel/
We begin by reviewing a few concepts from regression and multivariate data analysis. Our emphasis is on concepts that are frequently omitted,
1
1
Consider the familiar regression equation:
8
1 1 2 2
( ) E Y X X β β β = + + (1) To make this more concrete, suppose we are studying the relationship between Y=Health,
1
X Weight = and
2
. X Height =
1 2
( ) E Health Weight Height β β β = + + (2) so that
1
β is the ‘effect’ of changing Weight . What if we are really interested in the ‘effect’ of ExcessWeight . Maybe we should replace Weight with ExcessWeight . Let’s suppose that
1
1Height
φ φ + is the ‘normal’ Weight for a given Height . What happens if we fit the model?
1 2
1
9
1 2 1 1 2 1 0 1 2 1 1
1 0 1 1 2 1 1
2
1
10
1
2
2
1
1
2
11
2
1
1
2
2
1
1
2
2
1
12
13
1 1 2 2 i i i i
i
2
14
1 11 1 21 2 1 2 12 1 22 2 2 1 1 2 2 1 1 2 2 i i i i N N N N
Y x x Y x x Y x x Y x x β β β ε β β β ε β β β ε β β β ε = + + + = + + + = + + + = + + + M M Note that the s β remain the same from line to line but Ys, xs and s ε change. Using vectors and matrices and exploiting the rules for multiplying matrices:
1 11 21 1 2 12 22 2 1 2 1 2
1 1 1
N N N N
Y x x Y x x Y x x ε β ε β β ε ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = = ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ M M M
j
1 11 21 1 12 22 2 2 1 2 1 2
1 1 1
j j j i
j j j j j j j j j j j n j n j n j n j
Y x x x x Y x x Y ε β ε β β ε ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = + ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ M M M M M
15
j j j j
j =
1 1 1 j j j J J J
js
j
1 1 1 1 j j j j J J J J
16
2
' 1 '
−
2 ' 1
−
' 1 1 ' 1
GLS
− − −
' 1 1
− −
school minority female ses mathach size sector meanses 1 1317 0 1 0.062 18.827 455 1 0.351 2 1317 0 1 0.132 14.752 455 1 0.351 3 1317 0 1 0.502 23.355 455 1 0.351
17
4 1317 0 1 0.542 18.939 455 1 0.351 5 1317 0 1 0.582 15.784 455 1 0.351 6 1317 0 1 0.702 13.677 455 1 0.351 7 1317 0 1 0.812 20.236 455 1 0.351 8 1317 0 1 0.882 12.862 455 1 0.351 9 1317 0 1 0.992 11.621 455 1 0.351 10 1317 0 1 1.122 4.508 455 1 0.351 11 1317 0 1 1.372 20.748 455 1 0.351 12 1317 0 1 -0.098 11.064 455 1 0.351 13 1317 0 1 -0.578 11.502 455 1 0.351 14 1317 1 1 0.082 13.373 455 1 0.351 15 1317 1 1 0.122 7.142 455 1 0.351 16 1317 1 1 0.132 18.362 455 1 0.351 17 1317 1 1 0.132 20.261 455 1 0.351 18 1317 1 1 0.272 3.220 455 1 0.351 19 1317 1 1 0.302 6.973 455 1 0.351 20 1317 1 1 0.322 10.394 455 1 0.351 21 1317 1 1 0.362 21.405 455 1 0.351 22 1317 1 1 0.472 9.257 455 1 0.351 23 1317 1 1 0.482 12.283 455 1 0.351 24 1317 1 1 0.482 21.405 455 1 0.351 25 1317 1 1 0.492 8.382 455 1 0.351 26 1317 1 1 0.612 10.956 455 1 0.351 27 1317 1 1 0.642 17.246 455 1 0.351 28 1317 1 1 0.722 11.027 455 1 0.351 29 1317 1 1 0.832 4.810 455 1 0.351 30 1317 1 1 0.932 8.961 455 1 0.351 31 1317 1 1 0.942 23.736 455 1 0.351 32 1317 1 1 0.952 9.337 455 1 0.351 33 1317 1 1 0.972 11.794 455 1 0.351
18
34 1317 1 1 1.152 20.039 455 1 0.351 35 1317 1 1 1.462 10.661 455 1 0.351 36 1317 1 1 -0.008 10.066 455 1 0.351 37 1317 1 1 -0.008 11.290 455 1 0.351 38 1317 1 1 -0.028 7.031 455 1 0.351 39 1317 1 1 -0.068 17.869 455 1 0.351 40 1317 1 1 -0.088 8.057 455 1 0.351 41 1317 1 1 -0.108 10.121 455 1 0.351 42 1317 1 1 -0.108 10.493 455 1 0.351 43 1317 1 1 -0.108 17.203 455 1 0.351 44 1317 1 1 -0.158 4.756 455 1 0.351 45 1317 1 1 -0.258 15.555 455 1 0.351 46 1317 1 1 -0.288 21.405 455 1 0.351 47 1317 1 1 -0.848 11.531 455 1 0.351 48 1317 1 1 -1.248 8.253 455 1 0.351 49 1374 0 0 0.322 16.663 2400 0 -0.007 50 1374 0 0 0.362 24.041 2400 0 -0.007
19
0.0 0.2 0.4 0.6 0.8 1.0 2000 6000 Fraction of 1720 (NA: 0 )
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.4 0.8 Fraction of 1720 (NA: 0 )
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.4 0.8 Fraction of 1720 (NA: 0 )
0.0 0.2 0.4 0.6 0.8 1.0
1 Fraction of 1720 (NA: 0 )
0.0 0.2 0.4 0.6 0.8 1.0 5 15 25 Fraction of 1720 (NA: 0 )
0.0 0.2 0.4 0.6 0.8 1.0 500 1500 Fraction of 1720 (NA: 0 )
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.4 0.8 Fraction of 1720 (NA: 0 )
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.5 Fraction of 1720 (NA: 0 )
N = 1720 Nmiss = 0 10 20 30 40 50 60
C4511 C6074 C6366 C6469 C9021 C9104 C9347 C9586 P4642 P6415 P9158
Figure 1: Uniform quantile plots of high school data
20
ses mathach
1 5 10 15 20 25
P8854 P3657
1
P3377 P9158
1
P9340 P8775
1
P2995 P6415 P6808 P1374 P7101 P8367 P3152 P2651 P8202
5 10 15 20 25
P4642
5 10 15 20 25
P5838 P5783 P1909 P2030 P6897 P2336 P7919 P3332 P1461 C6074 C4511 C5192 C9347 C6366 C1317
5 10 15 20 25
C3039
5 10 15 20 25
C4868
1
C2658 C2755
1
C1436 C9586
1
C9021 C9104
1
C6469
Figure 2: "Trellis" plot of high school data with least-squares lines for each school.
21
ses mathach
1 5 10 15 20 25
P8854 P3657
1
P3377 P9158
1
P9340 P8775
1
P2995 P6415 P6808 P1374 P7101 P8367 P3152 P2651 P8202
5 10 15 20 25
P4642
5 10 15 20 25
P5838 P5783 P1909 P2030 P6897 P2336 P7919 P3332 P1461 C6074 C4511 C5192 C9347 C6366 C1317
5 10 15 20 25
C3039
5 10 15 20 25
C4868
1
C2658 C2755
1
C1436 C9586
1
C9021 C9104
1
C6469 male female
Figure 3: Trellis plot of high school data with sex of students.
22
Public School 4458
SES MathAch
1 2
5 10 15 20 25
i
Y and
i
X be the math achievement score and SES respectively of the ith student. We can formulate the model:
1 i i i
Y X r β β = + + where
1
β is the average change in math achievement for a unit change in SES, β is the expected math achievement at SES = 0, and i r is the random deviation of the ith student from the expected (linear) pattern. We assume i r to be iid
2
> summary(lm(MathAch ~ SES, zz1)) Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 6.9992 1.2380 5.6534 0.0000 SES 1.1318 1.0074 1.1235 0.2671 Residual standard error: 4.463 on 46 degrees
Multiple R-Squared: 0.02671
23
1
β = so we would accept
1
1 ij j j ij ij
j
j
2
b(SES) b0
5 10
5 10 15 20 25
24
2
25
> summary(lm(MathAch ~ SES * Sector, dataset)) Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 6.9992 1.1618 6.0244 0.0000 SES 1.1318 0.9454 1.1972 0.2348 Sector 11.3997 1.6096 7.0823 0.0000 SES:Sector 0.7225 1.5340 0.4710 0.6390
Residual standard error: 4.188 on 79 degrees of freedom Multiple R-Squared: 0.7418 F-statistic: 75.67 on 3 and 79 degrees of freedom, the p-value is 0 Correlation of Coefficients: (Intercept) SES Sector SES 0.8540 Sector -0.7218 -0.6164 SES:Sector -0.5263 -0.6163 -0.0410 > fit$contrasts $Sector: Catholic Public 0 Catholic 1
SES MathAch
1 2
5 10 15 20 25
26
j
j
27
Figure 4: The figure on the left shows the mean lines from each sector. The figure on the right shows each estimated intercept and slope from each school (i.e. ˆ
j
j
the mean lines from each sector.
SES MathAch
1 2
5 10 15 20 25
b(SES) b0
5 10
5 10 15 20 25
28
2
j
j
' 1
j j j j
−
j
j
' 1
j j j j
−
j
2
29
1 In some hierarchical modeling traditions the numbering of levels is reversed going from the top down instead of going from the bottom up. One needs to check which approach
an author is using.
30
SES 1 j j j j j
ij ij ij ij j j j
SES 1 j j j j j
31
j j j
1 1 11
00
00 01 1 1 1 10 11 1
j j j j j j j
00 1 11 1 10 01
i i i i
j
j
32
j j j
j
j
1 ij ij ij j j
:
33
34
35
j j
36
1 ij ij ij i i
37
i
i
38
39
40
41
1 2
j j
−
j
42
j
43
Figure 14: The blue dispersion ellipse with matrix
1 2
j j j
−
is almost coincident with the dispersion ellipse with matrixT .
44
2
j
j
1 2
j j j
−
1 2
j j
−
j
00 01 1 10 11 1 j j
j j j
ij
j
00 01 00 1 10 11 1 10 1 j j j j j j
2 Between-school variables are not limited to indicator variables. Any variables suitable as a predictor in a linear model could be used as long as it is a function of schools, i.e. has
the same value for every subject within each school.
45
00 01 00 01 1 10 11 1 10 11 1
j j j j j j
00
00 01
01
10
10 11
11
0 j
1 j
0 j
1 j
00 01 10 11
0 j
1 j
0 j
1 j
i
1
i
j
i
46
i
j
2 ' 1
i
−
i
1
i
0i
1i
0i
1i
0i
1i
00 1 10 11 1 1 j j j j j j
1 j ij j j i ij
1 00 11 1 00 01
j j ij ij ij i j j j i j j j
47
1 00 11 1 10 1 00 01 00 1 1 1
j j j ij ij ij ij ij ij ij ij ij j j j j j j j
00 01 10 11 j ij j ij
00 01 10 11
1 j j ij ij
0 j
1 j
48
ij
1 ij j j ij ij
00 01 10 11 ij j ij j ij ij
ij
2
0 j
1 j
ij
j
1 2
j i i j k
−
2
2
st
2
49
j j j j j
1 1 1 1 2 2 2
j j j j
j j j j j j j j j j j j j j n j j n j j n j n j
1 00 2 01 1 10 11
j
j j j j j j n j
2
j j
j
j
1 1 1
J J J
1 J
50
1 2 J
2
j
51
' 2
' 1 1 ' 1
− − −
1 −
3 One ironic twist concerns small estimated values of
2
1
−
52
ij j ij
00 j j
00 2 00
ij j ij ij j ij
2 00 00
. 2 .
j j j j j
53
. 2 . 00
j j j j j
. . . 2 2 00 . . . 2 2 00
j j j j j j j j j j j j
00
00
2
0 j
00
0 j
54
00
01 1
J n
01
J
0 j
1 J w j j
w
0 j
55
1
J w j
1
J w j Schools
1 2 3 1 1 2 3 4 1 1 2 3 1 J J J J J J J J
− − −
1 1 J j j j w J j j
= =
1 .. 1 J j j j w Students J j j
= =
56
1 2 3 1 1 2 3 4 1 1 2 3 1 J J J J J J J J
− − −
1 1 J j j j s J j j
= =
1
J w j j j
=
2 2 01 1
J w J j j j
=
j
j
j j
j j
2 / .
57
00
0 j
j j jw Y
j
00
00 00
w j j j j j j j j
j
j jw =
00
j
00
0 j
2 00
j j
2 2 00
w j j j
2 00
j
00
2
Schools
00
2
j
. Students
0 j
00
58
0 j
j
2 00 /
0 j
0 j
.
j j
0 j
00
2
0 j
ij
4 To use the HS data set, download the self-extracting file from the course website. Save it in a convenient directory. Click on its icon to create the SAS data set HS.SD2. From
SAS, create a library named MIXED that points to this directory. You can then use the data set using the syntax in this example.
59
' 2 '
1
−
' 2
0 j
00 . 00
j j
0 j
2 .
j
0 j
00
0 j
2 00 2 2 00 00
j j j j j j
2 2 2 00 00
j j j
60
2 00
j
00
0 j
2
0 j
0 j
j
0 j
0 j
00
ij j ij
00 01 j j j
00 01 ij j j ij
ij j ij
ij
61
1 ij j j ij ij
00 1 10 j j j
00 10 ij ij j ij
1 ij j j ij ij
00 1 10 1 j j j j
62
00 01 10 11 1
j j
00 10 1 ij ij j j ij ij
63
11
01
1
j
11
64
St andar d Ef f ect Est i m at e Er r or DF t Val ue Pr > | t | I nt er cept 11. 6997 0. 4282 31. 9 27. 32 <. 0001 SES 2. 8919 0. 2659 44. 7 10. 88 <. 0001 SECTO R 2. 4715 0. 7033 31. 2 3. 51 0. 0014 SES* SECTO R - 1. 1004 0. 4492 14. 8 - 2. 45 0. 0273 Type 3 Test s of Fi xed Ef f ect s Num Den Ef f ect DF DF F Val ue Pr > F SES 1 44. 7 118. 30 <. 0001 SECTO R 1 31. 2 12. 35 0. 0014 SES* SECTO R 1 14. 8 6. 00 0. 0273
65
St andar d Label Est i m at e Er r or DF t Val ue Pr > | t | Sect or di f f at 10- pct i l e of SES 3. 4497 0. 8596 40. 7 4. 01 0. 0003 Est i m at es Label Al pha Lower Upper Sect or di f f at 10- pct i l e of SES 0. 05 1. 7132 5. 1862
66
67
j
68
1 1 ij j j ij ij
00 01 10 11 1
j j
2
ij
00 01 2 10 11 2 2 00 01 11
69
01 11
2 2 00 01 11
2 2 00 01 11 2 2 2 00 01 11 11 01 11
11
2 01 11
01 11
01
01
01
01
00 01 02 2 1 2 10 11 12 1 20 21 22 2
70
1 1min 10 11 12 20 21 22 2min
−
1 11 12 21 22
−
12
1
2
1i
2i
1
2
12
00 01 02 03 10 11 20 22 30 33
00 01 10 11
71
00
11
01 11
' =
00 2 01 00 11 01 11
11 10 01 00
11 11 2 01 00 01 11 00 01 11
72
00 01 11
01 11 01 11
1
2
73
00
00 01 10 11
01 10
2
11
11
11
11
2 11
74
11
10
2 10
10
0, 1 , 1 1,0 1, 1, 1 k kk k k k k k k k
+ + + + + +
kk
0, 1 , 1 1, 1
k k k k k
+ + +
1, 1 k k
+ , can only take non-negative values. The
75
76 Cr i t i cal val ues f or m i xt ur e of t wo Chi - Squar es wi t h df = q and q- 1 df 0. 1 0. 05 0. 01 0. 005 0. 001 0. 0005 0. 0001 1e- 005 1 1. 64 2. 71 5. 41 6. 63 9. 55 10. 83 13. 83 18. 19 2 3. 81 5. 14 8. 27 9. 63 12. 81 14. 18 17. 37 21. 94 3 5. 53 7. 05 10. 50 11. 97 15. 36 16. 80 20. 15 24. 91 4 7. 09 8. 76 12. 48 14. 04 17. 61 19. 13 22. 61 27. 54 5 8. 57 10. 37 14. 32 15. 97 19. 69 21. 27 24. 88 29. 96 6 10. 00 11. 91 16. 07 17. 79 21. 66 23. 29 27. 02 32. 24 7 11. 38 13. 40 17. 76 19. 54 23. 55 25. 23 29. 06 34. 41 8 12. 74 14. 85 19. 38 21. 23 25. 37 27. 10 31. 03 36. 51 9 14. 07 16. 27 20. 97 22. 88 27. 13 28. 91 32. 94 38. 53
1 2. 71 3. 84 6. 63 7. 88 10. 83 12. 12 15. 14 19. 51 2 4. 61 5. 99 9. 21 10. 60 13. 82 15. 20 18. 42 23. 03 3 6. 25 7. 81 11. 34 12. 84 16. 27 17. 73 21. 11 25. 90 4 7. 78 9. 49 13. 28 14. 86 18. 47 20. 00 23. 51 28. 47 5 9. 24 11. 07 15. 09 16. 75 20. 52 22. 11 25. 74 30. 86 6 10. 64 12. 59 16. 81 18. 55 22. 46 24. 10 27. 86 33. 11 7 12. 02 14. 07 18. 48 20. 28 24. 32 26. 02 29. 88 35. 26 8 13. 36 15. 51 20. 09 21. 95 26. 12 27. 87 31. 83 37. 33 9 14. 68 16. 92 21. 67 23. 59 27. 88 29. 67 33. 72 39. 34
The G LM Pr ocedur e Dependent Var i abl e: M ATHACH Sum
77 Sour ce DF Squar es M ean Squar e F Val ue Pr > F M
Er r or 1679 62701. 74930 37. 34470 Cor r ect ed Tot al 1719 82428. 61743 R- Squar e Coef f Var Root M SE M ATHACH M ean
Sour ce DF Type I SS M ean Squar e F Val ue Pr > F SES 1 11770. 23946 11770. 23946 315. 18 <. 0001 SCHO O L 39 7956. 62868 204. 01612 5. 46 <. 0001 Sour ce DF Type I I I SS M ean Squar e F Val ue Pr > F SES 1 4171. 108312 4171. 108312 111. 69 <. 0001 SCHO O L 39 7956. 628677 204. 016120 5. 46 <. 0001 St andar d Par am et er Est i m at e Er r or t Val ue Pr > | t | I nt er cept 13. 41999619 B 0. 80723095 16. 62 <. 0001 SES 2. 32422573 0. 21992118 10. 57 <. 0001 SCHO O L 1317 - 1. 04494131 B 1. 18939271 - 0. 88 0. 3798 SCHO O L 1374 - 3. 66214705 B 1. 40930069 - 2. 60 0. 0094
78
.
ij ij j ij
79
80
Est i m at ed G M at r i x Row Ef f ect Subj ect Col 1 Col 2 1 I nt er cept 1 3. 5693 0. 5808 2 SES 1 0. 5808 Est i m at ed I nv( G ) M at r i x Row Ef f ect Subj ect Col 1 Col 2 1 I nt er cept 1 1. 7218 2 SES 1 1. 7218 - 10. 5818 Est i m at ed Chol ( G ) M at r i x
81 Row Ef f ect Subj ect Col 1 Col 2 1 I nt er cept 1 1. 8893 2 SES 1 0. 3074 Est i m at ed G Cor r el at i on M at r i x Row Ef f ect Subj ect Col 1 Col 2 1 I nt er cept 1 1. 0000 1. 0000 2 SES 1 1. 0000 1. 0000
Est i m at ed G M at r i x Row Ef f ect Subj ect Col 1 Col 2 1 SES 1 0. 07119 0. 5014 2 I nt er cept 1 0. 5014 3. 5309 Est i m at ed I nv( G ) M at r i x
82 Row Ef f ect Subj ect Col 1 Col 2 1 SES 1 14. 0467 2 I nt er cept 1 Est i m at ed Chol ( G ) M at r i x Row Ef f ect Subj ect Col 1 Col 2 1 SES 1 0. 2668 2 I nt er cept 1 1. 8791 Est i m at ed G Cor r el at i on M at r i x Row Ef f ect Subj ect Col 1 Col 2 1 SES 1 1. 0000 1. 0000 2 I nt er cept 1 1. 0000 1. 0000
Sol ut i on f or Fi xed Ef f ect s St andar d Ef f ect Est i m at e Er r or DF t Val ue Pr > | t | Al pha I nt er cept 11. 6996 0. 4285 1225 27. 31 <. 0001 0. 05 SES 2. 8917 0. 2659 1716 10. 88 <. 0001 0. 05
5 Models using METHOD = REML, the default, can be compared only if they do not differ in their MODEL statements, i.e. only the RANDOM and REPEATED models differ. If
the model differ in their MODEL statement as well full likelihood must be used with MODEL = ML.
83 SECTO R 2. 4716 0. 7037 1250 3. 51 0. 0005 0. 05 SES* SECTO R - 1. 1003 0. 4492 1716 - 2. 45 0. 0144 0. 05
84
Est i m at ed Chol ( G ) M at r i x Row Ef f ect Subj ect Col 1 Col 2 1 ses_adj 1 0. 1351 2 I nt er cept 1 1. 3193 0. 02311 Est i m at ed G Cor r el at i on M at r i x Row Ef f ect Subj ect Col 1 Col 2 1 ses_adj 1 1. 0000 0. 9998 2 I nt er cept 1 0. 9998 1. 0000
St andar d Ef f ect Est i m at e Er r or DF t Val ue Pr > | t | Al pha I nt er cept 11. 7392 0. 3511 1130 33. 44 <. 0001 0. 05 ses_adj 2. 6815 0. 2736 1672 9. 80 <. 0001 0. 05 ses_m ean 6. 5073 0. 9292 1115 7. 00 <. 0001 0. 05 SECTO R 1. 4300 0. 7730 1044 1. 85 0. 0646 0. 05 ses_adj * SECTO R - 1. 0102 0. 4600 1672 - 2. 20 0. 0282 0. 05 ses_m ean* SECTO R - 1. 9568 1. 7190 1022 - 1. 14 0. 2552 0. 05
Cont r ast
85 Num Den Label DF DF F Val ue Pr > F sect or 3 1201 2. 75 0. 0415
86
87
88
Age Y
8 10 12 14 20 25 30
M 24 M 16
8 10 12 14
M 13 M 23
8 10 12 14
M 18 M 27 M 25 M 14 M 20 M 22 M 26
20 25 30
M 19
20 25 30
M 17 M 15 M 12 M 21 F 10 F 6 F 9 F 3 F 1
20 25 30
F 2
20 25 30
F 5
8 10 12 14
F 7 F 8
8 10 12 14
F 4 F 11
Figure 15: Pothoff and Roy (1964) data on growth of jaw sizes of 16 boys and 11 girls. Note possibly anomalous case M 20.
89
Years post coma viq
10 20 30 80 100 120
Figure 16: Recovery of verbal IQ after coma.
90
Months post coma viq
12 24 36 48 80 100 120
Figure 17: Recovery of Verbal IQ post coma (first four years)
91
it
1 it i i it it
it
it
it
i i i i i i i i
i i
92
1 1 1 2 2 2 3 3 1 3 4 4 4 1 1 1 2 2 2 3 3 1 3 1 4 4 4
i i i i i i i i i i i i i i i i i i i i i i i i i i i i
00 01 10 11 1
i i
2 1 2 2 2 3 2 4
i i i i
93
2 3 1 2 2 2 2 3 3 2 4
i i i i
'
2 3 2 00 01 2 2 10 11 3 2 2 2 2 2 2 3 00 01 11 00 01 11 00 01 11 00 11 2 2 00 01 11 00 11 00 01 11
2 2 2 2 00 01 11 00 01 11 2 00 01 11
94
2 11 00 01 11 2 12 00 01 11 2 2 13 00 01 11 2 3 14 00 11 2 22 00 01 11 2 23 00 11 2 2 24 00 01 11 2 33 00 01 11 34 00 01 11 44
2 2 00 01 11
11 2 12 2 2 13 3 2 2 14 22 2 23 2 2 24 33 2 34 44
2 00 01 11
95
'
1 1 2 2 1 3 3 2 4 4 3
j i j i j j j j j
00
i
2
i
2 2 00 00 00 00 2 2 00 00 00 00 00 2 2 00 00 00 00 2 2 00 00 00 00
01
96
2 2 00 01 2 10 11 2 2 01 11 01 11 01 11 11 2 01 11 11 01 11 00 2 01 11 00 11 2 00 11
00
01
11
2
00
01
11
2 3 2 2 00 2 3 2
i i i
97
1 1 2 1 1 2 2 2 3 3 3 3 4 4
i i i i i i i i i i i i
' 1 00 01 02 03 2 10 11 12 13 3 20 21 22 23 4 30 31 32 33 2 2 2 2
i i i i
2
2
2
i i
1 1 2 1 2 3 2 3 4 3 4
i i i i i i i i
98
11 12 13 14 21 22 23 24 31 32 33 34 41 42 43 44
2 | | i j
2 | | ]
i j ij ij
−
| | i j i j
1 j i j k i k
− =
min( , ) 1 i j ik jk k
=
ij
2 1 ij
| | 2
i j
t t
−
6 Spatial covariance structure are designed for geographic applications where the correlation between observations is a function of their spatial distance. This model applied to the
99
ij
2 3 2 2 2 3 2
100
2 2 2
3 2
2 2 3 1 1 2 1 3 1 4 2 2 2 1 2 2 3 2 4 2 2 3 1 3 3 4 3 2 2 4 1 4 2 4 3 4
2 1 1 2 1 1 3 1 2 1 4 1 2 3 2 1 2 1 2 2 3 2 2 4 2 3 2 1 3 1 2 2 3 2 3 3 4 3 2 1 4 1 2 3 2 4 2 3 3 4 3 4
4 9 3 8 2 4 3 5 9 8 5
7 Note that the times and the number of times – hence the indices – can change from subject to subject but
2
σ and ρ have the same value.
101
9 From a SAS FAQ at the University of Texas: http://www.utexas.edu/cc/faqs/stat/sas/sas94.html
102
103
104
105
at ed G M at r i x Row Ef f ect Per son Col 1 1 I nt er cept 1 3. 0306 Est i m at ed Chol ( G ) M at r i x Row Ef f ect Per son Col 1 1 I nt er cept 1 1. 7409 Covar i ance Par am et er Est i m at es St andar d Z Cov Par m Subj ect Est i m at e Er r or Val ue Pr Z FA( 1, 1) Per son 1. 7409 0. 2744 6. 35 <. 0001 Resi dual 1. 8746 0. 2946 6. 36 <. 0001 Fi t St at i st i cs
AI C ( sm al l er i s bet t er ) 440. 6 AI CC ( sm al l er i s bet t er ) 441. 5 BI C ( sm al l er i s bet t er ) 448. 4
106 Est i m at ed V M at r i x f or Per son 1 Row Col 1 Col 2 Col 3 Col 4 1 4. 9052 3. 0306 3. 0306 3. 0306 2 3. 0306 4. 9052 3. 0306 3. 0306 3 3. 0306 3. 0306 4. 9052 3. 0306 4 3. 0306 3. 0306 3. 0306 4. 9052 Est i m at ed V Cor r el at i on M at r i x f or Per son 1 Row Col 1 Col 2 Col 3 Col 4 1 1. 0000 0. 6178 0. 6178 0. 6178 2 0. 6178 1. 0000 0. 6178 0. 6178 3 0. 6178 0. 6178 1. 0000 0. 6178 4 0. 6178 0. 6178 0. 6178 1. 0000 Nul l M
DF Chi - Squar e Pr > Chi Sq 1 49. 60 <. 0001 Sol ut i on f or Fi xed Ef f ect s St andar d Ef f ect G ender Est i m at e Er r or DF t Val ue Pr > | t | I nt er cept 16. 3406 0. 9631 25 16. 97 <. 0001 G ender F 1. 0321 1. 5089 79 0. 68 0. 4960 G ender M 0 . . . . Age 0. 7844 0. 07654 79 10. 25 <. 0001 Age* G ender F - 0. 3048 0. 1199 79 - 2. 54 0. 0130 Age* G ender M 0 . . . . Est i m at e St andar d Label Est i m at e Er r or DF t Val ue Pr > | t | gap ap 14 - 3. 2355 0. 8162 79 - 3. 96 0. 0002
107
108
109
110
111
112
113
2
114
i i i i i i
i
Est i m at ed R M at r i x f or Per son 1 Row Col 1 Col 2 Col 3 Col 4 1 5. 1192 2. 4409 3. 6105 2. 5222 2 2. 4409 3. 9279 2. 7175 3. 0624 3 3. 6105 2. 7175 5. 9798 3. 8235 4 2. 5222 3. 0624 3. 8235 4. 6180 Est i m at ed R Cor r el at i on M at r i x f or Per son 1 Row Col 1 Col 2 Col 3 Col 4 1 1. 0000 0. 5443 0. 6526 0. 5188 2 0. 5443 1. 0000 0. 5607 0. 7190 3 0. 6526 0. 5607 1. 0000 0. 7276 4 0. 5188 0. 7190 0. 7276 1. 0000 Covar i ance Par am et er Est i m at es St andar d Z Cov Par m Subj ect Est i m at e Er r or Val ue Pr Z UN( 1, 1) Per son 5. 1192 1. 4169 3. 61 0. 0002 UN( 2, 1) Per son 2. 4409 0. 9835 2. 48 0. 0131 UN( 2, 2) Per son 3. 9279 1. 0824 3. 63 0. 0001 UN( 3, 1) Per son 3. 6105 1. 2767 2. 83 0. 0047 UN( 3, 2) Per son 2. 7175 1. 0740 2. 53 0. 0114 UN( 3, 3) Per son 5. 9798 1. 6279 3. 67 0. 0001 UN( 4, 1) Per son 2. 5222 1. 0649 2. 37 0. 0179 UN( 4, 2) Per son 3. 0624 1. 0135 3. 02 0. 0025 UN( 4, 3) Per son 3. 8235 1. 2508 3. 06 0. 0022 UN( 4, 4) Per son 4. 6180 1. 2573 3. 67 0. 0001 Fi t St at i st i cs
115 AI C ( sm al l er i s bet t er ) 447. 5 AI CC ( sm al l er i s bet t er ) 452. 0 BI C ( sm al l er i s bet t er ) 465. 6 Nul l M
DF Chi - Squar e Pr > Chi Sq 9 58. 76 <. 0001 Sol ut i on f or Fi xed Ef f ect s St andar d Ef f ect G ender Est i m at e Er r or DF t Val ue Pr > | t | I nt er cept 15. 8423 0. 9356 25 16. 93 <. 0001 G ender F 1. 5831 1. 4658 25 1. 08 0. 2904 G ender M 0 . . . . Age 0. 8268 0. 07911 25 10. 45 <. 0001 Age* G ender F - 0. 3504 0. 1239 25 - 2. 83 0. 0091 Age* G ender M 0 . . . . Type 3 Test s of Fi xed Ef f ect s Num Den Ef f ect DF DF F Val ue Pr > F G ender 1 25 1. 17 0. 2904 Age 1 25 110. 54 <. 0001 Age* G ender 1 25 7. 99 0. 0091 Est i m at es St andar d Label Est i m at e Er r or DF t Val ue Pr > | t | gap at 14 - 3. 3231 0. 8403 25 - 3. 95 0. 0006
116
Est i m at ed R M at r i x f or Per son 1 Row Col 1 Col 2 Col 3 Col 4 1 1. 8174 - 0. 1179 0. 007655 - 0. 00050 2 - 0. 1179 1. 8174 - 0. 1179 0. 007655 3 0. 007655 - 0. 1179 1. 8174 - 0. 1179 4 - 0. 00050 0. 007655 - 0. 1179 1. 8174
Est i m at ed R Cor r el at i on M at r i x f or Per son 1 Row Col 1 Col 2 Col 3 Col 4 1 1. 0000 - 0. 06490 0. 004212 - 0. 00027 2 - 0. 06490 1. 0000 - 0. 06490 0. 004212 3 0. 004212 - 0. 06490 1. 0000 - 0. 06490 4 - 0. 00027 0. 004212 - 0. 06490 1. 0000 Est i m at ed G M at r i x Row Ef f ect Per son Col 1 1 I nt er cept 1 3. 0904 Est i m at ed V M at r i x f or Per son 1 Row Col 1 Col 2 Col 3 Col 4 1 4. 9078 2. 9724 3. 0980 3. 0899 2 2. 9724 4. 9078 2. 9724 3. 0980 3 3. 0980 2. 9724 4. 9078 2. 9724
117 4 3. 0899 3. 0980 2. 9724 4. 9078
' 1 1 1
at ed V Cor r el at i on M at r i x f or Per son 1 Row Col 1 Col 2 Col 3 Col 4 1 1. 0000 0. 6057 0. 6312 0. 6296 2 0. 6057 1. 0000 0. 6057 0. 6312 3 0. 6312 0. 6057 1. 0000 0. 6057 4 0. 6296 0. 6312 0. 6057 1. 0000 Covar i ance Par am et er Est i m at es St andar d Z Cov Par m Subj ect Est i m at e Er r or Val ue Pr Z FA( 1, 1) Per son 1. 7579 0. 2744 6. 41 <. 0001 AR( 1) Per son - 0. 06490 0. 1612 - 0. 40 0. 6872 Resi dual 1. 8174 0. 3078 5. 91 <. 0001
Asym pt ot i c Cor r el at i on M at r i x of Est i m at es Row Cov Par m CovP1 CovP2 CovP3 1 FA( 1, 1) 1. 0000 - 0. 1410 - 0. 1147 2 AR( 1) - 0. 1410 1. 0000 0. 3728 3 Resi dual - 0. 1147 0. 3728 1. 0000
00
Fi t St at i st i cs
AI C ( sm al l er i s bet t er ) 442. 5 AI CC ( sm al l er i s bet t er ) 443. 6
2I
118
St andar d Ef f ect G ender Est i m at e Er r or DF t Val ue Pr > | t | I nt er cept 16. 3140 0. 9388 25 17. 38 <. 0001 G ender F 1. 0648 1. 4709 79 0. 72 0. 4713 G ender M 0 . . . . Age 0. 7862 0. 07400 79 10. 62 <. 0001 Age* G ender F - 0. 3072 0. 1159 79 - 2. 65 0. 0097 Age* G ender M 0 . . . . Type 3 Test s of Fi xed Ef f ect s Num Den Ef f ect DF DF F Val ue Pr > F G ender 1 79 0. 52 0. 4713 Age 1 79 119. 11 <. 0001 Age* G ender 1 79 7. 02 0. 0097 Est i m at es St andar d Label Est i m at e Er r or DF t Val ue Pr > | t | gap at 14 - 3. 2358 0. 8113 79 - 3. 99 0. 0001
2
2
119
/* Dropping the 3rd occasion of the first subject */ data mixed.prmiss; set mixed.pr; if _n_ = 3 then delete; run; proc print data = prmiss; run; /* Now we try to use repeated in various ways */ /* Just using repeated with type = AR(1) will use the 3 occasions for subject 1 as if they were occasions 1, 2 and 3. This can be seen by checking the output for VCORR, the correlation Matrix for the 1st subject. */ proc mixed data = mixed.pr; class Person Gender; model y = Gender Age Gender*Age / s influence (effect = Person iter= 3) residual; repeated / subject = Person type = AR(1); random intercept / type=un subject=Person g gc v vcorr; estimate 'gap ap 14' Gender 1 -1 Age*Gender 14 -14; run; /* Using a categorical version of Age to label occasions: We create a new variable AGEC which is declared as a CLASS Variable in PROC MIXED. It appears as the “effect” in the REPEATED statement. Check the output of GCORR and compare it with the output For the first program above. */ data prmiss; set prmiss;
120
agec = age; /*REPEATED statement requires a categorical variable to identify occasions */ run; proc mixed data = prmiss; class Person Gender agec; model y = Gender Age Gender*Age / s influence (effect = Person iter= 3) residual; repeated agec / subject = Person type = AR(1); random intercept / type=un subject=Person g gc v vcorr; estimate 'gap ap 14' Gender 1 -1 Age*Gender 14 -14; run; /* mathematically equivalent model using Continuous AR(1) Note that the categorical variable Agec is not needed here */ proc mixed data = prmiss; class Person Gender; model y = Gender Age Gender*Age / s influence (effect = Person iter= 3) residual; repeated / subject = Person type = SP(POW)(Age); random intercept / type=un subject=Person g gc v vcorr; estimate 'gap ap 14' Gender 1 -1 Age*Gender 14 -14; run;
121
01
2
10 SAS documentation uses G instead of T, R instead of Σ , β instead of γ , and γ instead of u. Fortunately, Y, X and Z have the same meaning.
122
123
124
2
125
2 2 2
126
127