HIERAR HIERARCHICAL CHICAL LINEAR MODELLING LINEAR MODELLING - - PowerPoint PPT Presentation
HIERAR HIERARCHICAL CHICAL LINEAR MODELLING LINEAR MODELLING - - PowerPoint PPT Presentation
HIERAR HIERARCHICAL CHICAL LINEAR MODELLING LINEAR MODELLING Expectation Expectation After completing the workshop you will be able to: understand the data structure for multilevel data analysis; develop the appropriate models to
Expectation Expectation
■ After completing the workshop you will be able to:
- understand the data structure for multilevel data
analysis;
- develop the appropriate models to answer
specific research questions;
- utilize the HLM software to perform the relevant
analyses; and
- interpret the outputs/results.
“ONCE YOU KNOW THAT HIERARCHIES EXIST, YOU SEE THEM EVERYWHERE”
Kreft and de Leeuw (1998)
Reflection:
How do we collect data for our research in education?
Nest Nested/Hierar ed/Hierarchical Data Structure chical Data Structure
■ Behavioral and social data commonly have a nested or hierarchical data structure. ■ For examples:
- We have variables describing individuals, and
individuals are also grouped into a larger units, each units consisting a number of individuals. We maybe also have variables describing higher level unit.
- In education, students (micro-level) are grouped
in classes (macro-level), and there are variables describing students and another variables describing classes.
5
Nest Nested Data Structure ed Data Structure
■ Each person might be nested within some
- rganizational unit, such as a school or workplace.
■ These organizational units may in turn be nested within a geographical location such as a community, state, or country.
7
What are nest What are nested data? ed data?
■ “Nested Data” refer to sub-units that are grouped (or “nested”) within larger units. ■ Often the data are observations of individuals nested within groups. – Key: individuals within groups are more similar to one another than to individuals in
- ther groups.
– We can empirically verify this. ■ Sometimes data are multiple
- bservations
nested within an individual. ■ Data stemming from such research designs have a multilevel or hierarchical structure
Nest Nested/Hierar ed/Hierarchical Structure hical Structure
■ Where else can we find hierarchical or nested structure?
Some e Some exam amples of units at the ples of units at the macr macro and micr
- and micro le
- level
Three Le Three Levels of Data ls of Data
Level 1
- Denotes observations at the most detailed
level of the data.
- In a clustered data set, Level 1 represents the
units of analysis (or subjects) in the study.
- In a repeated measures or longitudinal data set,
Level 1 represents the repeated measure made on the same unit. Note:
- The continuous dependent variable is always
measured at Level 1 of the data.
Three Le Three Levels of Data ls of Data
Level 2
- Represents the next level of the hierarchy.
- In clustered data sets, Level 2 observations
represent clusters of units.
- In repeated measures and longitudinal data sets,
Level 2 represents the units of analysis.
Three Le Three Levels of Data ls of Data
Level 3
- Represents the next level of the hierarchy.
- Generally refers to clusters of units in
clustered longitudinal data sets
- Clusters of Level 2 units (cluster of clusters)
in three-level clustered data sets.
Section 1: What and Why HLM?
Wh Why HLM? y HLM?
At group level analysis: ■ (aggregate data and ignore individuals) ■ Aggregation bias = the meaning of a variable at Level-1 (e.g., individual level SES) may not be the same as the meaning at Level-2 (e.g., school level SES) There are some problems with traditional approaches: At 1-Level: ■ 1. Individual level analysis (ignore group) ■ 2. Group level analysis (aggregate data and ignore individuals)
■ Violate the assumption of independency – a statistical assumption required if we want to perform an analysis using regression or ANOVA. ■ Students’ responses in the same schools are likely to be more correlated than the scores for students in different schools because they share the same environment. ■ The violation – need for MLM – traditional methods produce excessive Type 1 errors and biased parameter estimates.
Wh Why HLM? y HLM?
■ Ordinary Least Squares (OLS) regression assumes each unit in a sample is an independent
- bservation, but subjects are often not
independent from their context. ■ Those from a particular setting tend to be more like each other than like those in other settings. – Why are they in that setting? – What characteristics do they share? – What are their shared experiences? ■ So should the unit of analysis be the individual or the setting in which they cluster?
Wh Why HLM? y HLM?
17
Wh Why multile y multilevel modeling? el modeling?
■ Nest Nested ed data data are very common in education. ■ Analysis of nested data poses unit of analysis problem – should we analyze the individual or the group? Unfortunately, we often can’t choose one over the other. ■ Traditional linear models offer a simple view of a complex world – generally assume same effects across groups. ■ If effects do differ across groups, we can explain these differences with multilevel modeling.
18
Unit of analysis pr it of analysis problem: individual, gr
- blem: individual, group
- up
- r bo
- r both?
th?
■ Example: studying what affects student retention (1000 students per school) in a group of schools (n=50). Total dataset N=50,000. ■ We can assign school-level variables to each individual, but … – We end up estimating the standard errors for school-level variables using N=50,000. – Yet we only have 50 different college
- bservations, so N really equals 50.
19
Unit of analysis pr it of analysis problem: individual, gr
- blem: individual, group
- up
- r bo
- r both?
th? ■ Alternatively, we can average student data for each college so that we have 1
- bservation per school (N=50).
– Now we have reduced variance on our student-level variables. – We also have variables which measure both individual student characteristics (SES) and college environment (average SES).
Summar Summary
■ Methods of dealing with nested data; – Disaggregation;
- Aggregation
■ Dependency Use HLM – HLM simultaneously investigates relationships within and between hierarchical levels of grouped data, thereby making it more efficient at accounting for variance among variables at different levels than other existing analysis,
So, what is HLM?
What Is HLM? What Is HLM?
■ HLM HLM = Hierarchical Linear Model ■ HLM HLM is a software name – multilevel modelling (MLM)
- r also known as:
- 1. Multilevel linear models
(In sociological research)
- 2. Mixed-effects models/Random-effects models
(In biometric applications)
- 3. Random-coefficient regression models
(In econometrics literature)
- 4. Covariance components models
(In statistical literature)
What is What is HLM? HLM?
■ Hierarchical Linear Modeling – The name of a software package – Used as a description for broader class of models ■ Random coefficient models ■ Models designed for hierarchically nested data structures ■ Typical applications – Hierarchically nested data structures – Outcome at lowest level – Independent variables at the lowest + higher levels
23
When t When to Use HLM? Use HLM?
■ Nested data structure
Example:
students nested within classrooms or schools employees nested within a firm ■ Clustered Data This refer to data sets in which the dependent variable is measured once for each subject (the unit of analysis), and the units of analysis are grouped into, or nested within, clusters of units. More examples….?
- multiple observation within individuals
HLM2 and HLM3 HLM2 and HLM3
Classroom Student School Student Classroom
Three-level model Two-level model
HLM Models HLM Models
Classroom Student Student Repeated measure
Contextual model Growth model
Prerequisite Knowledge
Pre-req Pre-requisit isite f e for HLM Data Analysis r HLM Data Analysis
■ Statistical assumptions and techniques to examine these assumptions ■ ANOVA – understand the SPSS output ■ Regression – especially multiple regression and can interpret the results of Mreg ■ Knowledge on how to aggregate the data
Linear R Linear Regression Analysis gression Analysis
- Va
Variables: X = Predictor Variable (we provide this) Y = Outcome Variable (we observe this)
- Param
Parameters: β0 = Y-Intercept β1 = Slope ε ~ Normal Random Variable (με = 0, σε = ???)
Least Sq Least Squares Line… uares Line…
these differences are called residuals or errors
HLM Im HLM Implications f plications for R
- r Resear
search Design ch Design
■ Observations are not independent within classes/schools – Students within schools tend to share similar characteristics (e.g., socioeconomic background and instructional setting) ■ Traditional linear regression (OLS) assumes: – Correlation (ei,ej)=0, i.e., the between
- bserved and predicted Y are uncorrelated
■ Ignoring dependency of observations may lead to wrong conclusions
Basic Appr Basic Approach t
- ach to P
Performing HLM
- rming HLM Analysis
Analysis
- 1. Researcher first specifies a
model based on theory.
- 2. The researcher then
determines how to measure constructs, collects data, and then inputs the data into the HLM software package.
- 3. The package fits the data to
the specified model and produces the results, which include:
- Overall model fit
statistics, and
- Parameter estimates.
St Steps eps
- 1. Clarifying the research questions
- 2. Choosing appropriate parameter estimator
- 3. Assessing the need for MLM
- 4. Building the level-1 model
- 5. Building the level-2 model
- 6. Multilevel effect size reporting
- 7. Likelihood ratio model testing
St Steps in R eps in Running HLM Analysis nning HLM Analysis
Three models are typically run:
- 1. Fully unconditional model
– No independent variables are specified – Used to determine if there is sufficient variance among groups to justify using HLM (intraclass correlation) 2. Partially conditional model – Predictors are added at level 1 3. Fully conditional model – Level 2 (and 3) predictors are modeled on the intercept and/or slopes to determine their effects on the outcome measure or on relationships between predictors and outcome
Time for some Algebra!
■ You mus u must learn some of the basic mathematical learn some of the basic mathematical no notations used in multile tations used in multilevel l modeling. modeling. – As we will see, the program HLM uses this notations to express the models that you estimate. – Understanding these basic symbols and expressions will allow you to tackle more complex analyses, and understand other researchers’ more complex analyses.
A level-1 model: multiple students in one school (familiar OLS equation)
■ Student is viewed as having average achievement in the school, plus a positive deviation due to SES, plus a positive or negative deviation due to the unique circumstances of the student.
Is student’s Math achievement score Is average achievement within school (intercept) Is average effect of SES on achievement (slope) Is student’s standardized SES (independent variable) Is unique effect for student i (error term)
A level-1 model: multiple students in multiple schools
Is student’s achievement in school number j Is average achievement within school j Is average effect of SES on achievement for school j Is student’s standardized SES of student i in school j Is unique effect for student i in school j
- Now we are estimating the equation from before
for each school. Each school can have a different average achievement (or intercept), and a different impact of SES on achievement (or slope).
Need to make some additional assumptions about the coefficients, because they vary
■ Student-level errors are normally distributed. ■ Gamma’s: we expect the average achievement for school j to be equal to the average school mean for all j schools, and the slope of SES for school j to equal the average of the slopes for all j schools. ■ Tau’s: these are the variances of the intercepts and slopes, and the covariance between them.
01 1j , 0j 11 1j 10 1j 00 0j 00 0j 2 ij
) β Cov(β ) (β Var , ) E(β ) β Var( , ) E(β ) , ( N ~ r
Level-2 model: explaining the Level-1 coefficients
■ Since our intercepts and slopes vary by school, we can now model why they vary. ■ Suppose we hypothesize that levels of achievement and impact of SES are related to whether a school is public or Catholic. ■ We need equations for the intercept and slope to describe our hypothesis: j school for t achievemen
- n
SES
- f
effect average is β j school within t achievemen average is β t) coefficien (slope u W β ) (intercept u W β
1j 0j 1j j 11 10 1j 0j j 01 00 0j
Level-2 model (continued)
So math achievement of an individual student in school j is explained by …
mean achievement in public schools, plus impact of a school being Catholic
- n mean achievement (if j is Catholic)
the effect of SES on achievement, plus the impact of a school being Catholic
- n how SES affects achievement
(again, if j is Catholic) student- and school-specific error terms
Multile Multilevel R el Regression Model gression Model
Lowest (individual) level: ■ Yij= b0j+ b1jXij+ eij and at the Second (group) level: ■ b0j= g00+ g01Zj+ u0j ■ b1j= g10+ g11Zj+ u1j Combining: ■ Yij= g00 + g10Xij+ g01Zj+ g11ZjXij + u1jXij+ u0j+ eij
Some examples from multilevel regression modeling:
Hands-on Session with HLM Software: HLM 7 Student Version
Star Starting HLM ting HLM
- Prepare data;
- Identify variables to be included in the model;
- Develop hypothesized model; and
- Install HLM program
HSB DATA
Our data file is a subsample from the 1982 High School and Beyond Survey and is used extensively in Hierarchical Linear Models by Raudenbush and Bryk. The data file, called hsb, consists of 7185 students nested in 160 schools. The outcome variable of interest is the student-level (level 1) math achievement score (mathach). The variable ses is the socio-economic status of a student and therefore is at the student level. The variable meanses is the group-mean centered version of ses and therefore is at the school level (level 2). The variable sector is an indicator variable indicating if a school is public or catholic and is therefore a school-level variable. There are 90 public schools (sector=0) and 70 catholic schools (sector=1) in the sample.
Exam Example ple
■ Using HSB-data ■ Questions:
- Is multilevel modeling needed for mathematics
achievement scores?
- Is there a relationship between SES and student
level Mathematics achievement scores?
- Does the effect of SES on Mathematics
achievement scores vary significantly across schools?
- Is the effect of SES on ACHMATH moderated by
the MEANSES and SECTOR?
Inf Inform HLM of the in rm HLM of the input and Mak put and Make MDM f MDM file le
Inform HLM with the data and analysis command
STEPS 1 2 3 4
5
6 7
8 9 10 10
Choose V Choose Variables f riables for Le r Level_1 l_1
Data file : HSB1.sav
Choose V Choose Variables f riables for Le r Level_2 l_2
Data file : HSB2.sav
Specify the Model Specify the Model
1 2
NULL/UNCONDITIONAL NULL/UNCONDITIONAL MODEL MODEL
Also kno Also known as Random Ef n as Random Effect Model ct Model
Fully Unconditional Model lly Unconditional Model
■ Fully unconditional model is run with no predictors to determine if a significant portion of the variance in achievement is between schools – indicating HLM should be used to analyze these data.
Purpose of N Purpose of Null Model ll Model
■ It is used as the baseline model to compare the results of more elaborate models, ■ It can estimate the grand mean of mathematics achievement (γoo ) with adjustment for clustering of students within schools and for different sample sizes across schools, ■ It can estimate variance components at student (σ2) and school level (τoo ).
Null/U ll/Unconditional Model nconditional Model
■ Null model is used for two purposes: (1) It is the basis for calculating the intra-class correlation coefficient (ICC), which is the usual test of whether multilevel modelling is needed; and (2) It outputs the deviance statistic (-2LL) and other coefficients used as a baseline for comparing later, more complex models.
Null Model ll Model
The level-1 model Yij = βoj + rij (1) Where Yij
= Mathematics achievement for student i in school j,
Βoj
= The average mathematics achievement for school j,
*rij
= error term representing a unique effect associated
with student i in school j. * Assumed to have a normal distribution with a mean of zero and a level-one variance, σ2
The Level-2 Model, βoj = γoo + uoj (2) Where, γoo = The intercept represents grand mean or overall average of mathematics achievement, *uoj = The error term represents a unique effect associated with school j. * Assumed to have a normal distribution with a mean of zero and a level-two variance, τ00
Null Model ll Model
■ Combine the two equations (Mixed-model),
Yij = γoo + uoj + rij
Null Model ll Model
The level 1 intercept term, expressed as β0j in output, is a function of a random intercept term at level 2 (γ00) and a level 1 residual error term (rij). The level 1 intercept, in turn, is a function of the grand mean (γ00) across level 2 units, which are agencies in this example, plus a random error term (u0j), signifying the intercept is modelled as a random effect. Substituting the right-hand side of the level 2 equation into the level 1 equation gives the mixed model equation for the null random intercept model.
Null or Unconditional Model ll or Unconditional Model
■ Reliability in HLM ≠ ordinary reliability ■ Reliability for the intercept in HLM indicates to what extent the intercept measures can discriminate among schools in their average achievement. ■ Low reliability does not mean lack of precision.
Anno Annotat tated R ed Results 1 sults 1
■ The reliability of the random effect of the level 1 intercept is the average reliability of the level 2 units. ■ It measures the overall reliability of the OLS estimates for each of the intercepts. The reliability estimate for this model is .901. ■ This indicates that the sample means is tend to be quite reliable as indicator of the true school mean.
Anno Annotat tated R ed Results 2 sults 2
■ In Final estimation of fixed effect: The intercept is 12.64 (SE=.24) and differ from zero [This value indicate the grand mean of Mathematics Achievement] ■ To measure the magnitude of the variation among schools in their mean achievement levels, we can calculate the plausible values range for these means based on the between variance we obtained from the model: 12.64 ± 1.96*(0.24) = (12.17, 13.11).
Anno Annotat tated R ed Results 3 sults 3
■ The estimated between variance, τ2, corresponds to the term intercept1 in the output of final estimation of variance components and the estimated within variance, σ2, corresponds to the term level-1 in the same output section. For this model, τ2 is 8.61 and σ2 is 39.15.
Anno Annotat tated R ed Results 4 sults 4
At the school level, is the variance of the true school means , around the grand mean, .The estimated variability in these school means is To measure the magnitude of the variation among schools in their mean achievement levels, we can calculate the plausible values range for these means. Under normality assumption, we would expect 95% of the school means to fall within the range:
Which yields 12.64 ± 1.96 ( )1/2 = (6.89, 18.39)
Indicates a substantial range in average achievement level s among schools in the sample data
Final Final estimation of V estimation of Variance riance
Anno Annotat tated R ed Results 5 sults 5
■ Statistically significant between-school variance (variance at school level) indicates that school average mathematics achievement varies significantly across schools.
Anno Annotat tated R ed Results 6 sults 6
Ef Effectiv ctive sam e sample size le size
■ A higher ICC value indicates greater dependence among
- bservations within schools
– Effective sample size is smaller than observed sample size ■ Effective n= mk / (1 + ICC*(m-1)) – where n=sample size, m= number of students per schools and k= number of schools ■ If ICC=1, effective n is equal to the # of schools (k) ■ If ICC=0, effective n is equal to the observed n (i.e., mk) ■ In general, effective n lies between k and mk
■ Based on the covariance estimates, we can compute the intra-class correlation: 8.61431/(8.61431 + 39.14831) = .18. ■ This tells us the portion of the total variance that
- ccurs between schools.
Calculating ICC Calculating ICC
Calculating the Intra-class Correlation Calculating the Intra-class Correlation coef coefficient (ICC) ficient (ICC)
ADDING V ADDING VARIABLE A RIABLE AT THE S THE STUDENT LEVEL UDENT LEVEL
Random Coef Random Coefficient Model ficient Model
Adding Predict ding Predictors at Le at Level_1 l_1
No Notes on the R s on the Results 1 sults 1
■ The model we fit was mathachij = β0j + β1j (SES - meanses) + rij β0j = γ00 + u0j β1j = γ10 + u1j ■ Filling in the parameter estimates we get mathachij = β0j + β1j (SES - meanses) + rij β0j = 12.64 + u0j β1j = 2.19 + u1j V(u0j) = 8.68 V(u1j) = .68 V(rij) = 36.7 ■ In a single equation our model would be written as: mathachij = γ00 + u0j + (γ10 + u1j )(SES - meanses) + rij
= γ00 + γ10 *(SES - meanses) + u0j + u1j *(SES –
meanses) + rij
■ The estimate for the variance of the slope for group- centered SES is 0.68. The p-value is .003. Because the test is statistically significant, we reject the hypothesis that there is no difference in slopes among schools. ■ The 95% plausible value range for the school means and school-specific SES achievement slope is 12.64 ± 1.96 *(8.68)1/2 = (6.87, 18.41). ■ The 95% plausible value range for the SES -achievement slope is 2.19 ± 1.96 *(.68)1/2 = (.57, 3.81).
No Notes on the R s on the Results 2 sults 2
■ The coefficient for the constant is the predicted math achievement when all predictors are 0; hence, when the average school SES is 0, the students' math achievement is predicted to be 12.65
No Notes on the R s on the Results 3 sults 3
■ Notice that the residual variance is now 36.70, compared to the residual variance of 39.15 in the one- way ANOVA with random effects (unconditional means) model. ■ We can compute the proportion variance explained at level 1 as (39.15 - 36.70) / 39.15 = .063. This suggests using student-level SES as a predictor
- f
math achievement reduced the within-school variance by 6.3%. ■ The correlation between the intercept and the slope is .019. It seems that they are not highly correlated.
No Notes on the R s on the Results 4 sults 4
Calculating Pr Calculating Propor
- portion of V
tion of Variance riance Explained Explained
THE INTER THE INTERCEPT EPT AND SL AND SLOPE OPE AS THE OUT AS THE OUTCOME OME MODEL MODEL
Final Model Final Model
This model is referred to as an intercepts- and slopes-as-outcomes model
Resear search Questions ch Questions
■ Do MEANSES and SECTOR significantly predict the intercept? ■ Do MEANSES and SECTOR significantly predict the within school slope? ■ How much variation in the intercepts and slopes is explained using SECTOR and MEANSES as predictors?
Int Interpre rpreting the Final R ting the Final Results sults
For intercept: ■ The MEANSES is positively related to school mean math achievement. ■ Catholic schools have higher mean achievement than do public schools, after controlling the effect of MEANSES.
For slope ■ School with higher MEANSES have a larger slope than low MEANSES. ■ Catholic schools have significantly weaker SES slopes on average than do public schools.
Int Interpre rpreting the Final R ting the Final Results sults
Repor porting in T ing in Table ble
■ The estimate for the variance of the SES slope is .15 with p-value .369; hence, we fail to reject the null hypothesis that there is no significant variation among the slopes of MEANSES remain unexplained after controlling the MEANSES and SECTOR effects. ■ The correlation between the level-1 intercept and the slope for SES is given as .32 from the earlier part of the output. ■ There is variation remain unexplained even after controlling the MEANSES and SECTOR effects.
Final Model Final Model
■ Using Deviance Statistics ■ Using Proportion of variance explained ■ Using other indicators AIC and BIC
Assessing Model Fit Assessing Model Fit
■ REML (restricted maximum likelihood) versus FML (full maximum likelihood) – REML and FML will usually produce similar results for the level-1 residual (σ2), but there can be noticeable differences for the variance-covariance matrix of the random effects. – REML is the default estimation method in HLM. – If the number of level-2 units is large, then the difference will be small. – If the number of level-2 units is small, then FML variance estimates will be smaller than REML, leading to artificially short confidence interval and problematic significant tests. ■ Nested models – If the fixed effects are the same, and there are fewer random effects in the reduced model, then both REML or FML are fine. – If one model has fewer fixed effects and possibly fewer random effects, then use FML to compare models.
Estimation Specif Estimation Specification ication
SOME PRA SOME PRACTICAL TICAL ASPECTS OF ASPECTS OF MUL MULTILEVEL ILEVEL MODELING MODELING
Questions t Questions to Answ Answer er
■ Can you use multilevel techniques to study your dependent variable? ■ Should you use multilevel techniques to study your dependent variable? ■ How will you center your level-1 and level-2 predictors? ■ Which of the level-1 coefficients will be explained at level-2? I.e., are they fixed or random? ■ How does my model perform?
Can I use HLM? Can I use HLM?
■ HLM requires a large amount of data. ■ Minimum:
- number of groups:
30, but most recommend 50+
- number of individuals within groups:
5-10, but can have low as 1.
- average group size:
10, obviously more is better.
Should I Use HLM? Should I Use HLM?
■ How much of the variance in your dependent variable is explained by group membership? ■ Intra-class correlation coefficient (ICC) = var between groups (var between groups + var within groups) ■ ) /(
2 00 00
variance level
- student
the is and means, school the
- r
, intercepts the
- f
variance the is Remember,
2 00
Cent Centering v ering variables riables
■ Whether and how you center is a very important decision: interpretation of results depends on your choice. ■ Important because the intercept at level-1 is also a dependent variable. ■ Centering – Refers to subtracting a mean from your independent variables. – The transformed value for an individual measures how much they deviate (+/-) from the mean.
■ Suppose we center verbal SAT scores around a student mean of 500. ■ How would we interpret a regression coefficient if all variables were similarly transformed? Actual score Centered score Steve 800 300 Claire 750 250 Bill 500 Paul 200
- 300
91
Cent Centering v ering variables riables
■ Why would we want to center? – Variable may lack a natural zero point, such as SAT score. – Stability of estimates at level-1 affected by location of variables. – Location at level-2 is less important.
Cent Centering v ering variables riables
- Centering in multilevel models presents a unique
challenge because different centering choices have a significant impact on how the parameter estimate is interpreted.
■ Generally two types of centering are used in HLM for a specific variable: – Grand mean centering – subtract the mean for the entire sample from each observation in the sample. – Group mean centering – subtract the mean for each group from each member of the group. ■ To fully understand the implications of centering, see the discussion in Bryk and Raudenbush (2002)
- pp. 134-149.
Cent Centering v ering variables riables
■ Grand mean centering is scoring variables as deviations from their sample means. ■ An example would be scoring occupational status as a deviation from the mean occupational status in the entire sample—scoring how high or low people are relative to the average. ■ In multivariate analyses, predictor variables that are grand-mean centered generate mathematically identical predicted values to those from the same model estimated on the original, conventionally scored variables.
Grand Mean Cent Grand Mean Centering ering
■ Some writers still claim that grand mean centering reduces multicollinearity, particularly when the regression includes many interactions, and most especially when these are cross-level interactions, (Bickel 2007; Preacher 2003). ■ Another advantage of grand mean centering is that it allows one to interpret the intercept as the predicted mean on the dependent variable when all the predictors are set to zero (Paccagnella 2006).
Grand Mean Cent Grand Mean Centering ering
■ It is also sometimes said that grand-mean centering facilitates regression coefficient interpretation, particularly for cross-level interactions when a variable is continuous (Bickel 2007; Kenny et al. 1998; Hox 2010). ■ Hox (2010) reports that convergence tends to be achieved more frequently and analyses run faster using grand-mean centering.
Grand Mean Cent Grand Mean Centering ering
Grand Mean Cent Grand Mean Centering ering
■ Group mean centering refers to scoring variables in multi-level models as deviations from the mean of their macro-level group. ■ An example would be scoring occupational status as a deviation from the mean status in respondent’s own country. ■ In group-mean centering, Nigerian clerks would be high (because they are high compared to the average Nigerian) while Swiss clerks would be low (because they are low compared to the average Swiss).
Gr Group Mean Cent
- up Mean Centering
ering
■ Bryk (2002) also posit that group-mean centering can reduce bias in random component variance estimates. ■ Paccagnella (2006) alleges the benefit of group-mean centering for researchers interested in ‘‘separating the between-group and the within-group components from the total variation to investigate how groups (contexts) affect the student performances, explicitly accounting for the group structure into the model’’ ■ Group-mean centering (also known as within-group deviation scoring) is widely used in many disciplines, and widely recommended.
Gr Group Mean Cent
- up Mean Centering
ering
Gr Group Mean Cent
- up Mean Centering
ering
■ Methodologists have cautioned that centering decisions should be undertaken warily and based on both a theoretical and statistical rationale. ■ In general group-mean centering individual-level variables can create large and varied biases for higher-level variables. ■ Centering decisions in longitudinal models take on a different role than in cross-sectional models, and often involve centering around a constant rather than group
- r overall mean values.
■ Centering choices have differing effects on the interpretation of the intercept term.
Mean Cent Mean Centering ering
■ Under grand mean centering, the variance in the intercept term (β0j) represents the between group variance in the outcome variable adjusted for the level-1 variables. ■ In group mean centered models, the intercept variance simply represents the between group variance in the outcome measure. ■ While centering around the grand mean constitutes a simple linear transformation, the scores resulting from group mean centering are a nonlinear (and discontinuous) function of two variables, namely the variable which is being 'centered' and the categorical variable which expresses the grouping.
Mean Cent Mean Centering ering
■ When to use grand mean centering: If you are more interested in the effects in individuals’ performance than in group effects.\ When raw score does not allow a meaningful interpretation of the intercept. ■ When to use group mean centering: Theory says that individual and group effects are separate. Smaller correlation between random intercept and random slope. Smaller correlation between level 1 and level 2 variables and cross-level interactions. This will stabilize model (coefficients are more or less independent estimates.)
Grand or Mean Cent Grand or Mean Centering ering
■ It would be nice to have everything random; that is, a different set of coefficients for each group. ■ But due to HLM demands on data, usually only the intercept and a few variables can be random. ■ Important: if you randomize gender and you have a group without females, that group will be dropped. ■ Generally you should run parallel models for intercept and slopes, as in our theory example.
Fix Fixed or Random? ed or Random?
■ Goodness of fit: – Proportion of variance explained at level-1 – Variance explained at level-2 ) ( ) ( ) (
2 2 2
null full null ) ( ) ( ) ( null full null
Model Statistics Model Statistics
Some thoughts about building y Some thoughts about building your ur models models
■ Before using HLM, run OLS regressions for sample and for each group. ■ Building the null model: – This is should be your first step. – Calculate the ICC ■ Building the level-1 models: – Should be theory driven – Step-up approach – Be cautious about what you leave as random – it’s often difficult to leave more than the intercept and one variable as random
■ Building the level-2 models – Rule of thumb: 10 observations/variable – Parallel models ■ Many scholars drop insignificant variables at both levels. (I disagree with this.)