HIERAR HIERARCHICAL CHICAL LINEAR MODELLING LINEAR MODELLING - - PowerPoint PPT Presentation

hierar hierarchical chical linear modelling linear
SMART_READER_LITE
LIVE PREVIEW

HIERAR HIERARCHICAL CHICAL LINEAR MODELLING LINEAR MODELLING - - PowerPoint PPT Presentation

HIERAR HIERARCHICAL CHICAL LINEAR MODELLING LINEAR MODELLING Expectation Expectation After completing the workshop you will be able to: understand the data structure for multilevel data analysis; develop the appropriate models to


slide-1
SLIDE 1

HIERAR HIERARCHICAL CHICAL LINEAR MODELLING LINEAR MODELLING

slide-2
SLIDE 2

Expectation Expectation

■ After completing the workshop you will be able to:

  • understand the data structure for multilevel data

analysis;

  • develop the appropriate models to answer

specific research questions;

  • utilize the HLM software to perform the relevant

analyses; and

  • interpret the outputs/results.
slide-3
SLIDE 3

“ONCE YOU KNOW THAT HIERARCHIES EXIST, YOU SEE THEM EVERYWHERE”

Kreft and de Leeuw (1998)

slide-4
SLIDE 4

Reflection:

How do we collect data for our research in education?

slide-5
SLIDE 5

Nest Nested/Hierar ed/Hierarchical Data Structure chical Data Structure

■ Behavioral and social data commonly have a nested or hierarchical data structure. ■ For examples:

  • We have variables describing individuals, and

individuals are also grouped into a larger units, each units consisting a number of individuals. We maybe also have variables describing higher level unit.

  • In education, students (micro-level) are grouped

in classes (macro-level), and there are variables describing students and another variables describing classes.

5

slide-6
SLIDE 6

Nest Nested Data Structure ed Data Structure

■ Each person might be nested within some

  • rganizational unit, such as a school or workplace.

■ These organizational units may in turn be nested within a geographical location such as a community, state, or country.

slide-7
SLIDE 7

7

What are nest What are nested data? ed data?

■ “Nested Data” refer to sub-units that are grouped (or “nested”) within larger units. ■ Often the data are observations of individuals nested within groups. – Key: individuals within groups are more similar to one another than to individuals in

  • ther groups.

– We can empirically verify this. ■ Sometimes data are multiple

  • bservations

nested within an individual. ■ Data stemming from such research designs have a multilevel or hierarchical structure

slide-8
SLIDE 8

Nest Nested/Hierar ed/Hierarchical Structure hical Structure

■ Where else can we find hierarchical or nested structure?

slide-9
SLIDE 9

Some e Some exam amples of units at the ples of units at the macr macro and micr

  • and micro le
  • level
slide-10
SLIDE 10

Three Le Three Levels of Data ls of Data

Level 1

  • Denotes observations at the most detailed

level of the data.

  • In a clustered data set, Level 1 represents the

units of analysis (or subjects) in the study.

  • In a repeated measures or longitudinal data set,

Level 1 represents the repeated measure made on the same unit. Note:

  • The continuous dependent variable is always

measured at Level 1 of the data.

slide-11
SLIDE 11

Three Le Three Levels of Data ls of Data

Level 2

  • Represents the next level of the hierarchy.
  • In clustered data sets, Level 2 observations

represent clusters of units.

  • In repeated measures and longitudinal data sets,

Level 2 represents the units of analysis.

slide-12
SLIDE 12

Three Le Three Levels of Data ls of Data

Level 3

  • Represents the next level of the hierarchy.
  • Generally refers to clusters of units in

clustered longitudinal data sets

  • Clusters of Level 2 units (cluster of clusters)

in three-level clustered data sets.

slide-13
SLIDE 13

Section 1: What and Why HLM?

slide-14
SLIDE 14

Wh Why HLM? y HLM?

At group level analysis: ■ (aggregate data and ignore individuals) ■ Aggregation bias = the meaning of a variable at Level-1 (e.g., individual level SES) may not be the same as the meaning at Level-2 (e.g., school level SES) There are some problems with traditional approaches: At 1-Level: ■ 1. Individual level analysis (ignore group) ■ 2. Group level analysis (aggregate data and ignore individuals)

slide-15
SLIDE 15

■ Violate the assumption of independency – a statistical assumption required if we want to perform an analysis using regression or ANOVA. ■ Students’ responses in the same schools are likely to be more correlated than the scores for students in different schools because they share the same environment. ■ The violation – need for MLM – traditional methods produce excessive Type 1 errors and biased parameter estimates.

Wh Why HLM? y HLM?

slide-16
SLIDE 16

■ Ordinary Least Squares (OLS) regression assumes each unit in a sample is an independent

  • bservation, but subjects are often not

independent from their context. ■ Those from a particular setting tend to be more like each other than like those in other settings. – Why are they in that setting? – What characteristics do they share? – What are their shared experiences? ■ So should the unit of analysis be the individual or the setting in which they cluster?

Wh Why HLM? y HLM?

slide-17
SLIDE 17

17

Wh Why multile y multilevel modeling? el modeling?

■ Nest Nested ed data data are very common in education. ■ Analysis of nested data poses unit of analysis problem – should we analyze the individual or the group? Unfortunately, we often can’t choose one over the other. ■ Traditional linear models offer a simple view of a complex world – generally assume same effects across groups. ■ If effects do differ across groups, we can explain these differences with multilevel modeling.

slide-18
SLIDE 18

18

Unit of analysis pr it of analysis problem: individual, gr

  • blem: individual, group
  • up
  • r bo
  • r both?

th?

■ Example: studying what affects student retention (1000 students per school) in a group of schools (n=50). Total dataset N=50,000. ■ We can assign school-level variables to each individual, but … – We end up estimating the standard errors for school-level variables using N=50,000. – Yet we only have 50 different college

  • bservations, so N really equals 50.
slide-19
SLIDE 19

19

Unit of analysis pr it of analysis problem: individual, gr

  • blem: individual, group
  • up
  • r bo
  • r both?

th? ■ Alternatively, we can average student data for each college so that we have 1

  • bservation per school (N=50).

– Now we have reduced variance on our student-level variables. – We also have variables which measure both individual student characteristics (SES) and college environment (average SES).

slide-20
SLIDE 20

Summar Summary

■ Methods of dealing with nested data; – Disaggregation;

  • Aggregation

■ Dependency Use HLM – HLM simultaneously investigates relationships within and between hierarchical levels of grouped data, thereby making it more efficient at accounting for variance among variables at different levels than other existing analysis,

slide-21
SLIDE 21

So, what is HLM?

slide-22
SLIDE 22

What Is HLM? What Is HLM?

■ HLM HLM = Hierarchical Linear Model ■ HLM HLM is a software name – multilevel modelling (MLM)

  • r also known as:
  • 1. Multilevel linear models

(In sociological research)

  • 2. Mixed-effects models/Random-effects models

(In biometric applications)

  • 3. Random-coefficient regression models

(In econometrics literature)

  • 4. Covariance components models

(In statistical literature)

slide-23
SLIDE 23

What is What is HLM? HLM?

■ Hierarchical Linear Modeling – The name of a software package – Used as a description for broader class of models ■ Random coefficient models ■ Models designed for hierarchically nested data structures ■ Typical applications – Hierarchically nested data structures – Outcome at lowest level – Independent variables at the lowest + higher levels

23

slide-24
SLIDE 24

When t When to Use HLM? Use HLM?

■ Nested data structure

Example:

 students nested within classrooms or schools  employees nested within a firm ■ Clustered Data This refer to data sets in which the dependent variable is measured once for each subject (the unit of analysis), and the units of analysis are grouped into, or nested within, clusters of units. More examples….?

  • multiple observation within individuals
slide-25
SLIDE 25

HLM2 and HLM3 HLM2 and HLM3

Classroom Student School Student Classroom

Three-level model Two-level model

slide-26
SLIDE 26

HLM Models HLM Models

Classroom Student Student Repeated measure

Contextual model Growth model

slide-27
SLIDE 27

Prerequisite Knowledge

slide-28
SLIDE 28

Pre-req Pre-requisit isite f e for HLM Data Analysis r HLM Data Analysis

■ Statistical assumptions and techniques to examine these assumptions ■ ANOVA – understand the SPSS output ■ Regression – especially multiple regression and can interpret the results of Mreg ■ Knowledge on how to aggregate the data

slide-29
SLIDE 29

Linear R Linear Regression Analysis gression Analysis

  • Va

Variables: X = Predictor Variable (we provide this) Y = Outcome Variable (we observe this)

  • Param

Parameters: β0 = Y-Intercept β1 = Slope ε ~ Normal Random Variable (με = 0, σε = ???)

slide-30
SLIDE 30

Least Sq Least Squares Line… uares Line…

these differences are called residuals or errors

slide-31
SLIDE 31

HLM Im HLM Implications f plications for R

  • r Resear

search Design ch Design

■ Observations are not independent within classes/schools – Students within schools tend to share similar characteristics (e.g., socioeconomic background and instructional setting) ■ Traditional linear regression (OLS) assumes: – Correlation (ei,ej)=0, i.e., the between

  • bserved and predicted Y are uncorrelated

■ Ignoring dependency of observations may lead to wrong conclusions

slide-32
SLIDE 32

Basic Appr Basic Approach t

  • ach to P

Performing HLM

  • rming HLM Analysis

Analysis

  • 1. Researcher first specifies a

model based on theory.

  • 2. The researcher then

determines how to measure constructs, collects data, and then inputs the data into the HLM software package.

  • 3. The package fits the data to

the specified model and produces the results, which include:

  • Overall model fit

statistics, and

  • Parameter estimates.
slide-33
SLIDE 33

St Steps eps

  • 1. Clarifying the research questions
  • 2. Choosing appropriate parameter estimator
  • 3. Assessing the need for MLM
  • 4. Building the level-1 model
  • 5. Building the level-2 model
  • 6. Multilevel effect size reporting
  • 7. Likelihood ratio model testing
slide-34
SLIDE 34

St Steps in R eps in Running HLM Analysis nning HLM Analysis

Three models are typically run:

  • 1. Fully unconditional model

– No independent variables are specified – Used to determine if there is sufficient variance among groups to justify using HLM (intraclass correlation) 2. Partially conditional model – Predictors are added at level 1 3. Fully conditional model – Level 2 (and 3) predictors are modeled on the intercept and/or slopes to determine their effects on the outcome measure or on relationships between predictors and outcome

slide-35
SLIDE 35

Time for some Algebra!

■ You mus u must learn some of the basic mathematical learn some of the basic mathematical no notations used in multile tations used in multilevel l modeling. modeling. – As we will see, the program HLM uses this notations to express the models that you estimate. – Understanding these basic symbols and expressions will allow you to tackle more complex analyses, and understand other researchers’ more complex analyses.

slide-36
SLIDE 36

A level-1 model: multiple students in one school (familiar OLS equation)

■ Student is viewed as having average achievement in the school, plus a positive deviation due to SES, plus a positive or negative deviation due to the unique circumstances of the student.

Is student’s Math achievement score Is average achievement within school (intercept) Is average effect of SES on achievement (slope) Is student’s standardized SES (independent variable) Is unique effect for student i (error term)

slide-37
SLIDE 37

A level-1 model: multiple students in multiple schools

Is student’s achievement in school number j Is average achievement within school j Is average effect of SES on achievement for school j Is student’s standardized SES of student i in school j Is unique effect for student i in school j

  • Now we are estimating the equation from before

for each school. Each school can have a different average achievement (or intercept), and a different impact of SES on achievement (or slope).

slide-38
SLIDE 38

Need to make some additional assumptions about the coefficients, because they vary

■ Student-level errors are normally distributed. ■ Gamma’s: we expect the average achievement for school j to be equal to the average school mean for all j schools, and the slope of SES for school j to equal the average of the slopes for all j schools. ■ Tau’s: these are the variances of the intercepts and slopes, and the covariance between them.

01 1j , 0j 11 1j 10 1j 00 0j 00 0j 2 ij

) β Cov(β ) (β Var , ) E(β ) β Var( , ) E(β ) , ( N ~ r           

slide-39
SLIDE 39

Level-2 model: explaining the Level-1 coefficients

■ Since our intercepts and slopes vary by school, we can now model why they vary. ■ Suppose we hypothesize that levels of achievement and impact of SES are related to whether a school is public or Catholic. ■ We need equations for the intercept and slope to describe our hypothesis: j school for t achievemen

  • n

SES

  • f

effect average is β j school within t achievemen average is β t) coefficien (slope u W β ) (intercept u W β

1j 0j 1j j 11 10 1j 0j j 01 00 0j

         

slide-40
SLIDE 40

Level-2 model (continued)

slide-41
SLIDE 41

So math achievement of an individual student in school j is explained by …

mean achievement in public schools, plus impact of a school being Catholic

  • n mean achievement (if j is Catholic)

the effect of SES on achievement, plus the impact of a school being Catholic

  • n how SES affects achievement

(again, if j is Catholic) student- and school-specific error terms

slide-42
SLIDE 42

Multile Multilevel R el Regression Model gression Model

Lowest (individual) level: ■ Yij= b0j+ b1jXij+ eij and at the Second (group) level: ■ b0j= g00+ g01Zj+ u0j ■ b1j= g10+ g11Zj+ u1j Combining: ■ Yij= g00 + g10Xij+ g01Zj+ g11ZjXij + u1jXij+ u0j+ eij

Some examples from multilevel regression modeling:

slide-43
SLIDE 43

Hands-on Session with HLM Software: HLM 7 Student Version

slide-44
SLIDE 44

Star Starting HLM ting HLM

  • Prepare data;
  • Identify variables to be included in the model;
  • Develop hypothesized model; and
  • Install HLM program
slide-45
SLIDE 45

HSB DATA

Our data file is a subsample from the 1982 High School and Beyond Survey and is used extensively in Hierarchical Linear Models by Raudenbush and Bryk. The data file, called hsb, consists of 7185 students nested in 160 schools. The outcome variable of interest is the student-level (level 1) math achievement score (mathach). The variable ses is the socio-economic status of a student and therefore is at the student level. The variable meanses is the group-mean centered version of ses and therefore is at the school level (level 2). The variable sector is an indicator variable indicating if a school is public or catholic and is therefore a school-level variable. There are 90 public schools (sector=0) and 70 catholic schools (sector=1) in the sample.

slide-46
SLIDE 46

Exam Example ple

■ Using HSB-data ■ Questions:

  • Is multilevel modeling needed for mathematics

achievement scores?

  • Is there a relationship between SES and student

level Mathematics achievement scores?

  • Does the effect of SES on Mathematics

achievement scores vary significantly across schools?

  • Is the effect of SES on ACHMATH moderated by

the MEANSES and SECTOR?

slide-47
SLIDE 47

Inf Inform HLM of the in rm HLM of the input and Mak put and Make MDM f MDM file le

slide-48
SLIDE 48

Inform HLM with the data and analysis command

slide-49
SLIDE 49

STEPS 1 2 3 4

5

6 7

8 9 10 10

slide-50
SLIDE 50

Choose V Choose Variables f riables for Le r Level_1 l_1

Data file : HSB1.sav

slide-51
SLIDE 51

Choose V Choose Variables f riables for Le r Level_2 l_2

Data file : HSB2.sav

slide-52
SLIDE 52

Specify the Model Specify the Model

1 2

slide-53
SLIDE 53

NULL/UNCONDITIONAL NULL/UNCONDITIONAL MODEL MODEL

Also kno Also known as Random Ef n as Random Effect Model ct Model

slide-54
SLIDE 54

Fully Unconditional Model lly Unconditional Model

■ Fully unconditional model is run with no predictors to determine if a significant portion of the variance in achievement is between schools – indicating HLM should be used to analyze these data.

slide-55
SLIDE 55

Purpose of N Purpose of Null Model ll Model

■ It is used as the baseline model to compare the results of more elaborate models, ■ It can estimate the grand mean of mathematics achievement (γoo ) with adjustment for clustering of students within schools and for different sample sizes across schools, ■ It can estimate variance components at student (σ2) and school level (τoo ).

slide-56
SLIDE 56

Null/U ll/Unconditional Model nconditional Model

■ Null model is used for two purposes: (1) It is the basis for calculating the intra-class correlation coefficient (ICC), which is the usual test of whether multilevel modelling is needed; and (2) It outputs the deviance statistic (-2LL) and other coefficients used as a baseline for comparing later, more complex models.

slide-57
SLIDE 57

Null Model ll Model

The level-1 model Yij = βoj + rij (1) Where Yij

= Mathematics achievement for student i in school j,

Βoj

= The average mathematics achievement for school j,

*rij

= error term representing a unique effect associated

with student i in school j. * Assumed to have a normal distribution with a mean of zero and a level-one variance, σ2

slide-58
SLIDE 58

The Level-2 Model, βoj = γoo + uoj (2) Where, γoo = The intercept represents grand mean or overall average of mathematics achievement, *uoj = The error term represents a unique effect associated with school j. * Assumed to have a normal distribution with a mean of zero and a level-two variance, τ00

Null Model ll Model

slide-59
SLIDE 59

■ Combine the two equations (Mixed-model),

Yij = γoo + uoj + rij

Null Model ll Model

slide-60
SLIDE 60

The level 1 intercept term, expressed as β0j in output, is a function of a random intercept term at level 2 (γ00) and a level 1 residual error term (rij). The level 1 intercept, in turn, is a function of the grand mean (γ00) across level 2 units, which are agencies in this example, plus a random error term (u0j), signifying the intercept is modelled as a random effect. Substituting the right-hand side of the level 2 equation into the level 1 equation gives the mixed model equation for the null random intercept model.

Null or Unconditional Model ll or Unconditional Model

slide-61
SLIDE 61

■ Reliability in HLM ≠ ordinary reliability ■ Reliability for the intercept in HLM indicates to what extent the intercept measures can discriminate among schools in their average achievement. ■ Low reliability does not mean lack of precision.

Anno Annotat tated R ed Results 1 sults 1

slide-62
SLIDE 62

■ The reliability of the random effect of the level 1 intercept is the average reliability of the level 2 units. ■ It measures the overall reliability of the OLS estimates for each of the intercepts. The reliability estimate for this model is .901. ■ This indicates that the sample means is tend to be quite reliable as indicator of the true school mean.

Anno Annotat tated R ed Results 2 sults 2

slide-63
SLIDE 63

■ In Final estimation of fixed effect: The intercept is 12.64 (SE=.24) and differ from zero [This value indicate the grand mean of Mathematics Achievement] ■ To measure the magnitude of the variation among schools in their mean achievement levels, we can calculate the plausible values range for these means based on the between variance we obtained from the model: 12.64 ± 1.96*(0.24) = (12.17, 13.11).

Anno Annotat tated R ed Results 3 sults 3

slide-64
SLIDE 64

■ The estimated between variance, τ2, corresponds to the term intercept1 in the output of final estimation of variance components and the estimated within variance, σ2, corresponds to the term level-1 in the same output section. For this model, τ2 is 8.61 and σ2 is 39.15.

Anno Annotat tated R ed Results 4 sults 4

slide-65
SLIDE 65

At the school level, is the variance of the true school means , around the grand mean, .The estimated variability in these school means is To measure the magnitude of the variation among schools in their mean achievement levels, we can calculate the plausible values range for these means. Under normality assumption, we would expect 95% of the school means to fall within the range:

Which yields 12.64 ± 1.96 ( )1/2 = (6.89, 18.39)

Indicates a substantial range in average achievement level s among schools in the sample data

Final Final estimation of V estimation of Variance riance

Anno Annotat tated R ed Results 5 sults 5

slide-66
SLIDE 66

■ Statistically significant between-school variance (variance at school level) indicates that school average mathematics achievement varies significantly across schools.

Anno Annotat tated R ed Results 6 sults 6

slide-67
SLIDE 67

Ef Effectiv ctive sam e sample size le size

■ A higher ICC value indicates greater dependence among

  • bservations within schools

– Effective sample size is smaller than observed sample size ■ Effective n= mk / (1 + ICC*(m-1)) – where n=sample size, m= number of students per schools and k= number of schools ■ If ICC=1, effective n is equal to the # of schools (k) ■ If ICC=0, effective n is equal to the observed n (i.e., mk) ■ In general, effective n lies between k and mk

slide-68
SLIDE 68

■ Based on the covariance estimates, we can compute the intra-class correlation: 8.61431/(8.61431 + 39.14831) = .18. ■ This tells us the portion of the total variance that

  • ccurs between schools.

Calculating ICC Calculating ICC

slide-69
SLIDE 69

Calculating the Intra-class Correlation Calculating the Intra-class Correlation coef coefficient (ICC) ficient (ICC)

slide-70
SLIDE 70

ADDING V ADDING VARIABLE A RIABLE AT THE S THE STUDENT LEVEL UDENT LEVEL

Random Coef Random Coefficient Model ficient Model

slide-71
SLIDE 71

Adding Predict ding Predictors at Le at Level_1 l_1

slide-72
SLIDE 72

No Notes on the R s on the Results 1 sults 1

■ The model we fit was mathachij = β0j + β1j (SES - meanses) + rij β0j = γ00 + u0j β1j = γ10 + u1j ■ Filling in the parameter estimates we get mathachij = β0j + β1j (SES - meanses) + rij β0j = 12.64 + u0j β1j = 2.19 + u1j V(u0j) = 8.68 V(u1j) = .68 V(rij) = 36.7 ■ In a single equation our model would be written as: mathachij = γ00 + u0j + (γ10 + u1j )(SES - meanses) + rij

= γ00 + γ10 *(SES - meanses) + u0j + u1j *(SES –

meanses) + rij

slide-73
SLIDE 73

■ The estimate for the variance of the slope for group- centered SES is 0.68. The p-value is .003. Because the test is statistically significant, we reject the hypothesis that there is no difference in slopes among schools. ■ The 95% plausible value range for the school means and school-specific SES achievement slope is 12.64 ± 1.96 *(8.68)1/2 = (6.87, 18.41). ■ The 95% plausible value range for the SES -achievement slope is 2.19 ± 1.96 *(.68)1/2 = (.57, 3.81).

No Notes on the R s on the Results 2 sults 2

slide-74
SLIDE 74

■ The coefficient for the constant is the predicted math achievement when all predictors are 0; hence, when the average school SES is 0, the students' math achievement is predicted to be 12.65

No Notes on the R s on the Results 3 sults 3

slide-75
SLIDE 75

■ Notice that the residual variance is now 36.70, compared to the residual variance of 39.15 in the one- way ANOVA with random effects (unconditional means) model. ■ We can compute the proportion variance explained at level 1 as (39.15 - 36.70) / 39.15 = .063. This suggests using student-level SES as a predictor

  • f

math achievement reduced the within-school variance by 6.3%. ■ The correlation between the intercept and the slope is .019. It seems that they are not highly correlated.

No Notes on the R s on the Results 4 sults 4

slide-76
SLIDE 76

Calculating Pr Calculating Propor

  • portion of V

tion of Variance riance Explained Explained

slide-77
SLIDE 77

THE INTER THE INTERCEPT EPT AND SL AND SLOPE OPE AS THE OUT AS THE OUTCOME OME MODEL MODEL

Final Model Final Model

slide-78
SLIDE 78

This model is referred to as an intercepts- and slopes-as-outcomes model

slide-79
SLIDE 79

Resear search Questions ch Questions

■ Do MEANSES and SECTOR significantly predict the intercept? ■ Do MEANSES and SECTOR significantly predict the within school slope? ■ How much variation in the intercepts and slopes is explained using SECTOR and MEANSES as predictors?

slide-80
SLIDE 80

Int Interpre rpreting the Final R ting the Final Results sults

For intercept: ■ The MEANSES is positively related to school mean math achievement. ■ Catholic schools have higher mean achievement than do public schools, after controlling the effect of MEANSES.

slide-81
SLIDE 81

For slope ■ School with higher MEANSES have a larger slope than low MEANSES. ■ Catholic schools have significantly weaker SES slopes on average than do public schools.

Int Interpre rpreting the Final R ting the Final Results sults

slide-82
SLIDE 82

Repor porting in T ing in Table ble

slide-83
SLIDE 83

■ The estimate for the variance of the SES slope is .15 with p-value .369; hence, we fail to reject the null hypothesis that there is no significant variation among the slopes of MEANSES remain unexplained after controlling the MEANSES and SECTOR effects. ■ The correlation between the level-1 intercept and the slope for SES is given as .32 from the earlier part of the output. ■ There is variation remain unexplained even after controlling the MEANSES and SECTOR effects.

Final Model Final Model

slide-84
SLIDE 84

■ Using Deviance Statistics ■ Using Proportion of variance explained ■ Using other indicators AIC and BIC

Assessing Model Fit Assessing Model Fit

slide-85
SLIDE 85

■ REML (restricted maximum likelihood) versus FML (full maximum likelihood) – REML and FML will usually produce similar results for the level-1 residual (σ2), but there can be noticeable differences for the variance-covariance matrix of the random effects. – REML is the default estimation method in HLM. – If the number of level-2 units is large, then the difference will be small. – If the number of level-2 units is small, then FML variance estimates will be smaller than REML, leading to artificially short confidence interval and problematic significant tests. ■ Nested models – If the fixed effects are the same, and there are fewer random effects in the reduced model, then both REML or FML are fine. – If one model has fewer fixed effects and possibly fewer random effects, then use FML to compare models.

Estimation Specif Estimation Specification ication

slide-86
SLIDE 86

SOME PRA SOME PRACTICAL TICAL ASPECTS OF ASPECTS OF MUL MULTILEVEL ILEVEL MODELING MODELING

slide-87
SLIDE 87

Questions t Questions to Answ Answer er

■ Can you use multilevel techniques to study your dependent variable? ■ Should you use multilevel techniques to study your dependent variable? ■ How will you center your level-1 and level-2 predictors? ■ Which of the level-1 coefficients will be explained at level-2? I.e., are they fixed or random? ■ How does my model perform?

slide-88
SLIDE 88

Can I use HLM? Can I use HLM?

■ HLM requires a large amount of data. ■ Minimum:

  • number of groups:

30, but most recommend 50+

  • number of individuals within groups:

5-10, but can have low as 1.

  • average group size:

10, obviously more is better.

slide-89
SLIDE 89

Should I Use HLM? Should I Use HLM?

■ How much of the variance in your dependent variable is explained by group membership? ■ Intra-class correlation coefficient (ICC) = var between groups (var between groups + var within groups) ■ ) /(

2 00 00

      variance level

  • student

the is and means, school the

  • r

, intercepts the

  • f

variance the is Remember,

2 00

 

slide-90
SLIDE 90

Cent Centering v ering variables riables

■ Whether and how you center is a very important decision: interpretation of results depends on your choice. ■ Important because the intercept at level-1 is also a dependent variable. ■ Centering – Refers to subtracting a mean from your independent variables. – The transformed value for an individual measures how much they deviate (+/-) from the mean.

slide-91
SLIDE 91

■ Suppose we center verbal SAT scores around a student mean of 500. ■ How would we interpret a regression coefficient if all variables were similarly transformed? Actual score Centered score Steve 800 300 Claire 750 250 Bill 500 Paul 200

  • 300

91

Cent Centering v ering variables riables

slide-92
SLIDE 92

■ Why would we want to center? – Variable may lack a natural zero point, such as SAT score. – Stability of estimates at level-1 affected by location of variables. – Location at level-2 is less important.

Cent Centering v ering variables riables

  • Centering in multilevel models presents a unique

challenge because different centering choices have a significant impact on how the parameter estimate is interpreted.

slide-93
SLIDE 93

■ Generally two types of centering are used in HLM for a specific variable: – Grand mean centering – subtract the mean for the entire sample from each observation in the sample. – Group mean centering – subtract the mean for each group from each member of the group. ■ To fully understand the implications of centering, see the discussion in Bryk and Raudenbush (2002)

  • pp. 134-149.

Cent Centering v ering variables riables

slide-94
SLIDE 94

■ Grand mean centering is scoring variables as deviations from their sample means. ■ An example would be scoring occupational status as a deviation from the mean occupational status in the entire sample—scoring how high or low people are relative to the average. ■ In multivariate analyses, predictor variables that are grand-mean centered generate mathematically identical predicted values to those from the same model estimated on the original, conventionally scored variables.

Grand Mean Cent Grand Mean Centering ering

slide-95
SLIDE 95

■ Some writers still claim that grand mean centering reduces multicollinearity, particularly when the regression includes many interactions, and most especially when these are cross-level interactions, (Bickel 2007; Preacher 2003). ■ Another advantage of grand mean centering is that it allows one to interpret the intercept as the predicted mean on the dependent variable when all the predictors are set to zero (Paccagnella 2006).

Grand Mean Cent Grand Mean Centering ering

slide-96
SLIDE 96

■ It is also sometimes said that grand-mean centering facilitates regression coefficient interpretation, particularly for cross-level interactions when a variable is continuous (Bickel 2007; Kenny et al. 1998; Hox 2010). ■ Hox (2010) reports that convergence tends to be achieved more frequently and analyses run faster using grand-mean centering.

Grand Mean Cent Grand Mean Centering ering

slide-97
SLIDE 97

Grand Mean Cent Grand Mean Centering ering

slide-98
SLIDE 98

■ Group mean centering refers to scoring variables in multi-level models as deviations from the mean of their macro-level group. ■ An example would be scoring occupational status as a deviation from the mean status in respondent’s own country. ■ In group-mean centering, Nigerian clerks would be high (because they are high compared to the average Nigerian) while Swiss clerks would be low (because they are low compared to the average Swiss).

Gr Group Mean Cent

  • up Mean Centering

ering

slide-99
SLIDE 99

■ Bryk (2002) also posit that group-mean centering can reduce bias in random component variance estimates. ■ Paccagnella (2006) alleges the benefit of group-mean centering for researchers interested in ‘‘separating the between-group and the within-group components from the total variation to investigate how groups (contexts) affect the student performances, explicitly accounting for the group structure into the model’’ ■ Group-mean centering (also known as within-group deviation scoring) is widely used in many disciplines, and widely recommended.

Gr Group Mean Cent

  • up Mean Centering

ering

slide-100
SLIDE 100

Gr Group Mean Cent

  • up Mean Centering

ering

slide-101
SLIDE 101

■ Methodologists have cautioned that centering decisions should be undertaken warily and based on both a theoretical and statistical rationale. ■ In general group-mean centering individual-level variables can create large and varied biases for higher-level variables. ■ Centering decisions in longitudinal models take on a different role than in cross-sectional models, and often involve centering around a constant rather than group

  • r overall mean values.

■ Centering choices have differing effects on the interpretation of the intercept term.

Mean Cent Mean Centering ering

slide-102
SLIDE 102

■ Under grand mean centering, the variance in the intercept term (β0j) represents the between group variance in the outcome variable adjusted for the level-1 variables. ■ In group mean centered models, the intercept variance simply represents the between group variance in the outcome measure. ■ While centering around the grand mean constitutes a simple linear transformation, the scores resulting from group mean centering are a nonlinear (and discontinuous) function of two variables, namely the variable which is being 'centered' and the categorical variable which expresses the grouping.

Mean Cent Mean Centering ering

slide-103
SLIDE 103

■ When to use grand mean centering:  If you are more interested in the effects in individuals’ performance than in group effects.\  When raw score does not allow a meaningful interpretation of the intercept. ■ When to use group mean centering:  Theory says that individual and group effects are separate.  Smaller correlation between random intercept and random slope.  Smaller correlation between level 1 and level 2 variables and cross-level interactions. This will stabilize model (coefficients are more or less independent estimates.)

Grand or Mean Cent Grand or Mean Centering ering

slide-104
SLIDE 104

■ It would be nice to have everything random; that is, a different set of coefficients for each group. ■ But due to HLM demands on data, usually only the intercept and a few variables can be random. ■ Important: if you randomize gender and you have a group without females, that group will be dropped. ■ Generally you should run parallel models for intercept and slopes, as in our theory example.

Fix Fixed or Random? ed or Random?

slide-105
SLIDE 105

■ Goodness of fit: – Proportion of variance explained at level-1 – Variance explained at level-2 ) ( ) ( ) (

2 2 2

null full null     ) ( ) ( ) ( null full null    

Model Statistics Model Statistics

slide-106
SLIDE 106

Some thoughts about building y Some thoughts about building your ur models models

■ Before using HLM, run OLS regressions for sample and for each group. ■ Building the null model: – This is should be your first step. – Calculate the ICC ■ Building the level-1 models: – Should be theory driven – Step-up approach – Be cautious about what you leave as random – it’s often difficult to leave more than the intercept and one variable as random

slide-107
SLIDE 107

■ Building the level-2 models – Rule of thumb: 10 observations/variable – Parallel models ■ Many scholars drop insignificant variables at both levels. (I disagree with this.)

Some thoughts about building y Some thoughts about building your ur models models

slide-108
SLIDE 108

Thank y Thank you