SEM Professor Patrick Sturgis Plan Path diagrams Exogenous, - - PowerPoint PPT Presentation

▶

Apr 10, 2023 232 likes •503 views

Key ideas, terms & concepts in SEM Professor Patrick Sturgis Plan Path diagrams Exogenous, endogenous variables Variance/covariance matrices Maximum likelihood estimation Parameter constraints Nested Models and Model

SLIDE 1

Key ideas, terms & concepts in SEM

Professor Patrick Sturgis

SLIDE 2

Plan

Path diagrams
Exogenous, endogenous variables
Variance/covariance matrices
Maximum likelihood estimation
Parameter constraints
Nested Models and Model fit
Model identification

SLIDE 3

Path diagrams

An appealing feature of SEM is

representation of equations diagrammatically e.g. bivariate regression Y= bX + e

SLIDE 4

Path Diagram conventions

Error variance / disturbance term Measured latent variable Observed / manifest variable Covariance / non-directional path Regression / directional path

SLIDE 5

Reading path diagrams

A latent variable Causes/measured by 3 observed variables With 3 error variances

SLIDE 6

Reading path diagrams

2 latent variables, each measured by 3 observed variables Correlated

SLIDE 7

Reading path diagrams

2 latent variables, each measured by 3 observed variables Regression of LV1 on LV2 Error/disturbance

SLIDE 8

Exogenous/Endogenous variables

Endogenous (dependent)

– caused by variables in the system

Exogenous (independent)

– caused by variables outside the system

In SEM a variable can be a predictor and

an outcome (a mediating variable)

SLIDE 9

2 (correlated) exogenous variables

SLIDE 10

η1 endogenous, η2 exogenous

1 1

SLIDE 11

Data for SEM

In SEM we analyse the

variance/covariance matrix (S) of the

bserved variables, not raw data
Some SEMs also analyse means
The goal is to summarise S by specifying a

simpler underlying structure: the SEM

The SEM yields an implied var/covar

matrix which can be compared to S

SLIDE 12

Variance/Covariance Matrix (S)

x1 x2 x3 x4 x5 X6 x1 0.91

0.37

0.05 0.04 0.34 0.31 x2

0.37

1.01 0.11 0.03

0.22
0.23

x3 0.05 0.11 0.84 0.29 0.14 0.11 x4 0.04 0.03 0.29 1.13 0.11 0.06 x5 0.34

0.22

0.14 0.11 1.12 0.34 x6 0.31

0.23

0.11 0.06 0.34 0.96

SLIDE 13

Maximum Likelihood (ML)

ML estimates model parameters by

maximising the Likelihood, L, of sample data

L is a mathematical function based on joint

probability of continuous sample observations

ML is asymptotically unbiased and efficient,

assuming multivariate normal data

The (log)likelihood of a model can be used

to test fit against more/less restrictive baseline

SLIDE 14

Parameter constraints

An important part of SEM is fixing or

constraining model parameters

We fix some model parameters to particular

values, commonly 0, or 1

We constrain other model parameters to be

equal to other model parameters

Parameter constraints are important for

identification

SLIDE 15

Nested Models

Two models, A & B, are said to be ‘nested’

when one is a subset of the other (A = B + parameter restrictions) e.g. Model B:

yi= a + b1X1 + b2X2 +ei

Model A:

yi= a + b1X1 + b2X2 +ei (constraint: b1=b2)

Model C (not nested in B):

yi= a + b1X1 + b2Z2 +ei

SLIDE 16

Model Fit

Based on (log)likelihood of model(s)
Where model A is nested in model B:

LLA-LLB = , with df = dfA-dfB

Where p of > 0.05, we prefer the more

parsimonious model, A

Where B = observed matrix, there is no

difference between observed and implied

Model ‘fits’!



SLIDE 17

Model Identification

An equation needs enough ‘known’ pieces
f information to produce unique estimates
f ‘unknown’ parameters

X + 2Y=7 (unidentified) 3 + 2Y=7 (identified) (y=2)

In SEM ‘knowns’ are the variances/

covariances/ means of observed variables

Unknowns are the model parameters to be

estimated

SLIDE 18

Identification Status

Models can be:

– Unidentified, knowns < unknowns – Just identified, knowns = unknowns – Over-identified, knowns > unknowns

In general, for CFA/SEM we require over-

identified models

Over-identified SEMs yield a likelihood

value which can be used to assess model fit

SLIDE 19

Assessing identification status

Checking identification status using the

counting rule

Let s = number of observed variables in the

model

number of non-redundant parameters =
t=number of parameters to be estimated

t> model is unidentified t< model is over-identified

) 1 ( 2 1  s s

SLIDE 20

Example 1 - identification

) 1 ( 2 1  s s

= 6 Non-redundant parameters parameters to be estimated 3 * error variance + 2 * factor loading + 1 * latent variance = 6 6 - 6 = 0 degrees of freedom, model is just-identified

SLIDE 21

Controlling Identification

We can make an under/just identified

model over-identified by:

– Adding more knowns – Removing unknowns

Including more observed variables can add

more knowns

Parameter constraints remove unknowns
Constraint b1=b2 removes one unknown

from the model (gain 1 df)

SLIDE 22

Example 2 – add knowns

) 1 ( 2 1  s s

= 10 Non-redundant parameters parameters to be estimated 4 * error variance + 3 * factor loading + 1 * latent variance = 8 10 - 8 = 2 degrees of freedom, model is over-identified

SLIDE 23

Example 3 – remove unknowns

) 1 ( 2 1  s s

= 6 Non-redundant parameters parameters to be estimated 3 * error variance + 0 * factor loading + 1 * latent variance = 4 6 - 4 = 2 degrees of freedom, model is over-identified Constrain factor loadings = 1

SLIDE 24

Summary

SEM requires understanding of some ideas

which are unfamiliar for many substantive researchers:

– Path diagrams – Analysing variance/covariance matrix – ML estimation – global ‘test’ of model fit – Nested models – Identification – Parameter constraints/restrictions

SLIDE 25

Key ideas, terms & concepts in SEM

Professor Patrick Sturgis

Plan

Path diagrams

representation of equations diagrammatically e.g. bivariate regression Y= bX + e

Path Diagram conventions

Error variance / disturbance term Measured latent variable Observed / manifest variable Covariance / non-directional path Regression / directional path

Reading path diagrams

A latent variable Causes/measured by 3 observed variables With 3 error variances

Reading path diagrams

2 latent variables, each measured by 3 observed variables Correlated

Reading path diagrams

2 latent variables, each measured by 3 observed variables Regression of LV1 on LV2 Error/disturbance

Exogenous/Endogenous variables

– caused by variables in the system

– caused by variables outside the system

an outcome (a mediating variable)

2 (correlated) exogenous variables

η1 endogenous, η2 exogenous

Data for SEM

variance/covariance matrix (S) of the

simpler underlying structure: the SEM

matrix which can be compared to S

Variance/Covariance Matrix (S)

Maximum Likelihood (ML)

maximising the Likelihood, L, of sample data

probability of continuous sample observations

assuming multivariate normal data

to test fit against more/less restrictive baseline

Parameter constraints

constraining model parameters

values, commonly 0, or 1

equal to other model parameters

identification

Nested Models

when one is a subset of the other (A = B + parameter restrictions) e.g. Model B:

yi= a + b1X1 + b2X2 +ei

yi= a + b1X1 + b2X2 +ei (constraint: b1=b2)

yi= a + b1X1 + b2Z2 +ei

Model Fit

LLA-LLB = , with df = dfA-dfB

parsimonious model, A

difference between observed and implied

Model Identification

covariances/ means of observed variables

estimated

Identification Status

– Unidentified, knowns < unknowns – Just identified, knowns = unknowns – Over-identified, knowns > unknowns

identified models

value which can be used to assess model fit

Assessing identification status

counting rule

model

Example 1 - identification

= 6 Non-redundant parameters parameters to be estimated 3 * error variance + 2 * factor loading + 1 * latent variance = 6 6 - 6 = 0 degrees of freedom, model is just-identified

Controlling Identification

model over-identified by:

– Adding more knowns – Removing unknowns

more knowns

from the model (gain 1 df)

Example 2 – add knowns

= 10 Non-redundant parameters parameters to be estimated 4 * error variance + 3 * factor loading + 1 * latent variance = 8 10 - 8 = 2 degrees of freedom, model is over-identified

Example 3 – remove unknowns

= 6 Non-redundant parameters parameters to be estimated 3 * error variance + 0 * factor loading + 1 * latent variance = 4 6 - 4 = 2 degrees of freedom, model is over-identified Constrain factor loadings = 1

Summary

which are unfamiliar for many substantive researchers:

– Path diagrams – Analysing variance/covariance matrix – ML estimation – global ‘test’ of model fit – Nested models – Identification – Parameter constraints/restrictions

for more information contact

www.ncrm.ac.uk