Consequences of measurement error Psychology 588: Covariance - - PowerPoint PPT Presentation

consequences of measurement error
SMART_READER_LITE
LIVE PREVIEW

Consequences of measurement error Psychology 588: Covariance - - PowerPoint PPT Presentation

Consequences of measurement error Psychology 588: Covariance structure and factor models Scaling indeterminacy of latent variables 2 Scale of a latent variable is arbitrary and determined by a convention for convenience


slide-1
SLIDE 1

Psychology 588: Covariance structure and factor models

Consequences of measurement error

slide-2
SLIDE 2

Scaling indeterminacy of latent variables

2

  • Scale of a latent variable is arbitrary and “determined” by a

convention for convenience

  • Typically set to variance of one (factor analysis convention) or

to be identical to an arbitrarily chosen indicator’s scale By centering indicator variables, we set latent variables’ means to zero

  • Consider the following transformation:

* *

, 1,..., , ,

j j j j j j j j

x j J a b b a b b                                  

slide-3
SLIDE 3
  • If all J indicators are considered simultaneously, vector

notation is more convenient:

* *

, 1 a b a b b                          x ν λ δ ν λ λ δ

meaning that the linear transformation of ξ can be exactly compensated in the accordingly transformed ν* = ν – λa/b and

λ* = λ/b, leaving the errors δ unchanged (i.e., same fit)

slide-4
SLIDE 4

What’s great about measurement errors in equation

4

  • Regression weights and correlations are interpreted, implicitly

assuming that the “operationally defined” variables involve no measurement error --- hardly realized for theoretical constructs (e.g., self esteem, IQ, etc.)

  • Ignoring the measurement error will lead to inconsistent

estimates

  • We will see consequences of ignoring measurement errors
slide-5
SLIDE 5

Univariate consequences

5

  • Consider a mean-included equation for X (hours worked per

week) to indicate ξ (achievement motivation): Given only one indicator per latent variable, the intercept and loading (i.e., weight) are simply scaling constants for ξ However, if the ξ scale is set comparable to the X scale (i.e.,

λ = 1), we see that var(X) is an over-estimation of ϕ = var(ξ)

if δ is not included in the equation

           

2

, , 0, var var

X

X E E E E X X                        

slide-6
SLIDE 6

Bivariate relation and simple regression

6

  • True data structure:
  • cov(x, y) is unbiased estimate of cov(ξ, η) with λ1 = λ2 =1,

since no other variables (δ and ε) can explain cov(x, y)

1 2

x y              

       

cov , cov , cov , cov , x y                    

xi eta

gamma

x

1

y

1

e d zeta

η: job satisfaction y: satisfaction scale

slide-7
SLIDE 7
  • From the previous equations, and by analogy

with y = γ*x + ζ * if measurement errors are ignored,

 

cov ,     

     

*

cov , var var

xx

x y x                

The parenthesized ratio (reliability) becomes 1 only with no measurement error; otherwise, γ* is an attenuated estimate of

γ and is an inconsistent estimator of γ

  • If

the bias of regression weight has an additional factor as --- but such scaling is unusual when there is only one indicator per latent variable

1 2,

  

 

* 2 1 xx

    

*

ˆ

xy xx

s s  

slide-8
SLIDE 8
  • Correlations:

which shows an attenuation of the “true” correlation due to measurement error, with the familiar correction formula:

                       

2 2 2 2 2 2 2 2 2

cov , var var cov , var var var var var var var var

xy xx yy

x y x y x y x y

 

                        

xy xx yy 

    

slide-9
SLIDE 9

Consequences in multiple regression

9

  • True data structure:

y            γ ξ x ξ δ

with Λx = I and λy = 1

  • Ignoring measurement errors:
  • *

*

y     γ x

xi1 eta x1 y e d1 zeta

g 1

xi2 xi3

1 g2 g 3

x2 d2 x3 d3

1 1 1

       

cov , cov , cov , cov ,

xy

y



                σ ξ ξ ξ γ Φγ σ x ξ δ ξ γ Φγ

slide-10
SLIDE 10

Without measurement error (Θδ = 0),

  • therwise,
  • Alternatively written: since --- where

i is the OLS estimator of B in i.e., regression weights for prediction of ξ by x Again, without measurement error,

  • Note: in Bollen (pp. 159-168), are meant to be

respectively, for the multiple regression model

  • 1

* xx x 

 γ Σ Σ γ

 

1 * * 1 1 1 * xx xy xx

y

 

   

        γ Φ σ γ x γ Σ σ Σ Φγ Φ Θ Φγ

and by analogy with ,

*

;  γ γ

* 

γ γ

x 

Σ Φ

1 xx x 

Σ Σ ,   ξ Bx e

1 xx x 

 Σ Σ I

, ,

xy 

Γ Σ Σ , , ,

xy 

γ σ σ

slide-11
SLIDE 11

with the true and estimated regression equations:

1 1 1

, 2,...,

i i

x x i q       

  • As a very simplified case, suppose x1 is the only fallible as:

1 1 2 2 * * * * 1 1 2 2 q q q q

x                           

  • In this special case, the regression weight matrix has a simple

multiplicative form of bias (hint: use ):

 

 

1 1 1 1 2

1 ,

xx x q q

c

     

                 Σ Σ Φ Θ Φ I c 0 c I

  Φ Φ Θ 

slide-12
SLIDE 12
  • Bias-factor for x1 is less than 1 in absolute value (1 without

measurement error), and so is biased toward 0 --- the bias factor indicates regression weight b1 in

ξ1 = b0 + b1x1 + b2ξ2 + … + bqξq

  • Consequences for xi, i = 2,…,q are additive, depending
  • n relationships between ξ1 and ξi holding all other IVs

constant, and γ1

* 1

 

1 1 2 ~ 1 1

* 1 1 * 1 ,

2, ,

q i i

i i

x x

b b i q

     

    

 

   

  • Consequently, resulting bias factors are:

1 1 2 q

x

b

   

slide-13
SLIDE 13
  • So far all reasoning is based on rather unrealistic assumptions:
  • Only single indicator per latent variable, and so its loading

becomes simply scaling constant

  • Only one fallible IV
  • Without such assumptions (e.g., all IVs fallible), consequences
  • f measurement error become too complicated and hard to

simplify algebraically --- no particular simple form of a

  • One clear conclusion: all estimates are inconsistent ---

systematically different from what they meant to be

1 xx x 

Σ Σ

slide-14
SLIDE 14
  • Consequence in SMC is similar to the bivariate case:

   

2 *2

plim plim R R 

     

* *

var var var

ii i i i

        

standardized

  • Consequence in standardization:
  • What should we do with essentially omnipresent measurement

error?

  • Use SEM which allows for measurement errors in the model
  • -- though we are limited in certain models regarding the

model identification (e.g., Table 5.1, p. 164)

slide-15
SLIDE 15

Correlated errors of measurement

15

* 1 1 xx x xx    

  γ Σ Σ γ Σ σ

  • Consequence in regression weights further complicated:

For simple regression:

   

*

cov , var

xx

x      

Now, is not necessarily < γ

*

  • If correlated measurement errors are only within IVs (i.e., σδε =

0, Σxx = Φ + Θσ where Θσ is not diagonal),

still holds (but the bias factor will have a more complicated form, also involving off-diagonal entries of Θσ)

1 * xx x 

 γ Σ Σ γ

slide-16
SLIDE 16
  • In path models with sequential causal paths, consequences of

measurement errors very hard to simply generalize --- see the union sentiment (Fig. 5.2, p. 169) and SES (Fig. 5.4, p. 173) examples

  • If reliabilities are known, the corresponding error variances can

be constrained; if unknown, the error variances may be modeled as free parameters provided that they are identifiable

  • To keep in mind: we need more than one indicator per latent

variable for identifiability and statistical testing --- leading to measurement models with multiple indicators or CFA

With multi-equations

16