Mixed Models II - Behind the Scenes Report Revised June 11, 2002 by - - PDF document

mixed models ii behind the scenes report
SMART_READER_LITE
LIVE PREVIEW

Mixed Models II - Behind the Scenes Report Revised June 11, 2002 by - - PDF document

Mixed Models II - Behind the Scenes Report Revised June 11, 2002 by G. Monette What is a mixed model "really" estimating? Paradox lost - paradox regained - paradox lost again. "Simple example": 4 patients each observed


slide-1
SLIDE 1

Mixed Models II - Behind the Scenes Report

Revised June 11, 2002 by G. Monette

  • What is a mixed model "really" estimating?
  • Paradox lost - paradox regained - paradox lost again.

"Simple example": 4 patients each observed at 3 dosages of a drug for depression: Q: Is the drug effective? Problem: Data "observational" not "experimental". Dosage might be affected by severity of illness.

1

slide-2
SLIDE 2

OLS approaches: 1) Pooled (ignore subjects) ∴Drug bad Model: Y ∼ X 2) Aggregate + regress Model ¯ Y ∼ ¯ X By Subject ∴Drug even worse

2

slide-3
SLIDE 3

3) Within Subject Y ∼ X + Sub ∴Drug is good Note:

  • 2 ∼ ecological correlation
  • 2 vs 3: Robinson’s Paradox e.g. Life Expectancy vs. Smoking

aggregated by country

  • 1 vs 3: Simpson’s Paradox

Call slope in 1. ˆ γ Pooled 2. ˆ γ Between 3. ˆ γ Within Let WB be weight (precision) of ˆ γB WW be weight (precision) of ˆ γW Then ˆ γP = (WB + WW)−1(WBˆ γB + WW ˆ γW)

3

slide-4
SLIDE 4

So, ˆ γP is an optimal combination of ˆ γB and ˆ γW Note: "optimal" in common model is connect. In our example ˆ γW is probably a "good" estimator, ˆ γB is NOT. So, ˆ γP mixes the good and the bad. What does a Mixed Model do? Does it give us ˆ γW ? Or ˆ γP ? Answer: Something in between. With a random intercept model: Y ∼ X / ∼ 1 | Sub ˆ γMM is also an "optimal" combination of ˆ γW and ˆ γB If Yi j = γ00 + γ10Xi j + u0 j + εi j ↓ ↓ V ar : τ00 σ2 then, Weight on ˆ γB = σ2/n σ2/n + τ00 × OLS weight Weight on ˆ γW = OLS weight I.E. ˆ γMM = ( · · · )−1 ×

  • σ2/n

σ2/n + τ 00 WBˆ γB + WW ˆ γW

  • Note: Here n is # of obs/subject.

So, ˆ γMM = ( · · · )−1 ×

  • σ2/n

σ2/n + τ 00 WBˆ γB + WW ˆ γW

  • If σ2/n very small / τ00 (∼ 0)

” ” ” ” = ∼ ˆ γW

4

slide-5
SLIDE 5

If σ2/n big / τ00 (∼ 1) ” ” ” ” = ∼ ˆ γP So ˆ γMM is between ˆ γP and ˆ γW depending on σ2/ (nτ00) If the between subject effect is not the same as the within subject effect (often the case with observational data) then Mixed Model can give biased results. In our example:

5

slide-6
SLIDE 6

To the Rescue: Contextual variable and within subject effects. Idea: decompose Xi j : Xi j = ¯ X· j + (Xi j − ¯ X· j)

↑ mean within S deviation from within subject mean Use ¯ X· j as a level-2 variable and Xi j − ¯ X· j as a level-1 variable Consider OLS: E(Y ) = γ0 + γ1 ¯ X· j + γ2( ¯ Xi j − ¯ X· j) What is γ1? △E(Y )/ △ ¯ X· j when Xi j − ¯ X· j = 0 i.e. between S effect when Xi j = ¯ X· j What is γ2? Answer: ˆ γ2 = ˆ γW !

6

slide-7
SLIDE 7

So a possible approach for Mixed Models:

  • 1. Transform inner variables into outer means and inner
  • deviations. Handy trick for small datasets:

PROC GLM; CLASS SUB; OUTPUT OUT = dsname P = X_B R = X_W;

  • 2. Model with both
  • 3. Test equality of parameters. If equal can revert to raw

inner variable.

  • 4. If not equal, rich possibilities for interpretation.

Note: Much discussion on "centering"

  • Should we use Xi j − ¯

X· j or Xi j − ¯ X

  • Often seems inconclusive. [See centering on

Xmin in next section] Problem:

  • Centering Xi j defines γ for ¯

X· j

  • has little effect on γ for Xi j - "center" !!

Essentially

  • 1. Controlling for ¯

X· j defines γ for Xi j − whatever

  • 2. Controlling for Xi j − whatever defines γ for ¯

X· j Compositional vs contextual effects We have three variable to consider (note that we can start with GMC (grand mean centering) of Xij. This only affect the meaning of the intercept (and effects of other interacting variables) but does affect the effects below. 3 variables:

7

slide-8
SLIDE 8
  • 1. Xij (raw variables)
  • 2. Xij − X·j (CWC: centered within contexts)
  • 3. X·j (contextual mean)

Note that we can’t use all 3 variables (why not?) 3 effects:

  • 1. γW within context
  • 2. γB between contexts = compositional effect
  • 3. γC contextual effect

Diagram: (R&B p. 140) γB = γW + γC How to get what: Raw CWC Mean Variables: Xij Xij − X·j X·j Contextual model: γW — γC Compositional model: — γW γB

8

slide-9
SLIDE 9

Hausman specification test: (S&B p. 87)

  • Random vs Fixed: Is

γX in random intercept model an unbiased estimator of γW?

  • Idea: equivalent to testing γC = 0.

— If so, then γB = γW and random intercept model is ok. — If not, then use ’fixed effects’ model. — Third choice: use contextual or compositional model. Contextual or compositional?

  • Fixed part is equivalent (same except for labelling of effects)

since XC = XBA for some non-singular matrix A. Equivalently we note that span(XC) = span(XB)

  • But the random part is different. Two random models are equiv-

alent if there is a single A such ZCj = ZBjA Consider using Xij − X·j and 1 in the random model instead of Xij and 1: Two solutions:

9

slide-10
SLIDE 10
  • 1. Choose according to desired random model. (S&B prefer contex-

tual, p. 80ff.) Then estimate desired parameters with ESTIMATE statements.

  • 2. Choose variables for fixed part to get estimates you want. Choose

variables for random part according to desired pattern of vari-

  • ance. Variables don’t have to be the same! As long as the fixed

model is equivalent to a model with the random effects, then the models are equivalent.

10

slide-11
SLIDE 11

Multivariate EBLUPs eg. β0j = γ00 − u0j β1j = γ10 − u1j with: Let be OLS estimation of

11

slide-12
SLIDE 12

Combine OLS with Empirical Prior Amount and direction of shrinkage depends on shape of T. Note what happens if we drop a random effect: Note: Shrinkage of works the same way. Just centre picture at Note: If T is "small" in some direciton, collapse into T.

12

slide-13
SLIDE 13

’Appendix’: Creating contextual variables in large data sets: Suppose we want to create a contextual variable and a within-cluster deviation for a variable X DATA MYDATA; INPUT GROUP X; CARDS; 1 10 1 20 1 30 2 2 10 2 20 3 40 3 50 ;; PROC MEANS; BY GROUP; VAR X; OUPUT OUT=NEW MEAN = M; RUN; DATA NEW; SET NEW; KEEP M GROUP; DATA TOGETHER; MERGE MYDATA NEW; BY GROUP; RUN; PROC PRINT DATA TOGETHER; RUN;

13