Estimating Distributional Parameters in Hierarchical Models - - PowerPoint PPT Presentation

β–Ά
estimating distributional
SMART_READER_LITE
LIVE PREVIEW

Estimating Distributional Parameters in Hierarchical Models - - PowerPoint PPT Presentation

Estimating Distributional Parameters in Hierarchical Models Introduction: Variability in Hierarchical Models Linear Models = 0 + 1 + (0, 2 ) Modelling central


slide-1
SLIDE 1

Estimating Distributional Parameters in Hierarchical Models

slide-2
SLIDE 2

Introduction: Variability in Hierarchical Models

slide-3
SLIDE 3

Linear Models

  • Modelling central tendency
  • Response (π‘§π‘—π‘˜) is a sum of intercept (𝛾0), slopes (𝛾1, 𝛾2, …), and error (π‘“π‘—π‘˜)
  • Error is assumed to be normally distributed around zero

π‘§π‘—π‘˜ = 𝛾0 + 𝛾1π‘Œπ‘—π‘˜ + π‘“π‘—π‘˜ π‘“π‘—π‘˜ ∼ 𝑂(0, Οƒ2)

slide-4
SLIDE 4

Linear Models

  • Modelling central tendency
  • Response (y) is a sum of intercept (implicit), slopes (pred), and error

(implicit)

  • Error is assumed to be normally distributed around zero

lm(y ~ pred)

slide-5
SLIDE 5

Linear Mixed Effects Models

  • Modelling central tendency
  • Response (π‘§π‘—π‘˜) is a sum of intercept (𝛾0), slopes (𝛾1, 𝛾2, …), random unit

intercepts (ΞΌ0𝑗), random unit slopes (ΞΌ1𝑗), and error (π‘“π‘—π‘˜)

  • Error, random intercepts, and random slopes are assumed to be normally

distributed around zero

π‘§π‘—π‘˜ = 𝛾0 + ΞΌ0𝑗 + (𝛾1 + ΞΌ1𝑗)π‘Œπ‘—π‘˜ + π‘“π‘—π‘˜ ΞΌ0𝑗 ∼ 𝑂(0, Οƒ2) ΞΌ1𝑗 ∼ 𝑂(0, Οƒ2) π‘“π‘—π‘˜ ∼ 𝑂(0, Οƒ2)

slide-6
SLIDE 6

Linear Mixed Effects Models

  • Modelling central tendency
  • Response (y) is a sum of intercept (implicit), slopes (pred), random unit

intercepts (pred || rand_unit), random unit slopes (pred | rand_unit), and error (implicit)

  • Error, random intercepts, and random slopes are assumed to be normally

distributed around zero

lmer(y ~ pred + (pred | rand_unit))

slide-7
SLIDE 7

Example Non-Gaussian Data: RT

  • 2AFC: does the word match the picture?
  • Congruency (2) x Predictability (12% – 100%)
  • 35 Subjects, 200 trials

+ bandage sardine

slide-8
SLIDE 8

Gamma Family GLMM

m_glmer <- glmer( rt ~ cong * pred + (cong * pred | subj) + (cong | image) + (1 | word), family = Gamma(identity), control = glmerControl(

  • ptimizer = β€œbobyqa”,
  • ptCtrl = list(maxfun = 2e5)

) )

slide-9
SLIDE 9

GLMM Results

slide-10
SLIDE 10

GLMM Results – Random Effects

summary(m_glmer)

slide-11
SLIDE 11

GLMM Results – Random Effects

ranef(m_glmer)

slide-12
SLIDE 12

GLMM Results – Random Effects

m_glmer %>% ranef() %>% as.data.frame()

slide-13
SLIDE 13

GLMM Results – Random Effects

ranef(m_glmer) %>% as_tibble() %>% filter(grpvar == β€œsubj") %>% mutate(grp = fct_reorder2(grp, term, condval)) %>% ggplot(aes( x = grp, y = condval, ymin = condval - condsd, ymax = condval + condsd )) + geom_pointrange(size=0.25) + facet_wrap(vars(term), scales="free", nrow=2)

slide-14
SLIDE 14

GLMM Results – Random Effects – Subject

slide-15
SLIDE 15

GLMM Results – Random Effects – Image

slide-16
SLIDE 16

GLMM Results – Random Effects – Word

slide-17
SLIDE 17
slide-18
SLIDE 18

Estimating Distributional Parameters in Hierarchical Models

slide-19
SLIDE 19

What if Meaningful Effects on Variance?

  • All glm variants model single parameters

(i.e. central tendency)

  • What if your effect looks like this?
slide-20
SLIDE 20

What if Meaningful Effects on Variance?

  • Mu is higher F(1, 1998) = 3237, p<.001
  • Sigma is higher Levene’s F(1, 1998) = 550, p<.001
slide-21
SLIDE 21

Assumption-free Distribution Comparison

  • Within a single model?
  • Assumption free distribution comparison (e.g.

Kolmogorov–Smirnov) could be one approach!

  • Overlapping index (Pastore & Calcagni, 2019)

from 0 (no overlap) to 1 (identical distribution)

slide-22
SLIDE 22

Assumption-free Distribution Comparison

x <- rnorm(1000, 10, 1), y <- rnorm(1000, 10.5, 1.5)

slide-23
SLIDE 23

Assumption-free Distribution Comparison

slide-24
SLIDE 24

Overlap Index Mu * Sigma Parameter Space

slide-25
SLIDE 25

Overlap Index Mu * Sigma Parameter Space

slide-26
SLIDE 26

Overlap Index Mu * Sigma Parameter Space

slide-27
SLIDE 27

Weirder Distribution Example

slide-28
SLIDE 28

Weirder Distribution Example

slide-29
SLIDE 29

Summary so far

  • Assumption-free approaches are flexible but don’t allow

us to test/make any specific predictions

  • Equivalent of shrugging and saying β€œyeah idk probs

something going on there” (though useful for very weird distributions)

  • Explicitly modelling multiple parameters of an assumed

distribution can give us more meaningful info

slide-30
SLIDE 30

Distributional Parameters in brms

brm( bf( dv ~ Intercept + iv + (iv | rand_unit), sigma ~ Intercept + iv + (iv | rand_unit) ), control = list( adapt_delta = 0.999, max_treedepth = 12 ), sample_all_pars = TRUE )

slide-31
SLIDE 31

Shifted Log-Normal Distribution

slide-32
SLIDE 32

Shifted Log-Normal Distribution

slide-33
SLIDE 33

Bayesian Shifted Log-Normal Mixed Effects Model with Distributional Parameters

brms::bf( rt ~ Intercept + cong * pred + (cong * pred | subj) + (cong | image) + (1 | word), sigma ~ rt ~ Intercept + cong * pred + (cong * pred | subj) + (cong | image) + (1 | word), ndt ~ rt ~ Intercept + cong * pred + (cong * pred | subj) + (cong | image) + (1 | word) )

slide-34
SLIDE 34
slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38
slide-39
SLIDE 39
slide-40
SLIDE 40

ranef(m_bme)

Bayesian Results – Random Effects

ID (e.g. subj_01, subj_02…) * value (est, err, Q2.5, Q97.5) * fixed parameter

slide-41
SLIDE 41
slide-42
SLIDE 42
slide-43
SLIDE 43
slide-44
SLIDE 44
slide-45
SLIDE 45

Caveats

  • Computationally intensive if using non-

informative priors for complex hierarchical formulae

  • Have to avoid temptation to try over-infer about

mechanisms unless using more cognitively informed models (e.g. drift diffusion)

slide-46
SLIDE 46

Summary

Hierarchical models with maximal structures for distributional parameters are a robust and appropriate way of looking at or accounting for subject/item/etc variability in fixed effects when you’re interested in more than central tendency. But, if you can assume no systematic differences in distributional parameters, GLMMs will suffice (and save you a lot of time and effort)!