Mixed Eect Models Danielle Quinn PhD Candidate, Memorial University - - PowerPoint PPT Presentation

mixed e ect models
SMART_READER_LITE
LIVE PREVIEW

Mixed Eect Models Danielle Quinn PhD Candidate, Memorial University - - PowerPoint PPT Presentation

Regression Modeling in R: Case Studies REGRESSION MODELING IN R : CASE STUDIES Mixed Eect Models Danielle Quinn PhD Candidate, Memorial University Regression Modeling in R: Case Studies Before you start modeling... Regression Modeling in R:


slide-1
SLIDE 1

Regression Modeling in R: Case Studies

Mixed Eect Models

REGRESSION MODELING IN R: CASE STUDIES

Danielle Quinn

PhD Candidate, Memorial University

slide-2
SLIDE 2

Regression Modeling in R: Case Studies

Before you start modeling...

slide-3
SLIDE 3

Regression Modeling in R: Case Studies

Before you start modeling...

  • rchids

site abundance richness humidity tree_age 1 a 11 7 59.5 14 2 a 10 4 70.4 12 3 a 13 4 73.4 9

slide-4
SLIDE 4

Regression Modeling in R: Case Studies

What is the research question?

How does tree age inuence the number of orchid species present?

slide-5
SLIDE 5

Regression Modeling in R: Case Studies

Grouped data

data belonging to the same group are correlated this violates assumptions about the independence of observations

slide-6
SLIDE 6

Regression Modeling in R: Case Studies

Fixed eects

Fixed eect: an unknown constant that is estimated from the data

linear_glm <- glm(richness ~ tree_age + site, data = orchids, family = "gaussian")

slide-7
SLIDE 7

Regression Modeling in R: Case Studies

The cost of xed eects

Treating site as a xed eect means that we are estimating parameters for each of the eight individual sites Need to have sucient data for each group Adding parameters costs us degrees of freedom

linear_glm$coefficients (Intercept) tree_age siteb sitec sited sitee 2.77185153 0.09323541 0.78985312 -2.06398531 -2.21142636 -0.68811751 sitef siteg siteh

  • 1.41014688 -2.17285272 -1.79616157
slide-8
SLIDE 8

Regression Modeling in R: Case Studies

What are we actually trying to do with our data?

How does tree age inuence the number of orchid species present?

slide-9
SLIDE 9

Regression Modeling in R: Case Studies

Random eects

Random eect: the model aims to estimate the distribution of the eect rather than estimate the eect itself as a constant Concerned with the wider population rather than the individuals sampled

slide-10
SLIDE 10

Regression Modeling in R: Case Studies

Random intercept model

Estimates a distribution of the random eect of site on intercept

random = ~1 | site species a random intercept model where site will

act as a random eect to inuence the intercept of the linear model

slide-11
SLIDE 11

Regression Modeling in R: Case Studies

Time to practice!

REGRESSION MODELING IN R: CASE STUDIES

slide-12
SLIDE 12

Regression Modeling in R: Case Studies

Model Selection and Interpretation

REGRESSION MODELING IN R: CASE STUDIES

Danielle Quinn

PhD Candidate, Memorial University

slide-13
SLIDE 13

Regression Modeling in R: Case Studies

Model comparison

How do we know if we need a random intercept model?

slide-14
SLIDE 14

Regression Modeling in R: Case Studies

Fit methods for comparing models

Gaussian GLM: Iteratively Reweighted Least Squares GLS linear model: Generalized Least Squares (GLS) Mixed eects model: Restricted Maximum Likelihood (REML)

gaussian_glm <- glm(richness ~ tree_age, data = orchids, family = "gaussian") gls_model <- gls(richess ~ tree_age, data = orchids) random_int_model <- lme(richness ~ tree_age, random = ~1 | site, data = orchids, method = "REML")

slide-15
SLIDE 15

Regression Modeling in R: Case Studies

Applying a likelihood ratio test

What does the p-value tell us? Has the AIC value improved?

# Apply likelihood ratio test to compare models anova(gls_model, random_int_model) Model df AIC BIC logLik Test L.Ratio p-val gls_model 1 3 623.9579 633.1456 -308.9789 random_int_model 2 4 577.7961 590.0464 -284.8980 1 vs 2 48.16181 <.00

slide-16
SLIDE 16

Regression Modeling in R: Case Studies

Correcting the p-value

Model df AIC BIC logLik Test L.Ratio p-val gls_model 1 3 623.9579 633.1456 -308.9789 random_int_model 2 4 577.7961 590.0464 -284.8980 1 vs 2 48.16181 <.00 LR <- ((-308.9789) - (-284.8980)) * -2 (1 - pchisq(LR, 1)) * 0.5 1.962319e-12

slide-17
SLIDE 17

Regression Modeling in R: Case Studies

Calculating variance of random intercept

Estimated variance of random intercept (d) = 1.019076 ^ 2 = 1.038516

random_int_model Linear mixed-effects model fit by REML Data: orchids Log-restricted-likelihood: -284.898 Fixed: richness ~ tree_age (Intercept) tree_age 1.63093599 0.08444773 Random effects: Formula: ~1 | site (Intercept) Residual StdDev: 1.019076 1.33274 Number of Observations: 160 Number of Groups: 8

slide-18
SLIDE 18

Regression Modeling in R: Case Studies

Extracting variance of random intercept

VarCorr(random_int_model) site = pdLogChol(1) Variance StdDev (Intercept) 1.038515 1.019076 Residual 1.776197 1.332740 # Extract variance value from position 1,1 VarCorr(random_int_model)[1,1] 1.038515

slide-19
SLIDE 19

Regression Modeling in R: Case Studies

Take a closer look at your random intercept model

REGRESSION MODELING IN R: CASE STUDIES

slide-20
SLIDE 20

Regression Modeling in R: Case Studies

Visualizing a random intercept model

REGRESSION MODELING IN R: CASE STUDIES

Danielle Quinn

PhD Candidate, Memorial University

slide-21
SLIDE 21

Regression Modeling in R: Case Studies

Where do we start?

ggplot(orchids) + geom_jitter(aes(x = tree_age, y = richness, col = site))

slide-22
SLIDE 22

Regression Modeling in R: Case Studies

Population level predictions

Generated from the xed component of the model Overall relationship between response and xed predictor variables

slide-23
SLIDE 23

Regression Modeling in R: Case Studies

Generating population level predictions

pred_df.fixed <- data.frame(tree_age = seq(from = 5, to = 20, length = 10)) # Generate population level predictions pred_df.fixed$predicted <- predict(random_int_model, pred_df.fixed, level = 0) head(pred_df.fixed) tree_age predicted 1 5.000000 2.053175 2 6.666667 2.193921 3 8.333333 2.334667 4 10.000000 2.475413 5 11.666667 2.616159 6 13.333333 2.756906

slide-24
SLIDE 24

Regression Modeling in R: Case Studies

Visualizing population level predictions

ggplot(orchids) + geom_jitter(aes(x = tree_age, y = richness, col = site)) + geom_line(aes(x = tree_age, y = predicted), size = 2, data = pred_df.fixed)

slide-25
SLIDE 25

Regression Modeling in R: Case Studies

Visualizing population level predictions

slide-26
SLIDE 26

Regression Modeling in R: Case Studies

Generating random eect predictions

pred_df.random <- expand.grid(tree_age = seq(from = 5, to = 20, length = 10 site = unique(orchids$site)) # Generate random effect predictions pred_df.random$random <- predict(random_int_model, pred_df.random, level = 1) pred_df.random tree_age site random 1 5.000000 a 3.202164 2 6.666667 a 3.342910 3 8.333333 a 3.483656 ... ... 78 16.66667 h 2.519369 79 18.33333 h 2.660115 80 20.00000 h 2.800861

slide-27
SLIDE 27

Regression Modeling in R: Case Studies

Visualizing the random eect

ggplot(orchids) + geom_jitter(aes(x = tree_age, y = richness, col = site)) + geom_line(aes(x = tree_age, y = predicted), size = 2, data = pred_df.fixed) + geom_line(aes(x = tree_age, y = random, col = site), data = pred_df.random)

slide-28
SLIDE 28

Regression Modeling in R: Case Studies

Visualizing the random eect

slide-29
SLIDE 29

Regression Modeling in R: Case Studies

Time to practice!

REGRESSION MODELING IN R: CASE STUDIES