exploring and modeling dichotomous outcomes
play

Exploring and Modeling Dichotomous Outcomes Brandon LeBeau - PowerPoint PPT Presentation

DataCamp Longitudinal Analysis in R LONGITUDINAL ANALYSIS IN R Exploring and Modeling Dichotomous Outcomes Brandon LeBeau Assistant Professor DataCamp Longitudinal Analysis in R Dichotomous outcomes Dichotomous or binary outcomes take two


  1. DataCamp Longitudinal Analysis in R LONGITUDINAL ANALYSIS IN R Exploring and Modeling Dichotomous Outcomes Brandon LeBeau Assistant Professor

  2. DataCamp Longitudinal Analysis in R Dichotomous outcomes Dichotomous or binary outcomes take two values Examples: 0 = No, 1 = Yes 0 = Not Present, 1 = Present 0 = Not Proficient, 1 = Proficient 0 = No symptoms, 1 = Symptoms

  3. DataCamp Longitudinal Analysis in R Exploring data with dichotomous outcomes library(HSAUR2) head(toenail, n = 10) patientID outcome treatment time visit 1 1 moderate or severe terbinafine 0.0000000 1 2 1 moderate or severe terbinafine 0.8571429 2 3 1 moderate or severe terbinafine 3.5357140 3 4 1 none or mild terbinafine 4.5357140 4 5 1 none or mild terbinafine 7.5357140 5 6 1 none or mild terbinafine 10.0357100 6 7 1 none or mild terbinafine 13.0714300 7 8 2 none or mild itraconazole 0.0000000 1 9 2 none or mild itraconazole 0.9642857 2 10 2 moderate or severe itraconazole 2.0000000 3

  4. DataCamp Longitudinal Analysis in R Generalized linear mixed model (GLMM) Explores the log-odds of success Success refers to the outcome coded as 1 Continuous models are not appropriate due to predictions often being out of bounds due to mean and variance being related

  5. DataCamp Longitudinal Analysis in R Changes in the outcome variable over time toenail <- toenail %>% mutate(outcome_dich = ifelse(outcome == "none or mild", 1, 0), visit_0 = visit - 1) toenail %>% group_by(visit_0) %>% summarise(prop_outcome = mean(outcome_dich), num = n()) visit_0 prop_outcome num <dbl> <dbl> <int> 1 0 0.629 294 2 1 0.663 288 3 2 0.703 283 4 3 0.787 272 5 4 0.916 263 6 5 0.926 244 7 6 0.924 264

  6. DataCamp Longitudinal Analysis in R Fitting GLMM with lme4 Fitting GLMMs with lme4 are similar to previous chapters Two additions: use glmer instead of lmer specify family = binomial argument toe_output <- glmer(outcome_dich ~ 1 + visit_0 + treatment + ( 1 | patientID), data = toenail, family = binomial) summary(toe_output)

  7. DataCamp Longitudinal Analysis in R GLMM output Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod] Family: binomial ( logit ) Formula: outcome_dich ~ 1 + visit_0 + treatment + (1 | patientID) Data: toenail AIC BIC logLik deviance df.resid 1260.3 1282.6 -626.2 1252.3 1904 Random effects: Groups Name Variance Std.Dev. patientID (Intercept) 21.97 4.687 Number of obs: 1908, groups: patientID, 294 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.96335 0.81901 2.397 0.0165 * visit_0 0.91153 0.07433 12.263 <2e-16 *** treatmentterbinafine 0.69688 0.68696 1.014 0.3104

  8. DataCamp Longitudinal Analysis in R LONGITUDINAL ANALYSIS IN R Time to practice!

  9. DataCamp Longitudinal Analysis in R LONGITUDINAL ANALYSIS IN R Generalized Estimating Equations (GEE) Brandon LeBeau Assistant Professor

  10. DataCamp Longitudinal Analysis in R Introduction to geepack Let's fit a first GEE model using the geepack package geeglm() is the model fitting function toenail <- toenail %>% mutate(outcome_dich = ifelse(outcome == "none or mild", 1, 0), visit_0 = visit - 1) # Fit GEE model gee_toe <- geeglm(outcome_dich ~ 1 + visit_0, data = toenail, id = patientID, family = binomial, scale.fix = TRUE) # Extract model summary summary(gee_toe)

  11. DataCamp Longitudinal Analysis in R geeglm output Call: geeglm(formula = outcome_dich ~ 1 + visit_0, family = binomial, data = toenail, id = patientID, scale.fix = TRUE) Coefficients: Estimate Std.err Wald Pr(>|W|) (Intercept) 0.35522 0.13122 7.328 0.00679 ** visit_0 0.38319 0.03728 105.673 < 2e-16 ***

  12. DataCamp Longitudinal Analysis in R Specifying working correlations An optional argument, corstr is used to control the working correlation matrix Accounts for the dependency due to repeated measures The default is independence # Fit GEE model gee_toe <- geeglm(outcome_dich ~ 1 + visit_0, data = toenail, id = patientID, family = binomial, corstr = 'exchangeable', scale.fix = TRUE) # Extract model summary summary(gee_toe)

  13. DataCamp Longitudinal Analysis in R GEE exchangeable output Here is the exchangeable output Call: geeglm(formula = outcome_dich ~ 1 + visit_0, family = binomial, data = toenail, id = patientID, corstr = "exchangeable", scale.fix = TRUE) Coefficients: Estimate Std.err Wald Pr(>|W|) (Intercept) 0.3332 0.1345 6.14 0.013 * visit_0 0.3797 0.0363 109.29 <2e-16 ***

  14. DataCamp Longitudinal Analysis in R Other working correlation structures corstr = "ar1" corstr = "unstructured" Example: correlation = 0.5 [,1] [,2] [,3] [,4] [,5] [,1] [,2] [,3] [,4] [,5] [1,] 1.0000 0.500 0.25 0.125 0.0625 [1,] 1.000 0.559 0.492 0.363 0.082 [2,] 0.5000 1.000 0.50 0.250 0.1250 [2,] 0.559 1.000 0.398 0.250 0.139 [3,] 0.2500 0.500 1.00 0.500 0.2500 [3,] 0.492 0.590 1.000 0.071 0.209 [4,] 0.1250 0.250 0.50 1.000 0.5000 [4,] 0.398 0.493 0.629 1.000 0.166 [5,] 0.0625 0.125 0.25 0.500 1.0000 [5,] 0.363 0.313 0.426 0.604 1.000

  15. DataCamp Longitudinal Analysis in R LONGITUDINAL ANALYSIS IN R Try GEE models!

  16. DataCamp Longitudinal Analysis in R LONGITUDINAL ANALYSIS IN R Model Selection Brandon LeBeau Assistant Professor

  17. DataCamp Longitudinal Analysis in R QIC QIC = quasi-likelihood under the independence model criterion GEE does not use maximum likelihood estimation like GLMM QIC needed for GEE MuMIn package calculates this statistic library(MuMIn) toenail <- toenail %>% mutate(outcome_dich = ifelse(outcome == "none or mild", 1, 0), visit_0 = visit - 1) # Fit GEE model gee_toe <- geeglm(outcome_dich ~ 1 + visit_0, data = toenail, id = patientID, family = binomial, scale.fix = TRUE) QIC(gee_toe) QIC 1828.552

  18. DataCamp Longitudinal Analysis in R Evaluating working correlation QIC can help select working correlation matrix # Fit GEE model gee_ind <- geeglm(outcome_dich ~ 1 + visit_0, data = toenail, id = patientID, family = binomial, scale.fix = TRUE) gee_exch <- geeglm(outcome_dich ~ 1 + visit_0, data = toenail, id = patientID, family = binomial, scale.fix = TRUE, corstr = 'exchangeable') gee_ar1 <- geeglm(outcome_dich ~ 1 + visit_0, data = toenail, id = patientID, family = binomial, scale.fix = TRUE, corstr = 'ar1') QIC(gee_ind, gee_exch, gee_ar1) QIC gee_ind 1828.552 gee_exch 1828.564 gee_ar1 1827.805

  19. DataCamp Longitudinal Analysis in R Model selection GLMM aictab() function from AICcmodavg package can be used for GLMM library(AICcmodavg) toe_baseline <- glmer(outcome_dich ~ 1 + visit_0 + ( 1 | patientID), data = toenail, family = binomial) toe_output <- glmer(outcome_dich ~ 1 + visit_0 + treatment + ( 1 | patientID), data = toenail, family = binomial) aictab(list(toe_baseline, toe_output), c("no treatment", "treatement")) Model selection based on AICc: K AICc Delta_AICc AICcWt Cum.Wt LL no treatment 3 1259.40 0.00 0.62 0.62 -626.69 treatement 4 1260.36 0.97 0.38 1.00 -626.17

  20. DataCamp Longitudinal Analysis in R LONGITUDINAL ANALYSIS IN R Time to practice model selection!

  21. DataCamp Longitudinal Analysis in R LONGITUDINAL ANALYSIS IN R Interpreting and Visualizing Model Results Brandon LeBeau Assistant Professor

  22. DataCamp Longitudinal Analysis in R Visualize GLMM Generate predicted values with predict() function toe_output <- glmer(outcome_dich ~ 1 + visit_0 + treatment + ( 1 | patientID), data = toenail, family = binomial) toenail <- toenail %>% mutate(pred_values = predict(toe_output)) ggplot(toenail, aes(x = visit_0, y = pred_values)) + geom_line(aes(group = patientID), linetype = 2) + theme_bw(base_size = 16) + xlab("Visit Number") + ylab("Predicted Values")

  23. DataCamp Longitudinal Analysis in R

  24. DataCamp Longitudinal Analysis in R Visualize GLMM - probabilities Often the probability metric is more intuitive predict() function with argument type = "response" will give probabilities toenail <- toenail %>% mutate(pred_values = predict(toe_output, type = "response")) ggplot(toenail, aes(x = visit_0, y = pred_values)) + geom_line(aes(group = patientID), linetype = 2) + theme_bw(base_size = 16) + xlab("Visit Number") + ylab("Prob of none or mild separation")

  25. DataCamp Longitudinal Analysis in R

  26. DataCamp Longitudinal Analysis in R Visualize GEE predict() can again be used here as with GLMMs gee_toe <- geeglm(outcome_dich ~ 1 + visit_0 + treatment, data = toenail, id = patientID, family = binomial, corstr = 'exchangeable', scale.fix = TRUE) toenail_gee <- toenail %>% mutate(pred_gee = predict(gee_toe, type = "response")) ggplot(toenail_gee, aes(x = visit_0, y = pred_gee)) + geom_line(aes(color = treatment)) + theme_bw(base_size = 16) + xlab("Visit Number") + ylab("Probability of none or mild separation")

  27. DataCamp Longitudinal Analysis in R

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend