mixed effect probit regression
play

Mixed effect probit regression Genotypic fungal resistance Dr. - PowerPoint PPT Presentation

Mixed effect probit regression Genotypic fungal resistance Dr. Jarad Niemi STAT 544 - Iowa State University April 28, 2016 Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 1 / 31 Outline Probit regression Bayesian


  1. Mixed effect probit regression Genotypic fungal resistance Dr. Jarad Niemi STAT 544 - Iowa State University April 28, 2016 Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 1 / 31

  2. Outline Probit regression Bayesian probit regression Data augmentation Bayesian mixed effect probit regression Extensions Ordinal categorical data Nominal categorical data Bayesian logistic regression Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 2 / 31

  3. Probit regression Consider the model ind ∼ Ber ( θ i ) Y i where, for the i th observation, Y i is binary indicating success and θ i is the probability of success. A probit regression model assumes θ i = Φ ( X ⊤ i β ) where X i are the explanatory variables for the i th observation, Φ is the standard normal cumulative distribution function, and β is the vector of parameters to be estimated. Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 3 / 31

  4. Low birth weight low age lwt race smoke ptl ht Min. :0.0000 Min. :14.00 Min. : 80.0 1:96 Min. :0.0000 Min. :0.0000 Min. :0.00000 1st Qu.:0.0000 1st Qu.:19.00 1st Qu.:110.0 2:26 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 Median :0.0000 Median :23.00 Median :121.0 3:67 Median :0.0000 Median :0.0000 Median :0.00000 Mean :0.3122 Mean :23.24 Mean :129.8 Mean :0.3915 Mean :0.1958 Mean :0.06349 3rd Qu.:1.0000 3rd Qu.:26.00 3rd Qu.:140.0 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:0.00000 Max. :1.0000 Max. :45.00 Max. :250.0 Max. :1.0000 Max. :3.0000 Max. :1.00000 ui ftv bwt Min. :0.0000 Min. :0.0000 Min. : 709 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:2414 Median :0.0000 Median :0.0000 Median :2977 Mean :0.1481 Mean :0.7937 Mean :2945 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:3487 Max. :1.0000 Max. :6.0000 Max. :4990 Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 4 / 31

  5. m = glm(low~., family=binomial(link=probit), data=birthwt[,-10]); summary(m) Call: glm(formula = low ~ ., family = binomial(link = probit), data = birthwt[, -10]) Deviance Residuals: Min 1Q Median 3Q Max -1.8848 -0.8271 -0.5217 0.9903 2.2445 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.31431 0.24893 -5.280 1.29e-07 *** age -0.09774 0.11482 -0.851 0.39466 lwt -0.27281 0.12217 -2.233 0.02555 * race2 0.74961 0.31431 2.385 0.01708 * race3 0.52183 0.25557 2.042 0.04117 * smoke 0.56910 0.23469 2.425 0.01531 * ptl 0.31968 0.20835 1.534 0.12495 ht 1.11161 0.41664 2.668 0.00763 ** ui 0.46517 0.27930 1.665 0.09581 . ftv 0.02832 0.10161 0.279 0.78050 --- Signif. codes: 0 ✬ *** ✬ 0.001 ✬ ** ✬ 0.01 ✬ * ✬ 0.05 ✬ . ✬ 0.1 ✬ ✬ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 234.67 on 188 degrees of freedom Residual deviance: 201.03 on 179 degrees of freedom AIC: 221.03 Number of Fisher Scoring iterations: 5 Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 5 / 31

  6. Bayesian analysis Bayesian probit regression Consider the model ind Y i ∼ Ber ( θ i ) = Φ ( X ⊤ θ i i β ) with prior β ∼ N ( b , B ) The posterior distribution is p ( β | y ) ∝ p ( y | β ) p ( β ) �� n e − ( β − b ) ⊤ B − 1 ( β − b ) / 2 i β )] 1 − y i � i =1 Φ ( X ′ i β ) y i [1 − Φ ( X ′ ∝ But neither p ( β | y ) nor p ( β p | y , β − p ) are a known distribution. Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 6 / 31

  7. Bayesian analysis Data augmentation Data augmentation An alternative construction of the model is = I ( ζ i > 0) Y i ind ∼ N ( X ′ ζ i i β, 1) Note that θ i = P ( Y i = 1) = P ( ζ i > 0) = P ( X ′ i β + ǫ > 0) ǫ ∼ N (0 , 1) = P ( ǫ > − X ′ i β ) = P ( ǫ < X ′ i β ) symmetry of standard normal = Φ ( X ′ i β ) Thus, this is equivalent to the probit regression model. Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 7 / 31

  8. Bayesian analysis Data augmentation Posterior distribution Now, the likelihood is n � p ( y | ζ ) ∝ [ I ( ζ i > 0) I ( y i = 1) + I ( ζ i ≤ 0) I ( y i = 0)] i =1 and ind ∼ N ( X ′ ζ i i β, 1) β ∼ N ( b , B ) Therefore the complete data likelihood is n � N ( ζ i | X ′ p ( y , ζ | β ) ∝ i β, 1) [ I ( ζ i > 0) I ( y i = 1) + I ( ζ i ≤ 0) I ( y i = 0)] i =1 Thus the posterior distribution is p ( β, ζ | y ) ∝ p ( y | ζ, β ) p ( ζ, β ) = p ( y | ζ ) p ( ζ | β ) p ( β ) = p ( y , ζ | β ) p ( β ) and we will derive the full conditionals for p ( β | ζ, y ) and p ( ζ | β, y ). Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 8 / 31

  9. Bayesian analysis Data augmentation Full conditional for β The full conditional for β is p ( β | . . . ) ∝ p ( y | ζ ) p ( ζ | β ) p ( β ) ∝ p ( ζ | β ) p ( β ) [ � n i =1 N ( ζ i | X ′ = i β, 1)] N ( β | b , B ) = N ( ζ | X β, I ) N ( β | b , B ) and thus β | . . . ∼ N (ˆ β, ˆ Σ β ) with = [ B − 1 + X ⊤ X ] − 1 ˆ Σ β ˆ = ˆ Σ β [ B − 1 b + X ⊤ ζ ] β Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 9 / 31

  10. Bayesian analysis Data augmentation Full conditional for ζ The full conditional for ζ is p ( ζ | . . . ) ∝ p ( y | ζ ) p ( ζ | β ) p ( β ) ∝ p ( y | ζ ) p ( ζ | β ) � n i =1 N ( ζ i | X ′ = i β, 1) [ I ( ζ i > 0) I ( y i = 1) + I ( ζ i ≤ 0) I ( y i = 0)] Thus the ζ i are conditionally independent with distribution � N ( ζ i | X ′ i β, 1) I ( ζ i > 0) if y i = 1 p ( ζ i | y i , β ) = N ( ζ i | X ′ i β, 1) I ( ζ i ≤ 0) if y i = 0 These can be drawn using the modified inverse cdf method. Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 10 / 31

  11. Bayesian analysis Data augmentation mcmc = function(n_iter, y, X, beta0, Sigma_beta) { n = nrow(X) p = ncol(X) # Precalculate quantities y = (as.numeric(y)==1) n1 = sum( y) n0 = sum(!y) XX = t(X)%*%X Si = solve(Sigma_beta) Sib = Si%*%beta0 # Saving structures beta_keep = matrix(NA, n_iter, p) zeta_keep = matrix(NA, n_iter, n) # Initial values m = glm(y~X-1, family=binomial(probit)) beta = coef(m) zeta = rep(NA,n) for (i in 1:n_iter) { # Sample zeta Xb = X%*%beta cut = pnorm(0,Xb) zeta[ y] = qnorm(runif(n1, cut[ y], 1), Xb[ y], 1) zeta[!y] = qnorm(runif(n0, 0, cut[!y]), Xb[!y], 1) # Sample beta S_hat = solve(Si+XX) b_hat = S_hat %*% (Sib+t(X)%*%zeta) beta = mvrnorm(1, b_hat, S_hat) # Record values beta_keep[i,] = beta zeta_keep[i,] = zeta } Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 11 / 31

  12. Bayesian analysis Data augmentation Run the MCMC X = model.matrix(m) # Constructs the design matrix p = ncol(X) n_iter = 10000 system.time(out <- mcmc(n_iter, birthwt$low, X, rep(0,p), 3*diag(p))) user system elapsed 2.763 0.009 2.775 Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 12 / 31

  13. Bayesian analysis Data augmentation (Intercept) age lwt race2 0.25 −0.5 0.0 1.5 0.00 −0.2 −1.0 1.0 −0.4 0.5 −0.25 −1.5 −0.6 0.0 −0.50 −2.0 0 2500 5000 7500 10000 0 2500 5000 7500 10000 0 2500 5000 7500 10000 0 2500 5000 7500 10000 race3 smoke ptl ht 2.5 1.0 1.0 2.0 1.0 1.5 value 0.5 0.5 0.5 1.0 0.5 0.0 0.0 0.0 0.0 −0.5 0 2500 5000 7500 10000 0 2500 5000 7500 10000 0 2500 5000 7500 10000 0 2500 5000 7500 10000 ui ftv 0.4 1.0 0.2 0.5 0.0 0.0 −0.2 −0.5 0 2500 5000 7500 10000 0 2500 5000 7500 10000 iteration Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 13 / 31

  14. Bayesian analysis Data augmentation (Intercept) age lwt race2 0.8 0.8 0.8 0.8 ACF ACF ACF ACF 0.0 0.0 0.0 0.0 0 20 40 0 20 40 0 20 40 0 20 40 Lag Lag Lag Lag race3 smoke ptl ht 0.8 0.8 0.8 0.8 ACF ACF ACF ACF 0.0 0.0 0.0 0.0 0 20 40 0 20 40 0 20 40 0 20 40 Lag Lag Lag Lag ui ftv 0.8 0.8 ACF ACF 0.0 0.0 0 20 40 0 20 40 Lag Lag Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 14 / 31

  15. Bayesian analysis Data augmentation Credible intervals Source: local data frame [10 x 4] variable ess lb ub (fctr) (dbl) (dbl) (dbl) 1 (Intercept) 2958 -1.75 -0.81 2 age 2773 -0.33 0.12 3 lwt 2516 -0.52 -0.05 4 race2 4766 0.11 1.32 5 race3 3389 -0.01 0.98 6 smoke 3069 0.09 1.01 7 ptl 5416 -0.07 0.72 8 ht 3692 0.26 1.87 9 ui 4269 -0.08 0.98 10 ftv 2910 -0.18 0.22 Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 15 / 31

  16. Probit regression with random effects Probit regression with random effects Consider the probit regression model = I ( ζ i > 0) Y i ∼ N ( ˜ X ˜ ζ β, 1) where ˜ ˜ β = ( β, α ) ⊤ X = [ X Zm ] where X is the design matrix for fixed effects and Zm is the design matrix for the random effects. A common assumption is that the random effects are α ∼ N (0 , σ 2 I ). Thus the distribution on ˜ β is � β �� b � B � � �� 0 ˜ β = ∼ N , σ 2 I α 0 0 where the precision is � B � B − 1 � − 1 � 0 0 = σ 2 I 1 0 0 σ 2 I Jarad Niemi (STAT544@ISU) Mixed effect probit regression April 28, 2016 16 / 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend