bayesian analysis of choice data
play

Bayesian Analysis of Choice Data Simon Jackman Stanford University - PowerPoint PPT Presentation

Bayesian Analysis of Choice Data Simon Jackman Stanford University http://jackman.stanford.edu/BASS February 3, 2012 Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 1 / 1 Discrete Choice binary (e.g., probit


  1. Bayesian Analysis of Choice Data Simon Jackman Stanford University http://jackman.stanford.edu/BASS February 3, 2012 Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 1 / 1

  2. Discrete Choice binary (e.g., probit model; we looked at with data augmentation) Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 2 / 1

  3. Discrete Choice binary (e.g., probit model; we looked at with data augmentation) ordinal (ordinal logit or probit) Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 2 / 1

  4. Discrete Choice binary (e.g., probit model; we looked at with data augmentation) ordinal (ordinal logit or probit) multinomial models for unordered choices: e.g., multinomial logit (MNL), multinomial probit (MNP). Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 2 / 1

  5. Discrete Choice binary (e.g., probit model; we looked at with data augmentation) ordinal (ordinal logit or probit) multinomial models for unordered choices: e.g., multinomial logit (MNL), multinomial probit (MNP). We won’t consider models for ‘‘tree-like’’ choice structures (nested logit, GEV, etc). Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 2 / 1

  6. Binary Choices: logit or probit for ‘‘standard’’ models (e.g., no ‘‘fancy’’ hierarchical structure, no concerns re missing data etc), other avenues besides BUGS / JAGS e.g., MCMCpack implementations in BUGS / JAGS : don’t use data augmentation a la Albert & Chib (1991). dbern or dbin and sample from the conditional distributions using Metropolis-within-Gibbs, slice sampling Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 3 / 1

  7. Binary Choices: logit or probit Voter turnout example. JAGS code 1 model{ 2 for (i in 1:N){ ## loop over observations 3 y[i] ~ dbern(p[i]) ## binary outcome 4 logit(p[i]) <- ystar[i] ## logit link 5 ystar[i] <- beta[1] ## regression structure for covariates 6 + beta[2]*educ[i] 7 + beta[3]*(educ[i]*educ[i]) 8 + beta[4]*age[i] 9 + beta[5]*(age[i]*age[i]) 10 + beta[6]*south[i] 11 + beta[7]*govelec[i] 12 + beta[8]*closing[i] 13 + beta[9]*(closing[i]*educ[i]) 14 + beta[10]*(educ[i]*educ[i]*closing[i]) 15 } 16 17 ## priors 18 beta[1:10] ~ dmnorm(mu[] , B[ , ]) # diffuse multivariate Normal prior 19 # see data file 20 } Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 4 / 1

  8. Binary Data Is Binomial Data when Grouped (§8.1.4) big, micro-level data sets with binary data (e.g., CPS) MCMC gets slow collapse the data into covariate classes , treat as binomial data; much smaller data set, much shorter run-times y i | x i ∼ Bernoulli ( F [ x i b ]) , where x i is a vector of covariates. Covariate classes : a set C = { i : x i = x C } i.e., the set of respondents who have covariate vector x C . probability assignments over y i ∀ i ∈ C are conditionally exchangeable given their common x i and b . binomial model r C ∼ Binomial ( p C ; n C ) , where p C = F ( x C b ) , r C = � i ∈C y i is the number of ‘‘successes’’ in C and n C is the cardinality of C . Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 5 / 1

  9. Example 8.5; binomial model for grouped binary data Form covariate classes, and groupedData object; original data set n ≈ 99000; only 636 unique covariates classes. R code 1 ## collapse by covariate classes 2 X <- cbind(nagler$age,nagler$educYrs) 3 X <- apply(X,1,paste,collapse=":") 4 covClasses <- match(X,unique(X)) 5 covX <- matrix(unlist(strsplit(unique(X),":")),ncol=2,byrow=TRUE) 6 r <- tapply(nagler$turnout,covClasses,sum) 7 n <- tapply(nagler$turnout,covClasses,length) 8 groupedData <- list(n=n,r=r, 9 age=as.numeric(covX[,1]), 10 educYrs=as.numeric(covX[,2]), 11 NOBS=length(n)) Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 6 / 1

  10. Example 8.5; binomial model for grouped binary data We can then pass the groupedData data frame to JAGS . We specify the binomial model r i ∼ Binomial ( p i ; n i ) with p i = F ( x i b ) and vague normal priors on b with the following code: JAGS code 1 model{ 2 for (i in 1:NOBS){ 3 logit(p[i]) <- beta[1] + age[i]*beta[2] 4 + pow(age[i],2)*beta[3] 5 + educYrs[i]*beta[4] 6 + pow(educYrs[i],2)*beta[5] 7 r[i] ~ dbin(p[i],n[i]) ## binomial model for each covariate class 8 } 9 10 11 beta[1:5] ~ dmnorm(b0[],B0[,]) 12 } 13 Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 7 / 1

  11. Ordinal Responses e.g., 7-point scale when measuring party identification in the U.S., assigning the numerals y i ∈ { 0 , . . . , 6 } to the categories {‘‘Strong Republican’’, ‘‘Weak Republican’’, . . . , ‘‘Strong Democrat’’}. Censored, latent variable representation: y * e i ∼ N ( 0 , r 2 ), = x i b + e i , i = 1 , . . . , n . i y * y i = 0 ⇐ ⇒ i < s 1 s j < y * y i = j ⇐ ⇒ i ≤ s j + 1 , j = 1 , . . . , J - 1 y * y i = J ⇐ ⇒ i > s J threshold parameters obey the ordering constraint s 1 < s 2 < . . . < s J . The assumption of normality for e i generates the probit version of the model; a logistic density generates the ordinal logistic model. Bayesian analysis: we want p ( b , s | y , X ) ∝ p ( y | X , b , s ) p ( b , s ) . Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 8 / 1

  12. Ordinal responses, y * i ∼ N ( x i b , r 2 ) Pr [ y i = j ] = U [( s j + 1 - x i b )/ r ] - U [( s j - x i b )/ r ] Pr(y=0) Pr(y=1) Pr(y=2) Pr(y=3) Pr(y=4) Pr(y=5) Pr(y=6) τ 0 τ 1 τ τ 2 τ 3 τ τ 4 τ 5 Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 9 / 1

  13. Identification y * e i ∼ N ( 0 , r 2 ), i = 1 , . . . , n . = x i b + e i , i y * y i = 0 ⇐ ⇒ i < s 1 s j < y * y i = j ⇐ ⇒ i ≤ s j + 1 , j = 1 , . . . , J - 1 y * y i = J ⇐ ⇒ i > s J Model needs identification constraints Set one of the s to a point (zero); set r to a constant (one) Drop the intercept and fix r Fix two of the s parameters. Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 10 / 1

  14. Priors on thresholds s j ∼ N ( 0 , 10 2 ) , subject to ordering constraint s j > s j - 1 , ∀ j = 2 , . . . , J . In JAGS only, use nifty sort function: JAGS code 1 for(j in 1:4){ 2 tau0[j] ~ dnorm(0,.01) 3 } 4 tau[1:4] <- sort(tau0) ## JAGS only, not in WinBUGS! BUGS : s 1 ∼ N ( t 1 , T 1 ) d j ∼ Exponential ( d ), j = 2 , . . . , J , j = 2 , . . . , J , s j = s j - 1 + d j , BUGS code 1 tau[1] ~ dnorm(0,.01) 2 for(j in 1:3){ 3 delta[j] ~ dexp(2) 4 tau[j+1] <- tau[j] + delta[j] 5 } Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 11 / 1

  15. Example 8.6, interviewer ratings of respondents 5 point rating scale used by interviewers in assessing respondents’ levels of political information In 2000 ANES: Label y n % Very Low 0 105 6 Fairly Low 1 334 19 Average 2 586 33 Fairly High 3 450 25 Very High 4 325 18 covariates: education, gender, age, home-owner, public sector employment Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 12 / 1

  16. Ordinal Logistic Model JAGS code 1 model{ 2 for(i in 1:N){ ## loop over observations 3 ## form the linear predictor (no intercept) 4 mu[i] <- x[i,1]*beta[1] + 5 x[i,2]*beta[2] + 6 x[i,3]*beta[3] + 7 x[i,4]*beta[4] + 8 x[i,5]*beta[5] + 9 x[i,6]*beta[6] 10 11 ## cumulative logistic probabilities 12 logit(Q[i,1]) <- tau[1]-mu[i] 13 p[i,1] <- Q[i,1] 14 for(j in 2:4){ 15 logit(Q[i,j]) <- tau[j]-mu[i] 16 ## trick to get slice of the cdf we need 17 p[i,j] <- Q[i,j] - Q[i,j-1] 18 } 19 p[i,5] <- 1 - Q[i,4] 20 y[i] ~ dcat(p[i,1:5]) ## p[i,] sums to 1 for each i 21 } 22 23 ## priors over betas 24 beta[1:6] ~ dmnorm(b0[],B0[,]) 25 26 ## thresholds 27 for(j in 1:4){ 28 tau0[j] ~ dnorm(0, .01) 29 } 30 tau[1:4] <- sort(tau0) ## JAGS only not in BUGS! 31 } Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 13 / 1

  17. Redundant Parameterization exploit lack of identification run the MCMC algorithm deployed in the space of unidentified parameters post-processing : map MCMC output back mixes better than the MCMC algorithm in the space of the identified parameters get a better mixing Markov chain in ordinal model case, exploit lack of identification between thresholds and intercept parameters take care! Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 14 / 1

  18. Interviewer heterogeneity in scale-use Different interviewers use the rating scale differently: e.g., interviewer k is a tougher grader than interviewer k ′ . We tap this with a set of interviewer terms, varying over interviewers k = 1 , . . . , K We augment the usual ordinal model as follows: Pr ( y i ≥ j ) = F ( s j - l i ), j = 0 , . . . , J - 1 Pr ( y i = J ) = 1 - F ( s j - 1 - l i ) l i = x i b + g k g k ∼ N ( 0 , r 2 ) k = 1 , . . . , K A positive g k is equivalent to the thresholds being shifted down (i.e., interviewer k is an easier-than-average grader). Zero-mean restriction on g k : why? Alternative model: each interviewer gets their own set of thresholds, perhaps fit these hierarchically. Simon Jackman ( Stanford ) Bayesian Analysis of Choice Data February 3, 2012 15 / 1

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend