Workshop 10.6b: Analysis of count data (Bayesian) Murray Logan - - PDF document

workshop 10 6b analysis of count data bayesian
SMART_READER_LITE
LIVE PREVIEW

Workshop 10.6b: Analysis of count data (Bayesian) Murray Logan - - PDF document

-1- Workshop 10.6b: Analysis of count data (Bayesian) Murray Logan September 13, 2016 Table of contents 1 Poisson regression 1 2 Helper Functions 6 3 Worked Examples 7 1. Poisson regression 1.1. Poisson regression Probability


slide-1
SLIDE 1
  • 1-

Workshop 10.6b: Analysis

  • f

count data (Bayesian)

Murray Logan

September 13, 2016

Table of contents

1 Poisson regression 1 2 Helper Functions 6 3 Worked Examples 7

  • 1. Poisson regression

1.1. Poisson regression

Probability density function

λ = 25 λ = 15 λ = 3 5 10 15 20 25 30 35 40

Cumulative density function

5 10 15 20 25 30 35 40

p(Yi) = e−λλx x! log(µ) = β0 + β1xi + ... + βpxp

1.2. Dispersion

Spread assumed to be equal to mean. (φ = 1)

slide-2
SLIDE 2
  • 2-

Probability density function

λ = 25 λ = 15 λ = 3 5 10 15 20 25 30 35 40

Cumulative density function

5 10 15 20 25 30 35 40

1.3. Dispersion

1.3.1. Over-dispersion Sample more varied than expected from its mean

  • variability due to other unmeasured (latent) influences

– quasi-poisson model – negative binomial – observation level random effect

1.4. Dispersion

1.4.1. Over-dispersion Sample more varied than expected from its mean

  • variability due to other unmeasured (latent) influences
  • clumpiness

– negative binomial model

  • due to more zeros than expected

– zero-inflated model

1.5. Residuals

  • difficult to interpret
slide-3
SLIDE 3
  • 3-

1.6. Simulated Residuals

  • residuals uniform when model good

height! height!

  • 1.68

1.70 1.72 1.74 1.76 1.78 −2 −1 1 2 3 4

Raw

  • 1.68

1.70 1.72 1.74 1.76 −1 1 2 3

Pearson

1.7. Simulated Residuals

  • residuals uniform when model good
slide-4
SLIDE 4
  • 4-

height! height!

  • 1.68

1.70 1.72 1.74 1.76 1.78 −2 −1 1 2 3 4

Raw

  • 1.68

1.70 1.72 1.74 1.76 −1 1 2 3

Pearson height! height!

  • 1.68

1.70 1.72 1.74 1.76 1.78 −2 −1 1 2 3 4

Raw

  • 1.68

1.70 1.72 1.74 1.76 −1 1 2 3

Pearson

1.8. Simulated Residuals

  • residuals uniform when model good
slide-5
SLIDE 5
  • 5-

height! height!

  • 1.68

1.70 1.72 1.74 1.76 1.78 −2 −1 1 2 3 4

Raw

  • 1.68

1.70 1.72 1.74 1.76 −1 1 2 3

Pearson height! height!

  • 1.68

1.70 1.72 1.74 1.76 1.78 −2 −1 1 2 3 4

Raw

  • 1.68

1.70 1.72 1.74 1.76 −1 1 2 3

Pearson

1.9. Simulated Residuals

  • residuals uniform when model good
slide-6
SLIDE 6
  • 6-

height! height!

  • 1.68

1.70 1.72 1.74 1.76 1.78 −2 −1 1 2 3 4

Raw

  • 1.68

1.70 1.72 1.74 1.76 −1 1 2 3

Pearson height! height!

  • 1.68

1.70 1.72 1.74 1.76 1.78 −2 −1 1 2 3 4

Raw

  • 1.68

1.70 1.72 1.74 1.76 −1 1 2 3

Pearson

  • 2. Helper Functions
slide-7
SLIDE 7
  • 7-

2.1. Simulated Residuals

> simRes <- function(lambda, Y,n=250, plot=T, family='negbin', size=NULL,theta=NULL) { + require(gap) + N = length(Y) + sim = switch(family, + 'poisson' = matrix(rpois(n*N,apply(lambda,2,mean)),ncol=N, byrow=TRUE), + 'negbin' = matrix(MASS:::rnegbin(n*N,apply(lambda,2,mean),size),ncol=N, byrow=TRUE), + 'zip' = matrix(gamlss.dist:::rZIP(n*N,apply(lambda,2,mean),theta),ncol=N, byrow=TRUE), + 'zinb' = matrix(gamlss.dist:::rZINBI(n*N,apply(lambda,2,mean),sigma=theta,nu=size),ncol=N, + byrow=TRUE) + ) + a = apply(sim + runif(n,-0.5,0.5),2,ecdf) + resid<-NULL + for (i in 1:length(Y)) resid<-c(resid,a[[i]](Y[i] + runif(1 ,-0.5,0.5))) + if (plot==T) { + par(mfrow=c(1,2)) + gap::qqunif(resid,pch = 2, bty = "n", + logscale = F, col = "black", cex = 0.6, main = "QQ plot residuals", + cex.main = 1, las=1) + plot(resid~apply(lambda,2,mean), xlab='Predicted value', ylab='Standardized residual', las=1) + } + resid + }

2.2. ggplot goodness

> library(ggplot2) > theme_goodness <- theme_classic() + + theme(axis.line.x=element_line(), axis.line.y=element_line())

2.3. MCMC summaries

> MCMCsum <- function(x) { + data.frame(Mean=mean(x), median=median(x), HPDinterval(as.mcmc(x))) + } > MCMCsummaries <- function(fit) { + plyr:::adply(fit, 2, MCMCsum) + }

  • 3. Worked Examples

3.1. Poisson t-test

Format of limpets.csv data files Count Shore 1 sheltered 3 sheltered 2 sheltered 1 sheltered 4 sheltered . . .

slide-8
SLIDE 8
  • 8-

. . . Shore Categorical description of the shore type (sheltered or exposed) - Predictor variable Count Number of limpets per quadrat - Response variable

3.2. Poisson regression

Format of day.csv data files TREAT BARNACLE ALG1 27 .. .. ALG2 24 .. .. NB 9 .. .. S 12 .. .. TREAT Categorical listing of surface types. ALG1 = algal species 1, ALG2 = algal species 2, NB = naturally bare surface, S = scraped bare surface. BARNACLE The number of newly recruited barnacles on each plot after 4 weeks.

3.3. Negative Binomial

p(yi) = Γ(yi + ω) Γ(ω)y! × µyi

i ωω

(µi + ω)µi+ω

slide-9
SLIDE 9
  • 9-
  • 1.68

1.70 1.72 1.74 1.76 1.78 −2 −1 1 2 3 4

Raw

  • 1.68

1.70 1.72 1.74 1.76 1.78 −1 1 2 3

Pearson

  • 1.68

1.70 1.72 1.74 1.76 1.78 −2 −1 1 2

Deviance

3.4. Negative Binomial

p(yi) = Γ(yi + ω) Γ(ω)y! × µyi

i ωω

(µi + ω)µi+ω

  • mixture model
  • ηi ∼ Gamma(µi, ω)
  • E(y) ∼ Poisson(ηi)

3.4. Worked Example