SLIDE 1 Workshop 11.1: Generalized linear models
Murray Logan 26-011-2013 Other data types
- Binary - only 0 and 1 (dead/alive) (present/absent)
- Proportional abundance - range from 0 to 100
- Count data - min of zero
General linear models General linear models
Residuali = yi − E(Yi)
General linear models General linear models
E(Y )
Link function
= β0 + β1x1 + ... + βpxp
+ε, ε ∼ Dist(...)
General linear models
E(Y )
Link function
= β0 + β1x1 + ... + βpxp
+ e
Random
E(Yi) ∼ N(µi, σ2)
β0 + β1x1 + ... + βpxp 1
SLIDE 2
µ = β0 + β1x1 + ... + βpxp
Generalized linear models
g(µ)
function
= β0 + β1x1 + ... + βpxp
+ e
Random
- Random component. A nominated distribution (Gaussian, Pois-
son, Binomial, Gamma, Beta,. . . )
- Systematic component
- Link function
Generalized linear models
Response variable Probability distribution Link function Model name Continuous measurements Gaussian identity: µ Linear regression Binary Binomial logit: log
1−π
probit:
1 √ 2π
α+β.X
−∞
exp
2Z2
dZ Probit regression 2
SLIDE 3 Complimentary log-log: log(−log(1 − π)) Logistic regression Counts Poisson log: logµ Poisson regressionlog-linear model Negative binomial log
µ+θ
- Negative biomial regression
Quasi-poisson logµ Poisson regression
OLS Maximum Likelihood Logistic regression
Binary data Link: log
1−π
- Transform scale of linear predictor (−∞, ∞) into that of the response (0,1)
Logistic regression
E(Y ) = n x
Dispersion
Spread assumed to be equal to mean. (φ = 1) 3
SLIDE 4 Dispersion
Over-dispersion Sample more varied than expected from its mean
- variability due to other unmeasured influences
– quasi- model
- due to more zeros than expected
– zero-inflated model
Logistic regression
Binary data
dat.glmL <- glm(y ~ x, data = dat, family = "binomial")
Logistic regression
Binary data
dat.glmL <- glm(y ~ x, data = dat, family = "binomial")
plot(dat.glmL)
Logistic regression
Binary data
dat.glmL <- glm(y ~ x, data = dat, family = "binomial")
4
SLIDE 5 plot(dat.glmL)
– Pearson’s χ2 residuals – Deviance (G2)
Logistic regression
Binary data
summary(data.glmL} Slope parameter is on log odds-ratio scale
Logistic regression
Binary data
summary(data.glmL} Slope parameter is on log odds-ratio scale
quasiR2 = 1 −
null deviance
Binary data
LD50 = −intercept slope 5
SLIDE 6
Worked Example
polis <- read.table("../data/polis.csv", header = T, sep = ",", strip.white = T) head(polis) ISLAND RATIO PA 1 Bota 15.41 1 2 Cabeza 5.63 1 3 Cerraja 25.92 1 4 Coronadito 15.17 5 Flecha 13.04 1 6 Gemelose 18.85 plot(PA ~ RATIO, data = polis) abline(lm(PA ~ RATIO, data = polis)) with(polis, lines(lowess(PA ~ RATIO))) polis.glm <- glm(PA ~ RATIO, family = binomial, data = polis) par(mfrow = c(2, 3)) plot(polis.glm) ## Check the model for lack of fit via: Pearson ## chisq polis.resid <- sum(resid(polis.glm, type = "pearson")^2) 1 - pchisq(polis.resid, polis.glm$df.resid) [1] 0.5715 # No evidence of a lack of fit # Deviance 1 - pchisq(polis.glm$deviance, polis.glm$df.resid) [1] 0.6514 # No evidence of a lack of fit # Estimate dispersal polis.resid/polis.glm$df.resid 6
SLIDE 7
Figure 1: plot of chunk unnamed-chunk-8 7
SLIDE 8 [1] 0.9019 # OR polis.glm$deviance/polis.glm$df.resid [1] 0.8365 # Number of zeros polis.tab <- table(polis$PA == 0) polis.tab/sum(polis.tab) FALSE TRUE 0.5263 0.4737 summary(polis.glm) Call: glm(formula = PA ~ RATIO, family = binomial, data = polis) Deviance Residuals: Min 1Q Median 3Q Max
0.237 0.433 2.099 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 3.606 1.695 2.13 0.033 RATIO
0.101
0.029 (Intercept) * RATIO *
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 26.287
degrees of freedom Residual deviance: 14.221
degrees of freedom AIC: 18.22 Number of Fisher Scoring iterations: 6 8
SLIDE 9 exp(-0.296) [1] 0.7438 # R2 1 - (polis.glm$deviance/polis.glm$null) [1] 0.459 anova(polis.glm, test = "Chisq") Analysis of Deviance Table Model: binomial, link: logit Response: PA Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev Pr(>Chi) NULL 18 26.3 RATIO 1 12.1 17 14.2 0.00051 NULL RATIO ***
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 par(mar = c(4, 5, 0, 0)) plot(PA ~ RATIO, data = polis, type = "n", ann = F, axes = F) points(PA ~ RATIO, data = polis, pch = 16) xs <- seq(min(polis$RATIO), max(polis$RATIO), len = 1000) ys <- predict(polis.glm, newdata = data.frame(RATIO = xs), type = "link", se = T) ys$lwr <- polis.glm$family$linkinv(ys$fit - ys$se.fit) ys$upr <- polis.glm$family$linkinv(ys$fit + ys$se.fit) ys$fit <- polis.glm$family$linkinv(ys$fit) 9
SLIDE 10 lines(ys$fit ~ xs, col = "black") lines(ys$lwr ~ xs, col = "black", lty = 2) lines(ys$upr ~ xs, col = "black", lty = 2) axis(1) mtext("Island perimeter to area ratio", 1, cex = 1.5, line = 3) axis(2, las = 2) mtext(expression(paste(italic(Uta), " lizard presence/absence")), 2, cex = 1.5, line = 3) box(bty = "l") # LD50 ld50 <- -polis.glm$coef[1]/polis.glm$coef[2]
Poisson regression
p(Yi) = e−λλx x! log(µ) = β0 + β1xi + ... + βpxp
Dispersion
Spread assumed to be equal to mean. (φ = 1)
Dispersion
Over-dispersion Sample more varied than expected from its mean
- variability due to other unmeasured influences
– quasi-poisson model
– negative binomial model
- due to more zeros than expected
– zero-inflated model 10
SLIDE 11
Figure 2: plot of chunk unnamed-chunk-8 11
SLIDE 12
Figure 3: Figure 4: 12
SLIDE 13
Worked Example
data.pois <- read.table("../data/data.pois.csv", header = T, sep = ",", strip.white = T) head(data.pois) y x 1 1 1.025 2 2 2.697 3 3 3.626 4 2 4.949 5 4 6.025 6 8 6.254 library(car) scatterplot(y ~ x, data = data.pois) scatterplot(y ~ x, data = data.pois, log = "y") rug(jitter(data.pois$y), side = 2) hist(data.pois$y) boxplot(data.pois$y, horizontal = TRUE) rug(jitter(data.pois$y), side = 1) # plot(y~x, data.pois, log=’y’) with(data.pois, # lines(lowess(y~x))) library(car) # scatterplot(y~x,data=data.pois, log=’y’) # rug(jitter(data.pois$y), side=2) # Zero inflation proportion of 0’s in the data data.pois.tab <- table(data.pois$y == 0) data.pois.tab/sum(data.pois.tab) FALSE 1 13
SLIDE 14
Figure 5: plot of chunk unnamed-chunk-9 14
SLIDE 15
Figure 6: plot of chunk unnamed-chunk-9 15
SLIDE 16
Figure 7: plot of chunk unnamed-chunk-9 16
SLIDE 17
Figure 8: plot of chunk unnamed-chunk-9 17
SLIDE 18
# proportion of 0’s expected from a Poisson # distribution mu <- mean(data.pois$y) cnts <- rpois(1000, mu) data.pois.tab <- table(cnts == 0) data.pois.tab/sum(data.pois.tab) FALSE TRUE 0.997 0.003 # fit model data.pois.glm <- glm(y ~ x, data = data.pois, family = "poisson") data.pois.glm <- glm(y ~ x, data = data.pois, family = poisson(link = "log")) # Model validation par(mfrow = c(2, 3)) plot(data.pois.glm, ask = F, which = 1:6) # goodness of fit data.pois.resid <- sum(resid(data.pois.glm, type = "pearson")^2) 1 - pchisq(data.pois.resid, data.pois.glm$df.resid) [1] 0.4897 # Deviance based 1 - pchisq(data.pois.glm$deviance, data.pois.glm$df.resid) [1] 0.5076 data.pois.glmG <- glm(y ~ x, data = data.pois, family = "gaussian") AIC(data.pois.glm, data.pois.glmG) df AIC data.pois.glm 2 90.32 data.pois.glmG 3 98.28 # Poisson deviance data.pois.glm$deviance 18
SLIDE 19
Figure 9: plot of chunk unnamed-chunk-9 19
SLIDE 20 [1] 17.23 # Gaussian deviance data.pois.glmG$deviance [1] 118.1 data.pois.resid/data.pois.glm$df.resid [1] 0.9717 data.pois.glm$deviance/data.pois.glm$df.resid [1] 0.957 # Or alternatively, via Pearson’s residuals Resid <- resid(data.pois.glm, type = "pearson") sum(Resid^2)/(nrow(data.pois) - length(coef(data.pois.glm))) [1] 0.9717 summary(data.pois.glm) Call: glm(formula = y ~ x, family = poisson(link = "log"), data = data.pois) Deviance Residuals: Min 1Q Median 3Q Max
0.044 0.562 2.046 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.5600 0.2539 2.21 0.027 x 0.1115 0.0186 6.00 2e-09 (Intercept) * x ***
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 20
SLIDE 21 (Dispersion parameter for poisson family taken to be 1) Null deviance: 55.614
degrees of freedom Residual deviance: 17.226
degrees of freedom AIC: 90.32 Number of Fisher Scoring iterations: 4 library(gmodels) ci(data.pois.glm) Estimate CI lower CI upper Std. Error (Intercept) 0.5600 0.02649 1.0935 0.25395 x 0.1115 0.07247 0.1506 0.01858 p-value (Intercept) 2.744e-02 x 1.971e-09 1 - (data.pois.glm$deviance/data.pois.glm$null) [1] 0.6903 par(mfrow = c(1, 1), mar = c(4, 5, 0, 0)) plot(y ~ x, data = data.pois, type = "n", ann = F, axes = F) points(y ~ x, data = data.pois, pch = 16) xs <- seq(min(data.pois$x), max(data.pois$x), len = 1000) # ys <- predict(data.pois.glm, newdata = # data.frame(x = xs), type = ’response’, se = T) ys <- predict(data.pois.glm, newdata = data.frame(x = xs), type = "link", se = T) ys$lwr <- exp(ys$fit - 2 * ys$se.fit) ys$upr <- exp(ys$fit + 2 * ys$se.fit) ys$fit <- exp(ys$fit) lines(ys$fit ~ xs, col = "black") lines(ys$lwr ~ xs, col = "black", lty = 2) lines(ys$upr ~ xs, col = "black", lty = 2) 21
SLIDE 22
polygon(c(xs, rev(xs)), c(ys$lwr, rev(ys$upr)), border = NA, col = "#0000FF50") # lines(ys$fit - 2 * ys$se.fit ~ xs, col = ’black’, # type = ’l’, lty = 2) lines(ys$fit + 2 * ys$se.fit # ~ xs, col = ’black’, type = ’l’, lty = 2) axis(1) mtext("X", 1, cex = 1.5, line = 3) axis(2, las = 2) mtext("Abundance of Y", 2, cex = 1.5, line = 3) box(bty = "l") Figure 10: plot of chunk unnamed-chunk-9 22
SLIDE 23 Quasi-Poisson
- specify var(y) = φµ
- φ (scale parameter) estimated
- parameter estimates same
- Standard errors = √φ × SE
Worked Example
data.qp <- read.table("../data/data.qp.csv", header = T, sep = ",", strip.white = T) head(data.qp) y x 1 0 1.025 2 3 2.697 3 0 3.626 4 5 4.949 5 6 6.025 6 1 6.254 ## library(car) scatterplot(y~x,data=data.qp) ## scatterplot(y~x,data=data.qp, log=’y’) plot(y ~ x, data.qp) with(data.qp, lines(lowess(y ~ x))) rug(jitter(data.qp$y), side = 2) plot(y ~ x, data.qp, log = "y") with(data.qp, lines(lowess(y ~ x))) rug(jitter(data.qp$y), side = 2) hist(data.qp$y) boxplot(data.qp$y, horizontal = TRUE) rug(jitter(data.qp$y), side = 1) 23
SLIDE 24
Figure 11: plot of chunk unnamed-chunk-10 24
SLIDE 25
Figure 12: plot of chunk unnamed-chunk-10 25
SLIDE 26
Figure 13: plot of chunk unnamed-chunk-10 26
SLIDE 27
Figure 14: plot of chunk unnamed-chunk-10 27
SLIDE 28
# Zero inflation proportion of 0’s in the data data.qp.tab <- table(data.qp$y == 0) data.qp.tab/sum(data.qp.tab) FALSE TRUE 0.85 0.15 # proportion of 0’s expected from a Poisson # distribution mu <- mean(data.qp$y) cnts <- rpois(1000, mu) data.qp.tab <- table(cnts == 0) data.qp.tab/sum(data.qp.tab) FALSE TRUE 0.998 0.002 # fit model data.qp.glm <- glm(y ~ x, data = data.qp, family = "quasipoisson") # Model validation par(mfrow = c(2, 3)) plot(data.qp.glm, ask = F, which = 1:6) data.qp.glm$deviance/data.qp.glm$df.resid [1] 3.888 # Or alternatively, via Pearson’s residuals Resid <- resid(data.qp.glm, type = "pearson") sum(Resid^2)/(nrow(data.qp) - length(coef(data.qp.glm))) [1] 3.699 summary(data.qp.glm) 28
SLIDE 29
Figure 15: plot of chunk unnamed-chunk-10 29
SLIDE 30 Call: glm(formula = y ~ x, family = "quasipoisson", data = data.qp) Deviance Residuals: Min 1Q Median 3Q Max
0.744 3.933 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.6997 0.4746 1.47 0.158 x 0.1005 0.0353 2.85 0.011 (Intercept) x *
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 (Dispersion parameter for quasipoisson family taken to be 3.699) Null deviance: 101.521
degrees of freedom Residual deviance: 69.987
degrees of freedom AIC: NA Number of Fisher Scoring iterations: 5 library(gmodels) ci(data.qp.glm) Estimate CI lower CI upper Std. Error (Intercept) 0.6997 -0.29744 1.6967 0.4746 x 0.1005 0.02632 0.1746 0.0353 p-value (Intercept) 0.15770 x 0.01071 1 - (data.qp.glm$deviance/data.qp.glm$null) [1] 0.3106 par(mar = c(4, 5, 0, 0)) 30
SLIDE 31 plot(y ~ x, data = data.qp, type = "n", ann = F, axes = F) points(y ~ x, data = data.qp, pch = 16) xs <- seq(0, 20, l = 1000) ys <- predict(data.pois.glm, newdata = data.frame(x = xs), type = "link", se = T) ys$lwr <- exp(ys$fit - 2 * ys$se.fit) ys$upr <- exp(ys$fit + 2 * ys$se.fit) ys$fit <- exp(ys$fit) lines(ys$fit ~ xs, col = "black") lines(ys$lwr ~ xs, col = "black", lty = 2) lines(ys$upr ~ xs, col = "black", lty = 2) axis(1) mtext("X", 1, cex = 1.5, line = 3) axis(2, las = 2) mtext("Abundance of Y", 2, cex = 1.5, line = 3) box(bty = "l")
Negative Binomial
p(yi) = Γ(yi + ω) Γ(ω)y! × µyi
i ωω
(µi + ω)µi+ω
Negative Binomial
p(yi) = Γ(yi + ω) Γ(ω)y! × µyi
i ωω
(µi + ω)µi+ω
- mixture model
- ηi ∼ Gamma(µi, ω)
- E(y) ∼ Poisson(ηi)
Worked Example
data.nb <- read.table("../data/data.nb.csv", header = T, sep = ",", strip.white = T) head(data.nb) y x 1 1 1.042 2 7 2.498 31
SLIDE 32
Figure 16: plot of chunk unnamed-chunk-10 32
SLIDE 33
Figure 17: 33
SLIDE 34
3 2 3.677 4 4 3.798 5 1 4.871 6 9 7.690 scatterplot(y ~ x, data.nb, log = "y") Error: NA/NaN/Inf in ’y’ Figure 18: plot of chunk unnamed-chunk-11 34
SLIDE 35
plot(y ~ x, data.nb) with(data.nb, lines(lowess(y ~ x))) Figure 19: plot of chunk unnamed-chunk-11 plot(y ~ x, data.nb, log = "y") with(data.nb, lines(lowess(y ~ x))) rug(jitter(data.nb$y), side = 2) hist(data.nb$y, br = 10) 35
SLIDE 36
Figure 20: plot of chunk unnamed-chunk-11 36
SLIDE 37
Figure 21: plot of chunk unnamed-chunk-11 37
SLIDE 38
boxplot(data.nb$y, horizontal = TRUE) rug(jitter(data.nb$y), side = 1) Figure 22: plot of chunk unnamed-chunk-11 # Zero inflation proportion of 0’s in the data data.nb.tab <- table(data.nb$y == 0) data.nb.tab/sum(data.nb.tab) FALSE TRUE 0.95 0.05 38
SLIDE 39
# proportion of 0’s expected from a Poisson # distribution mu <- mean(data.nb$y) cnts <- rpois(1000, mu) hist(cnts) Figure 23: plot of chunk unnamed-chunk-11 tab <- table(cnts == 0) tab/sum(tab) FALSE TRUE 0.996 0.004 39
SLIDE 40
# fit model library(MASS) data.nb.glm <- glm.nb(y ~ x, data = data.nb) # Model validation par(mfrow = c(2, 3)) plot(data.nb.glm, ask = F, which = 1:6) Figure 24: plot of chunk unnamed-chunk-11 data.ss <- sum(resid(data.nb1, type = "pearson")^2) Error: object ’data.nb1’ not found 40
SLIDE 41
data.ss/data.nb1$df.resid Error: object ’data.ss’ not found data.nb1 <- glm(y ~ x, data = data.nb, family = "poisson") plot(data.nb1, ask = F, which = 1:6) Figure 25: plot of chunk unnamed-chunk-11 1 - pchisq(data.nb1$deviance, data.nb1$df.resid) [1] 8.6e-08 41
SLIDE 42 # goodness of fit data.nb.resid <- sum(resid(data.nb.glm, type = "pearson")^2) 1 - pchisq(data.nb.resid, data.nb.glm$df.resid) [1] 0.5825 # Deviance based 1 - pchisq(data.nb.glm$deviance, data.nb.glm$df.resid) [1] 0.2407 data.nb.glmG <- glm(y ~ x, data = data.nb, family = "gaussian") AIC(data.nb.glm, data.nb.glmG) df AIC data.nb.glm 3 115.8 data.nb.glmG 3 128.2 # Poisson deviance data.nb.glm$deviance [1] 21.81 # Gaussian deviance data.nb.glmG$deviance [1] 527.6 summary(data.nb.glm) Call: glm.nb(formula = y ~ x, data = data.nb, init.theta = 2.359878187, link = log) Deviance Residuals: Min 1Q Median 3Q Max
0.442 1.366 42
SLIDE 43 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.7454 0.4253 1.75 0.0797 x 0.0949 0.0344 2.76 0.0058 (Intercept) . x **
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 (Dispersion parameter for Negative Binomial(2.36) family taken to be 1) Null deviance: 30.443
degrees of freedom Residual deviance: 21.806
degrees of freedom AIC: 115.8 Number of Fisher Scoring iterations: 1 Theta: 2.36
1.10 2 x log-likelihood:
library(gmodels) ci(data.nb.glm) Estimate CI lower CI upper Std. Error (Intercept) 0.74543 -0.14818 1.6390 0.42534 x 0.09494 0.02265 0.1672 0.03441 p-value (Intercept) 0.079682 x 0.005792 1 - (data.nb.glm$deviance/data.nb.glm$null) [1] 0.2837 par(mfrow = c(1, 1), mar = c(4, 5, 0, 0)) 43
SLIDE 44 plot(y ~ x, data = data.nb, type = "n", ann = F, axes = F) points(y ~ x, data = data.nb, pch = 16) xs <- seq(0, 20, l = 1000) ys <- predict(data.nb.glm, newdata = data.frame(x = xs), type = "link", se = T) ys$lwr <- exp(ys$fit - 2 * ys$se.fit) ys$upr <- exp(ys$fit + 2 * ys$se.fit) ys$fit <- exp(ys$fit) lines(ys$fit ~ xs, col = "black") lines(ys$lwr ~ xs, col = "black", lty = 2) lines(ys$upr ~ xs, col = "black", lty = 2) axis(1) mtext("X", 1, cex = 1.5, line = 3) axis(2, las = 2) mtext("Abundance of Y", 2, cex = 1.5, line = 3) box(bty = "l")
Zero-inflated model
– those expected by Poisson (or binomial) – false zeros
– binary logistic (rate of false zeros) – Poisson (trend in counts)
Worked Example
data.zip <- read.table("../data/data.zip.csv", header = T, sep = ",", strip.white = T) head(data.zip) y x 1 0 1.042 2 3 2.498 44
SLIDE 45
Figure 26: plot of chunk unnamed-chunk-11 45
SLIDE 46
3 1 3.677 4 4 3.798 5 5 4.871 6 1 7.690 plot(y ~ x, data.zip) with(data.zip, lines(lowess(y ~ x))) Figure 27: plot of chunk unnamed-chunk-12 plot(y ~ x, data.zip, log = "y") with(data.zip, lines(lowess(y ~ x))) rug(jitter(data.pois$y), side = 2) 46
SLIDE 47
Figure 28: plot of chunk unnamed-chunk-12 47
SLIDE 48
hist(data.zip$y, br = 20) Figure 29: plot of chunk unnamed-chunk-12 boxplot(data.zip$y, horizontal = TRUE) rug(jitter(data.zip$y), side = 1) # Zero inflation proportion of 0’s in the data data.zip.tab <- table(data.zip$y == 0) data.zip.tab/sum(data.zip.tab) 48
SLIDE 49
Figure 30: plot of chunk unnamed-chunk-12 49
SLIDE 50 FALSE TRUE 0.55 0.45 # proportion of 0’s expected from a Poisson # distribution mu <- mean(data.zip$y) cnts <- rpois(1000, mu) data.zip.tab <- table(cnts == 0) data.zip.tab/sum(data.zip.tab) FALSE TRUE 0.935 0.065 summary(glm(y ~ x, data.zip, family = "poisson")) Call: glm(formula = y ~ x, family = "poisson", data = data.zip) Deviance Residuals: Min 1Q Median 3Q Max
1.174 3.638 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.6705 0.3302 2.03 0.042 x 0.0296 0.0266 1.11 0.266 (Intercept) * x
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 85.469
degrees of freedom Residual deviance: 84.209
degrees of freedom AIC: 124 Number of Fisher Scoring iterations: 6 50
SLIDE 51
plot(glm(y ~ x, data.zip, family = "poisson")) 51
SLIDE 52
52
SLIDE 53
library(pscl) data.zip.zip <- zeroinfl(y ~ x | 1, dist = "poisson", data = data.zip) plot(resid(data.zip.zip) ~ fitted(data.zip.zip)) summary(data.zip.zip) Call: zeroinfl(formula = y ~ x | 1, data = data.zip, dist = "poisson") 53
SLIDE 54
Figure 31: plot of chunk unnamed-chunk-12 54
SLIDE 55
Figure 32: plot of chunk unnamed-chunk-12 55
SLIDE 56 Pearson residuals: Min 1Q Median 3Q Max
0.966 1.619 Count model coefficients (poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 0.7047 0.3196 2.21 0.02745 x 0.0873 0.0253 3.45 0.00056 (Intercept) * x *** Zero-inflation model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept)
0.456
0.62
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 Number of iterations in BFGS optimization: 13 Log-likelihood: -36.2 on 3 Df data.zip.zip1 <- zeroinfl(y ~ x | x, dist = "poisson", data = data.zip) plot(resid(data.zip.zip1) ~ fitted(data.zip.zip1)) summary(data.zip.zip1) Call: zeroinfl(formula = y ~ x | x, data = data.zip, dist = "poisson") Pearson residuals: Min 1Q Median 3Q Max
0.908 2.132 Count model coefficients (poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) 0.6111 0.3300 1.85 0.06404 x 0.0939 0.0257 3.65 0.00027 56
SLIDE 57
Figure 33: plot of chunk unnamed-chunk-12 57
SLIDE 58 (Intercept) . x *** Zero-inflation model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept)
1.890
0.15 x 0.217 0.146 1.49 0.14
0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 Number of iterations in BFGS optimization: 17 Log-likelihood: -34.4 on 4 Df AIC(data.zip.zip, data.zip.zip1) df AIC data.zip.zip 3 78.34 data.zip.zip1 4 76.81 par(mar = c(4, 5, 0, 0)) plot(y ~ x, data = data.zip, type = "n", ann = F, axes = F) points(y ~ x, data = data.zip, pch = 16) xs <- seq(0, 20, l = 1000) # ys <- predict(data.pois.glm, newdata = # data.frame(x = xs), type = ’response’, se = T) coefs <- coef(data.zip.zip)[1:2] mm <- model.matrix(~x, data = data.frame(x = xs)) fit <- coefs %*% t(mm) se <- sqrt(diag(mm %*% vcov(data.zip.zip)[-3, -3] %*% t(mm))) lwr <- as.vector(exp(fit - 2 * se)) upr <- as.vector(exp(fit + 2 * se)) fit <- as.vector(exp(fit)) lines(fit ~ xs, col = "black") lines(lwr ~ xs, col = "black", lty = 2) lines(upr ~ xs, col = "black", lty = 2) polygon(c(xs, rev(xs)), c(lwr, rev(upr)), border = NA, col = "#0000FF50") # lines(ys$fit - 2 * ys$se.fit ~ xs, col = ’black’, 58
SLIDE 59
# type = ’l’, lty = 2) lines(ys$fit + 2 * ys$se.fit # ~ xs, col = ’black’, type = ’l’, lty = 2) axis(1) mtext("X", 1, cex = 1.5, line = 3) axis(2, las = 2) mtext("Abundance of Y", 2, cex = 1.5, line = 3) box(bty = "l") Figure 34: plot of chunk unnamed-chunk-12
Generalized Linear Mixed effects Models (GLMM)
59