useR! 2009, Rennes 8.-10. July 2009 – p. 1
Size Estimation - Statistical Models for Underreporting
Gerhard Neubauer, Gordana Djuraš & Herwig Friedl
JOANNEUM RESEARCH and Technical University, Graz
Size Estimation - Statistical Models for Underreporting Gerhard - - PowerPoint PPT Presentation
Size Estimation - Statistical Models for Underreporting Gerhard Neubauer, Gordana Djura & Herwig Friedl JOANNEUM RESEARCH and Technical University, Graz useR! 2009, Rennes 8.-10. July 2009 p. 1 1 Introduction useR! 2009, Rennes
useR! 2009, Rennes 8.-10. July 2009 – p. 1
JOANNEUM RESEARCH and Technical University, Graz
useR! 2009, Rennes 8.-10. July 2009 – p. 2
useR! 2009, Rennes 8.-10. July 2009 – p. 3
■ criminology: crimes with an aspect of shame
■ public health: infectious (HIV) or chronic
■ production: error counts in a production process ■ traffic accidents with minor damage
useR! 2009, Rennes 8.-10. July 2009 – p. 4
useR! 2009, Rennes 8.-10. July 2009 – p. 4
i Ri ∼ Binomial(λ, π)
useR! 2009, Rennes 8.-10. July 2009 – p. 4
i Ri ∼ Binomial(λ, π)
useR! 2009, Rennes 8.-10. July 2009 – p. 5
i Ri ∼ Binomial(λ, π)
useR! 2009, Rennes 8.-10. July 2009 – p. 6
ind
useR! 2009, Rennes 8.-10. July 2009 – p. 7
useR! 2009, Rennes 8.-10. July 2009 – p. 8
useR! 2009, Rennes 8.-10. July 2009 – p. 9
useR! 2009, Rennes 8.-10. July 2009 – p. 10
useR! 2009, Rennes 8.-10. July 2009 – p. 11
useR! 2009, Rennes 8.-10. July 2009 – p. 12
■ τ = 0:
■ 0 < τ < 1:
■ τ < 0:
useR! 2009, Rennes 8.-10. July 2009 – p. 13
useR! 2009, Rennes 8.-10. July 2009 – p. 14
tβ)
tβ)
useR! 2009, Rennes 8.-10. July 2009 – p. 15
useR! 2009, Rennes 8.-10. July 2009 – p. 16
useR! 2009, Rennes 8.-10. July 2009 – p. 17
useR! 2009, Rennes 8.-10. July 2009 – p. 18
iterations loglik chi.sq gradient p1 GP 0.5 3 -405.256 94.602 0.000 0.4594 NegBin 0.5 3 -405.195 94.411 0.000 0.4604 BetaBin 0.5 19 -402.135 80.779 0.000 0.8230 BetaPois 0.5 30 -408.170 80.811 0.000 0.9097
useR! 2009, Rennes 8.-10. July 2009 – p. 19
Distribution: BetaPois Formula: y ˜ beta01 + T.cos1 + T.sin1 - 1 Estimate Std. Error t value Pr(>|t|) alpha1 2.309 0.289 8.002 beta01 5.071 0.025 204.435 T.cos1 0.060 0.016 3.673 T.sin1
0.016
Theta 11.231
measures loglik
chi.sq 80.811 df.residual 92.000 aic 824.341 bic 834.599 Reporting Probabilities: lower estimate upper alpha1 0.8622 0.9097 0.9571
useR! 2009, Rennes 8.-10. July 2009 – p. 20
20 40 60 80 50 100 150 200
Distribution=BetaPois, p0=0.3
Mean function from Quasi−Poisson−Model (blue) and BetaPois model (red) Frequency 20 40 60 80 50 100 150
Distribution=BetaPois, p0=0.3
Lambda function from BetaPois model (red) Frequency
Pearson residuals
Histogram of residuals and N(0,1) density Density −2 −1 1 2 0.0 0.1 0.2 0.3 0.4 5 10 15 −0.2 0.2 0.4 0.6 0.8 1.0 Lag ACF
Pearson residuals
useR! 2009, Rennes 8.-10. July 2009 – p. 21
■ Time range: 2004 - 2007 ■ weekly counts ■ 132 regions ■ different crime categories
useR! 2009, Rennes 8.-10. July 2009 – p. 22
50 100 150 200 20 40 60 80 100 120
Shop Lifting
Week Frequency
50 100 150 200 20 40 60 80 100
Bicycle Theft
Week Frequency
useR! 2009, Rennes 8.-10. July 2009 – p. 23
50 100 150 200 20 40 60 80 100 120
Shop Lifting
Week Frequency
50 100 150 200 50 100 150
Bicycle Theft
Week Frequency
useR! 2009, Rennes 8.-10. July 2009 – p. 24
■ Great variety of models ■ MLE based implementation in R ■ Good performance for simulated data ■ Reasonable estimates for real data
useR! 2009, Rennes 8.-10. July 2009 – p. 24
■ Great variety of models ■ MLE based implementation in R ■ Good performance for simulated data ■ Reasonable estimates for real data
■ Implement Conditional Poisson models ■ Non-nested Testing for more than two models