INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
Capture-recapture methodology for estimating hard-to-reach - - PowerPoint PPT Presentation
Capture-recapture methodology for estimating hard-to-reach - - PowerPoint PPT Presentation
I NTRO C ENSUS C APTURE -R ECAPTURE A SSUMPTIONS U SUAL R ESIDENTS O THER M ETHODS AND PRACTICES Capture-recapture methodology for estimating hard-to-reach populations: Estimating the number of usual residents in the Netherlands 2010 S.C.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
SUSANNA GERRITSE
Bachelor Clinical Psychology (University of Amsterdam) Research Master Psychology, major in Methodology (University of Amsterdam) PhD candidate (Utrecht University, Collaboration with Statistics Netherlands).
"Capture-recapture methodology on register data for quality report for the Census "
Extracuricular: PhD Network Netherlands
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
PHD PROJECT
- Prof. Dr. Bart F. M. Bakker
Register and Administrative data
- Prof. Dr. Peter G. M. van der
Heijden Statistics . My PhD: Capture-recapture methodology
Sensitivity analyses on assumptions methodology
Application to quality report of Census 2011
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
GOAL
What are the number of usual residents in the Netherlands? Residing in the Netherlands for longer than 12 months Why important to this Summer school? Hard-to-reach sub populations Solution: Census Quality Census?
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CENSUS
Eurostat: every ten years each European country has to conduct a Census Conducted by National Statistical Institute (NSI’s). Complete overview population Provides information to develop policies, plan and run public services, and allocate funding. Why do we need this?: How many homeless people are there? → Do we need more shelters? How many people or different ethnicities do we have? → Evaluating equal opportunities policies How many people live in the capital city? → Do we need to build more (student-)housing?
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CENSUS
How can we conduct a Census? Traditional Census Enumerate every individual Either internet form or door-to-door poll officers
Advantages: Enumerate almost everyone Disadvantages: Costly, a burden on the resident, non-response Countries: UK, Ireland, Portugal, Russia, Greece, etc.
Rolling Census: Traditional Census spread out Enumerating different characteristics over multiple years
Advantages: higher frequency of Census data Disadvantages: Costs, more burden France, US.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CENSUS
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CENSUS
Register based, administrative census. One or more registers Most important: Population register
Advantages: Less costs, data already exists, no burden on the residents, higher frequency Disadvantages: Data are collected for other reasons. Finland, Norway, Sweden
Integrated Census: Administrative data and enumeration Combination of Administrative data and additional survey’s
All the advantages of admin data, plus extra information via survey Netherlands, Belgium, Germany, Poland, Spain, Italy, Israel.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CENSUS
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
HARD-TO-REACH-POPULATIONS
Dutch Census: Since 1995: Automated population registration system Most important: Population register based on data from municipalities (municipalities own the data) The Population Register is incomplete → Undercoverage
Free movement and residence within EU Illegal immigrants
Undercoverage PR What is the quality of the PR? How can we estimate the hard-to-reach population? → Capture-recapture estimation Note: Overcoverage.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
OVERVIEW
Overview: Capture-recapture methodology Outcomes my PhD Other methods to estimate hard-to-reach populations, examples in Europe
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CAPTURE-RECAPTURE
Population size estimation technique Originated from animal estimation The Tundra Vole: In the ’Turfzakken’ polder, in 2009: On 10 different locations, in total 20 cages. Captured voles were noted and information collected. Some hair on their back was removed - Tag. A week later they repeated the process Three counts: Voles caught in the first sample Voles caught in the second sample Voles caught in both camples
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CAPTURE-RECAPTURE
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CAPTURE-RECAPTURE
Human estimation? Already existing administrative data Linking two (incomplete) registers PR, Dutch Population Register containing the registered population CSR, Crime Suspects register, Police register on suspects of known
- ffenses
Table: Expected values
CSR PR 1 1 m11 m10 m01 m00
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
INDEPENDENCE
Under independence Loglineair model log mij = λ + λA
i + λB j
where λA
1 = λB 1 = 0.
Odds ratio m00m11 m10m01 = 1 Two ways to estimate m00: Poisson loglinear regression: ˆ m00 = exp(ˆ λ) Maximum likelihood estimate (mle): ˆ m00 = ˆ m10 ˆ m01 ˆ m11 = n10n01 n11 .
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
ESTIMATING THE VOLES
Table: Population voles
Sample 1 Sample 2 1 1 2 6 6 ?
Odds ratio m11m00 m10m01 = 1. ˆ m10 ˆ m01 ˆ m11 = ˆ
- m00. → n10n01
n11 = ˆ m00. 6 ∗ 6 2 = 18. → 6 + 6 + 2 + 18 = 32
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CAPTURE-RECAPTURE
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
ASSUMPTIONS
Assumptions Independence between registers Perfect linkage of individuals in registers No erroneous captures Heterogeneous inclusion probabilities Closed population Two issues: How do you meet the assumptions? Effect of violation on population size estimate?
Literature? Sensitivity analyses
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
DATA
Compare two different nationality groups. Afghan, Iraqi, Iranian
Need visum for the Netherlands
Polish
EU → free movement and residence Table: Observed values and estimate ˆ m00
CSR PR 1 1 1,085 26,254 255 6,170.3
(a) Population of people with an
Afghan, Iraqi and Iranian (AII) nationality residing in the Netherlands in 2007
CSR PR 1 1 374 39,488 1,445 152,567.3
(b) Population of people with a Polish
nationality residing in the Netherlands in 2009
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
INDEPENDENCE
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
PERFECT LINKAGE
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
ERRONEOUS CAPTURES
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
IMPLIED COVERAGE
Why is there such a difference between the nationality groups? Implied coverage
Table: Observed counts mle ˆ m00
HKS GBA 1 1 1,085 26,254 255 6,170.3
(a) Afghan, Iraqi, Irani nationality in
the Netherlands, 2007
HKS GBA 1 1 374 39,488 1,445 152,567.3
(b) Polish nationality in the Netherlands,
2009
Overlap between the registers Coverage of the PR, given the CSR. Low implied coverage → High estimation → Not robust High implied coverage → Low estimation → Robust.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
METHOD
What is the number of usual residents in the Netherlands? We used three registers. Population register - PR Crime Suspects Register (police) - CSR Employee Register - ER Important: Length of stay PR: Difference between Census date and date of registration
However: present before date of registration? Assumption: Everyone registered in the PR has the intention to stay for 12 months
ER - job lengths
Assumption: Residence in NL during the time of the job Unemployed between jobs?
CSR - ?
Impute.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
UNDERCOVERAGE PR
Three registers (PR, CSR, ER) Multiple covariates (Sex, Age, Usual Residence, and nationality group) Deterministic en probabilistic linkage Delete erroneous captures PR and ER: Census date CSR: Period of half a year Method: Impute missing values
Predictive Mean Matching (PMM) multiple imputation
Generalized Loglinear Modeling (GLM) - Poisson verdeling
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
UNDERCOVERAGE PR
Number of Usual Residents 151 thousand missed by all registers 33,000 people in ER and CSR residing for longer than 12 months Missed by all three registers: 184 thousand Confidence interval: 149 to 222 thousand Registered in the PR: 16,638,805
Assumption: Everyone in PR has intention to stay for longer than 12 months
Total Usual Residents 16,822 thousand, of which .5 to 1.1% not registered in the PR
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CAPTURE-RECAPTURE METHODOLOGY
Goes by many names and many versions: Mark-Recapture Dual(/triple) system estimation At least 2 data sources (registers, survey’s, lists, etc.) Use of covariates very useful Loglinear modelling (Census vs Post Enumeration Survey (PES) at the ONS, UK) Item Response Theory (IRT; Census enumeration Italy, IStat) Rasch models (On Diabetes data by Fienberg 1999, USA) Bayesian modelling (Qaulity assessment for the Census in New Zealand)
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
HOMELESS PEOPLE NETHERLANDS
Capture-recapture methodology Multiple registers: PR People without address who receive social support Dutch register on alcohol and drug users LADIS Method and results Capture recapture methodology Observed in all registers 5,169 Estimated not observed: 12, 598 Total 17,5 thousand homeless CI: 15,000 - 21,000
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
HOMELESS PEOPLE NETHERLANDS
Repeated 2012 27,000 Half of them have a foreign background and 40 percent have a non-western background. Nearly half of all homeless are found in Amsterdam, Utrecht, The Hague and Rotterdam. https://www.cbs.nl/en-gb/news/2013/52/27-thousand-homeless-in-the- netherlands
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
HIV PREVALENCE
HIV in children, 2003-2006. Capture-recapture methodology Three source Capture-recapture estimation
Mandatory HIV case reporting (DOVIH) The mandatory HIV case reporting (DOVIH), Laboratory based surveillance of HIV (LaboVIH)
Multiple imputation for missing heterogeneous catchability Loglinear modeling Registered: 216 new HIV diagnoses. 117 were estimated. The number of new HIV diagnoses in children was estimated at 387 (95%CI 271-503) during 2003-2006, among whom 60% were born abroad
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
HIV PREVALENCE
Heraud, et al., 2012
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
MULTIPLIER METHOD
Method: At least two data sources - Comprehensive register and a survey Use distribution of people in the survey on the registers Multiplier methods are user-friendly for their mathematical simplicity, the absence of linkage and are straightforward to use Example: Nr of Polish individuals in the NL. Assume everyone has equal chance of going to a hospital We go to a hospital and address every polish person and ask if they are in the PR
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
MULTIPLIER METHOD
Table: Artificial observed data for the Polish people in the hospital
Hospital 1 GBA 1 150 39,850 40,000 50
- 200
- Example:
There are 200 Polish people, of which 150 are in the GBA. Thus p(GBA |Hospital) = 0.75 Total 40,000 Polish people registered in PR Then 40, 000/0.75 = 53, 333 and we missed 53, 333 − 40, 000 = 13, 333 people who are not registered in the PR.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
DRUG USERS
Multiplier method for harddrug users Sample: 572 problematic drug users were interviewed in 8 Dutch cities. These individuals were asked whether they were registered in LADIS, a Dutch registry on drug users Multiplier method on number of interviewed users were in LADIS Different multiplier for Sex, Age and drug used. Opiates: 17,700 (CI: 17,300 - 18,100) Crack: 12,400 (CI: 10,300 - 15,600) Cruts et al., 2010
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
CASEFINDING
Method Casefinding is a system for locating every patient-inpatient or outpatient- who is diagnosed and/or treated with a reportable diagnosis. Casefinding is like casting a net far and wide to "capture" all of the reportable cancer cases. A combination of active and passive casefinding is a commonly used system in registries today. Active: When researchers or users investigate all source documents for possible matches Passive: When cases are brought to the attention of the registry holder.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
HOSPITAL REGISTRY
National Cancer Institute (USA) http://training.seer.cancer.gov/
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
SURVIVAL ANALYSES
Survival analyses - Very broad methodology Data is used on a time period until a specific event occurs. Non parametric method: Kaplan Meier analysis Log-rank statistical test, Mantel-Haenszel, Comparing two or more genrations in time
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
VACCINATION
Vaccination coverage in 3 regions in Slovenia Population: Hard-to-reach Roma Children The data were obtained from health records, immunization records and National Computerized Immunization System (CEPI 2000Â R ) Total of 436 preschool and 551 schoolchildren children Vaccination coverage was calculated by comparing the number of children eligible for immunization with the number of vaccinated children. Log-Rang test, compared the survival curves for the two generations.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
SURVIVAL ANALYSIS
Kraigher, et al. 2006.
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES
OTHER EXAMPLES
Other methods: Snowballing Respondent driven sampling Weighing Cumulative (Bakker, 2009) Which to choose? Depends on your data and resources
INTRO CENSUS CAPTURE-RECAPTURE ASSUMPTIONS USUAL RESIDENTS OTHER METHODS AND PRACTICES