Stats NZ Census Session Introduc2on How the new 2018 collec2on - - PowerPoint PPT Presentation

stats nz census session
SMART_READER_LITE
LIVE PREVIEW

Stats NZ Census Session Introduc2on How the new 2018 collec2on - - PowerPoint PPT Presentation

Stats NZ Census Session Introduc2on How the new 2018 collec2on model worked in prac2ce Tap to add text PANZ Conference 3 2018 A new model built on 2018 Census international experience and testing PANZ Conference 4 Four phase


slide-1
SLIDE 1

Stats NZ Census Session

Introduc2on

slide-2
SLIDE 2

How the new 2018 collec2on model worked in prac2ce

slide-3
SLIDE 3

Tap to add text

PANZ Conference 3

slide-4
SLIDE 4

2018 Census

PANZ Conference 4

2018 – A new model built on international experience and testing

slide-5
SLIDE 5

2018 Census

PANZ Conference 5

Four phase collection model

Census Day

slide-6
SLIDE 6

2018 Census

PANZ Conference 6

Prepare : Creating the Dwelling Frame

  • A list of addresses for all dwellings
  • The basis for counting dwellings and finding

the people in them.

  • Sources of dwelling information
  • 2013 Census addresses
  • Building consents and LINZ data
  • NZ Post address list
  • Canvassing Field check before census
  • Very close to our estimate of dwellings
slide-7
SLIDE 7

2018 Census

PANZ Conference 7

Prepare: Early Engagement

  • Pre engagement
  • Community Engagement
  • Assisted Completion Events
slide-8
SLIDE 8

2018 Census

PANZ Conference 8

Te Reo in the 2018 Census

  • All census letters had bilingual messaging
  • Respondents completing online could choose to

complete in English or Māori (and toggle at any time)

  • 3,000 households in Full Enumeration Bilingual

areas

  • 80,000 households in Delivery with contact

areas

– Enabled with a call to action letter and then offered a bilingual visit pack – Northland, Rotorua, Hastings, East Cape, Otaki, Chatham Islands

slide-9
SLIDE 9

2018 Census

PANZ Conference 9

Help and support

  • 0800 CENSUS
  • Ran from mid Feb to mid May
  • Requests for paper and access codes –
  • Answered census questions
  • Language options English, Te Reo, Samoan,

Cantonese, Mandarin, Korean, Tongan and Hindi

  • Automated English and Te Reo
  • Social Media Channels
slide-10
SLIDE 10

2018 Census

PANZ Conference 10

Internet Collection System

  • Unique access code for each dwelling address
  • Built for desktop, tablet and phone
  • Māori and English
  • Questions matched paper forms
  • Smart routing
  • Reuse of respondent entered data
  • As you type capability
  • A household summary page to list occupants
  • Feedback was generally positive
  • No significant outages during operation
slide-11
SLIDE 11

2018 Census

PANZ Conference 11

Household Summary Form

slide-12
SLIDE 12

2018 Census

PANZ Conference 12

Delivery approaches

83% 17%

slide-13
SLIDE 13

2018 Census

PANZ Conference 13

Census letter delivered to your letterbox or doorstep

Remind Visit

Delivery

Early Visit

CENSUS DAY

x

Delivery

slide-14
SLIDE 14

2018 Census

PANZ Conference 14

Targeted Strategies – a face to face approach

  • Targeted People or Area Strategies
  • Delivery with attempted contact
  • Delivery with attempted contact bilingual
  • Early Visit
  • Full enumeration bilingual
  • Homeless strategy
  • Remote rural
  • Remote islands
  • Freedom campers, Marinas, Cruise Ships
slide-15
SLIDE 15

2018 Census

PANZ Conference 15

Remind 6th – 16th March

  • Reminder letters were sent to most non

responding mailable dwellings – arriving on 9th and 12th March

  • Early visit teams began delivering paper
  • Non private dwelling teams collected materials
  • Community Engagement teams continued

promoting the census

slide-16
SLIDE 16

2018 Census

PANZ Conference 16

Non Response Follow Up 16 March – 30 April

  • A knock at the door and delivery of paper forms
  • Return visits if no response
  • Anticipated 70 percent response by this time
  • Sub national variability
slide-17
SLIDE 17

2018 Census

PANZ Conference 17

Field Staff

  • Field staff had tablets
  • Online training and support
  • Workloads on tablets each day
  • Lists of addresses and maps
  • Dynamic Workload Allocation Tool

(WCAT)

  • Developed in partnership with Auckland University
  • Packaged addresses into workloads
  • Efficiency and flexibility
  • As responses received non response workloads

updated

slide-18
SLIDE 18

2018 Census

PANZ Conference 18

Recruitment and contracts

  • Specialist recruitment company
  • Expertise in performance

management, payroll and health and safety

  • Phase based employment for

canvassing, delivery, remind, visit

  • 30 hour contracts
  • Living wage pay rates
  • Most interaction with non respondents
slide-19
SLIDE 19

2018 Census

PANZ Conference 19

Recruitment challenges

  • We struggled to get enough staff in

some places

  • Using a third party may have

disconnected traditional staff base

  • Difficult to accommodate a census role

around another job

  • Lost some skilled Census people

especially in remote and rural communities

  • Overhead of technology made getting

staff trained and provisioned and active slow

slide-20
SLIDE 20

2018 Census

slide-21
SLIDE 21

2018 Census

PANZ Conference 21

When response wasn’t tracking well

  • Sent additional reminder letters
  • Reminder 3 and 4
  • Mailing paper forms
  • Extra advertising
  • Extended field hours and numbers
  • Extended engagement
  • Flying squads – redeploying staff
  • Assisted Completion events in low response

areas

slide-22
SLIDE 22

2018 Census

PANZ Conference 22

Challenges of the new model

We made it hard for people who wanted or needed paper

  • The public had to phone and ask for it
  • It took a long time for us to deliver paper if they hadn’t requested

it

– Remind phase was letter based – Delivery of paper didn’t start until 10 days after Census day in most areas

We didn’t get back to visit early enough in low responding areas

  • Recruitment challenges in some areas
  • Higher than anticipated non response in some areas
  • The later we visited the less effective it was
  • Smaller field team – limited ability to respond to lower than

expected response

slide-23
SLIDE 23

Challenges of the new model

PANZ Conference 23

In some cases we visited dwellings that had already responded

  • Timeliness and accuracy of receipting responses varied
  • Dwelling frame addresses didn’t always match respondents self

described addresses which sometimes caused duplication and confusion

  • Shared mailboxes and rigid document IDs caused pain

Communication Campaign was effective for main messages

  • Awareness of census – didn’t always translate to participation
  • Hard to communicate with respondents in small area based collection

strategies

slide-24
SLIDE 24

How people are counted in the 2018 Census

Use of administra.ve data to count people who were missed by census field collec.on

slide-25
SLIDE 25

Outline

Introduction How we decide which admin records should be included Assessment

slide-26
SLIDE 26

Census-taking is evolving

Wide variation in methods across countries recognised by the UN Statistical Commission (the peak body for official statistics)

A ‘full field enumeration’ (traditional) census asks everyone to fill in forms.

  • This is what we set out to do in 2018

A ‘register-based’ census uses only admin sources A ‘combined’ census uses a mix of admin sources and field collection

  • This is what we have produced for 2018

Census aims: Count everyone once, only once, and in the right place

slide-27
SLIDE 27

Census count NZ residents in NZ on census night Net census undercount

Es.mated by Post-Enumera.on Survey

  • PES

Aim Difference Achieved

Non-respondents Census form responses

Coverage: How many people should census count?

Interim es.mates for 2018, DSE

slide-28
SLIDE 28

Census forms Admin sources Real people 89% Real people 11%

How people are counted in the census

2018 Census 2013 Census

Census forms Unit Imputa.on Imputed 5% Real people 95% Missed

Approx 1.2%

Missed

2.4% +/- 0.5% PES measure

slide-29
SLIDE 29

2018 Census counts – individuals (June 2019)

Source Number Percent of census file Aim

Early approxima.on of 2018 census usual resident popula.on(1)

4,760,000 Census forms

Total counted from census forms

4,175,000 89%

  • Individual Forms received

3,972,000 85%

  • Individuals listed on Household

form

203,000 4% Admin sources 525,000 11% Achieved

Census usually resident popula.on count

4,700,000 100% Indica2ve coverage gap

Number

59,000

Percent

1.2%

1) April 2019. Revised 2013 base ERP, using new 12/16 external migra.on measure

slide-30
SLIDE 30

How we use admin data to add people to the census dataset

slide-31
SLIDE 31

2018 Census: Separate data sources

Age, sex, place

Census Forms

Individual Forms Household listing

Characteristics People People

Admin resident population

Characteristics Age, sex, place

  • ther variables
  • ther variables
slide-32
SLIDE 32

Combining census forms and admin data

Admin popula.on Census forms Some of these admin records are added to the census file to count people who were missed

slide-33
SLIDE 33

Admin NZ resident population – the IDI-ERP

Estimate of the NZ resident population using linked admin data

Begin with IDI spine (‘ever-resident’)

  • Include all individuals with activity in admin data sources (tax,

health, education, ACC) within the previous two years

  • Remove individuals
  • who died before reference date
  • who migrated overseas before reference date
slide-34
SLIDE 34

Quality of IDI-ERP admin population: strengths

The IDI-ERP is a good approximation of the NZ resident population Detailed examination of time series 2006 to 2016. 2016: Age/sex, 2017: geographies, 2018: ethnicity

Experimental population estimates from linked administrative data: methods and results

Stats NZ website. Data series and methods papers

slide-35
SLIDE 35

Quality of IDI-ERP admin population: Limitations

But does not meet all the accuracy requirements for producing official statistics. Key limitations:

  • includes some under-coverage and over-coverage (not well quantified)
  • some marked differences in the age/sex structure for younger adults, especially males
  • geographic location from the admin data is good for larger geographies such as TALBs, but

accuracy decreases at smaller geographies

  • admin households are problematic - around half of the admin households have the same

household membership as the census Statistical methods applied to allow for these limitations

slide-36
SLIDE 36

Admin enumerations framework

  • 1. Add those admin people to the census file [ie enumerate] who should be

counted as part of the NZ census, but who we don’t have a response for

  • 2. Put in private dwellings where we have good evidence for improving

households

  • 3. Otherwise, include in a meshblock when we are sure the person should be

counted, and we have good evidence for improving small area information

slide-37
SLIDE 37

Step 1: the eligible admin population

Admin people who should be counted as part of the NZ census, but who we don’t have a response for.

  • Admin NZ resident population (IDI-ERP) tells us who should be counted
  • Remove residents temporarily overseas from IDI-ERP
  • Through links to border movements data
  • Link 2018 Census to the IDI spine so we know who has already

responded

  • Match rate: 97.7%
  • Estimated missed links: 1.4%
  • Estimated incorrect links: <1%

Admin popula.on

Census forms

LINK

slide-38
SLIDE 38

Step 2: Admin people in households

Put in private dwellings where we have good evidence for improving households 162,000 admin people added to dwellings

  • Census dwelling frame for in-scope non-responding private dwellings
  • A statistical model predicts which non-responding dwellings we can create

good whole admin households for

  • Trade-off - strictly correct membership vs

same household type Threshold: 50% chance or beYer of same household type

slide-39
SLIDE 39

Step 3: Admin people in meshblocks

Otherwise, include in meshblocks when we are sure the person should be counted, and we have good evidence for improving small area information Adjust for admin limitations using statistical methods developed as part of a Dual System Estimation (DSE) population benchmark using census and the IDI- ERP

  • Remove over-coverage in IDI-ERP ( 119,000)
  • Account for missing linkages between census and the IDI (48,000)
  • Use a statistical model to predict which people are more likely to have

a correct meshblock

slide-40
SLIDE 40

The meshblock model and threshold

Trade-off:

  • Improving national demographic distributions

Versus

  • Even coverage patterns for small geographies

Threshold = 0.5 i.e. 50% chance or better of being in correct meshblock. 357,000 admin people added to meshblocks 68,000 not included, mostly young adults

slide-41
SLIDE 41

Assessment

slide-42
SLIDE 42

2018 Census counts and DSE population

  • Males by age

Age

slide-43
SLIDE 43

2018 Census ethnic group counts and DSE population

43

Māori

Age

Asian

Age

slide-44
SLIDE 44

Census net coverage gap: 2018 and 2013 - Males

2013 Census counts vs 2013 ERP 2018 Census counts vs 2018 DSE (approximate) 2013 Census 2018 Census

slide-45
SLIDE 45

Census net coverage gap 2018 and 2013: TALB

2013 Census counts vs 2013 ERP 2018 Census counts vs 2018 DSE (approximate)

Note: excludes Chatham Islands Territory

slide-46
SLIDE 46

Summary 2018 Census population counts

We have a coherent sta.s.cal methodology for adding admin records to the census file when we don’t have a census form Admin enumera.ons replace unit imputa.on – a significant quality improvement

  • They are real people, with some characteris.cs from alterna.ve sources
  • Admin data does include people who are hard to count in a census field

enumera.on

Stats NZ is now confident it has compiled a census dataset that will provide census usually resident popula.on counts and electoral counts of acceptable quality

slide-47
SLIDE 47

Characteristics

slide-48
SLIDE 48

Outline

Data sources for characteristics Quality information

slide-49
SLIDE 49

2018 Census dataset: Combined sources

2013 Census and admin are main source of variables 2013 Census and admin fill variable gaps in IFs

Age, sex, place

Census forms

Characteristics People

Admin records Admin records add people

Other variables

49

slide-50
SLIDE 50

The 2018 Census dataset

  • Census responses
  • Historic census
  • Admin sources
  • Item imputa.on
  • ‘Missing’

Characteristics

  • Historic census
  • Admin sources
  • Item imputa.on
  • ‘Missing’

Census forms Admin sources Real people

People counts

slide-51
SLIDE 51

Consistency of responses between 2013 and 2018 Census

Variable Consistency between 2013 and 2018 Censuses Country of birth 0.99 Māori descent electoral (Yes, No) 0.99 Number of children ever born (aged 45+) 0.96 Māori descent census (Yes, No, Don’t know) 0.95 Languages spoken 0.93 Ever-smoked 0.93 Regular smoker 0.93 Years since arrival in New Zealand 0.92 Ethnic group 0.90 Religious affilia.on 0.85 Highest secondary school qualifica.on 0.85 Highest post-school level 0.80

slide-52
SLIDE 52

Quality of administrative variables

See Census Transformation programme research papers

slide-53
SLIDE 53

Administrative data for 2018 Census variables

slide-54
SLIDE 54

Summary 2018 Census methods

Alternative sources add real value where they are available

  • Adding people to the census dataset
  • Providing information about people

A questionnaire is the only way to collect some census information Strength comes from the combination of both census forms and admin data

Admin popula.on

Census forms

slide-55
SLIDE 55

Quality information for census variables

Where information for census variables comes from, and associated quality measures

slide-56
SLIDE 56

Impacts on characteristic data (variables)

slide-57
SLIDE 57

Priority one variables

  • Count of the population (final)
  • Count of dwellings (final)
  • Unoccupied dwellings
  • Meshblock location of each dwelling in NZ
  • Location of all respondents in NZ on census night

to meshblock level

  • Usual residence to meshblock level of all usually

resident in NZ

  • Age of all respondents in NZ on census night
  • Sex of all respondents in NZ on census night
  • Ethnicity of all respondents in NZ on census night
  • Māori descent
slide-58
SLIDE 58

Individual form source breakdown

slide-59
SLIDE 59

Variables where the individual form is the only source

slide-60
SLIDE 60

Impacts on iwi affliation

  • Lower levels of participation from Māori descent population has

resulted in significant proportion of iwi with declines in affiliation for 2018

  • Ability to fill gaps is limited due to lack of suitable iwi

administrative data, and classification change (2017 revised Iwi Statistical Standard).

  • DECISION: 2018 iwi counts will not be released as official

statistics but will explore options for provision of non-official iwi data.

slide-61
SLIDE 61

Data quality impacts due to missing information

  • Non-responding dwellings with no admin enumerations
  • so no household
  • Admin enumerations in meshblocks
  • are missing from households, so some households are incomplete

A key area of investigation for our evaluations team and are engaging with customers on this issue

Impacts on variables: families and households

slide-62
SLIDE 62

Quality Assurance and Assessment

slide-63
SLIDE 63

Quality management strategy

slide-64
SLIDE 64

Quality impacts on variables

The quality of census variables can be affected by:

  • Missing data where there is no alternative source, and no

statistical imputation

  • Quality of 2013 Census and admin values, and imputed

values

  • Quality of received responses
slide-65
SLIDE 65

Quality rating scale (QRS): 2013

Non

  • response rate

Consistency with .me series and other data sources Data quality issues 2013 Quality Ra.ng Scale : Metrics

Overall ra.ng: Very High – High – Moderate – Poor - Very Poor

slide-66
SLIDE 66

Quality rating scale (QRS): 2018

Combined weighted score

  • f census response, 2013

Census, admin source, imputed, and missing values

Data sources and coverage : 98 – 100 = Very High 95

  • <

98 = High 90

  • <

95 = Moderate 75

  • <

90 = Poor < 75 = Very Poor Consistency and coherence: Very High High Moderate Poor Very Poor Data quality : Very High High Moderate Poor Very Poor 2018 Quality Ra.ng Scale Metrics Overall ra.ng : Very High – High – Moderate – Poor – Very Poor

slide-67
SLIDE 67

2018 QRS: Data sources & coverage examples

Sources of data Ra2ng Percent

  • f total

Score contribu2on Individual Form sourced 1.00 84% 0.84 Missing/Non-response 0.00 16% Total 100% 0.84 Sources of data Ra2ng Percent

  • f total

Score contribu2on Individual Form sourced 1.00 83% 0.830 Historic (2013 Census) 0.95 8% 0.076 Admin data sourced 0.96 4% 0.038 Imputa.on 0.57 5% 0.029 Total 100% 0.973

Example 1

Example 1 = ‘High’ data source and coverage rating

Example 2

Example 2 = ‘Poor’ data source and coverage rating

slide-68
SLIDE 68

Quality rating scale (QRS): 2018

Data sources and coverage : 98 – 100 = Very High 95

  • <

98 = High 90

  • <

95 = Moderate 75

  • <

90 = Poor < 75 = Very Poor Consistency and coherence: Very High High Moderate Poor Very Poor Data quality : Very High High Moderate Poor Very Poor 2018 Quality Ra.ng Scale Metrics Overall ra.ng : Very High – High – Moderate – Poor – Very Poor

slide-69
SLIDE 69

Measuring consistency and coherence (time series)

  • Increased use of non-census data sources has impacted the

way in which comparability can be measured

  • Many variables have higher coverage than in 2013, but

inclusion of other sources may impact time series comparability with previous census data

  • Metadata products will provide detail behind ratings for each

quality rating scale metric

slide-70
SLIDE 70

Decisions on output

  • Evaluations process is being finalised
  • Decisions to restrict/not output any variables will be guided by data

evaluation and the quality rating scale

  • At-risk variables will then undergo further investigation and a thorough

risk & impact assessment before any decision

  • Engaging with customers on decisions
  • If output for variables is to be restricted, we will communicate this as

soon as we can before release

slide-71
SLIDE 71

Next Census

  • High level design
  • Business Case
  • Iwi and Māori
  • Stakeholder engagement is underway
slide-72
SLIDE 72

Stats NZ Census Session

Ques2ons and Answers