Organizer: M. Mandy Sha RTI International Chair: Jenny Hunter - - PowerPoint PPT Presentation

organizer m mandy sha rti international chair jenny
SMART_READER_LITE
LIVE PREVIEW

Organizer: M. Mandy Sha RTI International Chair: Jenny Hunter - - PowerPoint PPT Presentation

Organizer: M. Mandy Sha RTI International Chair: Jenny Hunter Childs U.S. Census Bureau May 18, 2012 Characteristics of People Overcounted in the Census Sarah Heimel Decennial Statistical Studies Division U.S. Census Bureau May 18, 2012


slide-1
SLIDE 1

Organizer: M. Mandy Sha RTI International Chair: Jenny Hunter Childs U.S. Census Bureau May 18, 2012

slide-2
SLIDE 2

Characteristics of People Overcounted in the Census

Sarah Heimel Decennial Statistical Studies Division U.S. Census Bureau May 18, 2012

slide-3
SLIDE 3

Goals of this Paper

  • Introduce the concept of duplication
  • Describe a few characteristics of

duplicates in the 2010 census

3

slide-4
SLIDE 4

Common Living Situations Causing Duplication in the Census

  • College Housing
  • Joint Custody Arrangements
  • Moving Around Census Day
  • Vacation / Seasonal Home
  • Stay with Relatives or Friends

4

slide-5
SLIDE 5

Census Residence Rule & Residence Situations

  • Count people at the

place where they live and sleep most of the time.

  • Examples on front of

each Census form.

  • Respondents sometimes

have a different idea than the census does of who to count at their address.

5

slide-6
SLIDE 6

Identifying Duplicates in the Data

  • The data for each person listed on a census questionnaire

was captured and compiled.

  • Computer algorithms reviewed every pair of persons in the

census and compared name, age, sex and other variables to identify persons who might be duplicates.

  • The algorithm assigned a score to every pair; the score

described the strength of the match between the two person records.

  • Two records that were matched with each other are ‘links’.
  • Only links with high scores (above a clerically determined

cutoff point) were considered duplicates for this study and so were eligible for follow-up research.

6

slide-7
SLIDE 7

Classifying Duplicate Links

  • Geographic Distance (Hierarchical)

– Within Block – Within Surrounding Blocks – Within County – Within State – Across State

  • Housing Types

– Housing Unit to Housing Unit (HU-HU links) – Housing Unit to Group Quarters (HU-GQ links)

  • No GQ to GQ links were identified

7

slide-8
SLIDE 8

Implications from the Geographic Distance of Duplicates

  • Within Block and Surrounding Block Links

– Primarily Attributed to Housing Level (Address) Issues

  • For example (but not limited to):

– Duplicate (but slightly different) Addresses on the MAF

  • Within County, Within State, Across State Links

– Attributed to Person Level (Living Situation) Issues

  • For example (but not limited to):

– Joint Custody Arrangements – College Housing – Moving Around Census Day

8

slide-9
SLIDE 9

Limitations of this Research

  • Incomplete universe to identify duplicates

– Some census questionnaires were excluded from this matching

  • Not the final census universe

– Questionnaires might not have actually been included in the final count

  • Large clusters not researched

– One person could be identified on three or more census questionnaires.

9

slide-10
SLIDE 10

Who are the Duplicates?

  • How many links were identified in this

computer matching?

10

Response Level Person Level Type of Link Number Percent Number Percent Housing Unit to Housing Unit (HU-HU) 3,857,604 81.9 6,600,215 88.5 Housing Unit to Group Quarters (HU-GQ) 853,956 18.1 853,956 11.5 TOTAL 4,711,560 100.0 7,454,171 100.0

slide-11
SLIDE 11

Geographic Proximity

  • How close were the two addresses to

each other?

11

Response Level Person Level Geography of Link Number Percent Number Percent Within the same block 1,200,553 25.5 2,495,776 33.5 Within surrounding blocks 373,167 7.9 786,273 10.5 Within the same county 1,480,767 31.4 2,085,442 28.0 Within the same state 1,061,878 22.5 1,304,804 17.5 Across state lines 595,195 12.6 781,876 10.5 TOTAL 4,711,560 100.0 7,454,171 100.0

slide-12
SLIDE 12

Geographic Proximity

  • How close were the two housing units to each
  • ther, by type, at the response level?

12

HU-HU HU-GQ Geography of Link Number Percent Number Percent Within the same block 1,106,807 28.7 93,746 11.0 Within surrounding blocks 360,526 9.3 12,641 1.5 Within the same county 1,236,974 32.1 243,793 28.5 Within the same state 699,928 18.1 361,950 42.4 Across state lines 453,369 11.8 141,826 16.6 TOTAL 3,857,604 100.0 853,956 100.0

slide-13
SLIDE 13

Phone Match

Phone Comparison Percent Exact same non-blank telephone number 24.6 Different non-blank telephone numbers 53.8 One blank phone number and one nonblank number 19.0 Both phone numbers blank 2.6 TOTAL 100.0

13

  • For the 3,857,604 HU-HU response links,

how often did the two questionnaires report the same phone number?

slide-14
SLIDE 14

Type of GQs in Links

  • What types of GQs were duplicates counted in?

14

Type of GQ in the Link Percent (n=853,956) College/University Student Housing 51.7 Nursing Facilities/Skilled-Nursing Facilities 17.5 Correctional Facilities for Adults 12.2 Soup Kitchens, Transitional Shelters, Mobile Food Vans 4.7 Other Non-institutional Facilities 4.3 Group Homes and Residential Treatment Centers Intended for Adults 2.5 Military Quarters 2.1 Juvenile Facilities 2.1 Unknown Group Quarters Type 1.6 Other Institutional Facilities 1.4 TOTAL 100.0

slide-15
SLIDE 15

Overcount Question

15

slide-16
SLIDE 16

Overcount by Geography

Percent with An Overcount Mark HUGQ (N=853,956) HUHU (N=6,600,215) Within Block 13.9 15.9 Surrounding Block 26.4 19.1 Within County 39.5 55.7 Within State 76.1 72.4 Across State 80.1 75.9 TOTAL 58.7 41.3

16

  • What percent of duplicates positively marked the
  • vercount question (on at least one side)?
slide-17
SLIDE 17

Demographics

  • Demographics were compared across the

two sides.

  • Each side of a link could have provided

different information on the duplicated

  • person. One side might also have

provided information while the other side did not.

17

slide-18
SLIDE 18

Age

Age Category Percent

  • f Dups

(N=7,454,171)

Percent

  • f US

(N=308,745,538)

Under 5 6.0 6.5 5-9 7.1 6.6 10-14 7.8 6.7 15-19 11.5 7.1 20-24 11.8 7.0 25-29 6.2 6.8 30-34 4.7 6.5 35-39 4.4 6.5 40-44 4.5 6.8 45-49 5.1 7.4

18

Age Category Percent

  • f Dups

Percent

  • f US

50-54 5.4 7.2 55-59 5.1 6.4 60-64 4.8 5.4 65-69 3.8 4.0 70-74 2.9 3.0 75-79 2.3 2.4 80+ years 4.3 3.6 Inconsistent 2.3

  • Missing

0.2

  • TOTAL

100.0 100.0

slide-19
SLIDE 19

Conclusions

  • Duplication is an issue in the census.
  • There are some things the Census Bureau can do to

minimize duplication from occurring before enumeration begins.

  • After the enumeration, telephone number comparison is

a useful way to identify a sizable amount of duplication.

  • The overcount question could be utilized in the future to

better identify and resolve duplication.

  • Youth, especially college-aged persons, have high rates
  • f duplication.

19

slide-20
SLIDE 20

Thank You!

Sarah.K.Heimel@census.gov

20