problems in countries with Dmitri A. Jdanov Domantas Jasilionis - - PowerPoint PPT Presentation

problems in countries with
SMART_READER_LITE
LIVE PREVIEW

problems in countries with Dmitri A. Jdanov Domantas Jasilionis - - PowerPoint PPT Presentation

Traditional and newly emerging data quality problems in countries with Dmitri A. Jdanov Domantas Jasilionis functioning vital statistics: Vladimir M. Shkolnikov experience of the Human on behalf of Mortality Database the HMD team United


slide-1
SLIDE 1

Traditional and newly emerging data quality problems in countries with functioning vital statistics: experience of the Human Mortality Database

United Nations Expert Group Meeting on the methodology and lessons learned to evaluate the completeness and quality of vital statistics data from civil registration New York, November 3-4, 2016

Dmitri A. Jdanov Domantas Jasilionis Vladimir M. Shkolnikov

  • n behalf of

the HMD team

slide-2
SLIDE 2

The Human Mortality Database

  • Joint project of the Department of Demography at the University of

California at Berkeley (USA) and the Max Planck Institute for Demographic Research in Rostock (Germany)

  • Work began in autumn 2000, launched online in May 2002 with 17

country series

  • Now: a leading data resource on mortality in developed countries.
  • Includes 38 countries and 8 regions, 30,000+ users

Main advantages:

  • Comparability across time and space
  • Continuous, long-term series without gaps or ruptures
  • Data by age, year, cohort, in age-time formats 1x1, 5x1, 1x5, 5x5 etc.
  • Detailed documentation on origins and quality of the data

However, one of the main principles of the HMD is to include countries with reliable population statistics, especially requiring a full coverage of registration of vital events.

slide-3
SLIDE 3

Reliable population data

First questions: Is there civil registration system and reliable vital statistics? Is there reliable population estimates? Is it possible to get all these data? Preliminary quality checks in the HMD:

  • Accuracy: coverage, completeness, proportion of missing data
  • Availability and relevance: access to detailed enough tabular data
  • Comparability across space and time

For example, using UN and WHO sources a naïve user might conclude that data for Moldova, Chile, Costa-Rica are O.K. (last two censuses – complete, death and births - coverage >=90%). This might be not correct according to the HMD criteria.

slide-4
SLIDE 4

UN assessment of coverage by death registration (Dec 2014)

Source: UN Population Division (http://unstats.un.org/unsd/demographic/CRVS/CR_coverage.htm)

slide-5
SLIDE 5

Censuses and assessment of the population denominator

slide-6
SLIDE 6

Censuses and inter-censal population estimates

Assuming good quality of census data: After a new census the post-censal population estimates should be replaced by inter-censal estimates (backward from this census). Four components:

  • Census counts
  • Death counts
  • Births
  • Migration

Developed countries with high quality vital registration system which do NOT produce inter-censal estimates: Germany, Italy, Czech Republic, ….

Census years Males Females

4000000 4200000 4400000 4600000 4800000 5000000 5200000 5400000

1947 1951 1955 1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003

Population

Czech Republic, Official Population Estimates as of December 31

slide-7
SLIDE 7

Inter-censal population estimates: the HMD approach

The standard HMD methodology (Wilmoth et al., 2007) for the cases when inter-censal population estimates are either not available

  • r unreliable is based on the

assumption of uniform distribution

  • f migration across the entire inter-

censal period. This assumption works well in many conventional situations, but may be violated in the case of special events (e.g. the collapse of the USSR and abrupt social-economic changes in Eastern Europe, the EU enlargement in 2004, the financial crisis in 2008-2009).

x x + 1 x + 2 x + 3 x + 4 x + 5 x + 6

Age

t t+1 t+2 t+3 t+4 t+5

Time

P(x+5,t+5) P(x,t) DU(x+1,t+1) DL(x+4,t+3)

 

x n i L U

n i t i x D i t i x D x C n t n x P            

 

5 ) , 1 ( ) , ( ) ( ) , (

1 1

 

          

4 1 2

) , 1 ( ) , ( ) ( ) 5 (

i L U x

i t i x D i t i x D x C x C

C1 C2

slide-8
SLIDE 8

Bulgaria: correction of population data (inter-censal estimates)

The standard HMD inter-censal method is not applicable to the period 1985-1992 because of an irregular pattern of out-migration. In 1985-8, international migration was very restricted in Bulgaria. After the collapse of communism in 1989 - mass emigration (mostly of the Turkish minority) over the next several years. HMD Solution: official population estimates were used for 1985-8, but new population estimates were calculated for the latter period. The year 1988 was treated as a “pseudo-census point” as the beginning of the inter-censal interval.

1985 (census year) 2001 census year 1984 2000 1992 (census year) 1991 3500000 3700000 3900000 4100000 4300000 4500000 4700000 1961 1963 1965 1967 1969 1971 1973 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003

MALES FEMALES

3500000 3700000 3900000 4100000 4300000 4500000 4700000 1980 1985 1990 1995 2000 Females Males

Trends in the total number of males and females. Bulgaria, 1961-2003. Official population estimates (left) and HMD data (right). Source: Jasilionis D., Jdanov D.A. Human Mortality Database: Background and Documentation for Bulgaria

slide-9
SLIDE 9

Germany: three decades between censuses

Before the 2011 census, East Germany had a census 30 years ago and West Germany - 24 years ago. Whereas before the 2011 census Germany's population was estimated to be 81.7 million, the census corrected this down to 80.2 millions, a difference of 1.5 million people (~ 1.8%). The statistical office of Germany decided not to produce adjusted inter-censal population estimates by age.

  • 1.00

0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00

  • 5000

5000 10000 15000 20000 25000 30000 10 20 30 40 50 60 70 80 90 Relative difference Absolute dofference Age males, abs females, abs males, rel females, rel

Figure: the difference between current population estimates and the census counts of 2011

slide-10
SLIDE 10

External outmigration intensities by age for the German lands, males

In the 2000s statistical offices of many German lands have made efforts to eliminate from their populations non-existent residents (erroneous cases). Using information from local tax offices they removed erroneous cases by creating external outmigration events in the year of “cleaning”.

slide-11
SLIDE 11

The HMD inter-censal estimates for Gerrmany

1) Using additional migration data and cubic spline interpolation for migration trends across cohorts we removed the population changes due to the earlier “cleaning” by the statistical offices. 2) We distributed the accumulated error (not the net migration!) uniformly over the adjustment period of 24 years (30 years for East German lands):

slide-12
SLIDE 12

Changeable population definitions across time

slide-13
SLIDE 13

Numerator-denominator bias: case of Moldova

Source: Penina, Jdanov, Grigoriev (2015)

* Since 1998 official population counts do not include Transnistria region

The problem: systematic bias (deaths and births refer to the de facto population, (i.e.

  • ccurred within the country, while

population estimates also include long-term emigrants - Moldavian citizens living abroad). Results in under-estimation of mortality and fertility. The solution: population estimates were corrected using data on border crossings and additional data collected at the census of 2004

slide-14
SLIDE 14

Changes in the definition of population: Poland

Figure: Official and adjusted (Tymicki et al. , 2015) estimates of population of Poland

14,000,000 15,000,000 16,000,000 17,000,000 18,000,000 19,000,000 20,000,000 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Pre- and post-censal population estimates according to the 2002 Post-censal population estimates calculated according to the 1988 census Post-censal population estimates calculated according to the 1970 census Post-censal population estimates calculated according to the 1960 census

FEMALES MALES

Post-censal population estimates according to the 2011 census Unfofficial inter- censal estimates based on the 2011 census

In the 2000s, Poland faced a massive out-migration that followed the EU enlargement of 2004. It was expected that the population counts will be corrected downward after the next population census of 2011. But Statistics Poland has unexpectedly decided to change the official definition of the population status from the permanently resident (acting in 2010 and earlier) to the usually resident (from 2011 onward). Statistics Poland did not re- estimate age-specific population counts back to previous census. Due to irregular migration pattern the standard HMD inter-censal method for reconstruction of annual population estimates is not applicable.

slide-15
SLIDE 15

Change in the definition of ethnicity: New Zealand Māori

For New Zealand, HMD has separate data series for non-Māori and Māori populations. Change in definition of Māori in the census of 1991 from the one based on ethnicity of parents to the one based on self-identification. The new definition caused a jump in Māori population, but the death counts were not corrected simultaneously. The respective change in definition of ethnicity for death and birth was introduced only in September 1995.

15 of 32

Figure: Life expectancy at birth of Māori, Non-Māori, and total population of New Zealand calculated from the

  • fficial (unadjusted) data (left panel) and adjusted HMD data (right panel). Source: (Jasilionis et al., 2015)
slide-16
SLIDE 16

In 1974 Portugal lost its African colonies, 500 000-700 000 returned to Portugal in the second half of the 1974 and in 1975

Impact of migration at working/reproductive ages on mortality and fertility estimates

Population exposures by age group, Portugal, females Red line – official current estimates, blue line – HMD inter-censal

Current

slide-17
SLIDE 17

Inter-censal population estimates and fertility indicators: Portugal in 1975-76

20 40 60 80 100 1 2 3 4 5 6 7 8 x 10

4

Age Population Population counts, year 1976 20 40 60 80 100 1 2 3 4 5 6 7 8 x 10

4

Age Population Population counts, year 1976

Current population estimates HMD inter-censal estimates 1975 1976 1975 1976

Cohort childlessness at age 40 (1-CCF1_40), Portugal

1 2 3 4 5 6 7 8 1954 1956 1958 1960 1962 1964 1966 1968 1970 cohort percent

annual intercensal

17 of 32

slide-18
SLIDE 18

Mortality at advanced ages

slide-19
SLIDE 19

Mortality estimates at old ages

  • Internationally comparable high quality demographic data on old-age

populations remain insufficient.

  • The HMD is the only major demographic database which provides such data.

Population estimates for ages 80+ in the HMD are recalculated using extinct/almost extinct cohort and survival ratio methods. Evaluation of data quality at old ages The standard set of methods includes tests on

  • age overstatement (e.g. the ratio of the total person-years lived above age

100 to the total person-years lived above age 80);

  • precision of age reporting with the UN age-sex accuracy index;
  • age heaping with the Whipple's Index of age accuracy.

The comparison to other countries with reliable statistics may be also used for evaluation of data quality.

slide-20
SLIDE 20

Germany: old ages

West Germany

1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 Year

0.2 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4 0.42

m90

Males Females

East Germany

1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 Year

0.2 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4 0.42

Males Females

Trends in death rates at age 90+, calculated from the official population estimates, for the West and East Germany, males and females, 1956-2008.

slide-21
SLIDE 21

Germany: old ages (cont.)

Ratio of DRV records by age based on own pensions to estimates based on official data, West and East Germany, 2009. West Germany

70 75 80 85 90 95 100 Age 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 Ratio DRV / HMD population estimates

Males Females

East Germany

70 75 80 85 90 95 100 Age 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3

Males Females

To correct population estimates for West Germans at older ages in 2010, the HMD team used data by the Deutscher Rentenversicherung Bund (DRV), the German Pension Scheme.

slide-22
SLIDE 22

Life expectancy and probability of death for the corrected and the original data, West Germany, 1990-2008

e80

1990 1994 1998 2002 2006 Year 5.5 6 6.5 7 7.5 8 8.5 9 9.5 e80

q80

1990 1994 1998 2002 2006 Year 0.04 0.05 0.06 0.07 0.08 0.09 0.1 q80

e90

1990 1994 1998 2002 2006 Year 3 3.25 3.5 3.75 4 4.25 4.5 4.75 e90

q90

1990 1994 1998 2002 2006 Year 0.12 0.14 0.16 0.18 0.2 0.22 0.24 q90

Males corrected Females corrected Males original Females original

slide-23
SLIDE 23

Relative difference (per cent): HMD (2011 Census) vs. official population estimates and HMD (2011 Census) vs. HMD (DRV correction at ages 90+)

slide-24
SLIDE 24

Chile: old ages

In the HMD/HFD data series for Chile starts in 1992

slide-25
SLIDE 25

Costa-Rica: death rate ratios, males

Year

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 1970 1980 1990 2000 2010

Year Age

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 1970 1980 1990 2000 2010

Year

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 1970 1980 1990 2000 2010 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0

Costa-Rica / Sweden Costa-Rica / Japan Costa-Rica / USA

slide-26
SLIDE 26

Costa-Rica: life expectancy at advanced ages

slide-27
SLIDE 27

Infant mortality

slide-28
SLIDE 28

28

Underestimation of infant mortality due to a restrictive definition of live birth and death undercount

Adjustment is made by correction of the monthly mortality

  • curves. The

adjustment brings these curves to certain “golden- standard” curve.

Source: Kingkade & Sawyer, 2001.

28 of 32

slide-29
SLIDE 29

Underestimation of infant mortality: adjustment by mortality trend

Figure: Infant mortality rate in Moldova before and after correction prior to 1973, both sexes, Moldova, 1959–2014. Source: (Penina et al., 2015)

An abrupt increase in the infant mortality that

  • ccurred in all of the

Soviet republics at the beginning of the 1970s was interpreted by Anderson and Silver (1986) as a result of improvements in the registration rather than a real deterioration in survival of the newborns.

slide-30
SLIDE 30

South Korea: preliminary analysis

Infant mortality before 2000 is not reliable After 2000: Substantial differences between pop estimates for 2000 and 2005 and census counts for the same years. Perhaps census counts excludes foreigners? These differences are important for the ages 0,1 and also for other ages (including adult and old ages). It can be also seen that some smoothing was used to produce pop estimates (fluctuations observed at some ages in census data are absent in pop estimates - this cannot be explained by a simple exclusion of foreigners).

50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 0 3 6 9 12151821242730333639424548515457606366697275788184

2005 census 2005 estimate

50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84

2000 census 2000 estimate

slide-31
SLIDE 31

Conclusion

  • Data are of high quality if they are “Fit for Use” in their intended
  • perational, decision-making and other roles (Juran and Godfrey,

1999). This is why the understanding of problems hidden in the data is important in any demographic estimation, forecast or study.

  • We discussed several approaches which allow us to increase

significantly utility of the data even if data quality is problematic.

  • Standard demographic methods which work well with data from

developing countries or historical data series are often not applicable to problematic data from countries with functioning statistical systems.

  • Country-specific approaches in combination with usage of additional

and alternative data sources are needed. They should be combined with certain general principles that are applied in all countries to ensure comparability of HMD data series across time and space.

slide-32
SLIDE 32

32 of 32

Thank you!