[PPT] - Phantoms Never Die: Living with Unreliable Mortality Data ANDREW PowerPoint Presentation

SLIDE 1

Phantoms Never Die: Living with Unreliable Mortality Data

ANDREW CAIRNS, DAVID BLAKE, AND KEVIN DOWD

Tuesday, December 10, 2013

Thought-Leading Sponsor

0255848-00001-00 Prudential Financial Inc. headquartered in the United States is not affiliated with Prudential plc in the United Kingdom

SLIDE 2

1

Phantoms Never Die Living with Unreliable Mortality Data

Andrew Cairns Heriot-Watt University David Blake City University London Kevin Dowd Durham University Supported by Prudential Retirement

SLIDE 3

2

Journey of Discovery

December 2012: ONS revises population estimates
Investigation commissioned by Prudential Retirement
Process of discovery ⇒

– there is a need for fundamental reviews of all official mortality data and how people interpret this data – stochastic mortality modelling potentially flawed until this review is complete.

SLIDE 4

3

Potential Errors in Revised Population Estimates

−25 −20 −15 −10 −5 5 10 1970 1980 1990 2000 2010 40 50 60 70 80 90

Estimated Error (%), phi(t,x)

Year, t Age, x

Source data: ONS EW males deaths and revised population estimates.

SLIDE 5

4

Plan

1. Background and motivation
2. Data issues: deaths, population, exposures
3. Graphical diagnostics and signature plots
4. Model-based analysis of historical population data
5. Impact of exposure errors on future mortality

projections

6. Conclusions and next steps

SLIDE 6

5

1: Background and Motivation

England and Wales data
D(t, x): Deaths counts and age at death are considered to be accurate
P(t + 1

2, x): Mid-year population estimated by Office for National Statistics

(ONS) – No ID card system with 100% coverage – Censuses every 10 years: . . . . . . , 1981, 1991, 2001, 2011 – Censuses ⇒ ’under-enumeration’ – Further difficulties between censuses

∗ imperfect migration records ∗ attribution of deaths to specific cohorts

SLIDE 7

6

Revisions After the 2011 Census ONS: December 2012 ⇒

Results of 2011 census finalised
At some ages: 2011 census not consistent with

post-censal estimates (2001 → 2011)

Mid-year population estimates for 2002 to 2010

revised

Significant revisions at some ages

SLIDE 8

7

Impact of Population Revisions on Mortality Rates

60 70 80 90 100 0.005 0.020 0.050 0.200 0.500

EW Males Mortality Rates in 2010

Old rates Revised Rates 20 40 60 80 100 0.8 0.9 1.0 1.1 1.2

EW Males Mortality Rates in 2010 Ratio of revised rates to old rates

Figure 1:

SLIDE 9

8

Why Does it Matter?

Potential impact on

Population mortality forecasts
Forecasts of sub-population mortality
Calibration of multi-population models
Calculation of annuity liabilities and Value-at-Risk
Assessed levels of uncertainty in the above
Buyout pricing
Assessment of basis risk in longevity hedges
Assessment of hedges and hedging instruments

SLIDE 10

9

Types of Impact: Base Table; Central Trend; Future Uncertainty

1990 2000 2010 2020 2030 2040 2050 2060 0.001 0.005 0.020 0.100 0.500 Age 54 Age 64 Age 74 Age 84 Age 94

M7 − CBD

Year Death Rate (log scale) Revised Mortality Original Mortality

SLIDE 11

10

Questions/Issues to Address:

What immediate impact on forecasts resulting from the

2012 revisions?

Be aware that 2012 revision not a one off.
Be aware that issues are relevant to all countries.
Can we further clean the exposures data?
Impact of potential further anomalies on forecasts?
How do we allow for future revisions?

SLIDE 12

11

2: Data Issues: Deaths, Population, Exposures

Data: deaths, population, births

– where can errors occur?

Themes:

– facts – conjectures – consequences of facts and conjectures

SLIDE 13

12

2.1: Deaths

Published death counts, D(t, x)

– deaths in calendar year t – age x last birthday at date of death

Regarded as accurate, BUT ...

– potential errors in recorded age at death

SLIDE 14

13

2.2: Population Estimates, Exposures, Death Rates

Death rate m(t, x) = D(t, x)

E(t, x)

E(t, x) = ’exposure’ in year t (central exposed to risk)

= average value of P(s, x) from t to t + 1

– P(s, x) = population at exact time s aged x last birthday

England & Wales ⇒ only P(t + 1

2, x) reported

Common assumption: E(t, x) = P(t + 1

2, x)

– e.g. ONS reported death rates: m(t, x) = D(t, x)/P(t + 1

2, x)

SLIDE 15

14

2.3: Where Can Errors in E(t, x) Occur?

Known errors: Inaccurate P(t + 1

2, x)

– no ID card system – infrequent censuses, under-enumeration – migration etc. – mis-reported age at census

Lesser known errors:

– inaccurate shift from census date to mid-year – assumption that P(t + 1

2, x) ≈ E(t, x)

SLIDE 16

15

Where Can Errors in E(t, x) Occur?

Mid-year Population Exposures Census Migration ✲ ✲ PPPPP P q ✏✏✏ ✏ ✶ ✲ ✲ Four sources of error: ✲ Errors that can be mitigated using CBD Exposures Methodology

SLIDE 17

16

2.3.1: Propagation of General Errors Through Time

Errors follow cohorts
Phantoms never die

SLIDE 18

17

Phantoms Never Die

1 2 3 4 5 6 7 8 9 10 10 500 1000 1500 Year True Population

True + error Census

Estimate Phantoms Never Die Post−censal Estimate New Census Estimate

SLIDE 19

18

Factual Consquence: Backfilling (ONS Methodology)

1 2 3 4 5 6 7 8 9 10 10 500 1000 1500 Year True Population

True + error

Revised Inter−censal Estimate New Census Estimate Inter−censal Adjustments

SLIDE 20

19

2.3.2: Census to Mid-year Shift

x−1 x x+1 P(Tc, x) P(Tc, x − 1) P(Tc + ωc, x) D C B A D C B A Census Mid−year Time

ONS 2001 assumption: birthdays spread evenly throughout the year Conjecture: – Different methodology used in earlier censuses and in 2011

SLIDE 21

20

Can We Improve on This Assumption?

The Cohort Births/Deaths (CBD) Exposures Methodology Underlying hypothesis:

At any point in time t, pattern of birthdays at t will reflect

– actual pattern of births x years earlier – deaths (impact at high ages) – migration and birth patterns of immigrants

Irregular pattern of births can lead to errors in census →

mid-year shift

SLIDE 22

21

1910 1912 1914 1916 1918 1920 1922 1924 50 100 150 200 250 WW1 Spanish Flu Time, t Quarterly Births (’000s)

Quarterly Births

SLIDE 23

22

1920 1940 1960 1980 2000 −10 −5 5 10 1919 Cohort +9.2% 1920 −6.3% 1947 1941 CBD Benchmark ONS relative to CBD

Percentage Difference Between ONS Methodology and Cohort Births/Deaths Methodology

Mid−year Birth Cohort Percentage Difference

SLIDE 24

23

2.3.3: Proposal to Improve Estimates of Exposures

Death rate m(t, x) = D(t, x)/E(t, x)
Current assumption: E(t, x) = P(t + 1

2, x)

CBD Exposures Methodology:

Assume E(t, x) = P(t + 1

2, x) × E(t − x, 0) P(t + 1

2 − x, 0)

E(t − x, 0)/P(t + 1

2 − x, 0) = Convexity Adjustment Ratio

CAR based on monthly pattern of births over t − x − 1 to

t − x + 1

SLIDE 25

24

CBD Exposures Methodology: Convexity Adjustment Ratio

1910 1920 1930 1940 1950 0.96 0.98 1.00 1.02 1.04 Cohort Year of Birth, t−x CBD Convexity Adjustment Ratio 1919 1920 1946 1947

SLIDE 26

25

2.3.4: High Age Methodology

ONS reports

– P(t + 1

2, 90+) only

– D(t, x) for x = 90, 91, 92, . . .

P(t + 1

2, x) for x = 90, 91, . . . derived using the

Kannisto-Thatcher Method (extinct cohorts)

Conjecture: Potential for inconsistencies at the

boundary between ages 89 and 90+

SLIDE 27

26

2.4: Facts and Conjectures ⇒ Consequences, Anomalies

Statistically, how significant are these anomalies?

Graphical diagnostics

– hypothesis ⇒ plot should exhibit specific characteristics

Signature plots

– what if it does not?

Model-based analysis

SLIDE 28

27

3: Graphical Diagnostics and Signature Plots 3.1: Graphical Diagnostic 1 Hypothesis: Crude death rates by age for successive cohorts should look similar.

⇒ Plot crude death rates against age.

SLIDE 29

28

70 75 80 85 90 95 100 0.05 0.10 0.20 0.50 1911 1910 1909 1908 1907

Cohort Death Rates: 1907 to 1911 birth cohorts Cohort death rates by age for 1907 to 1911 cohorts. ONS revised EW males data up to 2011.

SLIDE 30

29

Signature Plot: Emergence of Phantoms

65 70 75 80 85 90 0.05 0.10 0.15 0.20 1921 1920 1919 1918 1917

Cohort Death Rates: 1917 to 1921 birth cohorts

Phantoms emerging

SLIDE 31

30

3.2: Graphical Diagnostic 2

Hypothesis: Underlying log death rates are approximately linear

⇒ Plot concavity of log death rates: the difference between log of

ne death rate and the average of its immediate neighbours:

C(t, x0) = log m(t, x0 + t) −1

2

( log m(t, x0 + t − 1) + log m(t, x0 + t + 1) )

If log death rates are linear then this should be close to 0.

SLIDE 32

31

1924 Cohort

1960 1970 1980 1990 2000 2010 −0.2 −0.1 0.0 0.1 0.2

Log Death Rates: Deviation Between 1924 Cohort and the Average of its Nearest Neighbours

Year, t Concavity

Dots are randomly above and below 0.

SLIDE 33

32

1920 Cohort

1960 1970 1980 1990 2000 2010 −0.2 −0.1 0.0 0.1 0.2

Log Death Rates: Deviation Between 1920 Cohort and the Average of its Nearest Neighbours

Year, t Concavity CBD adjustment for uneven births using Cohort Adjustment Ratio Emerging phantoms in 1919 cohort

Signature plot: births pattern ⇒ true E(t, x) < P(t + 1

2, x)

SLIDE 34

33

1947 Cohort

1960 1970 1980 1990 2000 2010 −0.2 −0.1 0.0 0.1 0.2

Log Death Rates: Deviation Between 1947 Cohort and the Average of its Nearest Neighbours

Year, t Concavity

Dosts mostly below 0 ⇒ cause for concern

SLIDE 35

34

Same Data in 2-Dimensions: Heat Map

−0.4 −0.2 0.0 0.2 0.4 1970 1980 1990 2000 2010 50 60 70 80 90

Concavity of log m(t,x)

Year, t Age, x

Sampling variation ⇒ more extremes < 50 and > 90

SLIDE 36

35

Heat Map: by Age and Calendar Year Identifiable non-random patterns Signatures:

Diagonals ⇒ issues with a cohort
Horizontals ⇒ anomalies in reported age at death ???
Age at death errors are more plausible than systematic

age-dependent errors in exposures.

SLIDE 37

36

3.3: Graphical Diagnostic 3

Hypothesis: Changes in cohort population sizes should match pattern of reported deaths

Underlying data:

– mid-year population, P(t + 1

2, x)

– deaths in one calendar year, D(t, x)

Define ˆ

d(t + 1

2, x) = P(t + 1 2, x) − P(t + 3 2, x + 1)

Plot ˆ

d(t + 1

2, x) by cohort

Compare with surrounding D(t, x)
ˆ

d and D should be similar if little or no net migration (e.g. high

ages)

SLIDE 38

37

Standard Graphical Diagnostic 3: 1924 Cohort, Deaths Curve

1970 1980 1990 2000 2010 −5000 5000 10000 50 60 70 80 Age: d−hat(t,x) D(t,x) D(t+1,x) D(t,x+1) D(t+1,x+1) Deaths

1924 Cohort

SLIDE 39

38

Signature Plot: Backfilling the 1919 Cohort by ONS

1970 1980 1990 2000 2010 −5000 5000 10000

Age 90

60 70 80 90 Age: d−hat(t,x) D(t,x) D(t+1,x) D(t,x+1) D(t+1,x+1) Deaths

1919 Cohort

Parallel shift

SLIDE 40

39

Possible Explanation: Census → Mid-year Pop Error

1919 cohort (stylized)

1991 1993 1995 1997 1999 2001 500 1000 1500 Year True Population Census −> Mid−year Estimate Post−censal Estimate Census −> Mid−year Estimate

SLIDE 41

40

Factual Consquence: Backfilling (ONS Methodology)

1919 cohort (stylized)

1991 1993 1995 1997 1999 2001 500 1000 1500 Year True Population Revised Inter−censal Estimate Census −> Mid−year Estimate Inter−censal Adjustments

SLIDE 42

41

1918, 1919 and 1920 Cohorts, Deaths Curves

1970 1980 1990 2000 2010 −5000 5000 10000

Age 90

60 70 80 90 Age: d−hat(t,x) D(t,x) D(t+1,x) D(t,x+1) D(t+1,x+1) Deaths

1918 Cohort

1970 1980 1990 2000 2010 −5000 5000 10000

Age 90

60 70 80 90 Age: Deaths

1919 Cohort

1970 1980 1990 2000 2010 −5000 5000 10000

Age 90

60 70 80 90 Age: Deaths

1920 Cohort

1920 cohort: similar shift in opposite direction
Age 90 anomaly for all 3 cohorts ⇒ cause for concern

SLIDE 43

42

Signature Plot: Backfilling the 1947 Cohort

1970 1980 1990 2000 2010 −5000 5000 10000 30 40 50 60 Age: d−hat(t,x) D(t,x) D(t+1,x) D(t,x+1) D(t+1,x+1) Deaths

1947 Cohort

Again consistent with ONS versus CBD methodologies

SLIDE 44

43

3.4: Summary

Errors remain in the ONS population data
Combination of three graphical diagnostics highlight known

anomalies (e.g.1919) and some unexpected discoveries (e.g. 1920, 1947 cohorts; age 89/90)

Anomalies characterised by cohort and by age
CBD Exposures Methodology can be used to improve estimates
f exposures
CBD Exposures Methodology explains the 1919 anomaly that

has emerged since 1991

SLIDE 45

44

4: Model-Based Analysis of Historical Population Data 4.1: Proposed Solution: Bayesian Adjustment of Exposures Bayesian prior hypotheses: A: Death counts are accurate B: Exposures are subject to errors – errors following cohorts are correlated through time C: Within each calendar year: – curve of underlying death rates is “smooth” Adjust exposures to achieve a balance between B and C

SLIDE 46

45

4.2: Results: Assume E(t, x) = P(t + 1

2, x) Mid-year Population

−0.25 −0.20 −0.15 −0.10 −0.05 0.00 0.05 0.10 1970 1980 1990 2000 2010 40 50 60 70 80 90

Mean Smoothing Adjustment, phi(t,x)

Year, t Age, x 1886 1920 1919 1900 1947 Age 90/91

SLIDE 47

46

Exposures, E(t, x), Adjusted Using CBD Convexity Adjustment Ratio

−0.25 −0.20 −0.15 −0.10 −0.05 0.00 0.05 0.10 1970 1980 1990 2000 2010 40 50 60 70 80 90

Mean Smoothing Adjustment, phi(t,x)

Year, t Age, x 1886 1920 1919 1900 1947 Age 90/91

E(t, x) = P(t + 1

2, x)× Convexity Adjustment Ratio

SLIDE 48

47

4.3: Results 1

Results confirm conclusions based on graphical

diagnostics (e.g. problems with 1919, 1947 cohorts; age 89/90 boundary)

Bayesian approach allows us to quantify rigorously the

size of the error

SLIDE 49

48

Results 2

CBD Exposures Methodology:

– convexity adjustment for E(t, x) ̸= P(t + 1

2, x) explains 1920

anomaly – CBD dampens other anomalies (e.g. 1947 cohort)

Other anomalies remain but we have some explanations

– 1919 cohort explained by 2001 census + backfilling – age 89/90 ⇒ issues with Kannisto-Thatcher methodology – e.g. ages 70, 80 ⇒ potential bias in reporting of age at death

1947 (1940-1960) cohort(s) should be seen as an issue

financially

SLIDE 50

49

5: Impact of Exposure Errors on Future Mortality Projections

1990 2000 2010 2020 2030 2040 2050 0.01 0.02 0.05 0.10 0.20 0.50 Age 74 Age 84 Age 94 1947 1947 1947 1919 Cohort

1919

Mortality Fan Chart

Year Death Rate (log scale)

Red: unadjusted exposures. Blue: adjusted exposures

SLIDE 51

50

5.2: Impact on Annuity Valuation: 2% Interest Rate

1920 1930 1940 1950 −0.06 −0.02 0.02 0.04 0.06

Proportional Difference in 2% Annuity Values

1919 1947 Cohort Year of Birth Difference

Significant impact on some annuity values

SLIDE 52

51

Why are 1919 annuity values so different?

92 94 96 98 100 102 104 0.01 0.05 0.20 0.50 Mean 8−year survival probability: 0.102 0.0912 Unadjusted exposures Adjusted Exposures

M9: 1919 Birth Cohort Conditional Survivorship from End 2010

Age Survival Probability (log scale) Fewer survivors

SLIDE 53

52

Annuity Valuation: 2% Interest Rate a2%

x

Age Unadjusted Adjusted Percentage

x

Cohort Mean St.Dev. Mean Difference 63 1948 17.30 2.2% 17.23

+0.4%

64 1947 16.90 2.3% 16.67

+1.4%

65 1946 16.04 2.4% 16.08

−0.3%

92 1919 3.06 4.2% 2.93

+4.4%

SLIDE 54

53

5.4: Future Mortality Conclusions Adjustments to exposures ⇒

Forecasts and fitted cohort effects look more plausible
Limited impact on forecasts by age
Significant impact on some cohorts
Potentially significant impact on annuity portfolio

valuation and buyout pricing

SLIDE 55

54

6: Conclusions and Next Steps

Exposure errors

– 2011 census revisions drew attention to potential for errors more widely in population data – errors remain in ONS population data – graphical diagnostics ⇒ powerful model-free toolkit for identifying anomalies (signature plots) – Bayesian framework ⇒ quantification of errors – 2-D diagnostics ⇒ possible to detect small errors in deaths and exposures of less than 1%

SLIDE 56

55

CBD Exposures Methdology can be used to explain

many of the bigger errors – census → mid-year population estimates – mid-year population estimates → exposures – some big errors can be attributed to change in ONS methodology in 2001 – other errors have been identified but only partially explained by CBD Exposures Methodology

SLIDE 57

56

Use of CBD Exposures Methodology to Mitigate Errors in Exposures

Mid-year Population Exposures Census Migration ✲ ✲ PPPPP P q ✏✏✏ ✏ ✶ ✲ ✲ Four sources of error: ✲ Errors that can be mitigated using CBD Exposures Methodology

SLIDE 58

57

Conclusions (continued)

Potential small biases in reported age at death
Anomaly at ages 89/90 might be due to interface with

Kannisto-Thatcher high age methodology

SLIDE 59

58

Implications

Errors can make a big difference in annuity valuation
Financially: post WW-2 cohorts (especially 1947) need special

consideration.

Sources of errors and variants will apply to other countries e.g.

– Netherlands: good registration data, but still needs careful handling of data – US: similar to UK

∗ 1946 cohort

SLIDE 60

59

US Males, Ages 40-85, 1971-2010: Similar Issues

−0.10 −0.08 −0.06 −0.04 −0.02 0.00 0.02 0.04 1980 1990 2000 2010 40 50 60 70 80

Mean Smoothing Adjustment, phi(t,x)

Year, t Age, x

Source data: Human Mortality Database

SLIDE 61

60

Next Steps

Need for fundamental reviews of all official mortality data and how

users interpret this data

Stochastic mortality modelling

– potentially flawed until this review is complete – need to stop assuming exposures = mid-year population

SLIDE 62

61

Further engagement with a range of stakeholders:

– government and supra-national agencies – professional bodies – specialist longevity modelling consultancies – private spector holders of longevity risk (pension plans, insurers, reinsurers)

Better communication between government agencies

– exploit information contained in other databases

Need to collate monthly/quarterly births data: essential for error

mitigation

SLIDE 63

62

Census → mid-year population methodology needs to be

revisited

Revisit high-age population methodology
Software for tackling errors in population data

– module for graphical diagnostics – model-based quantification of errors – (revised) high-age methodology – output: updated population and mortality tables

SLIDE 64

63

Further numerical and modelling work

– analysis of data from other countries (US, NL, Germany, ...) – further work on England and Wales ages 90+

∗ Kannisto-Thatcher methodology

– model future revisions to data – what if no further adjustments to ONS 2012 revisions? – impact on multi-population modelling

∗ CMI mortality projections model ∗ stochastic models ∗ impact on basis risk assessment

SLIDE 65

The opinions expressed are those of the individual authors. Institutional Investor LLC is not responsible for the accuracy, completeness, or timeliness of the information contained in the articles herein. If you require expert advice, you should seek the services of a competent professional. No statement in this document is to be construed as a recommendation to buy or sell securities. Product names mentioned in in this document may be trademarks or service marks of their respective owners.