Stochastic Model Efficiency Applications: Cluster-Distance Sampling - PowerPoint PPT Presentation

Stochastic Model Efficiency Applications: Cluster-Distance Sampling and Parametric Curve Fitting to Tackle Sampling Errors and Bias Yvonne C. Chueh, PhD, ASA Paul H. Johnson, Jr., PhD Yongxue Qi Joint work between the University of Illinois at Urbana-Champaign (UIUC) and Central Washington University (CWU) Funded by The Actuarial Foundation (Simon Fraser University) July 9, 2013 1 / 37

Introduction and Purpose Introduction Practitioners/researchers are challenged to make credible inferential statements about the population distribution of important economic variables These distributions often involve a large number of policyholders and economic scenarios Well known challenge of running a stochastic asset/liability model is the long run-time Successful projection of these population distributions is important for actuaries Pricing, reserving, budgeting risk capital (Simon Fraser University) July 9, 2013 2 / 37

Introduction and Purpose Introduction (Continued) To analyze the population distrbution of economic outcomes, model efficiency approaches are often utilized Model Efficiency : Mathematical approaches that reduce the number of economic scenarios required to achieve a given level of precision in stochastic actuarial modeling (Rosner 2011) Model efficiency approaches include (Rosner 2011): Transfer Scenario Order Importance Sampling Curve Fitting Representative Scenarios (Simon Fraser University) July 9, 2013 3 / 37

Introduction and Purpose Purpose The CWU Computer Science department engineered these two packages to implement our research and empower future model efficiency research CSTEP: Cluster Sampling for Tail Estimation of Probability (reduce sampling errors, especially at the tail of a distribution) AMOOF2: Actuarial Model Optimal Outcome Fit V2.0 (reduce sampling bias) We use CSTEP and AMOOF2 to analyze statutory ending surplus data from a real block of variable annuities, provided by Milliman (Milliman data) (Simon Fraser University) July 9, 2013 4 / 37

CSTEP CSTEP CSTEP: Cluster Sampling for Tail Estimation of Probability Distribution of Financial Model Outcome (Chueh and Johnson 2012, Johnson et al. 2013) Upgrade from SALMS (Stochastic Asset Liability Model Sampling) used since 2003 CSTEP is open source, high performance computation software Universe capacity: 8,388,608 scenarios with up to 4500 time periods each Flexible sample size, reversible and reusable sampling Rate sampling (interest rate, equity return, index) (Simon Fraser University) July 9, 2013 5 / 37

CSTEP Representative Scenarios Consider a population of N rate paths Editable distance formulas similar to Euclidean distance are used to select n representative (pivot) scenarios where n <<< N (Chueh 2002) Choose an arbitrary path out of the N paths and call it Pivot 1 Calculate the distances of the remaining N - 1 paths to Pivot 1, the path with the largest distance to Pivot 1 is Pivot 2 Assign each of remaining N - 2 paths to the closest of Pivots 1 and 2, forming two disjoint paths Calculate the distances of the remaining N - 2 paths to Pivots 1 and 2, the path with the largest distance to Pivots 1 and 2 is Pivot 3 Assign each of remaining N - 3 paths to the closest of Pivots 1, 2, and 3, forming three disjoint paths ... repeat until n Pivots A probability is then assigned to each representative scenario (Simon Fraser University) July 9, 2013 6 / 37

CSTEP Scenario Distance Formulas In order to employ the method of representative scenarios, we need to be able to calculate the distance between two scenarios and tie that distance to the model output The original version of CSTEP employed a theorem of high-dimensional continuity (Continuity Theorem 1) The new version of CSTEP employs a modified theorem of high-dimensional continuity that improves the tail fit for volative economic scenarios and equity-based insurance guarantees (Continuity Theorem 2) (Simon Fraser University) July 9, 2013 7 / 37

CSTEP Continuity Theorem 1 Consider two n-period rate scenarios: x = ( r 1 , r 2 , ..., r n ) and s = ( r ′ 1 , r ′ 2 , ..., r ′ n ) Let CF t denote the net cash flow at the end of period t under scenario x ; CF ′ t is similarly defined under scenario s Let f ( x ) = � n � t 1 + r k and f ( s ) = � n � t 1 1 t = 1 CF t t = 1 CF ′ k = 1 t k = 1 1 + r ′ k Let the distance between x and s : �� n t = 1 [ � t 1 + r k − � t 1 1 k ] 2 d X ( x , s ) = k = 1 k = 1 1 + r ′ Let the distance between f ( x ) and f ( s ) : d Y ( f ( x ) , f ( s )) = | � n � t � t 1 1 t = 1 [ CF t 1 + r k − CF ′ k ] | k = 1 t k = 1 1 + r ′ (Simon Fraser University) July 9, 2013 8 / 37

CSTEP Continuity Theorem 1 (Continued) Given an arbitrary risk scenario s ∈ X , ∀ ǫ > 0 ǫ ∃ δ = 2 √ nM ∋ ∀ scenario vectors x ∈ X : d X ( x , s ) ≤ δ is a sufficient condition ∋ d Y ( f ( x ) , f ( s )) ≤ ǫ M = max t ( | CF t | , | CF ′ t | ) The above illustrates uniform continuity (Simon Fraser University) July 9, 2013 9 / 37

CSTEP Continuity Theorem 1: Proof [ d Y ( f ( x ) , f ( s ))] 2 = | � n � t � t 1 1 k ] | 2 1 + r k − CF ′ t = 1 [ CF t k = 1 t k = 1 1 + r ′ ≤ ( 2 M ) 2 | � n t = 1 [ � t 1 + r k − � t 1 1 k ] | 2 k = 1 k = 1 1 + r ′ ≤ n ( 2 M ) 2 � n t = 1 [ � t 1 + r k − � t 1 1 k ] 2 k = 1 k = 1 1 + r ′ = n ( 2 M ) 2 [ d X ( x , s )] 2 ≤ n ( 2 M ) 2 δ 2 = ǫ 2 Then: d X ( x , s ) ≤ δ ensures that d Y ( f ( x ) , f ( s )) ≤ ǫ (Simon Fraser University) July 9, 2013 10 / 37

CSTEP Original CSTEP Distance Formulas Let p denote a pivot scenario Significance Method: �� n t = 1 [ � t 1 1 + r k ] 2 d X ( x , p ) = k = 1 Relative Present Value (RPV) Method: �� n t = 1 [ � t 1 + r k − � t 1 1 k ] 2 d X ( x , p ) = 1 + r p k = 1 k = 1 (Simon Fraser University) July 9, 2013 11 / 37

CSTEP Continuity Theorem 2 Let C t ≈ CF t and C ′ t ≈ CF ′ t (determined by historical experience for a similar block of business or a sample(s) of the population distribution) Let the distance between x and s : �� n � t 1 � t 1 k ] 2 d ∗ X ( x , s ) = t = 1 [ C t 1 + r k − C ′ k = 1 t k = 1 1 + r ′ Given an arbitrary risk scenario s ∈ X , ∀ ǫ > 0 ∃ δ = ǫ > 0 ∋ ∀ scenario vectors x ∈ X : d ∗ X ( x , s ) ≤ δ is a sufficient condition ∋ d Y ( f ( x ) , f ( s )) ≤ ǫ (Simon Fraser University) July 9, 2013 12 / 37

CSTEP Continuity Theorem 2: Proof [ d Y ( f ( x ) , f ( s ))] 2 = | � n � t � t 1 1 k ] | 2 t = 1 [ CF t 1 + r k − CF ′ k = 1 t k = 1 1 + r ′ ≈ | � n � t � t 1 1 k ] | 2 t = 1 [ C t 1 + r k − C ′ k = 1 t k = 1 1 + r ′ X ( x , s )] 2 = [ d ∗ ≤ δ 2 = ǫ 2 Then: d ∗ X ( x , s ) ≤ δ ensures that d Y ( f ( x ) , f ( s )) ≤ ǫ (Simon Fraser University) July 9, 2013 13 / 37

CSTEP New (Economic) CSTEP Distance Formulas Let p denote a pivot scenario Economic Significance Method: �� n � t 1 1 + r k ] 2 d ∗ X ( x , p ) = t = 1 [ C t k = 1 Economic Present Value (EPV) Method: �� n � t � t 1 1 k ] 2 d ∗ X ( x , p ) = t = 1 [ C t 1 + r k − C ′ k = 1 k = 1 t 1 + r ′ (Simon Fraser University) July 9, 2013 14 / 37

AMOOF2 AMOOF2 AMOOF2: Actuarial Model Optimal Outcome Fit, Version 2.0 (Chueh and Curtis 2004) Stand alone desktop software suite communicating to Microsoft Excel and incorporating formulas computed by Wolfram Mathematica 8.0 Open source, high computation software for complex probability distribution fitting for stochastic modeling (principle based approach, actuarial guideline 43) (Simon Fraser University) July 9, 2013 15 / 37

AMOOF2 AMOOF2 (Continued) Allows for efficient determination of a data set’s summary statistics (such as mean and variance) and tail metrics (such as VaR and CTE) Implements pdf selection (both 22 single and mixed distributions, Klugman (2008)), graphing features to aid user flexibility, and high-computation outcome reporting Implements small bias adjustments arising from maximum likelihood estimation (MLE) (Simon Fraser University) July 9, 2013 16 / 37

AMOOF2 Algorithms Fitting probability density functions MLE: Analytic MLEs for 22 distributions in closed form completed using Mathematica 8.0 (if exist), otherwise, Excel’s solver is used to determine MLEs Methods of Moments: First four positive and negative theoretical moments can be set equal to their corresponding sample moments Small Sample Bias-Corrected MLEs (BMLEs) Cox and Snell/Cordeiro and Klein (CSCK) analytic BMLEs for 15/22 distributions; remaining distributions do not have closed form BMLEs (Cox and Snell 1968, Cordeiro and Klein 1994) Integration: VaR and CTE High-Precision Reimann Sums (Gaussian Quadrature Integration) (Simon Fraser University) July 9, 2013 17 / 37

AMOOF2 CSCK Method: BMLEs Consider a distribution with p parameters: θ = ( θ 1 , θ 2 , ..., θ p ) ′ Define joint cumulants based on the total loglikehood function l ( θ ) with n observations for i , j , l = 1, 2, ..., p : ∂ 2 l κ ij = E [ ∂θ i ∂θ j ] ∂ 3 l κ ijl = E [ ∂θ i ∂θ j ∂θ l ] Cumulant derivative: κ ( l ) = ∂κ ij ij ∂θ l Total Fisher information Matrix of order p for θ is K = {− κ ij } ; inverse is K − 1 = {− κ ij } (Simon Fraser University) July 9, 2013 18 / 37

Stochastic Model Efficiency Applications: Cluster-Distance Sampling - PowerPoint PPT Presentation

Stochastic Model Efficiency Applications: Cluster-Distance Sampling and Parametric Curve Fitting to Tackle Sampling Errors and Bias Yvonne C. Chueh, PhD, ASA Paul H. Johnson, Jr., PhD Yongxue Qi Joint work between the University of Illinois at

Distance Education Distance education used to be about the distance. 1700s 1800s 1900s 2000s

Stochastic Model Efficiency Applications: Cluster-Distance Sampling to Tackle Sampling Errors and

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

Mark-recapture distance sampling (MRDS) in Distance 7.1 Setting up Distance for MRDS

history and drivers The Aerospace Cluster The Cluster-Association The Aerospace Cluster The

Getting started on the cluster Learning Objectives Describe the structure of a compute cluster

Distance in data space Notion of distance (metrics) in data space Who is my closest neighbor?

Energy Efficiency Cluster Lending For SMEs By Indian Banks Cluster lending represents an

What is Cluster Analysis? Cluster: a collection of data objects Similar to one another

Cluster Presentation Cluster Presentation EU-EECA ICT Cluster is the joint effort of three

EDEN CLUSTER STATIONS EDEN CLUSTER STATIONS Density MUNICIPALITY SAPS STATION (inhabitants/km 2

Build Your Cluster with Rocks Build Your Cluster with Rocks Yu Fu Yu Fu University of Florida

What is Cluster Analysis? Dmitriy (Dima) Gorenshteyn Sr. Data Scientist, Memorial Sloan

Introduction to Cluster Computing Brian Vinter vinter@diku.dk Overview Cluster Computing

Reaching the Goal with the Regensburg Marathon Cluster - A NetBSD Cluster Project - Hubert Feyrer

Computing Cluster Usage Visualization Tool Compu&ng Cluster Usage Visualiza&on

Improving Acute Stroke Services by Co- Creating Patient Centred Research CIHR/SPOR Patient

SCN 2Q2019 Financial Results Opportunity Day Presentation 12 September 2019 SCAN INTER PUBLIC

Wo rth ke e ping in I p g E S: Cultura l e xpre ssio n Wo rksho p g ro ups T

A EUROPEAN SOCIAL SCIENCES RESEARCH INFRASTRUCTURE PROJECT This project is supported by funding

Partnership for Research and Innovation in the Health System (PRIHS) 5 2019/2020 Sean Dewitt,

there yet?! Emma Howarth and Ayla Humphrey Innovation and Evaluation theme Children and young

Spring INTO YOUR FUTURE Celebrating 80 years of Excellence This is Canada How many oceans does

Workshop on Sabka Vishwas (Legacy Dispute Resolution) Scheme, 2019 Questions/Issues/Concerns

Stochastic Model Efficiency Applications: Cluster-Distance Sampling - PowerPoint PPT Presentation

Stochastic Model Efficiency Applications: Cluster-Distance Sampling and Parametric Curve Fitting to Tackle Sampling Errors and Bias Yvonne C. Chueh, PhD, ASA Paul H. Johnson, Jr., PhD Yongxue Qi Joint work between the University of Illinois at

Distance Education Distance education used to be about the distance. 1700s 1800s 1900s 2000s

Stochastic Model Efficiency Applications: Cluster-Distance Sampling to Tackle Sampling Errors and

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

Mark-recapture distance sampling (MRDS) in Distance 7.1 Setting up Distance for MRDS

history and drivers The Aerospace Cluster The Cluster-Association The Aerospace Cluster The

Getting started on the cluster Learning Objectives Describe the structure of a compute cluster

Distance in data space Notion of distance (metrics) in data space Who is my closest neighbor?

Energy Efficiency Cluster Lending For SMEs By Indian Banks Cluster lending represents an

What is Cluster Analysis? Cluster: a collection of data objects Similar to one another

Cluster Presentation Cluster Presentation EU-EECA ICT Cluster is the joint effort of three

EDEN CLUSTER STATIONS EDEN CLUSTER STATIONS Density MUNICIPALITY SAPS STATION (inhabitants/km 2

Build Your Cluster with Rocks Build Your Cluster with Rocks Yu Fu Yu Fu University of Florida

What is Cluster Analysis? Dmitriy (Dima) Gorenshteyn Sr. Data Scientist, Memorial Sloan

Introduction to Cluster Computing Brian Vinter vinter@diku.dk Overview Cluster Computing

Reaching the Goal with the Regensburg Marathon Cluster - A NetBSD Cluster Project - Hubert Feyrer

Computing Cluster Usage Visualization Tool Compu&amp;ng Cluster Usage Visualiza&amp;on

Improving Acute Stroke Services by Co- Creating Patient Centred Research CIHR/SPOR Patient

SCN 2Q2019 Financial Results Opportunity Day Presentation 12 September 2019 SCAN INTER PUBLIC

Wo rth ke e ping in I p g E S: Cultura l e xpre ssio n Wo rksho p g ro ups T

A EUROPEAN SOCIAL SCIENCES RESEARCH INFRASTRUCTURE PROJECT This project is supported by funding

Partnership for Research and Innovation in the Health System (PRIHS) 5 2019/2020 Sean Dewitt,

there yet?! Emma Howarth and Ayla Humphrey Innovation and Evaluation theme Children and young

Spring INTO YOUR FUTURE Celebrating 80 years of Excellence This is Canada How many oceans does

Workshop on Sabka Vishwas (Legacy Dispute Resolution) Scheme, 2019 Questions/Issues/Concerns

Computing Cluster Usage Visualization Tool Compu&ng Cluster Usage Visualiza&on