Version 10/1/11
Cutting Edge Tools for Pricing and Underwriting Seminar
Integrating External Data into the Decision Making / Predictive Modeling Process
Casualty Actuarial Society Ron Zaleski, Jr., FCAS, MAAA Fall 2011
1
Cutting Edge Tools for Pricing and Underwriting Seminar Integrating - - PowerPoint PPT Presentation
Version 10/1/11 Cutting Edge Tools for Pricing and Underwriting Seminar Integrating External Data into the Decision Making / Predictive Modeling Process Casualty Actuarial Society Ron Zaleski, Jr., FCAS, MAAA Fall 2011 1 Version 10/1/11
Version 10/1/11
1
Version 10/1/11
2
spirit of the antitrust laws. Seminars conducted under the auspices of the CAS are designed solely to provide a forum for the expression of various points of view
companies or firms to reach any understanding – expressed or implied – that restricts competition or in any way impairs the ability of members to exercise independent business judgment regarding matters affecting competition.
regulations, to prevent any written or verbal discussions that appear to violate these laws, and to adhere in every respect to the CAS antitrust compliance policy.
Version 10/1/11
3
Version 10/1/11
4
Version 10/1/11
5
(e.g., chi-square tests)
make sense)
Complicate Simplify Review Model
Version 10/1/11
6
Version 10/1/11
7
Location Proxy Data
What types of data can we use to represent geographic units in a model framework?
Credibility
How can we utilize external information to provide ballast when our internal data is thin or non-existent?
Competitor Analytics
How can we indentify the appropriate competitor data to use in the decision-making process?
Version 10/1/11
8
Version 10/1/11
9
Attributes & Attitudes Policyholder Characteristics
Industry Data
Version 10/1/11
10
Box-Whisker X-X Scatterplot Two-Way Plot
Software Snapshots: JMP, SAS, and EMBLEM
Version 10/1/11
11
Version 10/1/11
12
Version 10/1/11
13
Version 10/1/11
14
0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80 2.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Exposure Distribtion Loss Ratio Relativity (ex. Territory Factors) Ventile (5% Groups)
Lift Chart: Territory Based on Competitor X
Exposure Distribution Expected (Based on Competitor X) Actual (Based on Internal Data)
Using a traditional model evaluation technique, such as a lift chart, you can judge the appropriateness of a competitor’s territory on your own data.
Actual performance using Competitor X matches Expected very well
Version 10/1/11
0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80 2.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Exposure Distribtion Loss Ratio Relativity (ex. Territory Factors) Ventile (5% Groups)
Lift Chart: Territory Based on Competitor Y
Exposure Distribution Expected (Based on Competitor Y) Actual (Based on Internal Data)
15
Actual performance using Competitor Y matches Expected very well, too!!
But what happens when a second competitor looks just as good?
Version 10/1/11
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Cumulative % of Losses Explained Cumulative % of Exposures Earned
Gini / Lorenz Curve: Territory Based on Competitor X
GINI INDEX = 0.202
16
An alternative view is to use a Lorenz curve and calculate a Gini Index to provide a quantitative measure to compare two models
Higher Gini Index implies a greater degree of loss segmentation based on the selected model
Version 10/1/11
17
Ranking the performance of each of the Competitor Models by Gini Index will help guide your selection.
Version 10/1/11
0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80 2.00
Exposure Distribtion Loss Ratio Relativity (ex. Territory Factors) Average Discrepancy Factor (= X Factor/ Y Factor)
Discrepancy Graph: Competitor X vs. Competitor Y
Exposure Distribution Expected (Based on Competitor X) Expected (Based on Competitor Y) Actual (Based on Internal Data)
Actual loss performance tracks better with Competitor X than Y.
18
Another alternative visual comparison is the discrepancy or “X” graph that compares models against each other.
Version 10/1/11
19
predictive variables
minimum performance standards
complement
Version 10/1/11
20
Version 10/1/11
21
Version 10/1/11
22
Detection of signal is especially difficult where the data is thin. Consider the analysis of Principal Component below.
0% 2% 4% 6% 8% 10% 12% 0.00 0.50 1.00 1.50 2.00 2.50 3.00
0.06 0.23 0.38 0.54 0.69 0.84 0.99 1.15 1.32 1.48 1.66 1.84 2.02 2.19 2.39 2.62 2.85 3.08
Exposure Distribtion Pure Premium Relative to Average Principal Component #1
Pure Premium Relativity: Internal Data Only
Exposure Distribution Internal Pure Premium Relativity
There appears to be a decreasing trend across the variable, but are the spikes with low volume just noise?
Version 10/1/11
23
More data helps to flatten “noisy” spikes and reveal the true signal, especially where internal data is thin. It is also a good sign when trends are consistent.
0% 2% 4% 6% 8% 10% 12% 0.00 0.50 1.00 1.50 2.00 2.50 3.00
0.06 0.23 0.38 0.54 0.69 0.84 0.99 1.15 1.32 1.48 1.66 1.84 2.02 2.19 2.39 2.62 2.85 3.08
Exposure Distribtion Pure Premium Relative to Average Principal Component #1
Pure Premium Relativity: Internal vs. Industry
Exposure Distribution Internal Pure Premium Relativity Industry Pure Premium Relativity
The absence of more credible data may have resulted in a poor decision
Version 10/1/11
24
available on the industry data to avoid OVB (Omitted Variable Bias)
your data/signal may be included in the dataset
Version 10/1/11
25
Parameter Estimates on Industry Data Variable Coverage A Coverage B Coverage C Coverage D Principle Component 1 0.140 0.410 0.230 Principle Component 2 0.840 0.590 0.320 0.690 Principle Component 3 0.680 0.400 Principle Component 4 1.370 0.070 1.170 3.860 Principle Component 5 1.060 0.160 0.740 Principle Component 6 0.270 0.380
Version 10/1/11
26
Version 10/1/11
27
Version 10/1/11
28
Version 10/1/11
29