Data Cleansing for Predictive Models: The Next Level
Roosevelt C. Mosley, Jr., FCAS, MAAA CAS Ratemaking & Product Management Seminar Philadelphia, PA March 19 – 21, 2012
Experience the Pinnacle Difference!
Data Cleansing for Predictive Models: The Next Level Roosevelt C. - - PowerPoint PPT Presentation
Data Cleansing for Predictive Models: The Next Level Roosevelt C. Mosley, Jr., FCAS, MAAA CAS Ratemaking & Product Management Seminar Philadelphia, PA March 19 21, 2012 Experience the Pinnacle Difference! Data Cleaning Data cleansing
Experience the Pinnacle Difference!
Amount of insurance Age of home Billing option Construction Protection class Deductible Multiline State/territory
155,509 267,415 219,585
‐ 100,000 200,000 300,000 9 20 Total Coverage A
Coverage A
56 25 43
10 20 30 40 50 60 9 20 Total Age of Home
Age of Home
35% 15% 25%
0% 10% 20% 30% 40% 9 20 Total Percent without Multiline Discount
Percent without Multiline Discount
1.112 1.000 1.346 1.076 1.281 1.000 1.407 1.116 0.992 1.000 1.192 1.035 0.000 0.200 0.400 0.600 0.800 1.000 1.200 1.400 1.600 Monthly Semi‐Annual Pay in Full Mortgagee I n d i c a t e d R e l a t i v i t y Bill Plan
Bill Plan
Total Cluster 9 Cluster 20
0.200 0.400 0.600 0.800 1.000 1.200 1.400 50 100 250 500 1000 2500 5000 10000 I n d i c a t e d R e l a t i v i t y Deductible
Deductible
Total Cluster 9 Cluster 20
0.942 0.907 0.892 0.860 0.870 0.880 0.890 0.900 0.910 0.920 0.930 0.940 0.950 Auto & Home I n d i c a t e d R e l a t i v i t y Multi Line
Multi Line
Total Cluster 9 Cluster 20
Cluster 1 $1,109,048 Total Av $219,585 A erage Amount of Insurance verage Age of Home 19.6 years 19.9% 42.7 years Pe 1.9% rcentage of Deductibles > $2500
Midpoint of the cluster, represents an average risk for that cluster Risk that is slightly different than average, but still fits well with that cluster Potential anomaly – data point fits best within this cluster but is actually an
This generally means it doesn’t fit well anywhere.