predictive modeling and by peril analysis for
play

Predictive Modeling and By-Peril Analysis for Data Peril - PowerPoint PPT Presentation

Contents Case for by-peril modeling By-peril model building Predictive Modeling and By-Peril Analysis for Data Peril grouping Homeowners Insurance Variables Interactions By-peril territories Model validation


  1. Contents Case for by-peril modeling By-peril model building Predictive Modeling and By-Peril Analysis for � Data � Peril grouping Homeowners Insurance � Variables � Interactions � By-peril territories � Model validation Conclusion 2

  2. Case for by-peril modeling Case for by-peril modeling Modeling options Root mean error (smaller is better) 1 All peril losses • Regional players True by-peril • Limited number of states with fairly constant peril mix combined Aggregate by- Significant peril rate (M1) difference in 2 • True by-peril modeling Separate by-peril – Increased predictive • Average the true by-peril estimates aggregated into • Practical to do if legacy systems can only implement one accuracy accuracy single rate Aggregate by- set of rates peril rate (M2) All peril losses 3 • Conceptually makes sense combined • Certain variables are predictive for certain perils (e.g. fire True by-peril rating protective devices are predictive for the fire peril) • Responsive to state peril mix differences and changes in those 3 4

  3. By-peril model building: data staging By-peril model building: peril grouping Data options Grouping or peril separation Theft Grouping or breaking of perils -- considerations • Disappearance/theft on premises � Balance volume with recency to reflect an appropriate Availability of detailed/accurate peril codes from claims • Disappearance/theft off premises mix of business Years used • Is the correct cause of loss captured? Wind or hail damage to the roof? Liability • Are there additional peril code break-outs available? • Liability/Medical payments Grouping or breaking � Internal data used for non-catastrophes Data used • Use judgment, intuition, similarity Fire � Simulated data for catastrophes � Human-made fire • Electrical fire � Environmental fire • Grease fire from kitchen � Modeling/testing/validation • Fire from candles � Modeling/validation Data split • Fire from cigarette smoking Water � Out of time data split � • Children playing with matches Weather water � Out of sample data split � • Fireplace fire Non-weather water • Fire caused by electrical appliance Other � Glass � Aircraft � Vehicles 5 6

  4. By-peril model building: data staging By-peril model building: variable selection Practical considerations House characteristics Occupant characteristics Location and / or Amount of insurance Age External variables � Indications for appropriate rate level Number of stories Gender Weather � Systems cost trade off – cost benefit analysis of by-peril Number of rooms Marital status Temperature implementation Square footage Insurance score Precipitation Age of electrical Occupation/retired � Time constraints / speed to market / will dictate some of the options Elevation Age of home Number of occupants � Impact on other departments: claims, systems, pricing/actuarial, Slope Age of plumbing Prior claim activity financial reporting Geography Age of roof Other personal lines Commercial business Roof material Full payment/installments Protection class Construction type Billing lapse Demographics Protective devices Good payer Population Density Which variables are predictive for which peril? Financial variables 7 8

  5. By-peril model building: variable selection By-peril model building: variable selection Univariate analysis Univariate analysis Age � of � Home Amount � of � Insurance � (bucketed) 7 2 10.0% � Predictability by peril 1.8 9.0% 6 � Shape: continuous or categorical 1.6 8.0% 5 1.4 7.0% � Splines 1.2 6.0% Exposure 4 Indicated 1 5.0% � Transformations Indicated +2 � SE 3 0.8 4.0% +2 � SE � Equal buckets 0.6 3.0% � 2 � SE � 2 � SE 2 0.4 2.0% 1 0.2 1.0% 0 0.0% 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 0 20 40 60 80 100 120 Amount � of � Insurance � (bucketed) Age � of � Home 1.8 10.0% 7 1.6 9.0% 8.0% 6 1.4 7.0% 1.2 5 6.0% • Continuous Fits 1 Indicated Exposure 4 5.0% Reduced number of parameters +2 � SE 0.8 Indicated 4.0% 3 � 2 � SE 0.6 Allows for extrapolations outside of range Cubic � Fit 3.0% 2 Piecewise � Continuous � Fit 0.4 Avoids out-of-model smoothing 2.0% 1 0.2 1.0% Complex patterns fitted with piecewise splines 0 0 0.0% 0 20 40 60 80 100 120 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 9 10

  6. Model building: consistency over time Variable selection: occupant characteristics Considerations Urban ��� Rural � trend � over � time Options 1.1 1.05 Age Age 1 0.95 • Named insured Gender Gender 0.9 • Policy level Marital status Marital status 0.85 Rural Insurance score Insurance score • Presence of 0.8 Urban 0.75 Occupation/retired Occupation/retired • Number of 0.7 Number of occupants Number of occupants 0.65 • Maximum/minimum age 0.6 Other personal lines Other personal lines • Composition 2006 2007 2008 2009 Full payment/installments Full payment/installments Exposure � Year Billing lapse Billing lapse Good payer Good payer • Looking for a stable trend over time • Data quality • Correlation with other variables 11 12

  7. Variable selection: occupant characteristics Variable selection: predictiveness by peril Variable\Peril Fire Liability Theft Water Wind Other Considerations By-Peril Territory Insurance Score Age of Home Vendor data Protection class • House characteristics Construction material • Cost • Insurance scoring Amount of Insurance • How often is the data Other lines • Prior claim activity updated by the vendor? Full Pay • Weather Square Feet • How often does the data • Demographics have to be updated? Number of Rooms Claim Free • Elevation • Regulatory support and Retired Flag environment External data links Good Payer Prior Claims Secondary Residence Fire Protective Device Theft Protective Device Number of Occupants � By-peril territory, Insurance Score, Amount of Insurance, Full Pay, Age of Home, and Claims History are consistently powerful across all perils � Territory shows larger spread in weather perils � Insurance Score is predictive in weather perils 13 14

  8. Variable selection: identifying interactions Variable selection: modeling interactions � Looking for situations where the effect of variable x differs depending on variable y Move from categorical*categorical interaction to categorical*continuous interaction � Granularity can be a problem so grouping is often needed before testing for interactions young � old � household household small � 0.6 1.2 household large � 0.9 1.1 household 1.20 2 1.00 0.80 1.5 0.60 Factor 1 0.40 0.20 0.5 high 0.00 mid y 0 low Age � of � Oldest � Occupant low mid high 1/2 � occupants 3+ � occupants x 15 16

  9. Variable selection: modeling interactions By-peril territories Age � of � Occupant 1.8 14.0% Practical considerations 1.6 12.0% 1.4 • Territories are a collection of units (5 digit postal code, 3 digit postal code, counties, 10.0% 1.2 puma, etc) 8.0% 1 1/2 � Occupants 0.8 6.0% 3+ � Occupants • Data at the unit level by-peril is noisy due to limited information in one area 0.6 Exposure 4.0% • Territories are correlated with other rating variables (e.g. amount of insurance, age of home) 0.4 2.0% 0.2 0 0.0% 0 20 40 60 80 100 120 Age � of � Occupant 1.8 14.0% 1.6 Modeling solutions 12.0% 1.4 10.0% • Use territories developed by third parties using industry data 1.2 8.0% 1 1/2 � Occupants • Use residual risk based on initial models that include house information, occupant 0.8 6.0% 3+ � Occupants information, and external weather, demographics, geographical data Exposure 0.6 4.0% 0.4 • Use residual risk based on internal data only 2.0% 0.2 0 0.0% 0 20 40 60 80 100 120 17 18

  10. By-peril territories By-peril territories: smoothing Unsmoothed residual risk Distance based � Nearby units play a bigger role � Farther units play a smaller role Each unit of distance adds the same amount of risk independent of location More appropriate for weather related Adjacency based � Surrounding units play a bigger role � Outer rings of units play a smaller role Clustering smoothed residuals � Maximize variance between clusters � Minimize variance within clusters 19 20

  11. By-peril territories: how much smoothing? By-peril territories: how much smoothing? 21 22

  12. By-peril territories: how much smoothing? Model validation A couple of options… Out of sample validation • Traditional splitting of modeling and validation of the entire dataset may not work • Out of sample validation might fail if the observations are not independent (weather related perils) • The losses coming from the same “event” would be found both in the modeling and in the validation dataset Out of time validation � Could solve the independence issue if one year is kept aside for validation 23 24

  13. Conclusion • Homeowners predictive modeling could be as sophisticated and innovative as auto modeling • By-peril modeling is an important way of achieving increased sophistication and accuracy Contact David R. MacInnis Senior Predictive Modeler Allstate Insurance Company dmaau@allstate.com 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend