Predictive Modeling and By-Peril Analysis for Data Peril - - PowerPoint PPT Presentation

predictive modeling and by peril analysis for
SMART_READER_LITE
LIVE PREVIEW

Predictive Modeling and By-Peril Analysis for Data Peril - - PowerPoint PPT Presentation

Contents Case for by-peril modeling By-peril model building Predictive Modeling and By-Peril Analysis for Data Peril grouping Homeowners Insurance Variables Interactions By-peril territories Model validation


slide-1
SLIDE 1

Predictive Modeling and By-Peril Analysis for Homeowners Insurance

2

Contents

Case for by-peril modeling By-peril model building

  • Data
  • Peril grouping
  • Variables
  • Interactions
  • By-peril territories
  • Model validation

Conclusion

slide-2
SLIDE 2

3

Case for by-peril modeling

1 All peril losses combined 2 3 True by-peril rating

  • Regional players
  • Limited number of states with fairly constant peril mix
  • True by-peril modeling
  • Average the true by-peril estimates
  • Practical to do if legacy systems can only implement one

set of rates

  • Conceptually makes sense
  • Certain variables are predictive for certain perils (e.g. fire

protective devices are predictive for the fire peril)

  • Responsive to state peril mix differences and changes in

those

Modeling options Separate by-peril – aggregated into single rate Increased accuracy 4 Root mean error (smaller is better)

Case for by-peril modeling

True by-peril Aggregate by- peril rate (M1) Aggregate by- peril rate (M2) All peril losses combined

Significant difference in predictive accuracy

slide-3
SLIDE 3

5

By-peril model building: data staging

Years used Balance volume with recency to reflect an appropriate mix of business Data used Internal data used for non-catastrophes Simulated data for catastrophes Data split Modeling/testing/validation Modeling/validation Out of time data split Out of sample data split

Data options

6

By-peril model building: peril grouping

Grouping or peril separation

Theft

  • Disappearance/theft on premises
  • Disappearance/theft off premises

Liability

  • Liability/Medical payments

Fire

  • Human-made fire
  • Environmental fire

Availability of detailed/accurate peril codes from claims

  • Is the correct cause of loss captured? Wind or hail

damage to the roof?

  • Are there additional peril code break-outs available?

Grouping or breaking

  • Use judgment, intuition, similarity

Grouping or breaking of perils -- considerations

  • Electrical fire
  • Grease fire from kitchen
  • Fire from candles
  • Fire from cigarette smoking
  • Children playing with matches
  • Fireplace fire
  • Fire caused by electrical appliance

Water

  • Weather water
  • Non-weather water

Other

  • Glass
  • Aircraft
  • Vehicles
slide-4
SLIDE 4

7

By-peril model building: data staging

Practical considerations Indications for appropriate rate level Systems cost trade off – cost benefit analysis of by-peril implementation Time constraints / speed to market / will dictate some of the options Impact on other departments: claims, systems, pricing/actuarial, financial reporting

8

By-peril model building: variable selection

House characteristics

Amount of insurance Number of stories Number of rooms Square footage Age of electrical Age of home Age of plumbing Age of roof Roof material Construction type Protective devices

Occupant characteristics

Age Gender Marital status Insurance score Occupation/retired Number of occupants Prior claim activity Other personal lines Full payment/installments Billing lapse Good payer

Location and / or External variables

Which variables are predictive for which peril?

Financial variables

Weather Temperature Precipitation Elevation Slope Geography Commercial business Protection class Demographics Population Density

slide-5
SLIDE 5

9

By-peril model building: variable selection

Univariate analysis

  • Predictability by peril
  • Shape: continuous or categorical

Splines Transformations Equal buckets

  • Continuous Fits

Reduced number of parameters Allows for extrapolations outside of range Avoids out-of-model smoothing Complex patterns fitted with piecewise splines

AmountofInsurance(bucketed)

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 0.0% 1.0% 2.0% 3.0% 4.0% 5.0% 6.0% 7.0% 8.0% 9.0% 10.0% Exposure Indicated +2SE 2SE

AmountofInsurance(bucketed)

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 0.0% 1.0% 2.0% 3.0% 4.0% 5.0% 6.0% 7.0% 8.0% 9.0% 10.0% Exposure Indicated CubicFit

10

By-peril model building: variable selection

Univariate analysis

AgeofHome

1 2 3 4 5 6 7 20 40 60 80 100 120 Indicated +2SE 2SE

AgeofHome

1 2 3 4 5 6 7 20 40 60 80 100 120 Indicated +2SE 2SE PiecewiseContinuousFit

slide-6
SLIDE 6

11

Model building: consistency over time

  • Looking for a stable trend over time
  • Data quality
  • Correlation with other variables

UrbanRuraltrendovertime

0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 2006 2007 2008 2009 ExposureYear Rural Urban

12

Variable selection: occupant characteristics

Age Gender Marital status Insurance score Occupation/retired Number of occupants Other personal lines Full payment/installments Billing lapse Good payer Age Gender Marital status Insurance score Occupation/retired Number of occupants Other personal lines Full payment/installments Billing lapse Good payer Considerations Options

  • Named insured
  • Policy level
  • Presence of
  • Number of
  • Maximum/minimum age
  • Composition
slide-7
SLIDE 7

13

Variable selection: occupant characteristics

Considerations

Vendor data

  • House characteristics
  • Insurance scoring
  • Prior claim activity
  • Weather
  • Demographics
  • Elevation

External data links

  • Cost
  • How often is the data

updated by the vendor?

  • How often does the data

have to be updated?

  • Regulatory support and

environment 14

Variable selection: predictiveness by peril

  • By-peril territory, Insurance Score, Amount of Insurance, Full Pay, Age of Home, and

Claims History are consistently powerful across all perils

  • Territory shows larger spread in weather perils
  • Insurance Score is predictive in weather perils

Variable\Peril Fire Liability Theft Water Wind Other By-Peril Territory Insurance Score Age of Home Protection class Construction material Amount of Insurance Other lines Full Pay Square Feet Number of Rooms Claim Free Retired Flag Good Payer Prior Claims Secondary Residence Fire Protective Device Theft Protective Device Number of Occupants

slide-8
SLIDE 8

15

Variable selection: identifying interactions

Looking for situations where the effect of variable x differs depending on variable y Granularity can be a problem so grouping is often needed before testing for interactions low mid high 0.5 1 1.5 2 low mid high y Factor x

16

Variable selection: modeling interactions

Move from categorical*categorical interaction to categorical*continuous interaction

young household

  • ldhousehold

small household 0.6 1.2 large household 0.9 1.1

0.00 0.20 0.40 0.60 0.80 1.00 1.20 AgeofOldestOccupant 1/2occupants 3+occupants

slide-9
SLIDE 9

17

Variable selection: modeling interactions

AgeofOccupant

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20 40 60 80 100 120 0.0% 2.0% 4.0% 6.0% 8.0% 10.0% 12.0% 14.0% 1/2Occupants 3+Occupants Exposure

AgeofOccupant

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20 40 60 80 100 120 0.0% 2.0% 4.0% 6.0% 8.0% 10.0% 12.0% 14.0% 1/2Occupants 3+Occupants Exposure

18

By-peril territories

Practical considerations

  • Territories are a collection of units (5 digit postal code, 3 digit postal code, counties,

puma, etc)

  • Data at the unit level by-peril is noisy due to limited information in one area
  • Territories are correlated with other rating variables (e.g. amount of insurance, age of home)

Modeling solutions

  • Use territories developed by third parties using industry data
  • Use residual risk based on initial models that include house information, occupant

information, and external weather, demographics, geographical data

  • Use residual risk based on internal data only
slide-10
SLIDE 10

19

Unsmoothed residual risk

By-peril territories

20

By-peril territories: smoothing

Distance based

  • Nearby units play a bigger role
  • Farther units play a smaller role

Each unit of distance adds the same amount of risk independent of location More appropriate for weather related Adjacency based

  • Surrounding units play a bigger role
  • Outer rings of units play a smaller role

Clustering smoothed residuals

  • Maximize variance between clusters
  • Minimize variance within clusters
slide-11
SLIDE 11

21

By-peril territories: how much smoothing?

22

By-peril territories: how much smoothing?

slide-12
SLIDE 12

23

By-peril territories: how much smoothing?

24

Model validation

A couple of options… Out of sample validation

  • Traditional splitting of modeling and validation of the entire dataset may not

work

  • Out of sample validation might fail if the observations are not independent

(weather related perils)

  • The losses coming from the same “event” would be found both in the

modeling and in the validation dataset Out of time validation

  • Could solve the independence issue if one year is kept aside for validation
slide-13
SLIDE 13

25

Conclusion

  • Homeowners predictive modeling could be as sophisticated and innovative as auto modeling
  • By-peril modeling is an important way of achieving increased sophistication and accuracy

Contact

David R. MacInnis Senior Predictive Modeler Allstate Insurance Company dmaau@allstate.com