Semiparametric regression with hierarchical models Yanwei (Wayne) - - PowerPoint PPT Presentation

semiparametric regression with hierarchical models
SMART_READER_LITE
LIVE PREVIEW

Semiparametric regression with hierarchical models Yanwei (Wayne) - - PowerPoint PPT Presentation

Semiparametric regression with hierarchical models Yanwei (Wayne) Zhang Statistical Research CNA Insurance Company New Orleans March 17, 2011 Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 1 / 27 Antitrust Notice


slide-1
SLIDE 1

Semiparametric regression with hierarchical models

Yanwei (Wayne) Zhang

Statistical Research CNA Insurance Company New Orleans

March 17, 2011

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 1 / 27

slide-2
SLIDE 2

Antitrust Notice Antitrust Notice

  • The Casualty Actuarial Society is committed to adhering strictly

The Casualty Actuarial Society is committed to adhering strictly to the letter and spirit of the antitrust laws. Seminars conduc to the letter and spirit of the antitrust laws. Seminars conducted ted under the auspices of the CAS are designed solely to provide a under the auspices of the CAS are designed solely to provide a forum for the expression of various points of view on topics forum for the expression of various points of view on topics described in the programs or agendas for such meetings. described in the programs or agendas for such meetings.

  • Under no circumstances shall CAS seminars be used as a means

Under no circumstances shall CAS seminars be used as a means for competing companies or firms to reach any understanding for competing companies or firms to reach any understanding – – expressed or implied expressed or implied – – that restricts competition or in any way that restricts competition or in any way impairs the ability of members to exercise independent business impairs the ability of members to exercise independent business judgment regarding matters affecting competition. judgment regarding matters affecting competition.

  • It is the responsibility of all seminar participants to be aware

It is the responsibility of all seminar participants to be aware of

  • f

antitrust regulations, to prevent any written or verbal discussi antitrust regulations, to prevent any written or verbal discussions

  • ns

that appear to violate these laws, and to adhere in every respec that appear to violate these laws, and to adhere in every respect t to the CAS antitrust compliance policy. to the CAS antitrust compliance policy.

slide-3
SLIDE 3

Outline

Outline

Case study I: Review basic concepts and theories in hierarchical models Case study II: Build connection between penalized splines and hierarchical models Case study III: Geo-spatial smoothing with bivariate penalized splines

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 3 / 27

slide-4
SLIDE 4

Case study I: Hierarchical models

slide-5
SLIDE 5

Review of hierarchical models Overview

Hierarchies in insurance data

Insurance data often come with an inherent hierarchy (classification) Homogeneity VS stability?

Insurance Company California New York …… Texas 2009 2010 2009 2010 2009 2010 Property Liability Auto Property Liability Auto …... 2008

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 5 / 27

slide-6
SLIDE 6

Review of hierarchical models Overview

Hierarchical models

Three methods to deal with data with inherent hierarchies: Complete pooling, assuming all groups are exactly the same No pooling, assuming complete heterogeneity Partial pooling (hierarchical), a compromise between the two extremes Advantages using hierarchical models: Using all data to make robust inference (group with small sample size) Inference of group level variation Inclusion of group level predictors Prediction for new group is available, and accounts for group variation

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 6 / 27

slide-7
SLIDE 7

Review of hierarchical models Example

Example: contents loss due to theft/burglary

Suppose we have reported loss data (severity) for contents coverage due to theft/burglary in California: Y : reported loss for each claim, X: contents insurance amount Can build a simple model log E(Yi) = α + β log Xi for severity Exponentiating it will lead to E(Yi) = exp(α)X β

i

exp(α) will be the rate per insurance amount (or per 1,000,...) β determines the curvature of the curve

Log insurance amount Log loss

6 8 10 12 1 2 3 4 5 6

Insurance amount (10,000s) Loss

5000 10000 15000 20000 25000 200 400 600 800

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 7 / 27

slide-8
SLIDE 8

Review of hierarchical models Example

Data and model

How would you determine the rate exp(α) for each county? Run one big regression using all data? Does not fit well, and can not get rate for each county! Run separate regression for each county? Estimate is so volatile for small county, and even get slope reversal! Hierarchical model[random intercepts]: log E(Yi) = αj[i] + β log Xi

117 7 55 5 33 37 580 7 14 10 5 6 144 15 90 65 89 156 86 34 15 25 17 82 10 4 14 27 46 12 40 5

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 8 / 27

slide-9
SLIDE 9

Review of hierarchical models Example

Visualization of the three models

Log Insurance Amount Log Insurance Loss

6 8 10 12 6 8 10 12 Butte San Diego 1 2 3 4 5 6 Los Angeles San Francisco 1 2 3 4 5 6 Merced Santa Barbara 1 2 3 4 5 6 Monterey Santa Cruz 1 2 3 4 5 6 Riverside Ventura 1 2 3 4 5 6 Model Complete−pooling No−pooling Multi−level

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 9 / 27

slide-10
SLIDE 10

Review of hierarchical models Example

Adding group level predictor

Improve the model by adding county-level predictors (Z)- the crime index: log E(Yi) = αj[i] + β log Xi and αj = a + bZj Reduce group-level variation Make groups conditionally exchangeable Models: M3: logloss ~ 1 + logamt + (1 | county) M4: logloss ~ 1 + logamt + crime + (1 | county) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) M3 4 6228.9 6251.0 -3110.5 M4 5 6226.1 6253.7 -3108.1 4.8153 1 0.02821 *

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 10 / 27

slide-11
SLIDE 11

Review of hierarchical models Example

Visualizing group-level regression

Crime index Estimated intercepts

7.4 7.6 7.8 8.0 8.2

  • 40

50 60 70 80 90 100

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 11 / 27

slide-12
SLIDE 12

Review of hierarchical models Example

Comparison to hierarchical model with no group-level predictors

Log Insurance Amount Log Insurance Loss

Butte(71) San Diego(70) Los Angeles(103) San Francisco(100) Merced(100) Santa Barbara(55) Monterey(99) Santa Cruz(49) Riverside(95) Ventura(45) Model No county predictor County predictor Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 12 / 27

slide-13
SLIDE 13

Review of hierarchical models Example

Rate map

Can produce rate relativity (average county loss / average state loss) for a fixed insurance amount, assuming the modeled frequency is flat If a county is not available in the data, it is automatically set to be state average The right is a rate map at 10,000 insurance amount

Relativity 0.7−0.8 0.8−0.9 0.9−1.0 1.0 1.0−1.15 1.15−1.30 1.30−1.5 Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 13 / 27

slide-14
SLIDE 14

Review of hierarchical models Inference

Inference on linear models

Suppose y|u ∼ N(Xβ + Zu, R) (1) u ∼ N(0, G) (2) Maximum likelihood estimation leads to minimizing the following: (y − Xβ − Zu)TR−1(y − Xβ − Zu) + uTG−1u (3) This yields the GLS estimator ˆ β = (XTV−1X)−1XTV−1y, V = ZGZT + R, (4) and the best linear unbiased predictor ˆ u = GZTV−1(y − Xˆ β) (5) Using these to maximize the profile likelihood to get estimate for V and R and plug back into (4) and (5).

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 14 / 27

slide-15
SLIDE 15

Case study II: Semiparametric models

slide-16
SLIDE 16

Penalized Splines Introduction

Motivation

Flexible modeling of nonlinear pattern

Hard to find a parametric nonlinear model Even found, hard to estimate

Rely on basis functions (e.g., two knots κ1, κ2)

Linear: 1, x, (x − κ1)+, (x − κ2)+ Quadratic: 1, x, x2, (x − κ1)2

+, (x − κ2)2 +

Cubic: 1, x, x2, x3, (x − κ1)3

+, (x − κ2)3 +

  • Wayne Zhang (CNA insurance company)

Semiparametric models March 17, 2011 16 / 27

slide-17
SLIDE 17

Penalized Splines Inference

Penalized splines

With the basis functions, the model can be written as Eyi = β0 + β1xi +

K

  • k=1

uk(xi − κk)+ (6) Or, using matrix notation, Ey = Xβ + Zu (7) where β = (β0, β1)′, u = (u0, · · · , uK)′, Xi = (1, xi) and Zi = [(xi − κ1)+, · · · , (xi − κK)+]. Impose the constraints uTu < C to avoid wiggly fit. Using Lagrange multiplier, this is equivalent to minimize (Y − Xβ − Zu)T 1 σ2

y

(Y − Xβ − Zu) + uT λ σ2

y

u (8) This is the same as the

hierarchical model in (3)! Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 17 / 27

slide-18
SLIDE 18

Penalized Splines Advantage using P-splines

Why P-splines

The above shows that the P-splines can be estimated using hierarchical models, for which many softwares are available

R: lme4, nlme SAS: PROC MIXED, %GLIMIX WinBUGS for Bayesian analysis

Compared to the Generalized Additive Model (GAM), which uses all knots but penalizes the second derivative, P-splines are much easier to fit Compared to other spline models such as B-splines, the number and the positioning of the knots in P-splines are not important given that the set of knots is relatively dense with respect to the x. Easy generalization to include parametric components to form semi-parametric models Easy generalization to other spline forms, such as (x − κ1)p

+, |x − κ1|p.

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 18 / 27

slide-19
SLIDE 19

Example using P-splines

Claimant loss data

The following data is from Frees (2010). It includes automobile injury claims data from the Insurance Research Council (IRC), and contains information on age information about the claimant, attorney involvement and the economic loss (LOSS, in thousands), among other variables.

LOSS ATTORNEY SEATBELT CLMAGE 34,940 1 1 50 10,892 1 28 330 1 5 . . . . . . . . . . . .

Fit a regression model on log(LOSS):

Estimate

  • Std. Error

t value Pr(>|t|) (Intercept) 7.2750 0.2884 25.23 0.0000 CLMAGE 0.0154 0.0022 7.12 0.0000 ATTORNEY:1 1.3667 0.0741 18.45 0.0000 SEATBELT:1

  • 0.9866

0.2787

  • 3.54

0.0004

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 19 / 27

slide-20
SLIDE 20

Example using P-splines

Diagnostics

The model makes sense, but what happens to the residuals?

Claimant age Standardized residuals

−4 −3 −2 −1 1 2 3 20 40 60 80

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 20 / 27

slide-21
SLIDE 21

Example using P-splines

Model improvement

Can model the curve with linear splines Knots at seq(5,85,by=5), and estimated using hierarchical models

Claimant age Standardized residuals

−4 −2 2 20 40 60 80

Claimant age Contribution to log loss

4 6 8 10 20 40 60 80

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 21 / 27

slide-22
SLIDE 22

Case study III: Geo-spatial models

slide-23
SLIDE 23

Geo-spatial models

Introduction

The univariate P-spline model can be extended to the multivariate setting, f(longitude, latitude) This could explain spatial dependency and allow spatial interpolation Such an extension is more straightforward when the spline basis is Radial, |x − κ1| → ||x − κ1||, since this distance is invariant to rotation

  • f coordinate systems

Selection of knots is harder - can resort to space filling algorithm The efficiency gain of using hierarchical models in computing is enormous

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 23 / 27

slide-24
SLIDE 24

Geo-spatial models

Example: California house value

From Pace and Barry (1997). Attempt to predict median house value using predictors such as median income, number of bedrooms, median house age and etc. log(value) ~ income + I(income^2) + I(income^3) + log(house.age) + log(rooms) + log(bedrooms) + log(population/households) + log(households) Multiple R-squared: 0.6078 Pace and Barry (1997) used a Spatial Autoregressive (SAR) model where the R2 is improved to 0.8594. Here, we model the spatial dependency through a spline term f(longitude, latitude)

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 24 / 27

slide-25
SLIDE 25

Geo-spatial models

Data and knots

Run space filling algorithm to select knots (red)

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 25 / 27

slide-26
SLIDE 26

Geo-spatial models

Results

The bivariate spline models results in better R2 than OLS lme(log(value)~-1+X,random=pdIdent(~-1+Z)) Multiple R-squared: 0.8099 Also resolve the spatial dependency and allow surface estimation.

NoSmoothing Smoothing

−1.0 −0.5 0.0 0.5 1.0

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 26 / 27

slide-27
SLIDE 27

Geo-spatial models

Summary

Hierarchical model incorporates actuarial credibility, a compromise between two extremes- complete pooling and no pooling This existing software can be applied to the inference of penalized splines, where nonparametric non-linear pattern in the underlying insurance data can be readily modeled Multivariate extension of the penalized splines can be further applied to model spatial dependencies and perform geo-spatial interpolations

Wayne Zhang (CNA insurance company) Semiparametric models March 17, 2011 27 / 27