Sparsity with multi-type Lasso regularized GLMs Sander Devriendt - PowerPoint PPT Presentation

Sparsity with multi-type Lasso regularized GLMs Sander Devriendt (email: sander.devriendt@kuleuven.be) Joint work with K. Antonio, T. Reynkens, E. Frees, R. Verbelen eRum 2018, Budapest May 15, 2018

Motivation 2 Claim frequency and claim severity as function of nominal / numeric ∼ ordinal / spatial features Sparse modeling with multi-type variables – Sander Devriendt

Research questions 3 ◮ Generalized Linear Models (GLMs) for frequency ( ∼ Poisson) and severity ( ∼ Gamma). ◮ How to: (1) select variables or features? (2) cluster (or bin or fuse) levels within a variable? age groups / postal code clusters / clusters of car models ◮ Procedure should be data driven, scalable to large (big) data. ◮ End product is interpretable, within actuarial comfort zone. Sparse modeling with multi-type variables – Sander Devriendt

Research questions rephrased 4 ◮ Generalized Linear Models (GLMs) for frequency ( ∼ Poisson) and severity ( ∼ Gamma). ◮ How to: (1) avoid overfitting with too many variables or levels? (2) avoid underfitting with a priori binning/selection? Sparse modeling with multi-type variables – Sander Devriendt

A stepwise solution 5 Henckaerts, Antonio et al., 2018 (Scandinavian Actuarial Journal) Stepwise procedure 1 Do an exhaustive search through variables to find best GAM model. 2 Use well-chosen clustering algorithm to bin 2D spatial effect. Use evolutionary trees to bin 1D continuous effects and interactions. 3 Fit GLM with bins and clusters obtained in previous steps. 4 R packages: mgcv , classInt , evtree , rpart Sparse modeling with multi-type variables – Sander Devriendt

250 250 ^ GLM f 200 4 200 coefficients 0.5 −0.07 150 150 power power −0.021 0.0 0 100 100 0.035 −0.5 50 50 0.064 0 0 25 50 75 25 50 75 ageph ageph GLM ^ f 5 coefficients −0.329 0.2 −0.204 0.0 −0.155 −0.2 0 −0.4 0.199 Sparse modeling with multi-type variables – Sander Devriendt

Sparsity with multi-type Lasso regularized GLMs Devriendt, Antonio, Reynkens, Frees, Verbelen, 2018 (in progress)

Regularization 8 ✞ ☎ Standard GLM ✝ ✆ fit data as good as possible, no constraint on parameters. �    � ✞ ☎ Regularized GLM ✝ ✆ tradeoff between fit and interpretability/sparsity/stability, constraint on parameters. Sparse modeling with multi-type variables – Sander Devriendt

Lasso 9 ◮ Less is more: (Hastie, Tibshirani & Wainwright, 2015) a sparse model is easier to estimate and interpret than a dense model. ◮ Regularize (with budget constraint t , or regularization parameter λ ): min β 0 , β {−L ( β 0 , β ) } subject to � β � 1 ≤ t , or equivalenty   p   � min  −L ( β 0 , β ) + λ · | β j |  . β 0 , β j =1 Shrinks coefficients and even sets some to zero. Sparse modeling with multi-type variables – Sander Devriendt

Lasso visualization 10 Regularization = limited budget for β 1 , β 2 , β 3 . ‘Statistical Learning with Sparsity’ - Hastie et al. (2015) Sparse modeling with multi-type variables – Sander Devriendt

Lasso plot 11 Package glmnet overfitting ← − − → underfitting λ 0.2 0.1 Coordinates of β 0.0 −0.1 −0.2 0 5 10 15 λ Sparse modeling with multi-type variables – Sander Devriendt

Lasso and friends 12 ◮ Adjust lasso regularization to the type of variable: • Determine type (nominal / numeric ∼ ordinal / spatial); • Allocate logical penalty. ◮ Thus, for J variables, each with regularization term P j ( . ), we want to optimize: J � −L ( β 1 , . . . , β J ) + λ · P j ( β j ) . j =1 Sparse modeling with multi-type variables – Sander Devriendt

Lasso and friends: visualization 13 Different variable type → different penalty budget. ‘Statistical Learning with Sparsity’ - Hastie et al. (2015) Sparse modeling with multi-type variables – Sander Devriendt

Fused Lasso 14 Package genlasso overfitting ← − λ − → underfitting ordinal penalty example 0.20 var 1 var 6 var 2 var 7 var 3 var 8 var 4 var 9 0.15 var 5 var 10 Coordinates of β 0.10 0.05 0.00 −0.05 0 5 10 15 20 λ Sparse modeling with multi-type variables – Sander Devriendt

Generalized Fused Lasso 15 Package genlasso overfitting ← − λ − → underfitting nominal penalty example 0.20 var 1 var 6 var 2 var 7 var 3 var 8 var 4 var 9 0.15 var 5 var 10 Coordinates of β 0.10 0.05 0.00 −0.05 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 λ Sparse modeling with multi-type variables – Sander Devriendt

Unified GLM framework with multiple type of penalties 16 ◮ Gertheiss & Tutz (2010) and Oelker & Gertheiss (2017): • GLMs with various penalties. • R package available: gvcm.cat (not maintained). ◮ Uses local quadratic approximations of penalties and PIRLS: • non-exact selection or fusion; • computationally intensive. Sparse modeling with multi-type variables – Sander Devriendt

Unified GLM framework with multiple type of penalties 17 ◮ Our contribution: • implements an efficient algorithm (with proximal operators); - code bottleneck in C++ ( Rcpp ) - efficient linear algebra ( RcppArmadillo ) - parallel computations ( parallel ) • scalable to big data (splits into smaller sub-problems); • flexible regularization - penalty takes type of variable into account; - works for all popular penalties; ⇒ Package under construction. Sparse modeling with multi-type variables – Sander Devriendt

Case study: MTPL data 18 ◮ Frequency (and severity) information for n = 163 , 234 policyholders. ◮ 14 variables: binary, ordinal and nominal. ◮ Exposure modeled as offset. ◮ Fit Poisson GLM for frequency data with different penalties. • N i ∼ Poisson( µ i ) • log( µ i ) = log(exposure i ) + β 0 + � 14 j =1 X j β j • O ( β ) = −L ( β 0 , β 1 , . . . , β 14 ) + λ · � 14 j =1 P j ( β j ) Sparse modeling with multi-type variables – Sander Devriendt

Case study: MTPL data 19 Payment Frequency 0.30 0.25 0.20 Parameters 0.15 0.10 0.05 0.00 1 10 100 1000 10000 Lambda Sparse modeling with multi-type variables – Sander Devriendt

Case study: MTPL data 20 Age parameters 0.5 0.4 0.3 Parameter value 0.2 0.1 0.0 −0.1 −0.2 20 30 40 50 60 70 80 90 Lambda = 1 Age Sparse modeling with multi-type variables – Sander Devriendt

Case study: MTPL data 21 ◮ Settings: • Incorporate adaptive (GLM) and standardization weights for better consistency and predictive performance. • Tune λ with out-of-sample MSE (ˆ λ = 380) ◮ Re-estimate the final sparse GLM with standard GLM routines (from 164 to 38 params.). Sparse modeling with multi-type variables – Sander Devriendt

MTPL claim frequency with multiple type of penalties 22 1.0 ● ● ● 0.4 ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 −0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 30 40 50 60 70 80 90 50 100 150 Age Power (kW) 1.0 0.5 ● ● 0.6 ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● −0.5 ● ● ● ● ● ● ● ● ● ● ● −0.2 ● ● ● ● 0 5 10 15 20 0 5 10 15 20 25 Bonus−Malus scale Car age GAM fit, penalized GLM fit, GLM refit with new clusters. Sparse modeling with multi-type variables – Sander Devriendt

MTPL claim frequency with multiple type of penalties 23 0.6 Parameter estimates 0.4 ● 0.2 ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.2 sex use fuel sport fleet monovolume 4x4 0.3 ● Parameter estimates ● ● ● ● ● ● ● 0.1 ● ● ● −0.1 ● ● ● ● payfreq2 payfreq3 payfreq4 coverage2 coverage3 GAM fit, penalized GLM fit, GLM refit with new clusters. Sparse modeling with multi-type variables – Sander Devriendt

Wrap-up 24 ◮ Less is more. ◮ Flexible regularization can help predictive modeling. ◮ R package combines general framework with efficient algorithm. ◮ Package and working paper to be finalized. Sparse modeling with multi-type variables – Sander Devriendt

Sparsity with multi-type Lasso regularized GLMs Sander Devriendt - PowerPoint PPT Presentation

Sparsity with multi-type Lasso regularized GLMs Sander Devriendt (email: sander.devriendt@kuleuven.be) Joint work with K. Antonio, T. Reynkens, E. Frees, R. Verbelen eRum 2018, Budapest May 15, 2018 Motivation 2 Claim frequency and claim

Ridge/Lasso Regression, Model selection Xuezhi Wang Computer Science Department Carnegie Mellon

Sparse Exponential Weighting as an alternative to LASSO and Dantzig selector Alexandre Tsybakov

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern (

Type Checking Grammar Rule Semantic Rule var-decl id : type-exp Insert (id.name, type-exp .

Sparsity, Randomness and Compressed Sensing Petros Boufounos Mitsubishi Electric Research Labs

Sparse CCA using Lasso Anastasia Lykou & Joe Whittaker Department of Mathematics and

A practical tour of optimization algorithms for the Lasso Alexandre Gramfort

Using Stata 16s lasso features for prediction and inference Di Liu StataCorp August, 2019

Why Geometric Progression LASSO Method in Selecting the LASSO How Is Selected: . . . Natural

On Model Selection Consistency Of Lasso Yewon Kim 12/08/2015 Introduction Model selection is a

Regularized generalized CCA (RGCCA) Arthur Tenenhaus (SUPELEC) Michel Tenenhaus (HEC Paris) 1

Empirical Phase Transitions in Sparsity-Regularized Computed Tomography Jakob Sauer Jrgensen

Introduction to Sparsity in Modeling and Learning Introduction to Sparsity in Modeling and

Sparsity and image processing Aurlie Boisbunon INRIA-SAM, AYIN March 26, 2014 Why sparsity?

Omitted variable bias of Lasso-based inference methods: A finite sample analysis uthrich

Complexity Analysis of the Lasso Regularization Path Julien Mairal and Bin Yu Inria, UC Berkeley

Developing capacity for integrated maths Level 1 and Level 2 programmes that meet the reformed

Microinteractions.01 India HCI 2016. 7 Dec 2016 Venkatesh Rajamanickam (@venkatrajam)

Mommy, When I Grow Up, I Want T o Be An Architect! Mommy, When I Grow Up, I Want T o Be An

F u nctions as objects W R ITIN G FU N C TION S IN P YTH ON Sha y ne Miel Director of So w

Uncertainty and its Representa/on @kordinglab Uncertainty ma7ers

Boosted Top Tagging Seung J. Lee Outline Introduction: top jets @ LHC Modern boosted top

2016 New York EB-5 & Investment Immigration Convention Tr Trouble at the I-829 829 Carolyn

Best practices to present argument related to patentability and unpatentability before the PTAB

Sparsity with multi-type Lasso regularized GLMs Sander Devriendt - PowerPoint PPT Presentation

Sparsity with multi-type Lasso regularized GLMs Sander Devriendt (email: sander.devriendt@kuleuven.be) Joint work with K. Antonio, T. Reynkens, E. Frees, R. Verbelen eRum 2018, Budapest May 15, 2018 Motivation 2 Claim frequency and claim

Ridge/Lasso Regression, Model selection Xuezhi Wang Computer Science Department Carnegie Mellon

Sparse Exponential Weighting as an alternative to LASSO and Dantzig selector Alexandre Tsybakov

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern (

Type Checking Grammar Rule Semantic Rule var-decl id : type-exp Insert (id.name, type-exp .

Sparsity, Randomness and Compressed Sensing Petros Boufounos Mitsubishi Electric Research Labs

Sparse CCA using Lasso Anastasia Lykou &amp; Joe Whittaker Department of Mathematics and

A practical tour of optimization algorithms for the Lasso Alexandre Gramfort

Using Stata 16s lasso features for prediction and inference Di Liu StataCorp August, 2019

Why Geometric Progression LASSO Method in Selecting the LASSO How Is Selected: . . . Natural

On Model Selection Consistency Of Lasso Yewon Kim 12/08/2015 Introduction Model selection is a

Regularized generalized CCA (RGCCA) Arthur Tenenhaus (SUPELEC) Michel Tenenhaus (HEC Paris) 1

Empirical Phase Transitions in Sparsity-Regularized Computed Tomography Jakob Sauer Jrgensen

Introduction to Sparsity in Modeling and Learning Introduction to Sparsity in Modeling and

Sparsity and image processing Aurlie Boisbunon INRIA-SAM, AYIN March 26, 2014 Why sparsity?

Omitted variable bias of Lasso-based inference methods: A finite sample analysis uthrich

Complexity Analysis of the Lasso Regularization Path Julien Mairal and Bin Yu Inria, UC Berkeley

Developing capacity for integrated maths Level 1 and Level 2 programmes that meet the reformed

Microinteractions.01 India HCI 2016. 7 Dec 2016 Venkatesh Rajamanickam (@venkatrajam)

Mommy, When I Grow Up, I Want T o Be An Architect! Mommy, When I Grow Up, I Want T o Be An

F u nctions as objects W R ITIN G FU N C TION S IN P YTH ON Sha y ne Miel Director of So w

Uncertainty and its Representa/on @kordinglab Uncertainty ma7ers

Boosted Top Tagging Seung J. Lee Outline Introduction: top jets @ LHC Modern boosted top

2016 New York EB-5 &amp; Investment Immigration Convention Tr Trouble at the I-829 829 Carolyn

Best practices to present argument related to patentability and unpatentability before the PTAB

Sparse CCA using Lasso Anastasia Lykou & Joe Whittaker Department of Mathematics and

2016 New York EB-5 & Investment Immigration Convention Tr Trouble at the I-829 829 Carolyn