BoXHED : B oosted e X act H azard E stimator with D ynamic covariates - - PowerPoint PPT Presentation

boxhed b oosted e x act h azard e stimator with d ynamic
SMART_READER_LITE
LIVE PREVIEW

BoXHED : B oosted e X act H azard E stimator with D ynamic covariates - - PowerPoint PPT Presentation

BoXH BoXHED : B oosted e X act H azard E stimator with D ynamic covariates Xiaochen Wang Yale University With Donald K.K. Lee (Emory U.), Bobak J. Mortazavi (TAMU), Arash Pakbin (TAMU), Hongyu Zhao (Yale U.) Motivation Dynamic Features in


slide-1
SLIDE 1

BoXH BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates

With Donald K.K. Lee (Emory U.), Bobak J. Mortazavi (TAMU), Arash Pakbin (TAMU), Hongyu Zhao (Yale U.)

Xiaochen Wang Yale University

slide-2
SLIDE 2

Motivation

Dynamic Features in Survival Analyses

slide-3
SLIDE 3

Motivation

Dynamic Features in Survival Analyses

High frequency health vitals in ICU

slide-4
SLIDE 4

Motivation

Dynamic Features in Survival Analyses

Longitudinal data from clinical studies High frequency health vitals in ICU

slide-5
SLIDE 5

Motivation

Dynamic Features in Survival Analyses

Longitudinal data from clinical studies High frequency health vitals in ICU Mobile data and wearables devices

Behavioral data in financial risk assessment

slide-6
SLIDE 6

Challenges & Our contributions

  • Challenges:
  • ML survival methods mainly focus on time-static features. (Ishwaran et al. 08;

Ranganath et al. 16; Bellot & van der Schaar 18, 19; Lee et al. 19)

  • Methods dealing with dynamic features are very sparse:
  • Non-parametric: kernel smoothing for low-dimensional covariate settings.
  • Parametric: ‘flexsurv’ R package.
slide-7
SLIDE 7

Challenges & Our contributions

  • Challenges:
  • ML survival methods mainly focus on time-static features. (Ishwaran et al. 08;

Ranganath et al. 16; Bellot & van der Schaar 18, 19; Lee et al. 19)

  • Methods dealing with dynamic features are very sparse:
  • Non-parametric: kernel smoothing for low-dimensional covariate settings.
  • Parametric: ‘flexsurv’ R package.
  • Contributions:
  • 1. First publicly available software for boosted hazard estimation with time-

dependent features. https://github.com/BoXHED

  • 2. Novel algorithmic implementation of Lee, Chen, Ishwaran “Boosted nonparametric

hazards with time-dependent covariates” (2017)

slide-8
SLIDE 8

Problem statement

Each participant 𝑗 is represented by a triplet (𝑌$ 𝑢 &∈ (,*+ , Δ$, 𝑈$).

  • 𝑌$ 𝑢 is a set of continuously-monitored features.
  • Δ$ is a binary event indicator: 1 for an uncensored instance and 0 for a censored

instance.

  • 𝑈$ is the observed time, i.e.

𝑈$ = 2 Event bme if Δ$ = 1 Censoring bme if Δ$ = 0 Goal: Given above information of 𝑜 participants, we want to estimate log-hazard function 𝐺 𝑢, 𝑦 .

slide-9
SLIDE 9

Loss function

  • Loss function – negative log-likelihood.

𝑆 𝐺 = 1 𝑜 :

$;< =

>

( *+

𝑓@(&,A+ & )𝑒𝑢 − Δ$𝐺(𝑈

$, 𝑌$ 𝑈 $ )

  • Challenge: Likelihood risk 𝑆(𝐺) is too complex to be optimized using

traditional techniques. Solution provided in Lee, Chen, Ishwaran 17.

slide-10
SLIDE 10

Algorithm Overview

slide-11
SLIDE 11

Algorithm Overview

slide-12
SLIDE 12

Constructing the tree 𝑕E

Time X

+ + + + +

Candidate splits on time and feature

+ + + +

Tree Construction Demo

  • Select candidate splits based on percentiles (adjustable).

Trajectory Xi(t)

+

slide-13
SLIDE 13

Constructing the tree 𝑕E

Time X

+ + + + + +

Candidate splits on time and feature

+ + + +

Tree Construction Demo

  • Select candidate splits based on percentiles (adjustable).
  • Splits chosen to minimize R(F). How?

Trajectory Xi(t)

slide-14
SLIDE 14

Constructing the tree 𝑕E

Time X

+ + + + + +

Candidate splits on time and feature

+ + + +

Tree Construction Demo

  • Select candidate splits based on percentiles (adjustable).
  • Splits chosen to minimize R(F). How?
  • 1. A split creates two new sub-regions 𝐵< and 𝐵G.

𝐵< 𝐵G

Trajectory Xi(t) What’s the risk reduction if we split here?

slide-15
SLIDE 15

Constructing the tree 𝑕E

Time X

+ + + + + +

Candidate splits on time and feature

+ + + +

Tree Construction Demo

  • Select candidate splits based on percentiles (adjustable).
  • Splits chosen to minimize R(F). How?
  • 1. A split creates two new sub-regions 𝐵< and 𝐵G.
  • 2. Split score:

𝑒 = ∑I;<

G

𝑊

I 1 + log OP QP − (𝑊 < + 𝑊 G)(1 + log ORSOT Q

RSQ T) , where

𝑉I = ∑$;<

=

( *+ 𝑓@

W &,A+ & 𝐽YP 𝑢, 𝑌$ 𝑢

𝑒𝑢, 𝑊

I = # 𝑝𝑔 𝑝𝑐𝑡𝑓𝑠𝑤𝑓𝑒 𝑓𝑤𝑓𝑜𝑢𝑡 𝑗𝑜 𝐵I.

𝐵< 𝐵G

Trajectory Xi(t)

𝑉<

slide-16
SLIDE 16

Constructing the tree 𝑕E

Time X

+ + + + + +

Candidate splits on time and feature

+ + + +

Tree Construction Demo

  • Select candidate splits based on percentiles (adjustable).
  • Splits chosen to minimize R(F). How?
  • 1. A split creates two new sub-regions 𝐵< and 𝐵G.
  • 2. Split score:

𝑒 = ∑I;<

G

𝑊

I 1 + log OP QP − (𝑊 < + 𝑊 G)(1 + log ORSOT Q

RSQ T) , where

𝑉I = ∑$;<

=

( *+ 𝑓@

W &,A+ & 𝐽YP 𝑢, 𝑌$ 𝑢

𝑒𝑢, 𝑊

I = # 𝑝𝑔 𝑝𝑐𝑡𝑓𝑠𝑤𝑓𝑒 𝑓𝑤𝑓𝑜𝑢𝑡 𝑗𝑜 𝐵I.

  • 3. Choose the split that minimized 𝑒.

𝐵< 𝐵G

Trajectory Xi(t)

𝑉<

slide-17
SLIDE 17

Constructing the tree 𝑕E

Time X

+ + + + + +

Candidate splits on time and feature

+ + + +

Tree Construction Demo

  • Select candidate splits based on percentiles (adjustable).
  • Splits chosen to minimize R(F). How?
  • 1. A split creates two new sub-regions 𝐵< and 𝐵G.
  • 2. Split score:

𝑒 = ∑I;<

G

𝑊

I 1 + log OP QP − (𝑊 < + 𝑊 G)(1 + log ORSOT Q

RSQ T) , where

𝑉I = ∑$;<

=

( *+ 𝑓@

W &,A+ & 𝐽YP 𝑢, 𝑌$ 𝑢

𝑒𝑢, 𝑊

I = # 𝑝𝑔 𝑝𝑐𝑡𝑓𝑠𝑤𝑓𝑒 𝑓𝑤𝑓𝑜𝑢𝑡 𝑗𝑜 𝐵I.

  • 3. Choose the split that minimized 𝑒.
  • Choose subsequent splits to also minimize split score.

𝐵< 𝐵G

Trajectory Xi(t)

𝑉<

slide-18
SLIDE 18

Results

  • Simulation data
  • Framingham heart study data
slide-19
SLIDE 19

Simulation Data

Four hazard functions (Pérez et al. 13) 0, 20, and 40 irrelevant features from standard normal distribution are added to above four hazards.

𝜇< 𝑢, 𝑦& = 𝐶𝑓𝑢𝑏 𝑢, 2, 2 ×𝐶𝑓𝑢𝑏 𝑦&, 2, 2 , 𝑢 ∈ 0, 1 ; 𝜇G 𝑢, 𝑦& = 𝐶𝑓𝑢𝑏 𝑢, 4, 4 ×𝐶𝑓𝑢𝑏 𝑦&, 4, 4 , 𝑢 ∈ 0, 1 ; 𝜇h 𝑢, 𝑦& = 1 𝑢 𝜚(log 𝑢 − 𝑦&) Φ(𝑦& − log 𝑢) , 𝑢 ∈ 0, 5 ; 𝜇l 𝑢, 𝑦& = 3 2 𝑢(.n exp − 1 2 cos 2𝜌𝑦& − 3 2 , 𝑢 ∈ 0, 5 .

slide-20
SLIDE 20

Methods

Can handle time- dependent features? Nonparametric? Variable selection Parameter tuning BoXHED √ √ √ Cross-validated on training data Kernel Smoothing √ √ Kernel bandwidth tuned directly to test data FlexSurv √ √ Best parametric family for test data Black-boost √ Best parametric family and #iterations for test data

slide-21
SLIDE 21

RMSE error

RMSE error with 95% confidence interval.

slide-22
SLIDE 22

RMSE error

RMSE error with 95% confidence interval.

The kernel function is a beta density, resembling 𝜇< and 𝜇G.

slide-23
SLIDE 23

RMSE error

RMSE error with 95% confidence interval.

The kernel function is a beta density, resembling 𝜇< and 𝜇G. flexsurv is correctly specified for 𝜇h (log-normal distribution)

slide-24
SLIDE 24

Time-dependent AUC

AUC versus time 𝑢 for the estimators when applied to data simulated from 𝜇<. Larger AUC values are better. Left: No irrelevant covariates; right: 20 irrelevant covariates.

slide-25
SLIDE 25

Framingham heart study data

  • 9,697 participants enrolled by 1975 with event follow-up through

2017.

  • Many features were measured repeatedly in physical exams almost

every two years.

  • Risk factors: age, gender, systolic blood pressure (SBP), diastolic blood

pressure (DBP), total cholesterol (TC), smoking, diabetes, and BMI.

  • Outcome: first occurrence of a CVD event.
slide-26
SLIDE 26

Relationship between SBP and CVD

Conflicting clinical literature on how SBP affects CVD risk.

  • CVD risk increases with SBP;
  • CVD risk decreases with SBP, and then increases (U-shaped);
  • some more complicated interaction patterns ...

BoXHED identified novel interaction effects that may partially explain these conflicting findings.

slide-27
SLIDE 27

Estimated hazard by SBP

slide-28
SLIDE 28

Novel clinical finding

  • Hypotheses: The interaction effects SBP×BMI and SBP×Gender are

responsible for the reported clinical findings on SBP and CVD risk.

  • Validation: SBP×BMI interaction effect is validated using the

conventional odds ratio analyses.

slide-29
SLIDE 29

Conclusions

  • BoXHED is first publicly available software for boosted hazard

estimation that is

  • completely nonparametric
  • able to handle time-dependent features
  • applicable to high-dimensional data
  • Uncovered a novel interaction effect that may explain conflicting

findings on CVD risk in clinical literature.

https://github.com/BoXHED

slide-30
SLIDE 30

Q&A