Uplift modeling with survival data Piotr Rzepakowski National - - PowerPoint PPT Presentation

uplift modeling with survival data
SMART_READER_LITE
LIVE PREVIEW

Uplift modeling with survival data Piotr Rzepakowski National - - PowerPoint PPT Presentation

Uplift modeling with survival data Piotr Rzepakowski National Institute of Telecommunications Warsaw, Poland Szymon Jaroszewicz National Institute of Telecommunications Warsaw, Poland Polish Academy of Sciences Warsaw, Poland HI KDD 2014


slide-1
SLIDE 1

Uplift modeling with survival data

Piotr Rzepakowski

National Institute of Telecommunications Warsaw, Poland

Szymon Jaroszewicz

National Institute of Telecommunications Warsaw, Poland Polish Academy of Sciences Warsaw, Poland

HI KDD 2014

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 1 / 17

slide-2
SLIDE 2

Introduction

Clinical trials - a key tool of evidence based medicine

treatment group – gets the treatement control group – gets placebo or another treatment

Survival analysis

survival time – target variable censoring – survival time of some patients is longer then the observed value and is not known exactly

Personalized medicine

usually: statistical test to decide if treatment is effective overall personalized medicine requires analysis at the level of individuals: for whom the treatment works best

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 2 / 17

slide-3
SLIDE 3

Classification on survival data

Building classifiers on survival data:

pick a threshold θ patients who survived at least θ: successes remaining ones: failures tricks to handle censoring

Problems

control group (placebo) not taken into account censoring causes problems

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 3 / 17

slide-4
SLIDE 4

Main idea of uplift modeling

We can divide patients into four groups

1 Survived because of the treatment

(the people we want)

2 Survived, but would have survived anyway

(unnecessary side effects and costs)

3 Did not survived and the treatment had no impact

(unnecessary side effects and costs)

4 Did not survived because the treatment had a

(negative impact) Classification cannot discern those groups, uplift modeling can

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 4 / 17

slide-5
SLIDE 5

Uplift modeling

Machine learning taking the control group into account Model the difference between success probabilities in treatment and control PT(Y |X1 . . . Xm) − PC(Y |X1 . . . Xm) Traditional classification predicts only the conditional probability PT(Y |X1 . . . Xm)

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 5 / 17

slide-6
SLIDE 6

Uplift modeling algorithms

The fundamental problem of causal inference

Our knowledge is always incomplete For each training case we know either

what happened after the treatment, or what happened if no treatment was given

Never both! This makes designing uplift algorithms challenging Double model approach

build separate classifiers on treatment and control subtract their predictions

Dedicated algorithms: try to maximize uplift directly

Uplift decision trees Uplift KNN Uplift Random Forest ...

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 6 / 17

slide-7
SLIDE 7

Applying uplift modeling to survival data

Our contribution

Under certain assumptions, uplift models can be applied to survival data directly

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 7 / 17

slide-8
SLIDE 8

Applying uplift modeling to survival data

T ∗ – true survival time C – the right censoring time, i.e. the last time the patient has been

  • bserved in the study

T – observed survival time T = min(T ∗, C) (1) We are interested in the event T ∗ ≥ θ, but T ∗ is not available due to censoring Introduce a class variable: Y = 1, if T ≥ θ, 0,

  • therwise.

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 8 / 17

slide-9
SLIDE 9

Applying uplift modeling to survival data

Assumption 1

The survival time T ∗ and the censoring time C are conditionally independent given x We have 1−P(Y = 1|x) = P(T ∗ < θ|x)[1 − P(C < θ|x)] + P(C < θ|x)

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 9 / 17

slide-10
SLIDE 10

Applying uplift modeling to survival data

Assumption 1

The survival time T ∗ and the censoring time C are conditionally independent given x We have 1−P(Y = 1|x) = P(T ∗ < θ|x)[1 − P(C < θ|x)] + P(C < θ|x)

Assumption 2

The censoring times C are independent of the treatment group assignment: Pt(C < θ|x) = Pc(C < θ|x) = P(C < θ|x) We get Pt(Y = 1|x) − Pc(Y = 1|x) = [Pt(T∗ ≥ θ|x) − Pc(T∗ ≥ θ|x)]P(C ≥ θ|x)

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 9 / 17

slide-11
SLIDE 11

Applying uplift modeling to survival data

Pt(Y = 1|x) − Pc(Y = 1|x) = [Pt(T∗ ≥ θ|x) − Pc(T∗ ≥ θ|x)]P(C ≥ θ|x) The treatment is beneficial if Pt(T ∗ ≥ θ|x) − Pc(T ∗ ≥ θ|x) > 0 P(C ≥ θ|x) ≥ 0 does not change the sign

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 10 / 17

slide-12
SLIDE 12

Applying uplift modeling to survival data

Pt(Y = 1|x) − Pc(Y = 1|x) = [Pt(T∗ ≥ θ|x) − Pc(T∗ ≥ θ|x)]P(C ≥ θ|x) The treatment is beneficial if Pt(T ∗ ≥ θ|x) − Pc(T ∗ ≥ θ|x) > 0 P(C ≥ θ|x) ≥ 0 does not change the sign

Conclusion

Applying uplift modeling to class variable Y gives a model making correct treatment recommendations even though we used T (censored) instead of T ∗ (true)

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 10 / 17

slide-13
SLIDE 13

More on assumptions

Assumption 1 is fairly standard Assumption 2 (censoring independent of group assignment) true in many cases:

completely random censoring intention to treat analysis with all patients followed until the end of study

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 11 / 17

slide-14
SLIDE 14

Classification model with survival data

The trick does not work for classification: We have P(T ∗ ≥ θ|x) ≥ η ⇔ P(Y = 1|x) P(C ≥ θ|x) ≥ η, To decide whether the probability of interest is ≥ η need to estimate

  • f P(C ≥ θ|x)

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 12 / 17

slide-15
SLIDE 15

Colon dataset

The colon dataset available in the survival package in R Two types of treatment and control:

levamisole, a low-toxicity compound levamisole + 5-FU (Fluorouracil), a moderately toxic chemotherapy agent no treatment, control group subjected to observation only

Two events: recurrence, death We skipped levamisole + 5-FU treatment (effective for everyone, no uplift)

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 13 / 17

slide-16
SLIDE 16

Experimental methodology

Uplift KNN, k = 1 Threshold θ set to median and 3rd quartile of observed survival times Five fold cross-validation Simulated survival curve for treatment selection based on uplift model Compare survival curves at specific point in time (χ2-test)

uplift vs. treating all patients uplift vs. treating no patients

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 14 / 17

slide-17
SLIDE 17

Survival curves (disease recurrence)

500 1000 1500 2000 2500 3000 0.4 0.5 0.6 0.7 0.8 0.9 1.0 survival time [days] Proportion no recurrence Uplift model Levamisole Observation Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 15 / 17

slide-18
SLIDE 18

Statistical tests (disease recurrence)

uplift vs. treatment uplift vs. control test type χ2 stat. p-value χ2 stat. p-value recurrence free, θ = 2227 days naive 22.514 2.086 · 10−6 25.450 4.540 · 10−7 log-trans. 18.841 1.420 · 10−5 21.094 4.373 · 10−6 patient survival, θ = 2324 days naive 1.605 2.051 · 10−1 8.421 3.709 · 10−3 log-trans. 1.527 2.166 · 10−1 7.504 6.156 · 10−3

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 16 / 17

slide-19
SLIDE 19

Summary

Transformation for converting survival data obtained in clinical trials into classification data to which uplift modeling can be applied Under reasonable assumptions, the recommendations by an uplift model trained on such data remain correct even though observed survival times are used instead of true survival times Method applied to data from a clinical trial of colon cancer treatment Applying therapy to patients selected by an uplift model improves the

  • utcome of the therapy in terms of both patient survival and disease

recurrence

Piotr Rzepakowski & Szymon Jaroszewicz (Piotr RzepakowskiNational Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 17 / 17