 
              Uplift modeling with survival data Piotr Rzepakowski National Institute of Telecommunications Warsaw, Poland Szymon Jaroszewicz National Institute of Telecommunications Warsaw, Poland Polish Academy of Sciences Warsaw, Poland HI KDD 2014 Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 1 / 17
Introduction Clinical trials - a key tool of evidence based medicine treatment group – gets the treatement control group – gets placebo or another treatment Survival analysis survival time – target variable censoring – survival time of some patients is longer then the observed value and is not known exactly Personalized medicine usually: statistical test to decide if treatment is effective overall personalized medicine requires analysis at the level of individuals: for whom the treatment works best Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 2 / 17
Classification on survival data Building classifiers on survival data: pick a threshold θ patients who survived at least θ : successes remaining ones: failures tricks to handle censoring Problems control group (placebo) not taken into account censoring causes problems Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 3 / 17
Main idea of uplift modeling We can divide patients into four groups 1 Survived because of the treatment ( the people we want ) 2 Survived, but would have survived anyway ( unnecessary side effects and costs ) 3 Did not survived and the treatment had no impact ( unnecessary side effects and costs ) 4 Did not survived because the treatment had a ( negative impact ) Classification cannot discern those groups, uplift modeling can Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 4 / 17
Uplift modeling Machine learning taking the control group into account Model the difference between success probabilities in treatment and control P T ( Y | X 1 . . . X m ) − P C ( Y | X 1 . . . X m ) Traditional classification predicts only the conditional probability P T ( Y | X 1 . . . X m ) Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 5 / 17
Uplift modeling algorithms The fundamental problem of causal inference Our knowledge is always incomplete For each training case we know either what happened after the treatment, or what happened if no treatment was given Never both! This makes designing uplift algorithms challenging Double model approach build separate classifiers on treatment and control subtract their predictions Dedicated algorithms: try to maximize uplift directly Uplift decision trees Uplift KNN Uplift Random Forest ... Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 6 / 17
Applying uplift modeling to survival data Our contribution Under certain assumptions, uplift models can be applied to survival data directly Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 7 / 17
Applying uplift modeling to survival data T ∗ – true survival time C – the right censoring time, i.e. the last time the patient has been observed in the study T – observed survival time T = min( T ∗ , C ) (1) We are interested in the event T ∗ ≥ θ , but T ∗ is not available due to censoring Introduce a class variable: � 1 , if T ≥ θ, Y = 0 , otherwise. Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 8 / 17
Applying uplift modeling to survival data Assumption 1 The survival time T ∗ and the censoring time C are conditionally independent given x We have 1 − P ( Y = 1 | x ) = P ( T ∗ < θ | x )[1 − P ( C < θ | x )] + P ( C < θ | x ) Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 9 / 17
Applying uplift modeling to survival data Assumption 1 The survival time T ∗ and the censoring time C are conditionally independent given x We have 1 − P ( Y = 1 | x ) = P ( T ∗ < θ | x )[1 − P ( C < θ | x )] + P ( C < θ | x ) Assumption 2 The censoring times C are independent of the treatment group assignment: P t ( C < θ | x ) = P c ( C < θ | x ) = P ( C < θ | x ) We get P t ( Y = 1 | x ) − P c ( Y = 1 | x ) = [ P t ( T ∗ ≥ θ | x ) − P c ( T ∗ ≥ θ | x )] P ( C ≥ θ | x ) Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 9 / 17
Applying uplift modeling to survival data P t ( Y = 1 | x ) − P c ( Y = 1 | x ) = [ P t ( T ∗ ≥ θ | x ) − P c ( T ∗ ≥ θ | x )] P ( C ≥ θ | x ) The treatment is beneficial if P t ( T ∗ ≥ θ | x ) − P c ( T ∗ ≥ θ | x ) > 0 P ( C ≥ θ | x ) ≥ 0 does not change the sign Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 10 / 17
Applying uplift modeling to survival data P t ( Y = 1 | x ) − P c ( Y = 1 | x ) = [ P t ( T ∗ ≥ θ | x ) − P c ( T ∗ ≥ θ | x )] P ( C ≥ θ | x ) The treatment is beneficial if P t ( T ∗ ≥ θ | x ) − P c ( T ∗ ≥ θ | x ) > 0 P ( C ≥ θ | x ) ≥ 0 does not change the sign Conclusion Applying uplift modeling to class variable Y gives a model making correct treatment recommendations even though we used T (censored) instead of T ∗ (true) Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 10 / 17
More on assumptions Assumption 1 is fairly standard Assumption 2 (censoring independent of group assignment) true in many cases: completely random censoring intention to treat analysis with all patients followed until the end of study Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 11 / 17
Classification model with survival data The trick does not work for classification: We have P ( Y = 1 | x ) P ( T ∗ ≥ θ | x ) ≥ η ⇔ P ( C ≥ θ | x ) ≥ η, To decide whether the probability of interest is ≥ η need to estimate of P ( C ≥ θ | x ) Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 12 / 17
Colon dataset The colon dataset available in the survival package in R Two types of treatment and control: levamisole, a low-toxicity compound levamisole + 5-FU (Fluorouracil), a moderately toxic chemotherapy agent no treatment, control group subjected to observation only Two events: recurrence, death We skipped levamisole + 5-FU treatment (effective for everyone, no uplift) Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 13 / 17
Experimental methodology Uplift KNN, k = 1 Threshold θ set to median and 3rd quartile of observed survival times Five fold cross-validation Simulated survival curve for treatment selection based on uplift model Compare survival curves at specific point in time ( χ 2 -test) uplift vs. treating all patients uplift vs. treating no patients Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 14 / 17
Survival curves (disease recurrence) 1.0 Uplift model Levamisole Observation 0.9 Proportion no recurrence 0.8 0.7 0.6 0.5 0.4 0 500 1000 1500 2000 2500 3000 survival time [days] Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 15 / 17
Statistical tests (disease recurrence) uplift vs. treatment uplift vs. control χ 2 stat. χ 2 stat. test type p -value p -value recurrence free, θ = 2227 days 2 . 086 · 10 − 6 4 . 540 · 10 − 7 naive 22 . 514 25 . 450 1 . 420 · 10 − 5 4 . 373 · 10 − 6 log-trans. 18 . 841 21 . 094 patient survival, θ = 2324 days 2 . 051 · 10 − 1 3 . 709 · 10 − 3 naive 1 . 605 8 . 421 2 . 166 · 10 − 1 6 . 156 · 10 − 3 log-trans. 1 . 527 7 . 504 Piotr Rzepakowski & Szymon Jaroszewicz ( Piotr Rzepakowski National Institute of Telecommunications Warsaw, Uplift modeling with survival data HI KDD 2014 16 / 17
Recommend
More recommend