Survival analysis : from basic concepts to open research questions - PowerPoint PPT Presentation

⋄ Plugging-in ˆ S ( · ) into the log-likelihood, gives after some Basic algebra : concepts � � r � � � Cure models log L = d ( j ) log h ( j ) + R ( j ) − d ( j ) log ( 1 − h ( j ) ) Introduction Ongoing research j = 1 Dependent ⋄ Using this expression to solve censoring Introduction d log L = 0 Ongoing research dh ( j ) Measurement errors leads to Introduction h ( j ) = d ( j ) Ongoing research ˆ R ( j ) S ( t ) = � h ( j ) in ˆ ⋄ Plugging in this estimate ˆ j : Y ( j ) ≤ t ( 1 − h ( j ) ) we obtain : � R ( j ) − d ( j ) ˆ S ( t ) = = Kaplan-Meier estimator R ( j ) j : Y ( j ) ≤ t

Basic concepts ⋄ Step function with jumps at the event times Cure models Introduction ⋄ If the largest observation, say Y n , is censored : Ongoing research • ˆ Dependent S ( t ) does not attain 0 censoring • Impossible to estimate S ( t ) consistently beyond Y n Introduction Ongoing research • Various solutions : Measurement - Set ˆ S ( t ) = 0 for t ≥ Y n errors - Set ˆ S ( t ) = ˆ S ( Y n ) for t ≥ Y n Introduction Ongoing research - Let ˆ S ( t ) be undefined for t ≥ Y n ⋄ When all data are uncensored, the Kaplan-Meier estimator reduces to the empirical distribution function

Asymptotic normality of the KM estimator The variance can be consistently estimated by (Greenwood Basic formula) concepts � Cure models d ( j ) Var (ˆ � S ( t )) = ˆ S 2 ( t ) Introduction R ( j ) ( R ( j ) − d ( j ) ) Ongoing research j : Y ( j ) ≤ t Dependent censoring Asymptotic normality of ˆ Introduction S ( t ) : Ongoing research Measurement ˆ S ( t ) − S ( t ) errors d � → N ( 0 , 1 ) Introduction Var (ˆ � Ongoing research S ( t )) Nelson-Aalen estimator of the cumulative hazard function Proposed by Nelson (1972) and Aalen (1978) : � d ( j ) ˆ H ( t ) = for t ≤ Y ( r ) R ( j ) j : Y ( j ) ≤ t The estimator is also asymptotically normal

Point estimate of the mean survival time ⋄ Nonparametric estimator can be obtained using the Basic Kaplan-Meier estimator, since concepts � ∞ � ∞ Cure models µ = E ( T ) = tf ( t ) dt = S ( t ) dt Introduction 0 0 Ongoing research ⇒ We can estimate µ by replacing S ( t ) by the KM Dependent censoring estimator ˆ S ( t ) Introduction ⋄ But, ˆ Ongoing research S ( t ) is inconsistent in the right tail if the largest Measurement observation (say Y n ) is censored errors Introduction • Proposal 1 : assume Y n experiences the event Ongoing research immediately after the censoring time : � Y n ˆ µ Y n = ˆ S ( t ) dt 0 • Proposal 2 : restrict integration to a predetermined interval [ 0 , t max ] and consider ˆ S ( t ) = ˆ S ( Y n ) for Y n ≤ t ≤ t max : � t max ˆ µ t max = ˆ S ( t ) dt 0

Point estimate of the median survival time ⋄ Advantages of the median over the mean : Basic concepts • As survival function is often skewed to the right, the Cure models mean is often influenced by outliers, whereas the Introduction Ongoing research median is not Dependent • Median can be estimated in a consistent way (if censoring censoring is not too heavy) Introduction Ongoing research ⋄ An estimator of the p th quantile x p is given by : Measurement � � errors t | ˆ ˆ x p = inf S ( t ) ≤ 1 − p Introduction Ongoing research ⇒ An estimate of the median is given by ˆ x p = 0 . 5 ⋄ The variance of ˆ x p can be estimated by : Var (ˆ � S ( x p )) � Var (ˆ x p ) = , ˆ f 2 ( x p ) where ˆ f is an estimator of the density f

⋄ Estimation of f involves smoothing techniques and the Basic concepts choice of a bandwidth sequence Cure models ⇒ We prefer not to use this variance estimator in the Introduction Ongoing research construction of a CI Dependent ⋄ Thanks to the asymptotic normality of ˆ censoring S ( x p ) : Introduction � � ˆ Ongoing research S ( x p ) − S ( x p ) � P − z α/ 2 ≤ ≤ z α/ 2 ≈ 1 − α, Measurement errors Var (ˆ � S ( x p )) Introduction Ongoing research with obviously S ( x p ) = 1 − p . ⇒ A 100 ( 1 − α )% CI for x p is given by     ˆ S ( t ) − ( 1 − p )  t : − z α/ 2 ≤ � ≤ z α/ 2  � Var (ˆ S ( t ))

Example : Schizophrenia patients Basic ⋄ Schizophrenia is one of the major mental illnesses concepts Cure models encountered in Ethiopia Introduction Ongoing research → disorganized and abnormal thinking, behavior and Dependent language + emotionally unresponsive censoring Introduction → higher mortality rates due to natural and unnatural Ongoing research Measurement causes errors Introduction ⋄ Project on schizophrenia in Butajira, Ethiopia Ongoing research → survey of the entire population (68491 individuals) in the age group 15-49 years ⇒ 280 cases of schizophrenia identified and followed for 5 years (1997-2001)

Basic Table: Data on schizophrenia patients concepts Cure models Introduction Ongoing research Patid Time Censor Education Onset Marital Gender Age Dependent 1 1 1 1 37 3 1 44 censoring Introduction 2 3 1 3 15 2 2 23 Ongoing research 3 4 1 6 26 1 1 33 Measurement errors 4 5 1 12 25 1 1 31 Introduction Ongoing research 5 5 0 5 29 3 1 33 . . . 278 1787 0 2 16 2 1 18 279 1792 0 2 23 1 1 25 280 1794 1 2 28 1 1 35

Basic ⋄ In R : survfit concepts Cure models schizo <- read.table("c://...//Schizophrenia.csv", header=T,sep=";") Introduction KM_schizo_g <- survfit(Surv(Time,Censor) ∼ 1,data=schizo, Ongoing research type="kaplan-meier", conf.type="plain") Dependent plot(KM_schizo_g, conf.int=T, xlab="Estimated survival", ylab="Time", censoring yscale=1) Introduction mtext("Kaplan-Meier estimate of the survival function for Schizophrenic Ongoing research patients", 3,-3) Measurement mtext("(confidence interval based on Greenwood formula)", 3,-4) errors Introduction ⋄ In SAS : proc lifetest Ongoing research title1 ’Kaplan-Meier estimate of the survival function for Schizophrenic patients’; proc lifetest method=km width=0.5 data=schizo; time Time*Censor(0); run;

Basic concepts 1.0 Cure models Kaplan−Meier estimate of the survival function for Schizophrenic patients Introduction (confidence interval based on Greenwood formula) Ongoing research 0.8 Dependent censoring Introduction 0.6 Ongoing research Time Measurement errors 0.4 Introduction Ongoing research 0.2 0.0 0 500 1000 1500 Estimated survival

Basic concepts > KM_schizo_g Cure models Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = Introduction "kaplan-meier", conf.type = "plain") Ongoing research n events median 0.95LCL 0.95UCL Dependent 280 163 933 766 1099 censoring > summary(KM_schizo_g) Introduction Call: survfit(formula = Surv(Time, Censor) ~ 1, data = schizo, type = Ongoing research "kaplan-meier", conf.type = "plain") Measurement time n.risk n.event survival std.err lower 95% CI upper 95% CI errors 1 280 1 0.996 0.00357 0.9894 1.000 Introduction 3 279 1 0.993 0.00503 0.9830 1.000 Ongoing research 4 277 1 0.989 0.00616 0.9772 1.000 … 1770 13 1 0.219 0.03998 0.1409 0.298 1773 12 1 0.201 0.04061 0.1214 0.281 1784 8 2 0.151 0.04329 0.0659 0.236 1785 6 2 0.100 0.04092 0.0203 0.181 1794 1 1 0.000 NA NA NA

Basic concepts Cure models Introduction Proportional hazards models Ongoing research Dependent censoring The semiparametric proportional hazards (PH) model Introduction Ongoing research ⋄ Cox, 1972 Measurement errors ⋄ Popular regression model in survival analysis Introduction Ongoing research ⋄ We will work with semiparametric proportional hazards models, but there also exist parametric variations

Basic concepts Simplest expression of the model Cure models ⋄ Case of two treatment groups (Treated vs. Control) : Introduction Ongoing research h T ( t ) = ψ h C ( t ) , Dependent censoring with h T ( t ) and h C ( t ) the hazard function of the treated Introduction Ongoing research and control group Measurement ⋄ Proportional hazards model : errors Introduction • Ratio ψ = h T ( t ) / h C ( t ) is constant over time Ongoing research • ψ < 1 ( ψ > 1): hazard of the treated group is smaller (larger) than the hazard of the control group at any time • Survival curves of the 2 treatment groups can never cross each other

Basic More generalizable expression of the model concepts Cure models ⋄ Consider a treatment covariate x i (0 = control, 1 = Introduction Ongoing research treatment) and an exponential relationship between the Dependent hazard and the covariate x i : censoring Introduction h i ( t ) = exp ( β x i ) h 0 ( t ) , Ongoing research Measurement with errors • h i ( t ) : hazard function for subject i Introduction Ongoing research • h 0 ( t ) : hazard function of the control group • exp ( β ) = ψ : hazard ratio (HR) or relative risk ⋄ Other functional relationships can be used between the hazard and the covariate

More complex model ⋄ Consider a set of covariates x i = ( x i 1 , . . . , x ip ) T for Basic concepts subject i : Cure models h i ( t ) = h 0 ( t ) exp ( β T x i ) , Introduction Ongoing research with Dependent • β : the p × 1 parameter vector censoring Introduction • h 0 ( t ) : the baseline hazard function (i.e. hazard for a Ongoing research subject with x ij = 0, j = 1 , . . . , p ) Measurement errors ⋄ Proportional hazards (PH) assumption : ratio of the Introduction Ongoing research hazards of two subjects with covariates x i and x j is constant over time : h j ( t ) = exp ( β T x i ) h i ( t ) exp ( β T x j ) ⋄ Semiparametric PH model : leave the form of h 0 ( t ) completely unspecified and estimate the model in a semiparametric way

Fitting the semiparametric PH model ⋄ Based on likelihood maximization Basic concepts ⋄ As h 0 ( t ) is left unspecified, we maximize a so-called Cure models partial likelihood instead of the full likelihood : Introduction � � Ongoing research x T � r exp ( j ) β Dependent � � L ( β ) = � censoring x T k ∈ R ( Y ( j ) ) exp k β Introduction j = 1 Ongoing research where Measurement errors • r observed event times Introduction • Y ( 1 ) , . . . , Y ( r ) ordered event times Ongoing research • x ( 1 ) , . . . , x ( r ) corresponding covariate vectors • R ( Y ( j ) ) risk set at time Y ( j ) ⋄ It can be shown that the partial likelihood is actually a profile likelihood, in which the baseline hazard is profiled out. ⋄ This expression is used to estimate β through numerical maximization

Inference under the Cox model ⋄ Variance-covariance matrix of ˆ β can be approximated Basic concepts by the inverse of the information matrix evaluated at ˆ β Cure models → Var (ˆ β h ) can be approximated by [ I (ˆ β )] − 1 Introduction hh Ongoing research ⋄ Properties (consistency, asymptotic normality) of ˆ β are Dependent censoring well established (Gill, 1984) Introduction Ongoing research ⋄ A 100(1- α )% confidence interval for β h is given by � Measurement errors ˆ Var (ˆ β h ± z α/ 2 β h ) Introduction Ongoing research ⋄ Testing hypotheses of the form H 0 : β 1 = β 10 H 1 : β 1 � = β 10 regarding a subvector β 1 of β , can be done using the Wald, score or likelihood-ratio test, exactly as in parametric regression models.

Example : Active antiretroviral treatment cohort study Basic ⋄ CD4 cells protect the body from infections and other concepts types of disease Cure models Introduction → if count decreases beyond a certain threshold the Ongoing research patients will die Dependent censoring ⋄ As HIV infection progresses, most people experience a Introduction Ongoing research gradual decrease in CD4 count Measurement errors ⋄ Highly Active AntiRetroviral Therapy (HAART) Introduction • AntiRetroviral Therapy (ART) + 3 or more drugs Ongoing research • Not a cure for AIDS but greatly improves the health of HIV/AIDS patients ⋄ Data from a study conducted in Ethiopia : • 100 individuals older than 18 years and placed under HAART for the last 4 years • only use data collected for the first 2 years

Basic Table: Data of HAART Study concepts Cure models Introduction Pat Time Censo- Gen- Age Weight Func. Clin. CD4 ART Ongoing research Dependent ID ring der Status Status censoring 1 699 0 1 42 37 2 4 3 1 Introduction Ongoing research 2 455 1 2 30 50 1 3 111 1 Measurement 3 705 0 1 32 57 0 3 165 1 errors 4 694 0 2 50 40 1 3 95 1 Introduction Ongoing research 5 86 0 2 35 37 0 4 34 1 . . . 97 101 0 1 39 37 2 . . 1 98 709 0 2 35 66 2 3 103 1 99 464 0 1 27 37 . . . 2 100 537 1 2 30 76 1 4 1 1

How is survival influenced by gender and age ? ⋄ Define agecat = 1 if age < 40 years Basic concepts = 2 if age ≥ 40 years Cure models Introduction ⋄ Define gender = 1 if male Ongoing research = 2 if female Dependent censoring ⋄ Fit a semiparametric PH model including gender and Introduction agecat as covariates : Ongoing research Measurement • ˆ β agecat = 0 . 226 (HR=1.25) errors • ˆ β gender = 1 . 120 (HR=3.06) Introduction Ongoing research • Inverse of the observed information matrix : � 0 . 4645 � 0 . 1476 I − 1 (ˆ β ) = 0 . 1476 0 . 4638 • 95% CI for ˆ β agecat : [-1.11, 1.56] 95% CI for HR of old vs. young : [0.33, 4.77] • 95% CI for ˆ β gender : [-0.21, 2.45] 95% CI for HR of female vs. male : [0.81, 11.64]

Survival function estimation in the semiparametric model ⋄ Survival function for subject with covariate x i : Basic S i ( t ) = exp ( − H i ( t )) concepts exp ( − H 0 ( t ) exp ( β t x i )) Cure models = Introduction ( S 0 ( t )) exp ( β t x i ) Ongoing research = Dependent � t censoring with S 0 ( t ) = exp ( − H 0 ( t )) and H 0 ( t ) = 0 h 0 ( s ) ds Introduction Ongoing research ⋄ Estimate the baseline cumulative hazard H 0 ( t ) by Measurement � d ( j ) errors ˆ H 0 ( t ) = � � , � Introduction k ˆ x t k ∈ R ( Y ( j ) ) exp β Ongoing research j : Y ( j ) ≤ t ⋄ Define � � exp (ˆ β t x i ) ˆ ˆ S i ( t ) = S 0 ( t ) , with ˆ S 0 ( t ) = exp ( − ˆ H 0 ( t )) ⋄ It can be shown that the estimator is asymptotically normal

Example : Survival function estimates for marital status groups in the schizophrenic patients data Basic concepts 1 Cure models Single Married Alone again Introduction Ongoing research 0.8 Dependent censoring 0.6 Estimated survival Introduction Ongoing research 0.4 Measurement errors Introduction 0.2 Ongoing research 0 0 500 1000 1500 2000 Time Consider e.g. survival at 505 days : Single group : 0.755 95% CI : [0.690, 0.827] Married group : 0.796 95% CI : [0.730, 0.867] Alone again group : 0.537 95% CI : [0.453, 0.636]

Checking the proportional hazards assumption ⋄ PH assumption : hazard ratio between two subjects Basic concepts with different covariates is constant over time Cure models ⋄ Diagnostic plots : Introduction Ongoing research • Consider for simplicity the case of a covariate with r Dependent censoring levels Introduction • Estimate the cumulative hazard function for each level Ongoing research of the covariate by means of the Nelson-Aalen estimator Measurement errors ⇒ ˆ H 1 ( t ) , ˆ H 2 ( t ) , . . . , ˆ H r ( t ) should be constant multiples Introduction of each other : Ongoing research Plot PH assumption holds if log ( ˆ H 1 ( t )) , ..., log ( ˆ H r ( t )) vs t parallel curves log ( ˆ H j ( t )) − log ( ˆ H 1 ( t )) vs t constant lines H j ( t ) vs ˆ ˆ H 1 ( t ) straight lines through origin

Example : Basic concepts 3.0 1 2.5 Cure models Male 0 log(Cumulative hazard) Female Cumulative hazard 2.0 −1 Introduction −2 Ongoing research 1.5 −3 1.0 Dependent −4 censoring 0.5 Male −5 Female 0.0 Introduction Ongoing research 0 500 1000 1500 0 500 1000 1500 Time Time Measurement errors Introduction 1.0 Ongoing research 1.5 log(ratio cumulative hazards) Cumulative hazard Female 0.5 1.0 0.0 0.5 −0.5 0.0 0 500 1000 1500 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Time Cumulative hazard Male

Basic concepts Parametric survival models Cure models Introduction Ongoing research Some common parametric distributions Dependent censoring Introduction ⋄ Exponential distribution : S 0 ( t ) = exp ( − λ t ) Ongoing research Measurement ⋄ Weibull distribution : S 0 ( t ) = exp ( − λ t ρ ) errors Introduction 1 Ongoing research ⋄ Log-logistic distribution : S 0 ( t ) = 1 + ( t λ ) κ � log ( t ) − µ � ⋄ Log-normal distribution : S 0 ( t ) = 1 − F N √ γ

Basic Parametric survival models concepts Cure models The parametric models considered here have two Introduction representations : Ongoing research Dependent ⋄ Accelerated failure time model (AFT) : censoring Introduction S i ( t ) = S 0 ( exp ( θ T x i ) t ) , Ongoing research Measurement where errors • θ = ( θ 1 , . . . , θ p ) T = vector of regression coefficients Introduction Ongoing research • exp ( θ T x i ) = acceleration factor • S 0 belongs to a parametric family of distributions Hence, � � � � θ T x i exp ( θ T x i ) t h i ( t ) = exp h 0

and Basic M i = exp ( − θ T x i ) M 0 concepts where M i = median of S i , since Cure models Introduction � � S 0 ( M 0 ) = 1 Ongoing research exp ( θ T x i ) M i 2 = S i ( M i ) = S 0 Dependent censoring Introduction Ongoing research Ex : For one binary variable (say treatment (T) and Measurement control (C)), we have M T = exp ( − θ ) M C : errors Introduction Ongoing research 1 Control 0.75 Survival function Treated 0.5 0.25 0 M C M T 0.0 0.5 1.0 1.5 2.0 Time

Basic concepts ⋄ Linear model : Cure models log T = µ + γ T x + σ W , Introduction Ongoing research where Dependent censoring • µ = intercept Introduction • γ = ( γ 1 , . . . , γ p ) T = vector of regression coefficients Ongoing research Measurement • σ = scale parameter errors • W has known distribution, that is Introduction Ongoing research • independent of x (random design) • the same for all x (fixed design) and the mean and variance of W are fixed to identify the model

Basic ⋄ These two models are equivalent, if we choose concepts • S 0 = survival function of exp ( µ + σ W ) Cure models • θ = − γ Introduction Ongoing research Indeed, Dependent censoring S i ( t ) = P ( T i > t ) Introduction Ongoing research = P ( log T i > log t ) Measurement errors P ( µ + σ W i > log t − γ t x i ) = Introduction � � Ongoing research exp ( log t − γ t x i ) = S 0 � � t exp ( θ t x i ) = S 0 ⇒ The two models are equivalent

Special case : the Weibull distribution Basic concepts ⋄ Consider the accelerated failure time model Cure models � � Introduction exp ( θ t x i ) t S i ( t ) = S 0 , Ongoing research Dependent censoring where S 0 ( t ) = exp ( − λ t α ) is Weibull Introduction Ongoing research � Measurement − λ exp ( β t x i ) t α ) with β = αθ ⇒ S i ( t ) = exp errors � Introduction ⇒ f i ( t ) = λα t α − 1 exp ( β t x i ) exp − λ exp ( β t x i ) t α ) Ongoing research ⇒ h i ( t ) = αλ t α − 1 exp ( β t x i )= h 0 ( t ) exp ( β t x i ) , with h 0 ( t ) = αλ t α − 1 the hazard of a Weibull ⇒ We also have a Cox PH model

Basic ⋄ The above model is also equivalent to the following concepts linear model : Cure models Introduction log T = µ + γ t x + σ W , Ongoing research Dependent where W has a standard extreme value distribution, i.e. censoring S W ( w ) = exp ( − e w ) . Indeed, Introduction Ongoing research � � P ( W > w ) = P exp ( µ + σ W ) > exp ( µ + σ w ) Measurement � � errors = S 0 exp ( µ + σ w ) Introduction � � Ongoing research = exp − λ exp ( αµ + ασ w ) Since W has a known distribution, we fix λ exp ( αµ ) = 1 and ασ = 1 (identifiability constraint), and hence exp ( − e w ) P ( W > w ) =

Basic ⋄ It follows that concepts Weibull accelerated failure time model Cure models Introduction = Cox PH model with Weibull baseline hazard Ongoing research Dependent = Linear model with standard extreme value error censoring Introduction distribution Ongoing research Measurement and errors • θ = − γ = β/α Introduction Ongoing research • α = 1 /σ • λ = exp ( − µ/σ ) ⋄ Note that the Weibull distribution is the only continuous distribution that can be written as an AFT model and as a PH model

Estimation Basic ⋄ It suffices to estimate the model parameters in one of concepts the equivalent model representations. Consider e.g. the Cure models linear model : Introduction Ongoing research log T = µ + γ T x + σ W Dependent censoring ⋄ The likelihood function for right censored data equals Introduction Ongoing research n � Measurement f i ( Y i ) ∆ i S i ( Y i ) 1 − ∆ i L ( µ, γ, σ ) = errors Introduction i = 1 Ongoing research � 1 � log Y i − µ − γ T x i �� ∆ i n � = f W σ Y i σ i = 1 � � log Y i − µ − γ T x i �� 1 − ∆ i × S W σ Since W has a known distribution, this likelihood can be maximized w.r.t. its parameters µ, γ, σ

Basic concepts Cure models ⋄ Let Introduction Ongoing research Dependent (ˆ µ, ˆ γ, ˆ σ ) = argmax µ,γ,σ L ( µ, γ, σ ) censoring Introduction Ongoing research ⋄ It can be shown that Measurement • (ˆ µ, ˆ γ, ˆ σ ) is asymptotically unbiased and normal errors Introduction • The estimators of the accelerated failure time model (or Ongoing research any other equivalent model) and their asymptotic distribution can be obtained from the Delta-method

Basic concepts Cure models Introduction Ongoing research Dependent censoring Part II : Cure models Introduction Ongoing research Measurement errors Introduction Ongoing research

Basic concepts Cure models Introduction Ongoing research Dependent censoring Introduction Introduction to cure models Ongoing research Measurement errors Introduction Ongoing research

Introduction Basic concepts Cure models ⋄ In classical survival models, we assume that all Introduction Ongoing research individuals will experience the event of interest, so Dependent censoring t →∞ S ( t ) = 0 lim Introduction Ongoing research Measurement where errors Introduction S ( t ) = P ( T > t ) Ongoing research and T is the time until the event of interest occurs. ⋄ This assumption is realistic when studying e.g. • Time to death (all causes confounded) • Time to failure of a machine • Time to retirement • ...

⋄ However, in many situations, a fraction of the Basic concepts population will never experience the event of interest : Cure models Introduction • Medicine : time until recurrence of a certain disease Ongoing research Dependent • Economics : time to find a new job after a period of censoring unemployment Introduction Ongoing research • Demography : time to a second child after a first one Measurement errors • Finance : time until a bank goes bankrupt Introduction Ongoing research • Marketing : time until someone buys a new product • Sociology : time until a re-arrest for released prisoners • Education : time taken to solve a problem • ...

⋄ Two groups of individuals : • Cured individuals Basic • Susceptible individuals concepts ⋄ The survival function is not proper : Cure models Introduction Ongoing research t →∞ S ( t ) > 0 lim Dependent censoring ⋄ Cure rate = probability of being cured : Introduction Ongoing research Measurement 1 − p = lim t →∞ S ( t ) errors Introduction Ongoing research ⋄ Example : Kaplan-Meier plot of time to distant metastasis for breast cancer patients : 1.0 0.8 Survival probability 0.6 0.4 0.2 0.0 0 1000 2000 3000 4000 5000 Time to distant metastasis (in days) ⇒ Height of the plateau corresponds to 1 − p

Basic concepts Cure models Introduction Ongoing research Dependent censoring Introduction Ongoing research Measurement errors Introduction Ongoing research Example of exponential model with cure (where height of the plateau = cure rate = 1 − p )

⋄ The binary variable Basic B = I ( T < ∞ ) concepts Cure models indicating if someone is cured or not, is latent Introduction Ongoing research ⋄ The observable variables are still Y and ∆ as before, Dependent censoring but Introduction Ongoing research • when ∆ = 1, the individual is susceptible Measurement errors • when ∆ = 0, we don’t know whether he is susceptible Introduction or cured Ongoing research

Basic ⋄ Cure models are also called concepts Cure models • ‘split population models’ in economics Introduction Ongoing research • ‘limited-failure population life models’ in engineering Dependent censoring ⋄ How can we know that we need to use a cure model if Introduction Ongoing research we cannot distinguish cured observations from Measurement censored uncured observations ? errors Introduction • Informal: ‘if we have a long plateau that contains a large Ongoing research number of data points, we can be confident that (almost) all observations in the plateau correspond to cured observations’ • Context of the study

Is a cure model identified ? Basic concepts Or : how can we know whether a censored observation in Cure models the right tail is cured or not cured ? Introduction Ongoing research Let Dependent censoring S ( t ) = P ( T > t | B = 0 ) P ( B = 0 ) + P ( T > t | B = 1 ) P ( B = 1 ) Introduction Ongoing research = 1 − p + pS u ( t ) , Measurement errors where S u ( t ) = P ( T > t | B = 1 ) is the (proper) survival Introduction Ongoing research function of the susceptibles. Let F u = 1 − S u G = the censoring distribution τ F is the right endpoint of the support of F (for any F ) If τ F u ≤ τ G , then the model is identified !

Cure regression models Basic concepts Two main families exist : Cure models ⋄ Mixture cure models : Introduction Ongoing research Dependent S ( t | x , z ) = p ( z ) S u ( t | x ) + 1 − p ( z ) , censoring Introduction Ongoing research where Measurement errors • X and Z are two vectors of covariates Introduction • p ( z ) = P ( B = 1 | Z = z ) is the probability of being Ongoing research susceptible (incidence part) • S u ( t | x ) = P ( T > t | X = x , B = 1 ) is the (proper) conditional survival function of the susceptibles (latency part) → the cure rate is 1 − p ( z ) The model has been proposed by Boag (1949), Berkson and Gage (1952), Farewell (1982)

⋄ Promotion time cure models (also called bounded Basic concepts cumulative hazard models or PH cure models) : Cure models Introduction S ( t | x ) = exp {− θ ( x ) F ( t ) } , Ongoing research Dependent censoring where Introduction • X is the complete vector of covariates Ongoing research Measurement • θ ( x ) captures the effect of the covariates x on the errors survival function S ( t | x ) Introduction Ongoing research → proportional hazards structure → the cure rate is P ( B = 0 | X = x ) = exp {− θ ( x ) } The model has been proposed by Yakovlev et al (1996) There also exist models that unify the mixture and the promotion time cure model into one over-arching model

Is it important to account for cure? Basic concepts Simulate data from a mixture cure model Cure models Introduction S ( t | x , z ) = p ( z ) S u ( t | x ) + 1 − p ( z ) Ongoing research Dependent censoring with Introduction Ongoing research ⋄ Incidence : logistic regression model with Measurement errors Z = ( 1 , Z 1 , Z 2 ) T , average cure proportion of 32% Introduction Ongoing research ⋄ Latency : exponential model with covariate X = Z ⋄ Censoring times follow an exponential distribution, average censoring rate of 34% ⋄ n = 300 ⋄ For each dataset, we fit • a Cox PH model • a mixture cure model

Basic concepts Cure models Introduction Ongoing research Dependent censoring Introduction Ongoing research Measurement errors Introduction Ongoing research ⇒ Not taking into account the presence of a cure fraction in survival data has important consequences that may lead to wrong conclusions

Basic concepts Examples Cure models Introduction Example 1 : Breast cancer data Ongoing research Dependent ⋄ Time to distant metastasis (in days) censoring Introduction ⋄ 286 patients with a lymph-node-negative breast cancer Ongoing research Measurement ⋄ Covariates : errors Introduction • Age : range = [26-83], median = 52 Ongoing research • Estrogen receptor status : 0 = ER- (77 pts), 1 = ER+ (209 pts) • Size of the tumor : range = [1-4], median = 1 • Menopausal status : 0 = premenopausal (129 pts), 1 = postmenopausal (157 pts)

Basic concepts Cure models Introduction Ongoing research Dependent censoring Introduction Ongoing research Measurement errors Introduction Ongoing research → 179 patients are right-censored, among which 88.3% are censored after the last observed event time → strong medical evidence for a fraction of cure in breast cancer relapse

Example 2 : Personal loan data Basic ⋄ Data from a U.K. financial institution concepts ⋄ Data used in Stepanova and Thomas (2002), Tong et Cure models Introduction al. (2012) Ongoing research ⋄ Application information for 7521 loans Dependent censoring ⋄ Default observed for 376 out of 7521 observations (5%) Introduction Ongoing research Measurement errors Var number Description Type Introduction v1 The gender of the customer (1=M, 0=F) categorical Ongoing research v2 Amount of the loan continuous v3 Number of years at current address continuous v4 Number of years at current employer continuous v5 Amount of insurance premium continuous v6 Homephone or not (1=N, 0=Y) categorical v7 Own house or not (1=N, 0=Y) categorical v8 Frequency of payment (1=low/unknown, 0=high) categorical

Note that ⋄ heavy right censoring Basic concepts ⋄ default will not/never take place for a large part of the Cure models population Introduction Ongoing research ⇒ lim t →∞ S ( t ) � = 0 Dependent censoring ⇒ we use a mixture cure model Introduction Ongoing research ∆ i = 1 ⇒ the individual is susceptible Measurement errors ∆ i = 0 ⇒ we do not know whether default will ever take Introduction Ongoing research place or not Loans Default No default prob = p ( z ) prob = 1 − p ( z )

Mixture cure models Basic Recall the model : concepts Cure models S ( t | x , z ) = p ( z ) S u ( t | x ) + 1 − p ( z ) Introduction Ongoing research Incidence : Dependent censoring ⋄ models the probability of being susceptible Introduction Ongoing research p ( z ) = P ( B = 1 | Z = z ) Measurement ⋄ Most often logistic regression model : errors Introduction Ongoing research exp ( z T α ) p ( z ) = 1 + exp ( z T α ) Latency : ⋄ models the conditional survival function of the susceptibles S u ( t | x ) = P ( T > t | X = x , B = 1 ) • parametric model • Cox PH model • AFT model, ...

Basic Fully parametric model concepts Ex: Logistic/Weibull model (Farewell, 1982) Cure models Introduction ⋄ Conditional survival function of the uncured : Ongoing research Dependent censoring S u ( t | x ) = exp ( − ( λ e β T x ) t ρ ) Introduction Ongoing research Measurement with λ > 0 the shape parameter and ρ > 0 the scale errors Introduction parameter. Ongoing research ⋄ Maximum likelihood estimation : • Numerical optimization, e.g., Newton-Raphson • Variance of the estimators via the inverse of the observed information matrix

Logistic / Cox PH model Basic concepts ⋄ Conditional survival function of the uncured: Cure models Introduction S u ( t | x ) = S u ( t ) exp ( x T β ) Ongoing research Dependent censoring with the baseline survival function S u ( t ) left unspecified. Introduction Ongoing research ⋄ The PH assumption remains valid for the susceptibles Measurement errors but is not valid anymore at the level of the population Introduction Ongoing research ⇒ Partial likelihood approach developed for the Cox PH model can not be used ⋄ Several approaches have been proposed : • Approaches based on the marginal likelihood • Approaches based on the EM algorithm

Basic concepts Other mixture cure models Cure models ⋄ Logistic / semi parametric AFT models Introduction Ongoing research ⋄ Other link functions in the incidence: probit, Dependent censoring complementary log-log link function Introduction Ongoing research ⋄ Flexible semiparametric models, e.g. Measurement • Cox model for latency and single-index structure in the errors Introduction incidence : p ( z ) = g ( γ T z ) where g ( · ) is unspecified Ongoing research • Logistic regression for incidence and non-parametric model in the latency ⋄ Non-parametric mixture cure models

Promotion time cure models Basic concepts ⋄ Also called bounded cumulative hazard model or PH Cure models Introduction cure model Ongoing research Dependent ⋄ Introduced by Yakovlev et al (1996) and formally censoring proposed by Tsodikov (1998) Introduction Ongoing research ⋄ Idea : since, in the presence of cure, the survival Measurement errors function is improper, the idea is to ‘bound’ the Introduction Ongoing research cumulative hazard function H ( t ) = θ F ( t ) with F ( · ) a proper distribution function and θ > 0 In this way t →∞ H ( t ) = θ lim

⋄ If θ depends on covariates, the (improper) survival Basic function is then given by concepts Cure models S ( t | x ) = exp {− θ ( x ) F ( t ) } Introduction Ongoing research Dependent where censoring Introduction • X is the complete vector of covariates (with an Ongoing research intercept) Measurement • θ ( x ) captures the effect of the covariates x on the errors Introduction survival function S ( t | x ) Ongoing research ⋄ This formulation has a proportional hazards structure ⋄ This model has a specific biological interpretation (leading to the name ‘promotion time model’) ⋄ Usually, θ ( x ) = exp ( β T x ) , and F is unspecified ⋄ The cure rate is 1 − exp {− θ ( x ) }

Basic concepts Cure models Introduction References Ongoing research Dependent censoring Introduction ⋄ Book : Ongoing research • Maller and Zhou (1996) Measurement errors ⋄ Review papers : Introduction Ongoing research • Peng and Taylor (2014) • Amico and VK (2018)

Basic concepts Cure models Introduction Ongoing research Dependent The focused information criterion for a censoring mixture cure model Introduction Ongoing research Measurement errors (joint with Gerda Claeskens) Introduction Ongoing research

Proportional hazards mixture cure model Basic We consider the model concepts Cure models S ( t | x , z ) = p ( z ) S u ( t | x ) + 1 − p ( z ) Introduction Ongoing research Dependent where censoring Introduction ⋄ survival function : proportional hazards model, i.e. Ongoing research Measurement S u ( t ) exp ( x T β ) errors S u ( t | x ) = � � Introduction − exp ( x T β ) H u ( t ) Ongoing research = exp where S u ( · ) and H u ( · ) are the baseline survival and baseline cumulative hazard function of the susceptibles ⋄ cure rate : logistic model, i.e. � � exp ( z T α ) p ( z ) = z T α p ( z ) = or log 1 + exp ( z T α ) 1 − p ( z )

The data consist of iid vectors ( X i , Z i , Y i , ∆ i ) , i = 1 , . . . , n , Basic concepts with Cure models Y i = min ( T i , C i ) , ∆ i = I ( T i ≤ C i ) , Introduction Ongoing research and C i is independent of T i given ( X i , Z i ) . Dependent censoring Introduction Maximum likelihood estimation : Ongoing research The likelihood under the PH mixture cure model is given by Measurement errors �� i β ) � ∆ i n Introduction � i α ) H { Y i } e X T i β e − H ( Y i ) exp ( X T Ongoing research π ( Z T L n ( α, β, H ) = i = 1 � i β ) � 1 − ∆ i � i α ) e − H ( Y i ) exp ( X T 1 − π ( Z T i α ) + π ( Z T × , where π ( t ) = exp ( t ) / [ 1 + exp ( t )] .

Basic concepts Define Cure models α, � β, � ( � H u ) = argmax α,β, H L n ( α, β, H ) . Introduction Ongoing research α, � β, � Dependent Asymptotic properties of ( � H u ) have been established censoring by Fang, Li and Sun (2005) and Lu (2007) : Introduction Ongoing research n 1 / 2 �� Measurement H u ( · ) − H u ( · ) ⇒ Gaussian process errors Introduction Ongoing research and n 1 / 2 � β − β ) d α − α, � � → Multivariate normal (for the case where the model is correctly specified)

Variable selection in a mixture cure model Basic concepts The parameters in the model are Cure models ⋄ α : for logistic model on cure rate π ( · ) Introduction Ongoing research ⋄ β, H u ( · ) : for Cox PH model on survival function S u ( ·|· ) Dependent censoring Suppose we are interested in a certain quantity Introduction Ongoing research Measurement µ = µ ( α, β, H u ( · )) , errors Introduction Ongoing research which we call the focus. Of interest : Variable selection in order to estimate as well as possible (in MSE sense) the focus µ . Literature on variable selection for mixture cure models : ⋄ Scolas et al (2016) (using Lasso) ⋄ Dirick et al (2015) (using AIC)

Examples : Basic concepts ⋄ Personalized prediction of the (unconditional) survival Cure models of a given patient (or for given values of x and z ) : Introduction Ongoing research Dependent S ( t | x , z ) = p ( z ) S u ( t | x ) + 1 − p ( z ) censoring Introduction Ongoing research ⋄ Personalized prediction of the (unconditional) risk : Measurement errors p ( z ) f u ( t | x ) Introduction h ( t | x , z ) = Ongoing research p ( z ) S u ( t | x ) + 1 − p ( z ) ⋄ Mean or median survival time for given values of x and z (conditional or unconditional) ⋄ Probability of being cured for given z : p ( z )

How to do variable selection ? Basic Note that concepts Cure models ⋄ Incorporating the full vectors x and z will lead to a full Introduction Ongoing research model with a large variance but a smaller bias as Dependent compared to a narrow model that leaves out all censoring Introduction components of x and z , resulting in a large bias but a Ongoing research smaller variance. Measurement errors Introduction ⋄ One could construct intermediate model selection Ongoing research scenarios where some of the components of x and z are protected (i.e. forced to be present in all models). The unprotected variables take part in the model selection step. For simplicity, we ignore this division and assume that all components of x and z are unprotected.

Focused Information Criterion (FIC) General idea : ‘best’ model depends on the focus and is Basic selected by minimizing the MSE of the estimator of the concepts focus. Cure models Introduction Ongoing research References : Claeskens and Hjort (2008), Cambridge. Dependent censoring Some notation : In each submodel we estimate the focus µ Introduction by maximixing the semiparametric likelihood introduced Ongoing research Measurement before, and we define errors Introduction α S 1 , S 2 , � β S 1 , S 2 , � µ S 1 , S 2 = µ ( � � Ongoing research H uS 1 , S 2 ( · )) , where S 1 is the subset of { 1 , . . . , p } (logistic) and S 2 is the subset of { 1 , . . . , q } (Cox PH) that indicates which components of x and z are present in the considered model. Define ( � S 1 , � S 2 ) = argmin S 1 , S 2 FIC ( S 1 , S 2 ) = argmin S 1 , S 2 � MSE ( � µ S 1 , S 2 )

In order to be able to calculate the MSE of each submodel, Basic concepts we need to make an assumption regarding the true model. Cure models Introduction We work with local misspecification : Ongoing research Dependent censoring ⋄ The true hazard rate is Introduction � √ � Ongoing research x T ( β 0 + b / H u , true ( t | x ) = H 0 u ( t ) exp n ) , Measurement errors Introduction ⋄ The true logistic model is Ongoing research logit { p true ( z ) } = z T � √ � α 0 + a / n , where α 0 and β 0 are known, and a and b do not depend on the sample size n .

Asymptotic theory Basic concepts Cure models Define Introduction Ongoing research � � α − α 0 , � H u − H 0 u , � β − β 0 � ( g ) Dependent � τ censoring g 1 ( t ) d ( � α − α 0 , � H u − H 0 u )( t ) + g T = 2 ( � β − β 0 ) , Introduction Ongoing research 0 Measurement where g = ( g 1 ( · ) , g 2 ) . errors Introduction Ongoing research Note that ⋄ If g = ( 0 , e k ) , then � � α − α 0 , � H u − H 0 u , � β − β 0 � ( g ) α − α 0 , � = k -th component of ( � β − β 0 ) ⋄ If g = ( I ( · ≤ t ) , 0 ) , then � � α − α 0 , � β − β 0 � ( g ) = � H u − H 0 u , � H u ( t ) − H 0 u ( t )

Note that Basic concepts � � H 0 u , α 0 + a √ n , β 0 + b Cure models U n ( � α, � U √ n = 0 and H u , � β ) = 0 , Introduction Ongoing research Dependent where censoring Introduction Ongoing research U n (Γ)( g ) = U n ( H u , α, β )( g ) Measurement errors = U n 1 (Γ)( g 1 ) + U n 2 (Γ)( g 2 ) Introduction Ongoing research = score operator and U (Γ) = EU n (Γ) , where the expected value is with respect to the true model.

For any submodel ( S 1 , S 2 ) , n 1 / 2 � � α S 1 , S 2 − α 0 , � H uS 1 , S 2 − H 0 u , � β S 1 , S 2 − β 0 � Basic concepts converges weakly to a Gaussian process G with covariance Cure models Introduction function Ongoing research � τ � � � � Dependent σ − 1 ( t ) σ − 1 G ( g ) , G (˜ S 1 , S 2 ( 1 ) (˜ Cov g ) = σ 1 S 1 , S 2 ( g ) , 0 g )( t ) dH 0 ( t ) censoring 0 Introduction � � T σ 2 � � Ongoing research σ − 1 σ − 1 S 1 , S 2 ( 2 ) (˜ + g ) , 0 S 1 , S 2 ( g ) , 0 , Measurement errors and with mean function Introduction Ongoing research � � � � � � σ − 1 σ − 1 G ( g ) = B 1 S 1 , S 2 ( 1 ) ( g ) + B 2 S 1 , S 2 ( 2 ) ( g ) , 0 . E Note that ⋄ If g = ( 0 , e k ) , we get the asymptotic normality of the α S 1 , S 2 − α 0 , � k -th component of n 1 / 2 ( � β S 1 , S 2 − β 0 ) ⋄ If g = ( I ( · ≤ t ) , 0 ) , we get the asymptotic normality of n 1 / 2 ( � H uS 1 , S 2 ( t ) − H 0 u ( t ))

Basic Hence, concepts n 1 / 2 � � � Cure models µ S 1 , S 2 − µ 0 ) d � → N Bias ( µ, S 1 , S 2 , a , b ) , Var ( µ, S 1 , S 2 ) . Introduction Ongoing research Dependent Estimation of Bias ( µ, S 1 , S 2 , a , b ) and Var ( µ, S 1 , S 2 ) : censoring Introduction ⋄ Variance : plug-in estimation of the asymptotic variance Ongoing research b = n 1 / 2 � α Full and � Measurement ⋄ Bias : based on � a = n 1 / 2 � β Full errors Introduction Hence, Ongoing research FIC ( S 1 , S 2 ) = � MSE ( � µ S 1 , S 2 ) . This result can now be used to select the best model for µ by minimizing FIC ( S 1 , S 2 ) over all possible submodels.

Simulations Basic concepts Only preliminary simulation results ... Cure models Introduction Consider the following Cox/logistic cure model : Ongoing research Dependent censoring S ( t | x , z ) = p ( z ) S u ( t | x ) + 1 − p ( z ) Introduction Ongoing research where Measurement errors ⋄ X , Z ∼ Unif [ − 1 , 1 ] Introduction Ongoing research exp ( α 0 + α 1 z ) ⋄ p ( z ) = 1 + exp ( α 0 + α 1 z ) , with α 0 = α 1 = 2 ⋄ S u ( t | x ) = [ exp ( − 1 . 65 t )] exp ( β 1 x ) , with β 1 = 2 ⋄ C ∼ Exp ( mean = 1 . 7 ) Then, % cure = 0 . 2 and % censoring = 0 . 4

Focus parameters : µ j = H 0 u ( t ) Basic for t = 1 st , 2 nd or 3 rd quartile of baseline cumulative survival concepts Cure models function ( j = 1 , 2 , 3) Introduction Ongoing research 9 candidate models : Dependent censoring Estimated MSE ( × 10 3 ) True MSE ( × 10 3 ) logistic Cox Introduction X Z X Z µ 1 µ 2 µ 3 µ 1 µ 2 µ 3 Ongoing research Measurement 1 1 1 1 1 1.62 7.30 29.9 1.56 6.45 36.0 errors 2 1 1 1 0 1.57 6.73 26.1 1.57 6.13 33.6 Introduction Ongoing research 3 1 1 0 1 22.2 20.2 37.4 25.0 27.5 56.7 4 1 0 1 1 1.72 8.31 36.8 1.78 8.16 50.2 5 1 0 1 0 1.55 6.66 25.9 1.63 6.59 35.9 6 1 0 0 1 15.5 9.50 68.8 17.6 14.2 83.5 7 0 1 1 1 1.50 6.62 26.9 1.42 5.80 34.7 8 0 1 1 0 1.47 6.26 24.3 1.43 5.54 32.4 9 0 1 0 1 12.4 5.46 100.1 14.1 10.9 124.2 The true model is model 8.

Basic concepts logistic Cox FIC model selection prob. Cure models µ 1 µ 2 µ 3 X Z X Z Introduction Ongoing research 1 1 1 1 1 0.08 0.01 0.02 Dependent 2 1 1 1 0 0.07 0.02 0.10 censoring Introduction 3 1 1 0 1 0.00 0.05 0.11 Ongoing research 4 1 0 1 1 0.01 0.00 0.00 Measurement errors 5 1 0 1 0 0.18 0.09 0.20 Introduction Ongoing research 6 1 0 0 1 0.00 0.19 0.12 7 0 1 1 1 0.20 0.05 0.07 8 0 1 1 0 0.46 0.20 0.33 9 0 1 0 1 0.00 0.39 0.05

Data analysis Basic Personal loan data : concepts Cure models ⋄ Data from a U.K. financial institution Introduction Ongoing research ⋄ Data used in Stepanova and Thomas (2002), Tong et Dependent al. (2012) censoring Introduction ⋄ Application information for 7521 loans Ongoing research Measurement ⋄ Default observed for 376 out of 7521 observations (5%) errors Introduction Ongoing research Var number Description Type v1 The gender of the customer (1=M, 0=F) categorical v2 Amount of the loan continuous v3 Number of years at current address continuous v4 Number of years at current employer continuous v5 Amount of insurance premium continuous v6 Homephone or not (1=N, 0=Y) categorical v7 Own house or not (1=N, 0=Y) categorical v8 Frequency of payment (1=low/unknown, 0=high) categorical

Note that ⋄ heavy right censoring Basic concepts ⋄ default will not/never take place for a large part of the Cure models population Introduction Ongoing research ⇒ lim t →∞ S ( t ) � = 0 Dependent censoring ⇒ we use a mixture cure model Introduction Ongoing research ∆ i = 1 ⇒ the individual is susceptible Measurement errors ∆ i = 0 ⇒ we do not know whether default will ever take Introduction Ongoing research place or not Loans Default No default prob = p ( z ) prob = 1 − p ( z )

Basic concepts ⋄ 7521 observations and 8 variables Cure models ⋄ Default observed for 376 out of 7521 observations Introduction Ongoing research ⋄ 2 covariate vectors ( α and β ), empty models excluded : Dependent ( 2 8 − 1 ) × ( 2 8 − 1 ) = 65025 FICs to calculate ! censoring Introduction Ongoing research ⋄ Focus : probability of cure 1 − p ( z ) at z = median ( Z ) Measurement errors Introduction Ongoing research Part v1 v2 v3 v4 v5 v6 v7 v8 Cure rate 1 1 1 1 1 0 0 1 Survival of uncured 1 0 1 0 1 1 1 1

Basic concepts Conclusions Cure models Introduction Ongoing research Dependent ⋄ We considered a proportional hazards mixture cure censoring Introduction model, and developed the asymptotic distribution of the Ongoing research estimators of the model components under local Measurement errors misspecification of the model. Introduction Ongoing research ⋄ This asymptotic distribution can then be used to select the best variables to estimate a certain quantity (focus) in the model via FIC minimization.

Basic concepts Cure models Introduction Ongoing research Dependent Part III : Dependent censoring Introduction censoring Ongoing research Measurement errors Introduction Ongoing research

Basic concepts Cure models Introduction Ongoing research Dependent censoring Introduction Introduction to dependent censoring Ongoing research Measurement errors Introduction Ongoing research

Introduction Basic concepts Cure models Introduction ⋄ Random right censoring assumes that the survival time Ongoing research ( T ) and the censoring time ( C ) are independent Dependent censoring ⋄ We observe Introduction Ongoing research Y = min ( T , C ) and ∆ = I ( T ≤ C ) , Measurement errors so we observe either T or C , but not both Introduction ⇒ Relation between T and C not identifiable in general Ongoing research ⇒ Relation between T and C needs to be specified in order to identify the model ⇒ Independence assumption is most natural assumption, and holds true in many contexts (See Tsiatis, 1975)

Basic concepts Independence of T and C is satisfied if Cure models ⋄ Administrative censoring : individuals alive at the end of Introduction Ongoing research the study are censored Dependent censoring ⇒ Censoring is unrelated to survival time Introduction ⇒ Independence assumption makes sense Ongoing research Measurement ⋄ Censoring happens for other reasons that are errors Introduction completely unrelated to the event of interest Ongoing research Eg. In medical studies, patients might move, die because of car accident, etc. ⋄ Many other contexts

Independence of T and C might be doubtful if Basic concepts Cure models ⋄ Medical studies : Patients may withdraw from the study Introduction • because their condition is deteriorating or because they Ongoing research Dependent are showing side effects which need alternative censoring treatments (positive relation between T and C ) Introduction Ongoing research • because their health condition has improved and so Measurement they no longer follow the treatment (negative relation errors between T and C ) Introduction Ongoing research ⋄ Unemployment studies : Unemployed people with low chances on the job market could decide to go abroad to improve their chances, leading to censoring times that depend on the duration of unemployment

Survival analysis : from basic concepts to open research questions - PowerPoint PPT Presentation

Survival analysis : from basic concepts to open research questions Ecole dt, Villars-sur-Ollon, 2-5 September 2018 Ingrid Van Keilegom ORSTAT KU Leuven Table of Contents Basic concepts 1 Basic concepts Cure models Introduction

Survival Analysis / Time-to- Event Analysis in R Heidi Seibold Statistician at LMU Munich

Survival Analysis Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Why use the Weibull model? Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis

Kaplan-Meier estimate Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis in R

Survival curve showing cohorts Overall Survival Survival Frequency Time (%) 1 year 53.7 2

RcmdrPlugin.survival : An R Commander Plug-in Package for Survival Analysis John Fox McMaster

Survival Analysis: Introduction Survival Analysis typically focuses on time to event data. In the

The Cox Model Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis in R Why use

Estimating survival from Grays Outline flexible model I. Introduction II. Semiparametric

The LIFETEST Procedure Stratum 1: treatment = 0 Product-Limit Survival Estimates Survival

CS6220: DATA MINING TECHNIQUES Chapter 10: Cluster Analysis: Basic Concepts and Methods

Basic Concepts of I R: Outline Basic Concepts of Information Retrieval: Task definition of

Lecture 17: Survival Analysis -- Cox proportional Hazards Ani Manichaikul amanicha@jhsph.edu 14

CONCEPTS AND CONCEPTS AND CONCEPTS AND CONCEPTS AND PR PR PRINC PRINC NCIPLES OF NCIPLES

Model Selection in Survival Analysis Suppose we have a censored survival time that we want to

Survival Analysis APTS 2016/17 Ingrid Van Keilegom ORSTAT KU Leuven Glasgow, August 21-25,

Stochastic modelling in Mathematical Biology Daniel S anchez-Taltavull Centre de Recerca

Social Support and Support enables Self-Efficacy Self-Efficacy: Self-Efficacy cultivates

Plasmablastic lymphoma Jorge J. Castillo, MD Assistant Professor of Medicine Harvard Medical

Generalizing experimental study results to target populations Elizabeth Stuart Johns Hopkins

Cosmology with Damped Lyman- absorption systems Paolo Molaro INAF- OAT s . o . l QSO

WHO NEEDS CONTROLLED VOCABULARIES WHEN WE HAVE KEYWORDS & FREE TEXT SEARCHING? Drahomira

New Forms of Cultural Capital Professor Philippe Coulangeon Dr Laurie Hanquinet Lecturer in

A life-style segment of 'At-risk learners': Using mobile media provides educational

Survival analysis : from basic concepts to open research questions - PowerPoint PPT Presentation

Survival analysis : from basic concepts to open research questions Ecole dt, Villars-sur-Ollon, 2-5 September 2018 Ingrid Van Keilegom ORSTAT KU Leuven Table of Contents Basic concepts 1 Basic concepts Cure models Introduction

Survival Analysis / Time-to- Event Analysis in R Heidi Seibold Statistician at LMU Munich

Survival Analysis Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Why use the Weibull model? Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis

Kaplan-Meier estimate Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis in R

Survival curve showing cohorts Overall Survival Survival Frequency Time (%) 1 year 53.7 2

RcmdrPlugin.survival : An R Commander Plug-in Package for Survival Analysis John Fox McMaster

Survival Analysis: Introduction Survival Analysis typically focuses on time to event data. In the

The Cox Model Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis in R Why use

Estimating survival from Grays Outline flexible model I. Introduction II. Semiparametric

The LIFETEST Procedure Stratum 1: treatment = 0 Product-Limit Survival Estimates Survival

CS6220: DATA MINING TECHNIQUES Chapter 10: Cluster Analysis: Basic Concepts and Methods

Basic Concepts of I R: Outline Basic Concepts of Information Retrieval: Task definition of

Lecture 17: Survival Analysis -- Cox proportional Hazards Ani Manichaikul amanicha@jhsph.edu 14

CONCEPTS AND CONCEPTS AND CONCEPTS AND CONCEPTS AND PR PR PRINC PRINC NCIPLES OF NCIPLES

Model Selection in Survival Analysis Suppose we have a censored survival time that we want to

Survival Analysis APTS 2016/17 Ingrid Van Keilegom ORSTAT KU Leuven Glasgow, August 21-25,

Stochastic modelling in Mathematical Biology Daniel S anchez-Taltavull Centre de Recerca

Social Support and Support enables Self-Efficacy Self-Efficacy: Self-Efficacy cultivates

Plasmablastic lymphoma Jorge J. Castillo, MD Assistant Professor of Medicine Harvard Medical

Generalizing experimental study results to target populations Elizabeth Stuart Johns Hopkins

Cosmology with Damped Lyman- absorption systems Paolo Molaro INAF- OAT s . o . l QSO

WHO NEEDS CONTROLLED VOCABULARIES WHEN WE HAVE KEYWORDS &amp; FREE TEXT SEARCHING? Drahomira

New Forms of Cultural Capital Professor Philippe Coulangeon Dr Laurie Hanquinet Lecturer in

A life-style segment of 'At-risk learners': Using mobile media provides educational

WHO NEEDS CONTROLLED VOCABULARIES WHEN WE HAVE KEYWORDS & FREE TEXT SEARCHING? Drahomira