Survival analysis Niels Richard Hansen (Univ. Copenhagen) - - PowerPoint PPT Presentation

survival analysis
SMART_READER_LITE
LIVE PREVIEW

Survival analysis Niels Richard Hansen (Univ. Copenhagen) - - PowerPoint PPT Presentation

Survival analysis Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 1 / 8 Survival analysis or reliability analysis, or simple point process models. The topic has its own development with focus on aspects of


slide-1
SLIDE 1

Survival analysis

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 1 / 8

slide-2
SLIDE 2

Survival analysis

– or reliability analysis, or simple point process models. The topic has its own development with focus on aspects of models and distributions that differ from many other applications of statistics. This is primarily due to the following two issues:

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 1 / 8

slide-3
SLIDE 3

Survival analysis

– or reliability analysis, or simple point process models. The topic has its own development with focus on aspects of models and distributions that differ from many other applications of statistics. This is primarily due to the following two issues: Survival distributions are skewed distributions on the positive half

  • line. It is the shape of the distribution rather than the location of the

distribution that is of interest.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 1 / 8

slide-4
SLIDE 4

Survival analysis

– or reliability analysis, or simple point process models. The topic has its own development with focus on aspects of models and distributions that differ from many other applications of statistics. This is primarily due to the following two issues: Survival distributions are skewed distributions on the positive half

  • line. It is the shape of the distribution rather than the location of the

distribution that is of interest. There is almost always a censoring mechanism, and certain aspects of the data are consequently missing. We need to deal with this in the modeling.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 1 / 8

slide-5
SLIDE 5

Example I

In medicine we want to test whether a new, promising drug can prolong the life of humans.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 2 / 8

slide-6
SLIDE 6

Example I

In medicine we want to test whether a new, promising drug can prolong the life of humans. We set up a controlled, double-blinded experiment with 1000 individuals of age 55 given this drug and a control group of 1000 individuals of age 55 given a placebo drug (disregarding any ethical considerations at this point). The test runs for 10 years, and approximately 10% of the participants in both groups abandon the experiment without dying – they are the censored observations.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 2 / 8

slide-7
SLIDE 7

Example I

In medicine we want to test whether a new, promising drug can prolong the life of humans. We set up a controlled, double-blinded experiment with 1000 individuals of age 55 given this drug and a control group of 1000 individuals of age 55 given a placebo drug (disregarding any ethical considerations at this point). The test runs for 10 years, and approximately 10% of the participants in both groups abandon the experiment without dying – they are the censored observations. Those that survive for 10 years are all censored at that time - but this is less problematic.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 2 / 8

slide-8
SLIDE 8

Example II

In engineering we want to estimate the life time of an electrical

  • component. We record whenever a component is put to work and

whenever a component fails. At any given time, all components in work that have not yet failed are censored.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 3 / 8

slide-9
SLIDE 9

Example II

In engineering we want to estimate the life time of an electrical

  • component. We record whenever a component is put to work and

whenever a component fails. At any given time, all components in work that have not yet failed are censored. To estimate the life time based on the observed life times for the components that have failed up to this time will give a too pessimistic, biased result.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 3 / 8

slide-10
SLIDE 10

Example III

A “real” survival application.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 4 / 8

slide-11
SLIDE 11

Example III

A “real” survival application. Patients are enrolled in a study whenever they are diagnosed with a given (serious, life threatening) disease. Data on the subjects are collected – and may be collected regularly.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 4 / 8

slide-12
SLIDE 12

Example III

A “real” survival application. Patients are enrolled in a study whenever they are diagnosed with a given (serious, life threatening) disease. Data on the subjects are collected – and may be collected regularly. At a planned calendar time the statistical analysis is done, and patients alive at this time are censored.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 4 / 8

slide-13
SLIDE 13

Example III

A “real” survival application. Patients are enrolled in a study whenever they are diagnosed with a given (serious, life threatening) disease. Data on the subjects are collected – and may be collected regularly. At a planned calendar time the statistical analysis is done, and patients alive at this time are censored. Many questions are of interest, e.g. how different covariates affect the survival for this particular disease. One issue may be to compare the effect

  • f two or more treatments on the survival.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 4 / 8

slide-14
SLIDE 14

The Kaplan-Meier estimator

Based on the censored survival observations (Ti, ∆i) the Kaplan-Meier estimator is ˆ S(t) =

  • s≤t
  • 1 − ∆N(s)

Y (s)

  • .

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 5 / 8

slide-15
SLIDE 15

The Kaplan-Meier estimator

Based on the censored survival observations (Ti, ∆i) the Kaplan-Meier estimator is ˆ S(t) =

  • s≤t
  • 1 − ∆N(s)

Y (s)

  • .

If τi denotes the time for the i’th jump ˆ S(t) =

  • 1 −

1 Y (τ1) 1 − 1 Y (τ2)

  • . . .
  • 1 −

1 Y (τN(t))

  • Niels Richard Hansen (Univ. Copenhagen)

Statistics BI/E lecture March 18, 2009 5 / 8

slide-16
SLIDE 16

The Kaplan-Meier estimator

Based on the censored survival observations (Ti, ∆i) the Kaplan-Meier estimator is ˆ S(t) =

  • s≤t
  • 1 − ∆N(s)

Y (s)

  • .

If τi denotes the time for the i’th jump ˆ S(t) =

  • 1 −

1 Y (τ1) 1 − 1 Y (τ2)

  • . . .
  • 1 −

1 Y (τN(t))

  • This estimator is the survival analysis version of the empirical distribution

function.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 5 / 8

slide-17
SLIDE 17

The Cox proportional hazards model

With covariates Xi = (X1i, . . . , Xmi)T the hazard rate for the i’th individual is αi(t, Xi) = α0(t) exp  

m

  • j=1

βjXij   = α0(t) exp

  • X Tβ
  • for an m-dimensional vector β of parameters. This is the Cox model.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 6 / 8

slide-18
SLIDE 18

The Cox proportional hazards model

With covariates Xi = (X1i, . . . , Xmi)T the hazard rate for the i’th individual is αi(t, Xi) = α0(t) exp  

m

  • j=1

βjXij   = α0(t) exp

  • X Tβ
  • for an m-dimensional vector β of parameters. This is the Cox model.

The parameters are estimated by solving the estimating equation

n

  • i=1

(Xi − E(β, Ti))∆i = 0 where E(β, Ti) =

  • i 1(t ≤ Ti)Xi exp(X T

i β)

  • i 1(t ≤ Ti) exp(X T

i β) .

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 6 / 8

slide-19
SLIDE 19

The Cox proportional hazards model

With covariates Xi = (X1i, . . . , Xmi)T the hazard rate for the i’th individual is αi(t, Xi) = α0(t) exp  

m

  • j=1

βjXij   = α0(t) exp

  • X Tβ
  • for an m-dimensional vector β of parameters. This is the Cox model.

The parameters are estimated by solving the estimating equation

n

  • i=1

(Xi − E(β, Ti))∆i = 0 where E(β, Ti) =

  • i 1(t ≤ Ti)Xi exp(X T

i β)

  • i 1(t ≤ Ti) exp(X T

i β) .

Detailed knowledge is available on the theoretical merits of this method.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 6 / 8

slide-20
SLIDE 20

Some further topics

Diagnostic and residuals. To carry out a serious, practical analysis of survival data it is mandatory to take some steps to verify that the model actually fits the data, to check for outliers and/or highly influential observations. Several types of residuals can be introduced (cf. also generalized linear models), but no one general type of residuals and/or plots stand out clear (to me) as the winner. See Chapter 4 in Therneau and Grambsch.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 7 / 8

slide-21
SLIDE 21

Some further topics

Diagnostic and residuals. To carry out a serious, practical analysis of survival data it is mandatory to take some steps to verify that the model actually fits the data, to check for outliers and/or highly influential observations. Several types of residuals can be introduced (cf. also generalized linear models), but no one general type of residuals and/or plots stand out clear (to me) as the winner. See Chapter 4 in Therneau and Grambsch. The functional forms. Who says that the log-linear effects in the Cox model are the right functional form? One alternative is linear effects in Aalens additive model, see Martinussen and Scheike or transformations/expansions of the covariate in the Cox model, see Chapter 5 in Therneau and Grambsch.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 7 / 8

slide-22
SLIDE 22

Some further topics

Multiple endpoints and competing risks. A relevant practical issue is to model the dependence structure among several events for the same

  • individual. See Chapter 10 in Martinussen and Scheike or Chapter 8

in Therneau and Grambsch.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 8 / 8

slide-23
SLIDE 23

Some further topics

Multiple endpoints and competing risks. A relevant practical issue is to model the dependence structure among several events for the same

  • individual. See Chapter 10 in Martinussen and Scheike or Chapter 8

in Therneau and Grambsch. Time-varying effects. The whole book by Martinussen and Scheike essentially deals with the problem of estimating effects that may change over time. One example is the effect of a treatment, which may show a substantial effect shortly after initiation, where the effect

  • f a continued treatment diminish over time.

Niels Richard Hansen (Univ. Copenhagen) Statistics BI/E lecture March 18, 2009 8 / 8