Who needs the Cox model anyway Bendix Carstensen Steno Diabetes - PowerPoint PPT Presentation

Who needs the Cox model anyway Bendix Carstensen Steno Diabetes Center Copenhagen Gentofte, Denmark http://BendixCarstensen.com SDC Epi and Biostat Network, 11 March 2020 Thursday 12 th March, 2020, 10:38 From /home/bendix/teach/AdvCoh/talks/Aarhus2020/slides.tex 1/ 47

The dogma [1] ◮ do not condition on the future — indisputable ◮ do not count people after they are dead — disputable ◮ stick to this world — expandable P. K. Andersen and N. Keiding: Interpretability and importance of functionals in competing risks and multistate models Stat Med, 31:1074–1088, 2012 2/ 47

(further) dogma for “sticking to this world” ◮ rates are continuous in time (and“smooth” ) ◮ rates may depend on more than one time scale ◮ . . . which timescales is an empirical question ◮ But first we look at the machinery for modeling simple occurence rates from follow-up studies (mortality, incidence, . . . ) 3/ 47

◮ In follow-up studies we estimate rates from: ◮ D — events, deaths ◮ Y — person-years ◮ ˆ λ = D / Y rates ◮ . . . empirical counterpart of intensity — an estimate ◮ Rates differ between persons. ◮ Rates differ within persons: ◮ by age ◮ by calendar time ◮ by disease duration ◮ . . . ◮ Multiple timescales — later 4/ 47

Representation of follow-up data A cohort or follow-up study records events and risk time The outcome (response) is thus bivariate : ( d , y ) Follow-up data for each individual must therefore have (at least) three pieces of information recorded: Date of entry entry date variable Date of exit exit date variable Status at exit indicator (mostly 0 / 1 ) event 5/ 47

From representation to likelihood ◮ Target is estimates of occurrence rates (mortality rates, incidence rates) ◮ . . . and how these depend on covariates ◮ If we assume that mortality, λ is constant over time, then the log-likelihood from one person based on ( d , y ) : ◮ d — event, 0 or 1 ( event ) ◮ y — risk time ( exit − entry ) ℓ ( λ ) = d log( λ ) − λ y ◮ This formula is not derived here — see note on website 6/ 47

d y t 0 t 1 t 2 t x y 1 y 2 y 3 Probability log-Likelihood P( d at t x | entry t 0 ) d log( λ ) − λ y = P( surv t 0 → t 1 | entry t 0 ) = 0 log( λ ) − λ y 1 × P( surv t 1 → t 2 | entry t 1 ) + 0 log( λ ) − λ y 2 × P( d at t x | entry t 2 ) + d log( λ ) − λ y 3 7/ 47

d = 0 y ❡ t 0 t 1 t 2 t x ❡ y 1 y 2 y 3 Probability log-Likelihood P( surv t 0 → t x | entry t 0 ) 0 log( λ ) − λ y = P( surv t 0 → t 1 | entry t 0 ) = 0 log( λ ) − λ y 1 × P( surv t 1 → t 2 | entry t 1 ) + 0 log( λ ) − λ y 2 × P( surv t 2 → t x | entry t 2 ) + 0 log( λ ) − λ y 3 8/ 47

d = 1 y ✉ t 0 t 1 t 2 t x ✉ y 1 y 2 y 3 Probability log-Likelihood P( event at t x | entry t 0 ) 1 log( λ ) − λ y = P( surv t 0 → t 1 | entry t 0 ) = 0 log( λ ) − λ y 1 × P( surv t 1 → t 2 | entry t 1 ) + 0 log( λ ) − λ y 2 × P( event at t x | entry t 2 ) + 1 log( λ ) − λ y 3 9/ 47

d y t 0 t 1 t 2 t x y 1 y 2 y 3 Probability log-Likelihood P( d at t x | entry t 0 ) d log( λ ) − λ y = P( surv t 0 → t 1 | entry t 0 ) = 0 log( λ ) − λ y 1 × P( surv t 1 → t 2 | entry t 1 ) + 0 log( λ ) − λ y 2 × P( d at t x | entry t 2 ) + d log( λ ) − λ y 3 10/ 47

d y t 0 t 1 t 2 t x y 1 y 2 y 3 Probability log-Likelihood P( d at t x | entry t 0 ) d log( λ ) − λ y = P( surv t 0 → t 1 | entry t 0 ) = 0 log( λ 1 ) − λ 1 y 1 × P( surv t 1 → t 2 | entry t 1 ) + 0 log( λ 2 ) − λ 2 y 2 × P( d at t x | entry t 2 ) + d log( λ 3 ) − λ 3 y 3 — allows different rates ( λ i ) in each interval 11/ 47

Likelihood for time-split data ◮ The setup is for a situation where it is assumed that rates are constant in each of the intervals ◮ Each record in the data set represents follow-up for one person in one (small) interval — many records for each person ◮ Each record in the data set contributes a term to the likelihood ◮ Each term looks like a contribution from a Poisson variate (albeit with values only 0 or 1 ), with mean λ y ◮ ⇒ Likelihood for one person’s FU (rate likelihood) is the same as the likelihood for several independent Poisson variates: ◮ Two models, one likelihood. 12/ 47

Analysis of time-split data Observations classified by p —person and i —interval ◮ d pi — In the model as response ◮ y pi — risk time In the model as offset log( y ) . . . or as part of the response ◮ Covariates are: ◮ timescales (age, period, time in study) ◮ other variables for this person (constant in each interval). ◮ Model rates using the covariates in glm : — no difference in how time-scales and other covariates are modeled 13/ 47

A look at the Cox model λ ( t , x ) = λ 0 ( t ) × exp( x ′ β ) A model for the rate as a function of t and x . Covariates: ◮ x ◮ t ◮ . . . often the effect of t is ignored (forgotten?) ◮ i.e. left unreported 14/ 47

Cox-likelihood The (partial) log-likelihood for the regression parameters: � � e η death � ℓ ( β ) = log � i ∈R t e η i death times is also a profile likelihood in the model where observation time has been subdivided in small pieces (empirical rates) and each small piece provided with its own parameter: � � � � + x ′ β = α t + η log λ ( t , x ) = log λ 0 ( t ) 15/ 47

The Cox-likelihood as profile likelihood ◮ One parameter per death time to describe the effect of time (i.e. the chosen timescale). � � � � log λ ( t , x i ) = log λ 0 ( t ) + β 1 x 1 i + · · · + β p x pi = α t + η i � �� η i ◮ Profile likelihood: ◮ Derive estimates of α t as function of data and β s — assuming constant rate between death/censoring times ◮ Insert in likelihood, now only a function of data and β s ◮ This turns out to be Cox’s partial likelihood ◮ Cumulative intensity ( Λ 0 ( t ) ) obtained via the Breslow-estimator 16/ 47

Mayo Clinic 1.0 lung cancer data: 0.8 60 year old woman 0.6 Survival 0.4 0.2 0.0 0 200 400 600 800 Days since diagnosis 17/ 47

The Cox-likelihood: mechanics of computing ◮ The likelihood is computed by suming over risk-sets: � � e η death � ℓ ( η ) = log � i ∈R t e η i t ◮ this is essentially splitting follow-up time at event- (and censoring) times ◮ . . . repeatedly in every cycle of the iteration ◮ . . . simplified by not keeping track of risk time ◮ . . . but only works along one time scale 18/ 47

� � � � log λ ( t , x i ) = log λ 0 ( t ) + β 1 x 1 i + · · · + β p x pi = α t + η i � �� η i ◮ Suppose the time scale has been divided into small intervals with at most one death in each: ◮ Empirical rates: ( d it , y it ) — each t has at most one d it = 1 . ◮ Assume w.l.o.g. the y s in the empirical rates all are 1. ◮ Log-likelihood contributions that contain information on a specific time-scale parameter α t will be from: ◮ the (only) empirical rate (1 , 1) with the death at time t . ◮ all other empirical rates (0 , 1) from those who were at risk at time t . 19/ 47

Note: There is one contribution from each person at risk to the part of the log-likelihood at t : � ℓ t ( α t , β ) = d i log( λ i ( t )) − λ i ( t ) y i i ∈R t � � d i ( α t + η i ) − e α t + η i � = i ∈R t = α t + η death − e α t � e η i i ∈R t where η death is the linear predictor for the person that died at t . 20/ 47

The derivative w.r.t. α t is: 1 D α t ℓ t ( α t , β ) = 1 − e α t � e η i = 0 e α t = ⇔ � i ∈R t e η i i ∈R t If this estimate is fed back into the log-likelihood for α t , we get the profile likelihood (with α t “profiled out” ): � � � � 1 e η death log + η death − 1 = log − 1 � � i ∈R t e η i i ∈R t e η i which is the same as the contribution from time t to Cox’s partial likelihood. 21/ 47

Splitting the dataset a priori ◮ The Poisson approach needs a dataset of empirical rates ( d , y ) with suitably small values of y . ◮ — each individual contributes many empirical rates ◮ (one per risk-set contribution in Cox-modelling) ◮ From each empirical rate we get: ◮ Poisson-response d ◮ Risk time y → log( y ) as offset ◮ time scale covariates: current age, current date, . . . ◮ other covariates ◮ Contributions not independent, but likelihood is a product ◮ Same likelihood as for independent Poisson variates ◮ Poisson glm with spline/factor effect of time 22/ 47

History This is not new, the profile likelihood was pointed out by Holford [2] in 1976, and the practical implementation was demonstrated by Whitehead in 1980 [3], using GLIM. . . . so I am telling an old story here. 23/ 47

Example: Mayo Clinic lung cancer ◮ Survival after lung cancer ◮ Covariates: ◮ Age at diagnosis ◮ Sex ◮ Time since diagnosis ◮ Cox model ◮ Split data: ◮ Poisson model, time as factor ◮ Poisson model, time as spline 24/ 47

Mayo Clinic 1.0 lung cancer 0.8 60 year old woman 0.6 Survival 0.4 0.2 0.0 0 200 400 600 800 Days since diagnosis 25/ 47

Who needs the Cox model anyway Bendix Carstensen Steno Diabetes - PowerPoint PPT Presentation

Who needs the Cox model anyway Bendix Carstensen Steno Diabetes Center Copenhagen Gentofte, Denmark http://BendixCarstensen.com SDC Epi and Biostat Network, 11 March 2020 Thursday 12 th March, 2020, 10:38 From

LTS Efforts in Network Mapping LTS Efforts in Network Mapping Dr B Ann Cox Dr B Ann Cox Dr. B.

Algorithms for Cox rings Simon Keicher ICERM May 2018 Algorithms for Cox rings S. Keicher

Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides

Who needs the Cox model anyway Bendix Carstensen Steno Diabetes Center Copenhagen Gentofte,

Radare2 - The Dwarf Fortress of reversing Who needs a GUI anyway? Florent (Skia) Jacquet Julien

Responding To A PCAOB Investigation October 16, 2018 Lawline Robert H. Cox 1 Robert H. Cox

Special Needs Planning Cox Law Group, Inc Cynthia Cox, Esq. cynthia@coxlawgroupinc.com

Whose Internet Is It, Anyway? Blackhat DC 2010 Andrew Fried, ISC, SURBL Richard Cox, Spamhaus

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

Photography Photography By: Jason Cox By: Jason Cox Cameras Cameras Pinhole Pinhole

The Mathematics of Billiards Washington University Math Circle Chris Cox March 6, 2016 Chris

1099 1099 1099 1099 New Y New York rk Av Avenue W W Washing ashington, ton, D D.C. D

2 3 4 5 6 7 8 9 10 11 Cox (1993); Cox et al. (1991); Hicks-Clarke & Iles (2000); Richard

Cary Cox Agenda Overview Cary Cox Assistant Secretary for Marketing & Communications

TEANA July 2018 Independent Contractor Legal Review Jeffrey E. Cox, Esq. Seaton & Husk, LP

reduces the risk of relapse compared to bone marrow in patients with Hodgkin lymphoma. Luca

Tafenoquine for Malaria Disclosure: Prophylaxis and Antirelapse Therapy Co-investigator on

The status of Relapsed and Primary Refractory Hodgkin Lymphoma in the near future Impact of

Tobacco- Use Disorders OBJECTIVES As a result of this training, participants will be able to:

A novel approach to competing risks analysis using case-base sampling Maxime Turgeon June 10th,

An Audit of Steroid Treatment within a two-week relapse period at Salford Royal Hospital Mo

Propensity Analysis Major Depressive Disorder Collaborative Depression Study 25 year

Integrating Care for Neurology Dr Mo Ali Clinical Director Whole System Integrated Care

Who needs the Cox model anyway Bendix Carstensen Steno Diabetes - PowerPoint PPT Presentation

Who needs the Cox model anyway Bendix Carstensen Steno Diabetes Center Copenhagen Gentofte, Denmark http://BendixCarstensen.com SDC Epi and Biostat Network, 11 March 2020 Thursday 12 th March, 2020, 10:38 From

LTS Efforts in Network Mapping LTS Efforts in Network Mapping Dr B Ann Cox Dr B Ann Cox Dr. B.

Algorithms for Cox rings Simon Keicher ICERM May 2018 Algorithms for Cox rings S. Keicher

Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides Needs Slides

Who needs the Cox model anyway Bendix Carstensen Steno Diabetes Center Copenhagen Gentofte,

Radare2 - The Dwarf Fortress of reversing Who needs a GUI anyway? Florent (Skia) Jacquet Julien

Responding To A PCAOB Investigation October 16, 2018 Lawline Robert H. Cox 1 Robert H. Cox

Special Needs Planning Cox Law Group, Inc Cynthia Cox, Esq. cynthia@coxlawgroupinc.com

Whose Internet Is It, Anyway? Blackhat DC 2010 Andrew Fried, ISC, SURBL Richard Cox, Spamhaus

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

Survival Rates and Multiple timescales Survival Lifetable estimators Competing risks Kaplan-

Photography Photography By: Jason Cox By: Jason Cox Cameras Cameras Pinhole Pinhole

The Mathematics of Billiards Washington University Math Circle Chris Cox March 6, 2016 Chris

1099 1099 1099 1099 New Y New York rk Av Avenue W W Washing ashington, ton, D D.C. D

2 3 4 5 6 7 8 9 10 11 Cox (1993); Cox et al. (1991); Hicks-Clarke &amp; Iles (2000); Richard

Cary Cox Agenda Overview Cary Cox Assistant Secretary for Marketing &amp; Communications

TEANA July 2018 Independent Contractor Legal Review Jeffrey E. Cox, Esq. Seaton &amp; Husk, LP

reduces the risk of relapse compared to bone marrow in patients with Hodgkin lymphoma. Luca

Tafenoquine for Malaria Disclosure: Prophylaxis and Antirelapse Therapy Co-investigator on

The status of Relapsed and Primary Refractory Hodgkin Lymphoma in the near future Impact of

Tobacco- Use Disorders OBJECTIVES As a result of this training, participants will be able to:

A novel approach to competing risks analysis using case-base sampling Maxime Turgeon June 10th,

An Audit of Steroid Treatment within a two-week relapse period at Salford Royal Hospital Mo

Propensity Analysis Major Depressive Disorder Collaborative Depression Study 25 year

Integrating Care for Neurology Dr Mo Ali Clinical Director Whole System Integrated Care

2 3 4 5 6 7 8 9 10 11 Cox (1993); Cox et al. (1991); Hicks-Clarke & Iles (2000); Richard

Cary Cox Agenda Overview Cary Cox Assistant Secretary for Marketing & Communications

TEANA July 2018 Independent Contractor Legal Review Jeffrey E. Cox, Esq. Seaton & Husk, LP