A FRAILTY MODEL FOR CENSORED FAMILY SURVIVAL DATA, applied to the - - PowerPoint PPT Presentation

a frailty model for censored family survival data
SMART_READER_LITE
LIVE PREVIEW

A FRAILTY MODEL FOR CENSORED FAMILY SURVIVAL DATA, applied to the - - PowerPoint PPT Presentation

Marianne Jonker VU University Amsterdam, The Netherlands. A FRAILTY MODEL FOR CENSORED FAMILY SURVIVAL DATA, applied to the age at onset of mental problems Joint work with: D.I Boomsma Motivation Observations from the Netherlands Twin


slide-1
SLIDE 1

Marianne Jonker

VU University Amsterdam, The Netherlands.

A FRAILTY MODEL FOR CENSORED FAMILY SURVIVAL DATA,

applied to the age at onset of mental problems

Joint work with: D.I Boomsma

slide-2
SLIDE 2

Motivation

Observations from the Netherlands Twin Register: For a study on health and personality, longitudinal data were collected

  • f twins and their sibs at different time points between 1991 and 2002.

At every timepoint it is observed whether the individual ever had contacted social service before.

slide-3
SLIDE 3

Motivation

Observations from the Netherlands Twin Register: For a study on health and personality, longitudinal data were collected

  • f twins and their sibs at different time points between 1991 and 2002.

At every timepoint it is observed whether the individual ever had contacted social service before. Aims:

  • to estimate the degree of heritability and environmental effects of

the age at which individuals contact social service for non-physical problems for the first time,

  • to test if the survival functions for twins and non-twins differ.
slide-4
SLIDE 4

Heritability

Heritability is the proportion of phenotypic variation in a population that is attributable to genetic variation among the individuals in that population. Phenotypic variation may be due to genetic and environmental factors. If a trait has a heritability of 0.5, this means that the phenotypic variation due to genetic variation is 50%. It does not imply that the trait is caused by genetics for 50%. Example: body height. Heritability of “height” in Western society is approximately ..%. Nowadays, environmental effects are very homogenous among these

  • populations. Decades ago, the heriatabilty of height was probably

much lower.

slide-5
SLIDE 5

Heritability

Heritability is the proportion of phenotypic variation in a population that is attributable to genetic variation among the individuals in that population. Phenotypic variation may be due to genetic and environmental factors. If a trait has a heritability of 0.5, this means that the phenotypic variation due to genetic variation is 50%. It does not imply that the trait is caused by genetics for 50%. Example: body height Heritability of “height” in Western society is approximately 80%. Nowadays, environmental effects are very homogenous among these

  • populations. Decades ago, the heriatabilty of height was probably

much lower.

slide-6
SLIDE 6

Body height

Heritability is implied if mono-zygotic twins are more similar to each

  • ther than dizygotic twins for the trait under study.

170 180 190 165 170 175 180 185 190 195 155 160 165 170 175 180 155 160 165 170 175 180 170 175 180 185 190 195 170 175 180 185 190 195 160 165 170 175 180 185 155 165 175 185

slide-7
SLIDE 7

Estimation of Heritability

Suppose we are interested in a trait that has been observed for every individual. We model the trait value as: T = µ + G + C + E. Then for a monozygotic twin: T MZ

1

= µ + G + C + E1 T MZ

2

= µ + G + C + E2 and for a dizygotic twin T DZ

1

= µ + G1 + C + E1 T DZ

2

= µ + G2 + C + E2.

slide-8
SLIDE 8

Estimation of Heritability

Heritability is defined as h2 = Var G Var T . It can be computed that ρMZ = Var G + Var C Var T ρDZ = 0.5Var G + Var C Var T . So, h2 = Var G Var T = 2(ρMZ − ρDZ). The term at the right can be easily estimated by inserting sample correlations.

slide-9
SLIDE 9

Interval Censored Data

The age at which an individual visists social service for the first time is interval censored; we do not observe the exact age, but an interval in which this age falls. An interval may start at zero or end at infinity. Heritability as defined before can not be estimated.

slide-10
SLIDE 10

The model, hazards

Suppose T is the age at which an individual contacted social service for the first time with distribution function F and density f, then the hazard is defined as λ(t) = f(t) 1 − F(t) 1 − F(t) = e−Λ(t), Λ(t) = t λ(s)ds Interpretation If f(t)dt is interpreted as the probability that an individual visits social service for the first time in the interval [t, t + dt), then λ(t)dt ≈ P(t ≤ T < t + dt) P(T ≥ t) = P(t ≤ T < t + dt|T ≥ t)

slide-11
SLIDE 11

The model, hazards

Suppose T is the age at which an individual contacted social service for the first time with distribution function F and density f, then the hazard is defined as λ(t) = f(t) 1 − F(t) 1 − F(t) = e−Λ(t), Λ(t) = t λ(s)ds Interpretation If f(t)dt is interpreted as the probability that an individual visits social service for the first time in the interval [t, t + dt), then λ(t)dt ≈ P(t ≤ T < t + dt) P(T ≥ t) = P(t ≤ T < t + dt|T ≥ t)

slide-12
SLIDE 12

Statistical Model (Frailty Model)

A frailty model is a random effects proportional hazards model. The hazard function is defined as: λi(t|Zi) = Ziλ(t). Family survival data can be modelled with a correlated frailty model where the random effects (“frailties”) account for the dependence between family members. Let: T1, T2, T3: ages at onset of mental help for a twin and a sib Z1, Z2, Z3: corresponding frailty variables

  • T1, T2, T3 are independent given (Z1, Z2, Z3)
  • T1, T2, T3 have hazard functions t → λi(t|Zi) = Ziλ(t), i = 1, 2, 3

Equivalently P(T1 > t1, T2 > t2, T3 > t3|Z1, Z2, Z3) = e−Z1Λ(t1)−Z2Λ(t2)−Z3Λ(t3).

slide-13
SLIDE 13

Statistical Model (Frailty Model)

A frailty model is a random effects proportional hazards model. The hazard function is defined as: λi(t|Zi) = Ziλ(t). Family survival data can be modelled with a correlated frailty model where the random effects (“frailties”) account for the dependence between family members. Let: T1, T2, T3: ages at onset of mental help for a twin and a sib Z1, Z2, Z3: corresponding frailty variables

  • T1, T2, T3 are independent given (Z1, Z2, Z3)
  • T1, T2, T3 have hazard functions t → λi(t|Zi) = Ziλ(t), i = 1, 2, 3

Equivalently P(T1 > t1, T2 > t2, T3 > t3|Z1, Z2, Z3) = e−Z1Λ(t1)−Z2Λ(t2)−Z3Λ(t3).

slide-14
SLIDE 14

Statistical Model (Frailty Variable)

The coordinates in the frailty vector (Z1, Z2, Z3) are decomposed as Zi = Ai + C + Bi + Ei i = 1, 2, 3 with

  • Ai: genetic effects for individual i
  • C: common environmental effects for twin and sib
  • Bi: specific twin environmental effects
  • Ei: non-shared, specific environmental and genetic effects
slide-15
SLIDE 15

Model Assumptions

Zi = Ai + C + Bi + Ei i = 1, 2, 3

  • Ai: cor(A1, A2) = 1 if MZ-twin-pair, otherwise cor(Ai, Aj) = 1/2
  • Bi: cor(B1, B2) = 1, otherwise cor(Bi, Bj) = 0, i = j
  • Ei: E1, E2, E3 are independent; cor(Ei, Ej) = 0, i = j
slide-16
SLIDE 16

Model Assumptions

Zi = Ai + C + Bi + Ei i = 1, 2, 3

  • Ai: cor(A1, A2) = 1 if MZ-twin-pair, otherwise cor(Ai, Aj) = 1/2
  • Bi: cor(B1, B2) = 1, otherwise cor(Bi, Bj) = 0, i = j
  • Ei: E1, E2, E3 are independent; cor(Ei, Ej) = 0, i = j

Furthermore:

  • (A1, A2, A3), C, (B1, B2, B3) and (E1, E2, E3) are independent.
  • Ai, C, Bi, Ei have gamma-distributions with inverse scale

parameter η and shape parameter ν, νc, νb, νe, respectively. So, Zi has a gamma distribution with parameters η and ν + νc + νb + νe Assume: η = ν + νc + νb + νe, then EZ1 = EZ2 = EZ3 = 1.

slide-17
SLIDE 17

Heritability, Environmental and Twin effect

Heritability, and environmental and twin effect are defined as the proportions of variance of the frailty associated with genetic, environmental and twin effects: h2 = Var Ai Var Z = ν η c2 = Var C Var Z = νc η b2 = νb η (see also Yashin et al, 1999 and Jonker et al, 2009). These definitions are as usual, except that the frailties are viewed as the latent

  • phenotypes. Then, the correlations between the frailty variables of two

individuals equal ρMZ = h2 + c2 + b2, ρDZ = 0.5h2 + c2 + b2, ρS,S = 0.5h2 + c2.

slide-18
SLIDE 18

Simultaneous Survival function

Because the vector of frailties has a multivariate gamma distribution, the simultaneous survival functions can be written in terms of the marginal survival function S. For instance, SMZ,S(t1, t2, t3) = P(T1 > t1, T2 > t2, T3 > t3) =

  • S(t1)−σ2 + S(t2)−σ2 + S(t3)−σ2 − 2

−ν/2−νc ×

  • S(t1)−σ2 + S(t2)−σ2 − 1

−ν/2−νb ×S(t3)(ν/2+νb)σ2(S(t1)S(t2)S(t3))νeσ2 (1) with σ2 = 1/η and with S the marginal survival function. S can be taken gender and/or twin - non twin specific.

slide-19
SLIDE 19

Aims

  • to estimate the degree of heritability (h2), environmental effects

(c2), and twin effects (b2),

  • to investigate whether the survival functions differ for twins and

non-twins. Steps:

  • Compute the likelihood (interval censored data)
  • Estimate the unknown parameters in the model (maximum

likelihood estimators)

  • Estimate h2, c2, b2 and test whether they are positive (likelihood

ratio test)

  • Test whether the survival fuctions for twins and non-twins differ

(likelihood ratio test)

slide-20
SLIDE 20

Likelihood and estimators

For the twin and sib timepoints Ui and Vi are observed, with Ui ≤ Ti ≤ Vi, i = 1, 2, 3. The likelihood for a MZ-twin with a sib is proportional to L(ν, νc, νb, η, S)[(U1, V1), (U2, V2), (U3, V3)] ∝ SMZ,S(U1, U2, U3) − SMZ,S(U1, U2, V3) − SMZ,S(U1, V2, U3) −SMZ,S(V1, U2, U3) + SMZ,S(U1, V2, V3) + SMZ,S(V1, U2, V3) +SMZ,S(V1, V2, U3) − SMZ,S(V1, V2, V3).

slide-21
SLIDE 21

Likelihood and estimators

For the twin and sib timepoints Ui and Vi are observed, with Ui ≤ Ti ≤ Vi, i = 1, 2, 3. The likelihood for a MZ-twin with a sib is proportional to L(ν, νc, νb, η, S)[(U1, V1), (U2, V2), (U3, V3)] ∝ SMZ,S(U1, U2, U3) − SMZ,S(U1, U2, V3) − SMZ,S(U1, V2, U3) −SMZ,S(V1, U2, U3) + SMZ,S(U1, V2, V3) + SMZ,S(V1, U2, V3) +SMZ,S(V1, V2, U3) − SMZ,S(V1, V2, V3). SMZ,S can be expressed in terms of the model parameters by (1). Under the assumption that all families are independent, the total likelihood is the product of the marginal likelihoods. The unknown parameters are estimated by their maximum likelihood estimators (S was taken equal to a parametric distribution).

slide-22
SLIDE 22

Hypothesis testing

We tested whether heritability, environmental and twin effects are positive: H0 : h2 = 0 ∨ H1 : h2 > 0, H0 : c2 = 0 ∨ H1 : c2 > 0, H0 : b2 = 0 ∨ H1 : b2 > 0, and whether the survival functions for twins (Stwin) and non-twins Snon−twin differ: H0 : Stwin ≡ Snon−twin ∨ H1 : Stwin ≡ Snon−twin, We used the likelihood ratio test. We assumed that S(t) = 1 − α(1 − exp(−λ(t − s))) for t ≥ s, with α, λ and s gender and twin- non twin specific (a truncated and shifted exponential distribution).

slide-23
SLIDE 23

Data

Data of 4498 families (twins and siblings) 727 families with 1 individual 1988 families with 2 individuals 1232 families with 3 individuals 551 families with 4 or more individuals In total, we have data of 6289 females and 4314 males.

slide-24
SLIDE 24

20 40 60 80 0.0 0.2 0.4 0.6 0.8 1.0

age

NPMLEs of distribution functions (1 − S) for males (dashed curve) and females (continued curve) under the assumption of independence between all individuals.

slide-25
SLIDE 25

20 40 60 80 0.0 0.2 0.4 0.6 0.8 1.0

age

NPMLEs of distribution functions (1 − S) for males (dashed curve) and females (continued curve) under the assumption of independence between all individuals.

20 40 60 80 0.0 0.2 0.4 0.6 0.8 1.0

age

20 40 60 80 0.0 0.2 0.4 0.6 0.8 1.0

age

NPMLEs for females (left) and males (right) under the independence

  • assumption. Dashed curves: non-twins, continued curves: twins
slide-26
SLIDE 26

Results

Estimates: ˆ h2 = 0.392, ˆ c2 = 0.101,ˆ b2 = 0.137. p-values for the four hypotheses are 1.1810−6(h2), 0.0198(c2), 0.000361(b2), 0.000264(S, for males), 0.282(S, for females).

20 40 60 80 0.0 0.2 0.4 0.6 0.8 1.0

age

Parametric maximum likelihood estimates for the distribution function for all women (dashed curve) and male twins and non-twins (continued lines).

slide-27
SLIDE 27

Problems to think over

  • Definition of “Social Service”
  • Fit of the model (especially for male non-twins)
  • Generations-effect
slide-28
SLIDE 28

Ackowledgments

  • Dorret Boomsma (VU University, Amsterdam) for providing the

data.

  • Piet Groeneboom for using his computer program for estimating

the NPMLE for the survival curve based on interval censored survival data.

  • All Dutch twins and their families for participation in the study.

Financially supported by: NDNS+ and NWO

slide-29
SLIDE 29

References

  • MA Jonker, S Bhulai, DI Boomsma, RSL Ligthart, D Posthuma, and

AW van der Vaart. Gamma frailty model for linkage analysis with application to interval censored migraine data Biostatistics 10: 187-200; 2009.

  • MA Jonker and DI Boomsma. A frailty model for (interval) censored

family survival data, applied to the age at onset of non-physical problems, Lifetime data analysis, 16: 299-315; 2010

  • AI Yashin, AZ Begun and IA Iachine. Genetic factors in susceptibility

to death: a comparitive analysis of bivariate survival models. Journal of Epidemiology and Biostatistics 4(1), 53-60; 1999.