Mixed effect model for the spatiotemporal analysis of longitudinal - - PowerPoint PPT Presentation

mixed effect model for the spatiotemporal analysis of
SMART_READER_LITE
LIVE PREVIEW

Mixed effect model for the spatiotemporal analysis of longitudinal - - PowerPoint PPT Presentation

Mixed effect model for the spatiotemporal analysis of longitudinal manifold valued data Stphanie Allassonnire with J.B. Schiratti, J. Chevallier, I. Koval, V. Debavalaere and S. Durrleman Universit Paris Descartes & Ecole


slide-1
SLIDE 1

Stéphanie Allassonnière

with J.B. Schiratti, J. Chevallier, I. Koval, V. Debavalaere and

  • S. Durrleman

Université Paris Descartes & Ecole Polytechnique

Mixed effect model for the spatiotemporal analysis of longitudinal manifold valued data

slide-2
SLIDE 2
  • Represent and analyse geometrical elements upon which deformations

can act

  • Describe the observed objects as geometrical variations of one or several

representative elements

  • Quantify this variability inside a population

Deformable template model from Grenander

  • How does the deformation act?
  • What is a representative element?
  • How to quantify the geometrical variability ?

2

Computational Anatomy

slide-3
SLIDE 3

3

One solution:

  • Quantify the distance between observations using deformations
  • Provide a statistical model to approximate the generation of the observed

population from the atlas

  • Propose a statistical learning algorithm
  • Optimise the numerical estimation

Computational Anatomy

slide-4
SLIDE 4

4

Bayesian Mixed Effect model

  • First model :

– One observation per subject – Image or shape (viewed as currents) – Deformations either linearized or diffeomorphic – Homogeneous or heterogeneous populations (mixture models)

slide-5
SLIDE 5

5

Bayesian Mixed Effect model

slide-6
SLIDE 6

6

Bayesian Mixed Effect model

slide-7
SLIDE 7

7

Bayesian Mixed Effect model

  • First model :

– One observation per subject – Image or shape (viewed as currents) – Deformations either linearized or diffeomorphic – Homogeneous or heterogeneous populations (mixture models)

Ø Limitations

Ø One observation per subject Ø Corresponding acquistion time

slide-8
SLIDE 8

8

Longitudinal Data Analysis

  • Longitudinal model :

– Several observation per subject – Image, shape, etc – Atlas = representative trajectory and population variability

slide-9
SLIDE 9

Longitudinal Data Analysis

9

Temporal marker of progression

(e.g. time since drug injection, seeding, birth, etc..)

subject #1 subject #2 subject #3 Time (age) Regression

(e.g. compare measurements at same time-point)

No temporal marker of progression

(e.g. in aging, neurodegenerative diseases, etc..)

How to learn representative trajectories of data changes from longitudinal data?

Learning spatiotemporal distribution of trajectories

  • Find temporal correspondences
  • Compare data at corresponding

stages of progression

Linear mixed-effects models

[Laird&Ware’82, Diggle et al., Fitzmaurice et al.]

Needs to disentangle differences in:

  • Measurements
  • Dynamics of measurement

changes vector-valued data manifold-valued data

(normalized data, positive matrices, shapes, etc;;)

slide-10
SLIDE 10

Spatiotemporal Statistical Model

10

[Schiratti et al. IPMI’15, NIPS’15]

  • Statistical model inclinding:
  • a representative trajectory of

data changes

  • spatiotemporal variations in:
  • measurement values
  • pace of measurement

changes

  • Orthogonality condition ensures

identifiability (unique space/time decomposition)

  • Time is not a covariate but a

random variable

t0 t

v0

p0

T0(t) = Expp0,t0(v0)(t)

vi

pij

vij

yij = Ti(ψi(t)) + εij

Ti(t) = ExpT0(t)(PT0

t0,t(vi))

ψi(t) = t0 + αi(t − t0 − τi)

Acceleration factor Time-shift Space-shift

αi ∼ log N(0, σ2

α)

τi ∼ N(0, σ2

τ)

vi = (A1| . . . |AK) si Ak⊥v0 Random effects: Fixed effects: (p0, t0, v0)

(σ2

α, σ2 τ, A1, ...AK)

and

slide-11
SLIDE 11

11

Spatiotemporal Statistical Model

yij = Ti(ψi(t)) + εij

Ti(t) = ExpT0(t)(PT0

t0,t(vi))

T0(t) = Expp0,t0(v0)(t)

ψi(t) = t0 + αi(t − t0 − τi) αi ∼ log N(0, σ2

α)

τi ∼ N(0, σ2

τ)

vi = (A1| . . . |AK) si Ak⊥v0 (p0, t0, v0)

{

Submanifold value observations Parallel curve Representative trajectory Linear time reparametrization

{

Hidden random variables: Acceleration factor Time shift Space shift Parameters: Mean trajectory parametrization and prior parameter

(σ2

α, σ2 τ, A1, ...AK)

{

slide-12
SLIDE 12

12

Spatiotemporal Statistical Model

vi

pij

vij

p0

Σ P0tT ΣP0t

Interest: Parallel transport keep invariant the structure of the distribution, but updated it in time Comparison with previous work:

v0

p0

vi vi

pij

vij v0

p0

slide-13
SLIDE 13

Spatiotemporal Statistical Model

13

  • The straight line model M = R

y time

b

t0 yij = (a + ai)(ti,j − t0) + b + bi | {z } +εi,j

Measurement of the ith subject at time t0 Laird & Ware (1982) Schiratti et al. (2015)

time

b

t0 yij = (a + ai)(ti,j −t0 − τi | {z }) + b + εi,j

Time at which measurement of the ith subject reaches ¯

b

x x

slide-14
SLIDE 14

Spatiotemporal Statistical Model

14

  • The logistic curve model
  • Geodesic are logistic curves
  • It is not equivalent to a linear model on the logit of the observations (i.e. the

Riemannian log at p0 = 0.5), since p0 is estimated

  • If we fix p0 = 0.5 in our model à end up with our previous linear case (different

from Laird&Ware)

M =]0, 1[, g(p)(u, v) = uv p2(1 − p)2 γ0(t) = 1 + (1 − p0)/p0 exp ⇣ −

v0 p0(1−p0) (t − t0)

⌘ yij = γ0 ⇣ t0 + αi(t − t0 − τi) ⌘ + εij

slide-15
SLIDE 15

Spatiotemporal Statistical Model

15

  • The propagation model
  • Geodesics are logistic curves in each coordinate
  • Parametric family of geodesics seen as a model of propagation of an effect
  • The parallel curve in the direction of the space-shift vi writes

à The parallel changes the relative timing of the effect onset across coordinates

M =]0, 1[N, g(p)(u, v) =

N

X

k=1

ukvk p2

k(1 − pk)2

γδ(t) = ⇣ γ0(t), γ0(t − δ1), . . . , γ0(t − δN−1) ⌘

✓ γ0 ✓ t + vi,1 v0 ◆ , γ0 ✓ t − δ1 + vi,2 v0 ◆ , ..., γ0 ✓ t − δN−1 + vi,N v0 ◆◆

slide-16
SLIDE 16

Parameter Estimation

16

  • Maximum Likelihood:
  • EM:
  • Distribution from the curved exponential family

maxθ p(y|θ) = Z p(y, z|θ)dz p(yi|zi, θ)p(zi|θ) θk+1 = argmaxθ

N

X

i=1

Z log ✓ p(yi, zi|θ) | {z } ◆ p(zi|yi, θk)dzi θk+1 = argmaxθ ( φ(θ)T

N

X

i=1

Z S(yi, zi)p(zi|yi, θk)dzi − N log(C(θ)) )

log p(yi, zi|θ) = φ(θ)T S(yi, zi) − log(C(θ))

y = (y1, ..., yN), z = (z1, ...zN), θ = (σ2

z, σ2 ε, A1, ..., AK, p0, t0, v0)

slide-17
SLIDE 17

17

Parameter Estimation: stochastic algorithm

  • SA-EM: replaces integration by one simulation of the hidden variable:

sample from , and a stochastic approximation of the sufficient statistics Maximization step (unchanged)

  • MCMC-SAEM: replaces sampling by a single Markov Chain step
  • For each subject, sample the random effect w.r.t a transition kernel of a

geometrically ergodic Markov chain targeting the conditional distribution

zi,k+1 p(zi|yi, θk) Sk+1 = (1 − ∆k)Sk + ∆k 1 N

N

X

i=1

S(yi, zi,k+1) ! θk+1 = argmaxθ

  • φ(θ)T Sk+1 − log(C(θ))

[Delyon, Lavielle, Moulines.’99] [Allassonnière et al.’10]

q(zi|yi, θk)

<latexit sha1_base64="Z9NCyt3yIeWKeHQl9WaurXkrylo=">AC2HicjVHLSsNAFD2Nr1pf1S7dBItQUqgi6LblxWsA9sS0jSaR2aJjGZCLUK7sStP+BWv0j8A/0L74wpqEV0QpIz595zZu69duDySBjGa0qbmp6ZnUvPZxYWl5ZXsqtrtciPQ4dVHd/1w4ZtRczlHqsKLlzWCEJmDWyX1e3+kYzXL1kYcd87FcOAtQdWz+Nd7liCKDObuyhcmfx6aPLtljhnwjL7W2Y2bxQNtfRJUEpAHsmq+NkXtNCBDwcxBmDwIAi7sBDR0QJBgLi2hgRFxLiKs5wgwxpY8pilGER26dvj3bNhPVoLz0jpXboFJfekJQ6NknjU15IWJ6mq3isnCX7m/dIecq7DelvJ14DYgXOif1LN878r07WItDFgaqBU02BYmR1TuISq67Im+tfqhLkEBAncYfiIWFHKcd91pUmUrXL3loq/qYyJSv3TpIb413ekgZc+jnOSVDbKZ2i8bJXr58mIw6jXVsoEDz3EcZx6igSt5DPOIJz9qZdqvdafefqVoq0eTwbWkPH86XlxU=</latexit>
slide-18
SLIDE 18

18

Parameter Estimation: stochastic algorithm

  • SA-EM: replaces integration by one simulation of the hidden variable:

sample from , and a stochastic approximation of the sufficient statistics Maximization step (unchanged)

  • MCMC-SAEM: replaces sampling by a single Markov Chain step
  • For each subject, sample the random effect w.r.t a transition kernel of a

geometrically ergodic Markov chain targeting the conditional distribution As long as “converges towards” as

zi,k+1 p(zi|yi, θk) Sk+1 = (1 − ∆k)Sk + ∆k 1 N

N

X

i=1

S(yi, zi,k+1) ! θk+1 = argmaxθ

  • φ(θ)T Sk+1 − log(C(θ))

[Chevallier & Allassonnière, preprint ‘19]

˜ qk(zi, θk)

<latexit sha1_base64="EFP1N1Tawr4HAt3mDB6PxmLbyvs=">AC3nicjVHNSsNAGBzjX/2PehIvwSIoSElV0GPBi8cKtgq2xM12tUvTJCYboRbx5k28+gJe9XHEN9C38Ns1BUVENySZnW9mdr9dPw5kqlz3dcgaHhkdGy9MTE5Nz8zO2fML9TKEi5qPAqi5NhnqQhkKGpKqkAcx4lgXT8QR35nT9ePLkWSyig8VL1YNLvsPJRnkjNFlGcvNZQMWqJ/ce1q48udFQbaGY1n37GK5JrhuD/AoFREPqR/YIGWojAkaELgRCKcACGlJ4TlOEiJq6JPnEJIWnqAteYJG9GKkEKRmyHvuc0O8nZkOY6MzVuTqsE9CbkdLBKnoh0CWG9mPqmUnW7G/ZfZOp9ajv59ndYlVaBP7l2+g/K9P96Jwhl3Tg6SeYsPo7niekplT0Tt3vnSlKCEmTuMW1RPC3DgH5+wYT2p612fLTP3NKDWr5zXZnjXu/x6wb+D+mapvFVyD7aLld38qgtYxgrW6D53UME+qhR9g0e8YRn69S6te6s+0+pNZR7FvFtWA8fuYiZuw=</latexit>

˜ qk

<latexit sha1_base64="40UCSY71gF6hpiKfKWST7aiH12w=">ACznicjVHLSsNAFD2Nr1pfVZdugkVwVRIt6LgxmUF+4BWSjKd1qHTJCaTQinFrT/gVj9L/AP9C+MKahFdEKSM+ec2fuvX4kRaIc5zVnLS2vrK7l1wsbm1vbO8XdvUYSpjHjdRbKMG75XsKlCHhdCSV5K4q5N/Ilb/rDCx1vjnmciDC4VpOI34y8QSD6gnmKqHZHCdnj07tZd9gtlpyY5a9CNwMlJCtWlh8Qc9hGBIMQJHAEVYwkNCTxsuHETE3WBKXExImDjHDAXypqTipPCIHdJ3QLt2xga01zkT42Z0iqQ3JqeNI/KEpIsJ69NsE09NZs3+lntqcuq7TejvZ7lGxCrcEvuXb678r0/XotDHualBUE2RYXR1LMuSmq7om9tfqlKUISJO4x7FY8LMOd9to0nMbXr3nom/maUmtV7lmlTvOtb0oDdn+NcBI2Tsntadq4qpWolG3UeBzjEMc3zDFVcoa6fgjnvBs1ayxNbPuP6VWLvPs49uyHj4A2kCT/g=</latexit>

q(zi|yi, θk)

<latexit sha1_base64="Z9NCyt3yIeWKeHQl9WaurXkrylo=">AC2HicjVHLSsNAFD2Nr1pf1S7dBItQUqgi6LblxWsA9sS0jSaR2aJjGZCLUK7sStP+BWv0j8A/0L74wpqEV0QpIz595zZu69duDySBjGa0qbmp6ZnUvPZxYWl5ZXsqtrtciPQ4dVHd/1w4ZtRczlHqsKLlzWCEJmDWyX1e3+kYzXL1kYcd87FcOAtQdWz+Nd7liCKDObuyhcmfx6aPLtljhnwjL7W2Y2bxQNtfRJUEpAHsmq+NkXtNCBDwcxBmDwIAi7sBDR0QJBgLi2hgRFxLiKs5wgwxpY8pilGER26dvj3bNhPVoLz0jpXboFJfekJQ6NknjU15IWJ6mq3isnCX7m/dIecq7DelvJ14DYgXOif1LN878r07WItDFgaqBU02BYmR1TuISq67Im+tfqhLkEBAncYfiIWFHKcd91pUmUrXL3loq/qYyJSv3TpIb413ekgZc+jnOSVDbKZ2i8bJXr58mIw6jXVsoEDz3EcZx6igSt5DPOIJz9qZdqvdafefqVoq0eTwbWkPH86XlxU=</latexit>

k → ∞

<latexit sha1_base64="+SPztLGDsSBXAoy1PgIrOiODdQ=">ACz3icjVHLSsNAFD2Nr1pfVZdugkVwVRIt6LgxmUL9gFNkSd1qF5MZkopShu/QG3+lfiH+hfeGdMQS2iE5KcOfeM3Pv9ZKAp9KyXgvGwuLS8kpxtbS2vrG5Vd7eadxJnzW8uMgFl3PTVnAI9aSXAasmwjmhl7AOt74TMU710ykPI4u5CRh/dAdRXzIfVcS5YwdGZsOj4ZyUrosV6yqpZc5D+wcVJCvRlx+gYMBYvjIEIhgiQcwEVKTw82LCTE9TElThDiOs5wixJpM8pilOESO6bviHa9nI1orzxTrfbplIBeQUoTB6SJKU8QVqeZOp5pZ8X+5j3VnupuE/p7uVdIrMQVsX/pZpn/1alaJIY41TVwqinRjKrOz10y3RV1c/NLVZIcEuIUHlBcEPa1ctZnU2tSXbvqravjbzpTsWrv57kZ3tUtacD2z3HOg/ZR1T6uWs1apV7LR13EHvZxSPM8QR3naKBF3gke8YRno2ncGHfG/WeqUcg1u/i2jIcPWcCTwQ=</latexit>
slide-19
SLIDE 19
  • Theoretical properties of the sampler:

Under mild conditions: – Drift property – Small set – Geometric ergodicity uniformly on any compact set of the parameters

  • Theoretical properties of the estimation algorithm:

– a.s. convergence towards the MAP estimator – Normal asymptotic behaviour: speed – Normal asymptotoc behaviour with optimal speed with averaging sequences

19

Parameter Estimation: stochastic algorithm

1/ p ∆k

1/ √ k

slide-20
SLIDE 20

Model of Alzheimer’s disease progression

20

The average trajectory of data changes

  • Neuropsychological tests ADAS-

Gog from ADNI

  • 248 subjects who converted

from MCI to AD

  • 6 time-points per subjects on

average (min 3, max 11)

  • Data points

with propagation logistic model yij ∈]0, 1[4

[Schiratti et al. IPMI’15, NIPS’15]

slide-21
SLIDE 21

Model of Alzheimer’s disease progression

[Schiratti et al. IPMI’15, NIPS’15]

Acceleration factor 𝛽"

Distinguish fast vs. slow progressers Distinguish early vs. late onset individuals

Time-shift

+1σ −1σ

slide-22
SLIDE 22

Model of Alzheimer’s disease progression

[Schiratti et al. IPMI’15, NIPS’15] Variability in the relative timing and ordering of the events

Decomposition vector A1

+1σ −1σ

Decomposition vector A2

slide-23
SLIDE 23

23

Computational comparisons

  • Comparison of: MCMC-SAEM – STAN – MONOLIX
  • Number of iterations:
  • MCMC-SAEM: 1 000 000 (6s / 1 000 iterations)
  • STAN: 15 000 (25min / 1 000 iterations)
  • MONOLIX: 20 000 (3,5 min / 1 000 iterations)
slide-24
SLIDE 24

24

Computational comparisons

  • Comparison of: MCMC-SAEM – STAN – MONOLIX

True values MCMC-SAEM STAN Monolix

slide-25
SLIDE 25

Model of Alzheimer’s disease progression

[Schiratti et al. IPMI’15, NIPS’15]

slide-26
SLIDE 26

26

Comparison AD vs Controls

slide-27
SLIDE 27

27

Comparison MCI vs Controls

slide-28
SLIDE 28

28

Each snapshot corresponds to the neuronal loss within 5 years (from 70-75 to 85-90) Neuronal loss within 10 years

ADNI Data à Alzheimer’s Disease cohort Measures of the cortical thickness for MCI converters

Model of propagation on a graph

[Koval et al, Frontiers in Neurosciences’17]

slide-29
SLIDE 29

Spatiotemporal Statistical Model

29

[Schiratti et al. IPMI’15, NIPS’15]

  • Geodesic representative trajectory
  • > What for treated pathologies?

t0 t

v0

p0

T0(t) = Expp0,t0(v0)(t)

vi

pij

vij

yij = Ti(ψi(t)) + εij

Ti(t) = ExpT0(t)(PT0

t0,t(vi))

ψi(t) = t0 + αi(t − t0 − τi)

Acceleration factor Time-shift Space-shift

αi ∼ log N(0, σ2

α)

τi ∼ N(0, σ2

τ)

vi = (A1| . . . |AK) si Ak⊥v0 Random effects: Fixed effects: (p0, t0, v0)

(σ2

α, σ2 τ, A1, ...AK)

and

slide-30
SLIDE 30

30

Spatiotemporal Statistical Model Piecewise geodesic trajectories

slide-31
SLIDE 31

Spatiotemporal Statistical Model Piecewise geodesic trajectories

[Chevallier et al, NIPS’17]

slide-32
SLIDE 32

Spatiotemporal Statistical Model Piecewise geodesic trajectories

[Chevallier et al, NIPS’17]

slide-33
SLIDE 33

Spatiotemporal Statistical Model Piecewise geodesic trajectories

[Chevallier et al, NIPS’17]

slide-34
SLIDE 34

Spatiotemporal Statistical Model Piecewise geodesic trajectories

slide-35
SLIDE 35

35

The individual parameters are related to the real age of conversion of the individuals

Prediction tool

slide-36
SLIDE 36
  • Generic statistical model to learn spatiotemporal distribution of

trajectories on manifolds: – Calibrated on longitudinal data sets using MCMC-SAEM – Automatically finds temporal correspondences among similar events that may happen at different age/time – Estimates the variability of the data at the corresponding events

  • It allows us to position disease progression within the life and history of

the patient

  • Future work:

– Derive instances of the model for more complex manifold-valued data (e.g. textured shapes data), metamorphosis and mixtures of all of these!

36

Conclusion

slide-37
SLIDE 37

37

Thank you!

vi

pij

vij

t0 t

v0

p0

T0(t) = Expp0,t0(v0)(t)