[PPT] - Mixed effect model for the spatiotemporal analysis of longitudinal PowerPoint Presentation

SLIDE 1

Stéphanie Allassonnière

with J.B. Schiratti, J. Chevallier, I. Koval, V. Debavalaere and

S. Durrleman

Université Paris Descartes & Ecole Polytechnique

Mixed effect model for the spatiotemporal analysis of longitudinal manifold valued data

SLIDE 2

Represent and analyse geometrical elements upon which deformations

can act

Describe the observed objects as geometrical variations of one or several

representative elements

Quantify this variability inside a population

Deformable template model from Grenander

How does the deformation act?
What is a representative element?
How to quantify the geometrical variability ?

2

Computational Anatomy

SLIDE 3

3

One solution:

Quantify the distance between observations using deformations
Provide a statistical model to approximate the generation of the observed

population from the atlas

Propose a statistical learning algorithm
Optimise the numerical estimation

Computational Anatomy

SLIDE 4

4

Bayesian Mixed Effect model

First model :

– One observation per subject – Image or shape (viewed as currents) – Deformations either linearized or diffeomorphic – Homogeneous or heterogeneous populations (mixture models)

SLIDE 5

5

Bayesian Mixed Effect model

SLIDE 6

6

Bayesian Mixed Effect model

SLIDE 7

7

Bayesian Mixed Effect model

First model :

– One observation per subject – Image or shape (viewed as currents) – Deformations either linearized or diffeomorphic – Homogeneous or heterogeneous populations (mixture models)

Ø Limitations

Ø One observation per subject Ø Corresponding acquistion time

SLIDE 8

8

Longitudinal Data Analysis

Longitudinal model :

– Several observation per subject – Image, shape, etc – Atlas = representative trajectory and population variability

SLIDE 9

Longitudinal Data Analysis

9

Temporal marker of progression

(e.g. time since drug injection, seeding, birth, etc..)

subject #1 subject #2 subject #3 Time (age) Regression

(e.g. compare measurements at same time-point)

No temporal marker of progression

(e.g. in aging, neurodegenerative diseases, etc..)

How to learn representative trajectories of data changes from longitudinal data?

Learning spatiotemporal distribution of trajectories

Find temporal correspondences
Compare data at corresponding

stages of progression

Linear mixed-effects models

[Laird&Ware’82, Diggle et al., Fitzmaurice et al.]

Needs to disentangle differences in:

Measurements
Dynamics of measurement

changes vector-valued data manifold-valued data

(normalized data, positive matrices, shapes, etc;;)

SLIDE 10

Spatiotemporal Statistical Model

10

[Schiratti et al. IPMI’15, NIPS’15]

Statistical model inclinding:
a representative trajectory of

data changes

spatiotemporal variations in:
measurement values
pace of measurement

changes

Orthogonality condition ensures

identifiability (unique space/time decomposition)

Time is not a covariate but a

random variable

t0 t

v0

p0

T0(t) = Expp0,t0(v0)(t)

vi

pij

vij

yij = Ti(ψi(t)) + εij

Ti(t) = ExpT0(t)(PT0

t0,t(vi))

ψi(t) = t0 + αi(t − t0 − τi)

Acceleration factor Time-shift Space-shift

αi ∼ log N(0, σ2

α)

τi ∼ N(0, σ2

τ)

vi = (A1| . . . |AK) si Ak⊥v0 Random effects: Fixed effects: (p0, t0, v0)

(σ2

α, σ2 τ, A1, ...AK)

and

SLIDE 11

11

Spatiotemporal Statistical Model

yij = Ti(ψi(t)) + εij

Ti(t) = ExpT0(t)(PT0

t0,t(vi))

T0(t) = Expp0,t0(v0)(t)

ψi(t) = t0 + αi(t − t0 − τi) αi ∼ log N(0, σ2

α)

τi ∼ N(0, σ2

τ)

vi = (A1| . . . |AK) si Ak⊥v0 (p0, t0, v0)

{

Submanifold value observations Parallel curve Representative trajectory Linear time reparametrization

{

Hidden random variables: Acceleration factor Time shift Space shift Parameters: Mean trajectory parametrization and prior parameter

(σ2

α, σ2 τ, A1, ...AK)

{

SLIDE 12

12

Spatiotemporal Statistical Model

vi

pij

vij

p0

Σ P0tT ΣP0t

Interest: Parallel transport keep invariant the structure of the distribution, but updated it in time Comparison with previous work:

v0

p0

vi vi

pij

vij v0

p0

SLIDE 13

Spatiotemporal Statistical Model

13

The straight line model M = R

y time

b

t0 yij = (a + ai)(ti,j − t0) + b + bi | {z } +εi,j

Measurement of the ith subject at time t0 Laird & Ware (1982) Schiratti et al. (2015)

time

b

t0 yij = (a + ai)(ti,j −t0 − τi | {z }) + b + εi,j

Time at which measurement of the ith subject reaches ¯

b

x x

SLIDE 14

Spatiotemporal Statistical Model

14

The logistic curve model
Geodesic are logistic curves
It is not equivalent to a linear model on the logit of the observations (i.e. the

Riemannian log at p0 = 0.5), since p0 is estimated

If we fix p0 = 0.5 in our model à end up with our previous linear case (different

from Laird&Ware)

M =]0, 1[, g(p)(u, v) = uv p2(1 − p)2 γ0(t) = 1 + (1 − p0)/p0 exp ⇣ −

v0 p0(1−p0) (t − t0)

⌘ yij = γ0 ⇣ t0 + αi(t − t0 − τi) ⌘ + εij

SLIDE 15

Spatiotemporal Statistical Model

15

The propagation model
Geodesics are logistic curves in each coordinate
Parametric family of geodesics seen as a model of propagation of an effect
The parallel curve in the direction of the space-shift vi writes

à The parallel changes the relative timing of the effect onset across coordinates

M =]0, 1[N, g(p)(u, v) =

N

X

k=1

ukvk p2

k(1 − pk)2

γδ(t) = ⇣ γ0(t), γ0(t − δ1), . . . , γ0(t − δN−1) ⌘

✓ γ0 ✓ t + vi,1 v0 ◆ , γ0 ✓ t − δ1 + vi,2 v0 ◆ , ..., γ0 ✓ t − δN−1 + vi,N v0 ◆◆

SLIDE 16

Parameter Estimation

16

Maximum Likelihood:
EM:
Distribution from the curved exponential family

maxθ p(y|θ) = Z p(y, z|θ)dz p(yi|zi, θ)p(zi|θ) θk+1 = argmaxθ

N

X

i=1

Z log ✓ p(yi, zi|θ) | {z } ◆ p(zi|yi, θk)dzi θk+1 = argmaxθ ( φ(θ)T

N

X

i=1

Z S(yi, zi)p(zi|yi, θk)dzi − N log(C(θ)) )

log p(yi, zi|θ) = φ(θ)T S(yi, zi) − log(C(θ))

y = (y1, ..., yN), z = (z1, ...zN), θ = (σ2

z, σ2 ε, A1, ..., AK, p0, t0, v0)

SLIDE 17

17

Parameter Estimation: stochastic algorithm

SA-EM: replaces integration by one simulation of the hidden variable:

sample from , and a stochastic approximation of the sufficient statistics Maximization step (unchanged)

MCMC-SAEM: replaces sampling by a single Markov Chain step
For each subject, sample the random effect w.r.t a transition kernel of a

geometrically ergodic Markov chain targeting the conditional distribution

zi,k+1 p(zi|yi, θk) Sk+1 = (1 − ∆k)Sk + ∆k 1 N

N

X

i=1

S(yi, zi,k+1) ! θk+1 = argmaxθ

φ(θ)T Sk+1 − log(C(θ))

[Delyon, Lavielle, Moulines.’99] [Allassonnière et al.’10]

q(zi|yi, θk)

<latexit sha1_base64="Z9NCyt3yIeWKeHQl9WaurXkrylo=">AC2HicjVHLSsNAFD2Nr1pf1S7dBItQUqgi6LblxWsA9sS0jSaR2aJjGZCLUK7sStP+BWv0j8A/0L74wpqEV0QpIz595zZu69duDySBjGa0qbmp6ZnUvPZxYWl5ZXsqtrtciPQ4dVHd/1w4ZtRczlHqsKLlzWCEJmDWyX1e3+kYzXL1kYcd87FcOAtQdWz+Nd7liCKDObuyhcmfx6aPLtljhnwjL7W2Y2bxQNtfRJUEpAHsmq+NkXtNCBDwcxBmDwIAi7sBDR0QJBgLi2hgRFxLiKs5wgwxpY8pilGER26dvj3bNhPVoLz0jpXboFJfekJQ6NknjU15IWJ6mq3isnCX7m/dIecq7DelvJ14DYgXOif1LN878r07WItDFgaqBU02BYmR1TuISq67Im+tfqhLkEBAncYfiIWFHKcd91pUmUrXL3loq/qYyJSv3TpIb413ekgZc+jnOSVDbKZ2i8bJXr58mIw6jXVsoEDz3EcZx6igSt5DPOIJz9qZdqvdafefqVoq0eTwbWkPH86XlxU=</latexit>

SLIDE 18

18

Parameter Estimation: stochastic algorithm

SA-EM: replaces integration by one simulation of the hidden variable:

sample from , and a stochastic approximation of the sufficient statistics Maximization step (unchanged)

MCMC-SAEM: replaces sampling by a single Markov Chain step
For each subject, sample the random effect w.r.t a transition kernel of a

geometrically ergodic Markov chain targeting the conditional distribution As long as “converges towards” as

zi,k+1 p(zi|yi, θk) Sk+1 = (1 − ∆k)Sk + ∆k 1 N

N

X

i=1

S(yi, zi,k+1) ! θk+1 = argmaxθ

φ(θ)T Sk+1 − log(C(θ))

[Chevallier & Allassonnière, preprint ‘19]

˜ qk(zi, θk)

<latexit sha1_base64="EFP1N1Tawr4HAt3mDB6PxmLbyvs=">AC3nicjVHNSsNAGBzjX/2PehIvwSIoSElV0GPBi8cKtgq2xM12tUvTJCYboRbx5k28+gJe9XHEN9C38Ns1BUVENySZnW9mdr9dPw5kqlz3dcgaHhkdGy9MTE5Nz8zO2fML9TKEi5qPAqi5NhnqQhkKGpKqkAcx4lgXT8QR35nT9ePLkWSyig8VL1YNLvsPJRnkjNFlGcvNZQMWqJ/ce1q48udFQbaGY1n37GK5JrhuD/AoFREPqR/YIGWojAkaELgRCKcACGlJ4TlOEiJq6JPnEJIWnqAteYJG9GKkEKRmyHvuc0O8nZkOY6MzVuTqsE9CbkdLBKnoh0CWG9mPqmUnW7G/ZfZOp9ajv59ndYlVaBP7l2+g/K9P96Jwhl3Tg6SeYsPo7niekplT0Tt3vnSlKCEmTuMW1RPC3DgH5+wYT2p612fLTP3NKDWr5zXZnjXu/x6wb+D+mapvFVyD7aLld38qgtYxgrW6D53UME+qhR9g0e8YRn69S6te6s+0+pNZR7FvFtWA8fuYiZuw=</latexit>

˜ qk

<latexit sha1_base64="40UCSY71gF6hpiKfKWST7aiH12w=">ACznicjVHLSsNAFD2Nr1pfVZdugkVwVRIt6LgxmUF+4BWSjKd1qHTJCaTQinFrT/gVj9L/AP9C+MKahFdEKSM+ec2fuvX4kRaIc5zVnLS2vrK7l1wsbm1vbO8XdvUYSpjHjdRbKMG75XsKlCHhdCSV5K4q5N/Ilb/rDCx1vjnmciDC4VpOI34y8QSD6gnmKqHZHCdnj07tZd9gtlpyY5a9CNwMlJCtWlh8Qc9hGBIMQJHAEVYwkNCTxsuHETE3WBKXExImDjHDAXypqTipPCIHdJ3QLt2xga01zkT42Z0iqQ3JqeNI/KEpIsJ69NsE09NZs3+lntqcuq7TejvZ7lGxCrcEvuXb678r0/XotDHualBUE2RYXR1LMuSmq7om9tfqlKUISJO4x7FY8LMOd9to0nMbXr3nom/maUmtV7lmlTvOtb0oDdn+NcBI2Tsntadq4qpWolG3UeBzjEMc3zDFVcoa6fgjnvBs1ayxNbPuP6VWLvPs49uyHj4A2kCT/g=</latexit>

q(zi|yi, θk)

<latexit sha1_base64="Z9NCyt3yIeWKeHQl9WaurXkrylo=">AC2HicjVHLSsNAFD2Nr1pf1S7dBItQUqgi6LblxWsA9sS0jSaR2aJjGZCLUK7sStP+BWv0j8A/0L74wpqEV0QpIz595zZu69duDySBjGa0qbmp6ZnUvPZxYWl5ZXsqtrtciPQ4dVHd/1w4ZtRczlHqsKLlzWCEJmDWyX1e3+kYzXL1kYcd87FcOAtQdWz+Nd7liCKDObuyhcmfx6aPLtljhnwjL7W2Y2bxQNtfRJUEpAHsmq+NkXtNCBDwcxBmDwIAi7sBDR0QJBgLi2hgRFxLiKs5wgwxpY8pilGER26dvj3bNhPVoLz0jpXboFJfekJQ6NknjU15IWJ6mq3isnCX7m/dIecq7DelvJ14DYgXOif1LN878r07WItDFgaqBU02BYmR1TuISq67Im+tfqhLkEBAncYfiIWFHKcd91pUmUrXL3loq/qYyJSv3TpIb413ekgZc+jnOSVDbKZ2i8bJXr58mIw6jXVsoEDz3EcZx6igSt5DPOIJz9qZdqvdafefqVoq0eTwbWkPH86XlxU=</latexit>

k → ∞

<latexit sha1_base64="+SPztLGDsSBXAoy1PgIrOiODdQ=">ACz3icjVHLSsNAFD2Nr1pfVZdugkVwVRIt6LgxmUL9gFNkSd1qF5MZkopShu/QG3+lfiH+hfeGdMQS2iE5KcOfeM3Pv9ZKAp9KyXgvGwuLS8kpxtbS2vrG5Vd7eadxJnzW8uMgFl3PTVnAI9aSXAasmwjmhl7AOt74TMU710ykPI4u5CRh/dAdRXzIfVcS5YwdGZsOj4ZyUrosV6yqpZc5D+wcVJCvRlx+gYMBYvjIEIhgiQcwEVKTw82LCTE9TElThDiOs5wixJpM8pilOESO6bviHa9nI1orzxTrfbplIBeQUoTB6SJKU8QVqeZOp5pZ8X+5j3VnupuE/p7uVdIrMQVsX/pZpn/1alaJIY41TVwqinRjKrOz10y3RV1c/NLVZIcEuIUHlBcEPa1ctZnU2tSXbvqravjbzpTsWrv57kZ3tUtacD2z3HOg/ZR1T6uWs1apV7LR13EHvZxSPM8QR3naKBF3gke8YRno2ncGHfG/WeqUcg1u/i2jIcPWcCTwQ=</latexit>

SLIDE 19

Theoretical properties of the sampler:

Under mild conditions: – Drift property – Small set – Geometric ergodicity uniformly on any compact set of the parameters

Theoretical properties of the estimation algorithm:

– a.s. convergence towards the MAP estimator – Normal asymptotic behaviour: speed – Normal asymptotoc behaviour with optimal speed with averaging sequences

19

Parameter Estimation: stochastic algorithm

1/ p ∆k

1/ √ k

SLIDE 20

Model of Alzheimer’s disease progression

20

The average trajectory of data changes

Neuropsychological tests ADAS-

Gog from ADNI

248 subjects who converted

from MCI to AD

6 time-points per subjects on

average (min 3, max 11)

Data points

with propagation logistic model yij ∈]0, 1[4

[Schiratti et al. IPMI’15, NIPS’15]

SLIDE 21

Model of Alzheimer’s disease progression

[Schiratti et al. IPMI’15, NIPS’15]

Acceleration factor 𝛽"

Distinguish fast vs. slow progressers Distinguish early vs. late onset individuals

Time-shift

+1σ −1σ

SLIDE 22

Model of Alzheimer’s disease progression

[Schiratti et al. IPMI’15, NIPS’15] Variability in the relative timing and ordering of the events

Decomposition vector A1

+1σ −1σ

Decomposition vector A2

SLIDE 23

23

Computational comparisons

Comparison of: MCMC-SAEM – STAN – MONOLIX
Number of iterations:
MCMC-SAEM: 1 000 000 (6s / 1 000 iterations)
STAN: 15 000 (25min / 1 000 iterations)
MONOLIX: 20 000 (3,5 min / 1 000 iterations)

SLIDE 24

24

Computational comparisons

Comparison of: MCMC-SAEM – STAN – MONOLIX

True values MCMC-SAEM STAN Monolix

SLIDE 25

Model of Alzheimer’s disease progression

[Schiratti et al. IPMI’15, NIPS’15]

SLIDE 26

26

Comparison AD vs Controls

SLIDE 27

27

Comparison MCI vs Controls

SLIDE 28

28

Each snapshot corresponds to the neuronal loss within 5 years (from 70-75 to 85-90) Neuronal loss within 10 years

ADNI Data à Alzheimer’s Disease cohort Measures of the cortical thickness for MCI converters

Model of propagation on a graph

[Koval et al, Frontiers in Neurosciences’17]

SLIDE 29

Spatiotemporal Statistical Model

29

[Schiratti et al. IPMI’15, NIPS’15]

Geodesic representative trajectory
> What for treated pathologies?

t0 t

v0

p0

T0(t) = Expp0,t0(v0)(t)

vi

pij

vij

yij = Ti(ψi(t)) + εij

Ti(t) = ExpT0(t)(PT0

t0,t(vi))

ψi(t) = t0 + αi(t − t0 − τi)

Acceleration factor Time-shift Space-shift

αi ∼ log N(0, σ2

α)

τi ∼ N(0, σ2

τ)

vi = (A1| . . . |AK) si Ak⊥v0 Random effects: Fixed effects: (p0, t0, v0)

(σ2

α, σ2 τ, A1, ...AK)

and

SLIDE 30

30

Spatiotemporal Statistical Model Piecewise geodesic trajectories

SLIDE 31

Spatiotemporal Statistical Model Piecewise geodesic trajectories

[Chevallier et al, NIPS’17]

SLIDE 32

Spatiotemporal Statistical Model Piecewise geodesic trajectories

[Chevallier et al, NIPS’17]

SLIDE 33

Spatiotemporal Statistical Model Piecewise geodesic trajectories

[Chevallier et al, NIPS’17]

SLIDE 34

Spatiotemporal Statistical Model Piecewise geodesic trajectories

SLIDE 35

35

The individual parameters are related to the real age of conversion of the individuals

Prediction tool

SLIDE 36

Generic statistical model to learn spatiotemporal distribution of

trajectories on manifolds: – Calibrated on longitudinal data sets using MCMC-SAEM – Automatically finds temporal correspondences among similar events that may happen at different age/time – Estimates the variability of the data at the corresponding events

It allows us to position disease progression within the life and history of

the patient

Future work:

– Derive instances of the model for more complex manifold-valued data (e.g. textured shapes data), metamorphosis and mixtures of all of these!

36

Conclusion

SLIDE 37

37

Thank you!

vi

pij

vij

t0 t

v0

p0

T0(t) = Expp0,t0(v0)(t)