Statistical inference in transport-fragmentation Hoffmann models - - PowerPoint PPT Presentation

statistical inference in transport fragmentation
SMART_READER_LITE
LIVE PREVIEW

Statistical inference in transport-fragmentation Hoffmann models - - PowerPoint PPT Presentation

Statistical inference in transport- fragmentation models Marc Statistical inference in transport-fragmentation Hoffmann models Genealogical versus temporal data The size Marc Hoffmann dependent division rate model Paris-Dauphine


slide-1
SLIDE 1

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Statistical inference in transport-fragmentation models

Marc Hoffmann

Paris-Dauphine University

Van Dantzig Seminar, 6 March 2015

slide-2
SLIDE 2

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Acknowledgements

This talk is based on joint projects (some are still in progress!) with

  • M. Doumic (INRIA)
  • N. Krell (University of Rennes)
  • A. Olivier (Paris-Dauphine University)
  • P. Reynaud-Bouret (CNRS)
  • V. Rivoirard (Paris-Dauphine University)
  • L. Robert (INRA)
slide-3
SLIDE 3

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Context (1/4)

We consider (simple) branching processes with deterministic evolution between jump times. Such models appear as toy models for population growth in cellular biology. We wish to statistically estimate the parameters of the model, in order to ultimately discriminate between different hypotheses related to the mechanisms that trigger cell division.

slide-4
SLIDE 4

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Context (2/4)

We structure the model by state variables for each individual like size, age, growth rate, DNA content and so

  • n.

The evolution of the particle system is described by a common mechanism:

1 Each particle grows by “ingesting a common nutrient” =

deterministic evolution.

2 After some time, depending on a structure variable, each

particle gives rise to k = 2 offsprings by cell division = branching event.

Our goal in this talk: estimate the branching rate as a function of age or size (or both).

slide-5
SLIDE 5

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-6
SLIDE 6

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-7
SLIDE 7

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-8
SLIDE 8

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-9
SLIDE 9

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-10
SLIDE 10

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-11
SLIDE 11

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-12
SLIDE 12

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-13
SLIDE 13

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-14
SLIDE 14

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-15
SLIDE 15

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Figure : Evolution of a E. Coli population.

slide-16
SLIDE 16

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Context (3/4)

Deterministically the density of structured state variables evolves according to a so-called fragmentation-transport PDEs Stochastically, the particles evolve according to a piecewise deterministic Markov process that evolves along a branching tree. We study nonparametric inference of the division rate, with the concern of matching deterministic and stochastic approaches.

slide-17
SLIDE 17

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Context (4/4)

I will follow a “pedestrian route” by reviewing some of the results we progressively obtained by “trial-and-error”. In particular, the results are highly sensitive to the choice

  • f the observation schemes (genealogical versus temporal).

Our control experiments are data sets extracted from the

  • bservation of 88 microcolonies of E. Coli bacteria cultures (a

colony is followed from a single ancestor up to a few hundreds descendants).

slide-18
SLIDE 18

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Outline

1 Genealogical versus temporal data 2 The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

3 Estimating the age dependent division rate

slide-19
SLIDE 19

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Genealogical representation

In the talk we focus on structuring variables that are either age or size. The population evolution is associated with an infinite marked binary tree U =

  • n=0

{0, 1}n with {0, 1}0 := ∅. To each cell or node u ∈ U, we associate a cell with size at birth given by ξu and lifetime ζu. To each u ∈ U, we associate a birth time bu and a time of death du so that ζu = du − bu.

slide-20
SLIDE 20

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Observation scheme I: temporal data

Fix a (large) T > 0. Define UT =

  • u ∈ U, bu ≤ T
  • .

We have UT = ˚ UT ∪ ∂ UT, with ˚ UT =

  • u, du ≤ T
  • and ∂ UT =
  • u, bu ≤ T < du
  • We observe
  • ζT

u

and/or ξT

u , u ∈ UT

  • where ζT

u = min{du, T} − bu, and ξT u = ξu if du ≤ T and

the “size of u at time T” otherwise.

slide-21
SLIDE 21

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Observation scheme II: genealogical data

|u| = n if u = (u1, . . . , un) ∈ U, uv = (u1, . . . , un, v1, . . . , vm) if v = (v1, . . . , vm) ∈ U. Sparse tree case Given u(n) ∈ U, with |u(n)| = n, let Uu(n) =

  • u ∈ U, uw = u(n) for some w ∈ U
  • .

We observe

  • ζu and/or ξu, u ∈ Uu(n)
  • .

Full tree case For n = 2kn, define U[n] = {u ∈ U, |u| ≤ kn}. We observe

  • ξu and/or ζu, u ∈ U[n]
  • .
slide-22
SLIDE 22

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Temporal data

Figure : Genealogical tree observed up to T = 7 for a time-dependent division rate B(a) = a2 (60 cells). In blue: ˚

  • UT. In red: ∂ UT.
slide-23
SLIDE 23

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Genealogical data

Figure : The same outcome organised at a genealogical level.

slide-24
SLIDE 24

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Outline

1 Genealogical versus temporal data 2 The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

3 Estimating the age dependent division rate

slide-25
SLIDE 25

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Size dependent division rate (1/2)

Perthame, Transport equations in biology, Birk¨ auser, 2006. n(t, x): density of cells of size x. Parameter of interest: Division rate B(x). 1 cell of size x gives birth to 2 cells of size x/2. The growth of the cell size by nutrient uptake is given by a growth rate g(x) = τx in this talk: it follows the deterministic evolution dX(t) dt = g(X(t))dt

slide-26
SLIDE 26

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Size dependent division rate (2/2)

The deterministic model: transport-fragmentation equation ∂tn(t, x) + ∂x

  • τxn(t, x)
  • + B(x)n(t, x) = 4B(2x)n(t, 2x)

n(t, x = 0) = 0, t > 0 and n(0, x) = n(0)(x), x ≥ 0.

  • btained by mass conservation law:
  • LHS: density evolution + growth by nutrient + division of

cells of size x.

  • RHS: division of cells of size 2x.
slide-27
SLIDE 27

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Nonparametric estimation of B: First approach

Represent the solution of the transport-fragmentation equation in a stationary regime. Obtain a reconstruction formula for B(x) via this representation in terms of the steady-state or stationary density of the model. Postulate a proxy model where one observes exactly a drawn from the stationary density. Transfer standard nonparametric estimation techniques in this setting.

slide-28
SLIDE 28

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Solution by stable distribution

Start with the transport-fragmentation equation ∂tn(t, x) + ∂x

  • τxn(t, x)
  • + B(x)n(t, x) = 4B(2x)n(t, 2x)

Ansatz n(t, x) = eλtN(x). ∂x

  • τxN(x)
  • +
  • λ + B(x)
  • N(x) = 4B(2x)N(2x)

N(0) = 0, N(x) > 0 for x > 0 and

  • [0,∞) N(x)dx = 1.

Perthame et al. (2005) prove n(t, x) ≈ eλtN(x) with explicit (fast) rates of convergence (steady-state) under fairly general conditions.

slide-29
SLIDE 29

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

A proxy statistical model (1/4)

Yields a strategy for the nonparametric estimation of B. At time T, the data approximately behave like drawn from N(x)dx. Recover B through the representation L(N, λ) = L(BN), with L(f , λ)(x) = ∂x

  • τxf (x)
  • + λf (x),

L(f )(x) = 4f (2x) − f (x). The operator L(·, λ) has ill-posedness degree of order 1.

slide-30
SLIDE 30

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

A proxy statistical model (2/4)

We postulate the observation of outcomes of cell size X1, . . . , Xn in a stationary regime and that are independent: P(X1 ∈ dx1, . . . , Xn ∈ dxn) :=

n

  • i=1

N(xi)dxi. We can take advantage of kernel methods in nonparametric estimation. τ and λ assumed to be known (or λn proxy of λ given within sufficient accuracy).

slide-31
SLIDE 31

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

A proxy statistical model (3/4)

Reconstruction method:

1 Construct an estimator

Ln(x) of the action L(N, λ)(x) = ∂x

  • τxN(x)
  • + λN(x),

2 Build an approximate inverse L−1

k

  • f the inverse of

L(f )(x) = 4f (2x) − f (x).

3 Use representation

L(N, λ) = L(BN) and take as final estimator

  • Bn(x) := L−1

kn

  • Ln(x)
  • Nn(x)

where Nn(x) = n−1 n

i=1 h−1 n K

  • h−1

n (x − Xi)

  • kernel

estimator of N(x) for an approriate bandwidth hn > 0.

slide-32
SLIDE 32

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

A proxy statistical model (4/4)

In Doumic, H, Rivoirard and Reynaud-Bouret (2011), we construct an approximate inverse L−1

k

such that L−1

k (ϕ) − L−1(ϕ)L2(D) k−1/2ϕH1

and reconstruct L(N, λ)(x) by kernel methods. We obtain an estimator Bn s.t.

  • E
  • Bn − B2

L2(D)

1/2 n−s/(2s+3) uniformly in B over Sobolev balls (over the compact D ⊂ (0, ∞)). The result is compatible with previous deterministic results by Perthame and collaborators.

slide-33
SLIDE 33

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Limitations of the deterministic based approach

We implicitly assume a stationary regime (the steady-state approximation). We do not take advantage of richer available observation

  • schemes. I particular, if we have access of the finer

structure of the tree, can we beat the ill-posedness imposed by our approach? And more: constant growth rate, assuming two (sibling)

  • ffsprings are of the same size at birth, etc.
slide-34
SLIDE 34

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

The stochastic (cell level) approach (1/3)

We start with a singe cell of size x0. The cell grows exponentially according to a constant rate τ. The mother cell gives rize to two offsprings, at a rate B(x) that depend on its size x. The two offsprings have initial size x1/2, where x1 is the size of the mother at division. The two offsprings start independent growth according to the rate τ and divide according to the rate B(x).

slide-35
SLIDE 35

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

The stochastic (cell level) approach (2/3)

To each node u ∈ U, we associate a cell with size at birth given by ξu and lifetime ζu. u− denotes the parent of u. Thus 2ξu = ξu− exp

  • τζu−
  • .

X(t) =

  • X1(t), X2(t), . . .
  • process of the sizes of the

population at time t.

slide-36
SLIDE 36

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

The stochastic approach (3/3)

X(t) ↔ finite point measure valued process ♯X(t)

i=1

δXi(t) Identity between point measures

  • i=1

1{Xi(t)>0}δXi(t) =

  • u∈U

δξueτ(t−bu)1{bu≤t<bu+ζu}. In particular, observing (X(t), t ∈ [0, T]) is equivalent to

  • bserving {ξT

u , ζT u , u ∈ UT}.

slide-37
SLIDE 37

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Matching det. and stoch. approaches (1/3)

We can relate X(t) and n(t, x) via so-called many-to-one formulae. Classical technique for fragmentation and branching processes (see e.g. Bansaye et al. 2009, Bertoin, 2006, Cloez 2011): Pick a cell at random at each division and follow its size χ(t) through time. For ξ∅ = x χ(t) = x eτt 2Nt where Nt is the number of divisions of the tagged fragment up to time t.

slide-38
SLIDE 38

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Matching det. and stoch. approaches (2/3)

Step 1 for every (regular compactly supported) f : E ∞

  • i=1

f

  • Xi(t)
  • = E

u∈U

f

  • ξu

t

  • Step 2 : many-to-one formula

E

  • f
  • χ(t)
  • = E

u∈U

ξu

t

e−τt x f

  • ξu

t

  • Step 3 Finally

E f

  • χ(t)
  • χ(t)

xeτt = E ∞

  • i=1

f

  • Xi(t)
  • .
slide-39
SLIDE 39

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Transport-fragmentation equation

Set, for (regular compactly supported) f n(t, ·), f := E ∞

  • i=1

f

  • Xi(t)
  • .

We have (in a weak sense) ∂tn(t, x) + ∂x

  • τx n(t, x)
  • + B(x)n(t, x) = 4B(2x)n(t, 2x).

Therefore the mean empirical distribution of X(t) satisfies the deterministic transport-fragmentation equation.

slide-40
SLIDE 40

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Statistical estimation of B(x)

Observation scheme: genealogical data from two possible schemes:

Sparse tree: we observe, for some u(n) with |u(n)| = n,

  • ξu, uw = u(n) for some w ∈ U
  • Full tree: we observe, for n = 2kn,
  • ξu, |u| ≤ kn
  • Asymptotics: n → ∞.
slide-41
SLIDE 41

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Statistical estimation: identifying B(x)

We have P(ζu ∈ [t, t + dt] |ζu ≥ t, ξu = x) = B(xeτt)dt from which we obtain the density of the lifetime ζu− conditional on ξu− = x: t B(xeτt) exp

t B(xeτs)ds

  • .
slide-42
SLIDE 42

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Toward a Markov kernel

Using 2 ξu = ξu− exp

  • τζu−
  • , we further infer

P

  • ξu ∈ dx′

ξu− = x

  • =B(2x′)

τx′ 1{x′≥x/2} exp

x′

x/2 B(2s) τs ds

  • dx′.

We thus obtain a simple an explicit representation for the transition kernel PB

  • x, dx′) = PB
  • x, x′)dx′:

PB

  • x, x′) = B(2x′)

τx′ 1{x′≥x/2} exp

x′

x/2 B(2s) τs ds

  • .
slide-43
SLIDE 43

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Assumptions on B

Under appropriate conditions on B, the Markov chain on (0, ∞) is geometrically ergodic: there exists a unique invariant probability νB(dx) = νB(x)dx on [0, ∞) such that νBPB = νB. (the chain is however not reversible.) More precisely, we have the contraction property sup

|g|≤V

  • Pk

Bg(x) −

  • S

g(z)νB(z)dz

  • ≤ RV (x)γk

for an appropriate Lyapunov function V and some (explicitly computable) γ < 1.

slide-44
SLIDE 44

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Identifying B(x) through the invariant measure

Expand the equation νBPB = νB: νB(y) = ∞ νB(x)PB

  • x, y
  • dx

= B(2y) τy 2y νB(x) exp

y

x/2 B(2s) τs ds

  • dx

= B(2y) τy ∞ ∞ 1{x ≤ 2y, s ≥ y}νB(x) PB

  • x, s
  • dsdx.

This yields the key representation νB(y) = B(2y) τy PνB

  • ξu− ≤ 2y, ξu ≥ y
  • .
slide-45
SLIDE 45

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Key representation

We conclude B(y) = τy 2 νB(y/2) PνB

  • ξ−

u ≤ y, ξu ≥ y/2

. This yields the estimator

  • Bn(y)

= τy 2 n−1

u∈U[n] Khn(ξu − y/2)

n−1

u∈U[n] 1{ξu− ≤ y, ξu ≥ y/2}

̟n , where the kernel Khn(y) = h−1K

  • h−1

n y

  • is specified with

an appropriate bandwidth (and technical thershold ̟n).

slide-46
SLIDE 46

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Under the previous assumptions (+ the additional condition γ < 1

2 for the geometric ergodicity decay in the

full tree case), we have Eµ

  • Bn − B2

L2(D)

1/2 (log n)1/2n−s/(2s+1) uniformly in B over s-smooth H¨

  • lder balls intersected

with “nice geometrically ergodic classes”. Here, µ is any initial condition so that V 2 is µ-integrable.

slide-47
SLIDE 47

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Remarks and extensions

Smoothness adaptation (by means of appropriate concentration inequalities on trees) The rate are minimax (which is of course no surprise). (Possible extension: variability in the growth rate: extension to a cell-dependent τ = τu drawn via a Markov kernel κ(τu−, dτ).) (Possible extension: the cell mother divides into offsprings

  • f different sizes.)
slide-48
SLIDE 48

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Effect of variability (sparse tree case)

1 2 3 4 5 5 10 15 20 25 30 35 40

x

n=2047, B(x)=x

2

, the gr owth r ate is unifor m on [ 0.5, 1 .5] , spar se tr ee

distribution of all cell sizes distribution of size at division true division rate estimated division rate with variability estimated division rate without variability

slide-49
SLIDE 49

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Effect of variability (dense tree case)

0.5 1 1 .5 2 2.5 3 3.5 4 4.5 5 5 1 1 5 20 25 30 35 40

x

n=2047, B(x)=x

2

, the gr owth r ate distr ibution is unifor m on [ 0.5,1 .5] , plain tr ee distr ibution of all cell sizes distr ibution of size at division tr ue division r ate estimated division r ate with var iability estimated division r ate without var iability

slide-50
SLIDE 50

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Une l´ eg` ere surprise (1/3)

Revisit the representation formula B(y) = τy 2 νB(y/2) PνB

  • ξu− ≤ y, ξu ≥ y/2

. We always have {ξu− ≥ y} ⊂ {ξu ≥ y/2}, hence PνB

  • ξu− ≤ y, ξu ≥ y/2
  • = PνB
  • ξu ≥ y/2) − PνB(ξu− ≥ y
  • =

y/2

− ∞

y

= y

y/2

νB(x)dx (!).

slide-51
SLIDE 51

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Une l´ eg` ere surprise (2/3)

Finally (for constant growth rate) we have B(y) = τy 2 νB(y/2) y

y/2 νB(x)dx

We have a “gain”: rate n−s/(2s+1) versus n−s/(2s+3) in the proxy model based on the transport-fragmentation equation... But it only comes from the fact that we estimate the invariant measure “at division”, versus the invariant measure “at fixed time” in the proxy model.

slide-52
SLIDE 52

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

Estimating the age dependent division rate

Une l´ eg` ere surprise (3/3)

There seems to be more “nonparametric statistical information” in data extracted from ˚ UT rather than ∂ UT However

  • ˚

UT

  • ∂ UT
  • (supercritical branching

processes). Can we make that argument more precise (up to changing the model)?

slide-53
SLIDE 53

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Outline

1 Genealogical versus temporal data 2 The size dependent division rate model

Estimation at a (large) fixed time in a proxy model Estimation through genealogical data

3 Estimating the age dependent division rate

slide-54
SLIDE 54

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Age dependent division rate B(a)

n(t, a) is now solution to ∂tn(t, a) + ∂a

  • an(t, a)
  • + B(a)n(t, a) = 0,

n(t, a = 0) = 2 ∞ B(a)n(t, a)da n(t = 0, a) = n(0)(a). This translates into the stochastic model as P(ζu ∈ [a, a + da]

  • ζu ≥ a) = B(a)da.

Here, the ζu are i.i.d. We have nothing but a renewal process on a tree.

slide-55
SLIDE 55

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Observation scheme

The ζu are i.i.d.: the case of genealogical data is readily embedded into standard density estimation. Temporal data: we observe, for some (large) T > 0

  • ζT

u , u ∈ UT

  • which can be split into two data sets
  • ζu, u ∈ ˚

UT

  • T − bu, u ∈ ∂ UT
  • .
slide-56
SLIDE 56

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Estimation of B(a) from ˚ UT (1/4)

Analogue of what we did for the size dependent B(x) in the sense that we have (empirical) access to the time at division. Additional difficulty: bias selection (small lifetimes are

  • bserved more often than large lifetimes).

Strategy: many-to-one formulae (Bansaye et al., 2009, Cloez, 2012)

slide-57
SLIDE 57

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Estimation of B(a) from ˚ UT (2/4)

Many-to-one formula (Cloez, 2012): we have, for a nice test function g: E

u∈˚ UT

g(ζu)

  • =

T eλBs E

  • g(χ(s))HB
  • χ(s)
  • ds.

where χ(t) is a tagged branch picked at random on the tree, and HB(a) an explicit function. Also E[|˚ UT|] ∼ κBeλBT. All the ingredients needed for a law of large numbers.

slide-58
SLIDE 58

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Estimation of B(a) from ˚ UT (3/4)

Let fB(a) = B(a) exp

0 B(s)ds

  • .

We have 1 |˚ UT|

  • u∈˚

UT

g(ζu) P → 2 ∞ g(a)eλBafB(a)da. We even obtain a rate of convergence (in probability)

  • exp(λBT)

1/2 with some uniformity in B ∈ B (in a “neighbourhood” of constant functions B).

Proof: rates of convergence in the many-to-one formula for g(ζu, ζv) for u, v ∈ ˚ UT + geometric ergodicity.

slide-59
SLIDE 59

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Estimation of B(a) from ˚ UT (4/4)

We derive kernel estimators that achieve the rate

  • exp(λBT)

s/(2s+1) uniformly over B ∩ H(s, M). The rate is nearly minimax (use likelihood expansions established by L¨

  • cherbach in the early 2000’s).
slide-60
SLIDE 60

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

What if data are taken from ∂ UT solely?

We now have (using Cloez’s many-to-one formulae), for a test function g |∂ UT|−1

u∈∂ UT

g(ζu) P → 2λB ∞ g(a)eλBa fB(a) B(a) da = 2λB ∞ g(a)eλBae−

a

0 B(s)dsda.

We have a rate of convergence (in probability)

  • exp(λBT)

1/2 uniformly in B ∈ B. We retrieve an ill-posed problem of order 1, leading to concergence rate

  • exp(λBT)

s/(2s+3).

slide-61
SLIDE 61

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

The age dependent model, simulated data

Figure : Reconstruction of B over D = [0.1, 4] with 95%-level confidence bands constructed over M = 100 Monte-Carlo trees. In bold red line: x B(x); in bold blue line: fHB; in blue line: fB. Left: T = 15. Right: T = 23.

slide-62
SLIDE 62

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Conclusion/Overall picture

data Size model Age model proxy model n−s/(2s+3) + adaptation irrelevant ∂ UT ? (eλBT)−s/(2s+3) genealogical n−s/(2s+1) + adaptation irrelevant ˚ UT ? (eλBT)−s/(2s+1)

slide-63
SLIDE 63

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Thank you for your attention!

Doumic, M., H. M., Reynaud-Bouret, P. and Rivoirard, V. (2012) Nonparametric estimation of the division rate of a size-structured population. SIAM Journal on Numerical

  • Analysis. 50, 25pp.

Doumic, M., H.,M., Krell, N. and Robert, L. (2013) Statistical estimation of a growth-fragmentation model observed on a genealogical tree. Bernoulli, in press. L Robert, M.H., N. Krell, S. Aymerich, J. Robert and M.

  • Doumic. (2014) Division control in Escherichia coli is based on

a size-sensing rather than a timing mechanism. BMC Biology, 02/2014 10pp. M.H., Olivier, A. (2014) Nonparametric estimation of the division rate of an age dependent branching process. arXiv:1412.5936. 32pp.

slide-64
SLIDE 64

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Effect of variability (sparse tree case)

1 2 3 4 5 5 10 15 20 25 30 35 40

x

n=2047, B(x)=x

2

, the gr owth r ate is unifor m on [ 0.5, 1 .5] , spar se tr ee

distribution of all cell sizes distribution of size at division true division rate estimated division rate with variability estimated division rate without variability

slide-65
SLIDE 65

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Effect of variability (dense tree case)

0.5 1 1 .5 2 2.5 3 3.5 4 4.5 5 5 1 1 5 20 25 30 35 40

x

n=2047, B(x)=x

2

, the gr owth r ate distr ibution is unifor m on [ 0.5,1 .5] , plain tr ee distr ibution of all cell sizes distr ibution of size at division tr ue division r ate estimated division r ate with var iability estimated division r ate without var iability

slide-66
SLIDE 66

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Exploration on real data (E. Coli, sparse and dense tree case)

2 4 6 8 0.05 0.1 0.15 0.2 0.25 0.3 0.35 Size ( m) B - plain tree B, plain tree B - sparse tree B - sparse tree

Figure : Implementation on real data

slide-67
SLIDE 67

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Comparison with the inverse problem approach

0.5 1 1 .5 2 2.5 3 3.5 4 1 20 30 40 50 "Inver se Pr oblems" method, B(x)=x

2

, n=2047, 50 simulations tr ue B r econstr ucted B fr om a sample of 2047 cells

slide-68
SLIDE 68

Statistical inference in transport- fragmentation models Marc Hoffmann Genealogical versus temporal data The size dependent division rate model Estimating the age dependent division rate

Numerical implementation

Figure : Exploration on real-data. Sparse tree, n ≈ 3000.