Mathematical programming techniques applied to biology
Fabien Tarissan1
Leo Liberti2 Camilo La Rota3
1 ISC-PIF (Paris, France) 2 ´
Ecole Polytechnique (Paris, France)
3 IXXI (Lyon, France)
Mathematical programming techniques applied to biology Fabien - - PowerPoint PPT Presentation
Mathematical programming techniques applied to biology Fabien Tarissan 1 Leo Liberti 2 Camilo La Rota 3 1 ISC-PIF (Paris, France) 2 Ecole Polytechnique (Paris, France) 3 IXXI (Lyon, France) October 31, 2008 Context of work Pre-simulation tool
1 ISC-PIF (Paris, France) 2 ´
3 IXXI (Lyon, France)
◮ organisms ◮ data ◮ reliability ◮ level of details ◮ . . .
◮ Modelisation by means of mathematical programming techniques
◮ Reformulation of the models in order to ease the solving
◮ Reconstruction of gene regulatory networks: ◮ with continuous dynamics (drosophila) ◮ with discrete dynamics (arabidopsis)
◮ f : Rn → R is the objective function ◮ g : Rn → Rm is the set of constraints
Class f , g Z Best solver Best free solver Complexity LP linear Z = ∅ CPLEX CLP Θ(106) cNLP convex Z = ∅ SNOPT/FILTER IPOPT Θ(104) MILP linear Z = ∅ CPLEX BCP/SYMPHONY Θ(103) NLP non linear Z = ∅ BARON ? Θ(102) cMINLP convex Z = ∅ MINLP bb/FILMINT BONMIN/FILMINT Θ(103) MINLP non linear Z = ∅ BARON ? Θ(102)
◮ gia(t) is the concentration of gene a in nucleus i at time t ◮ Ra is the production rate for gene a ◮ Φ is the sigmoid regulation function ◮ λa is the decay rate ◮ Da is the diffusion coefficient for gene a
◮ Wba is the weight on the arc (b, a) in the GRN ◮ ma is the regulatory influence of the maternal gene bcd ◮ ha is the activation threshold for Φ
◮ Network of 6 genes ◮ but missing values for W , R, D, m, λ, h : 66 variables.
◮ Objective function: min X
a∈Nγ i∈Nι t∈Tdata
(ga
i (t) − gdata a i (t))2 +
X
a∈Nγ b∈Nγ
(Wa
bvb max)2 +
X
a∈Nγ
((mavbcd
max)2 + h2 a)
◮ Some penalty functions as constraints: ∀a ∈ Nγ 8 > < > : RL ≤ Ra ≤ RU λL ≤ λa ≤ λU DL ≤ Da ≤ DU ◮ PDE as a constraint (discretization): ga
i (t)−ga i (t − 1) = ∆t
B @ Ra 2 ( ua
i (t)
q ua
i (t)2 + 1
+ 1) − λaga
i (t) + Da(ga i+1(t) − 2ga i (t) + ga i−1(t))
1 C A
◮ Mitosis time ◮ Modelling cell division ◮ Updating diffusion coefficient ◮ . . .
◮ Driven by biological knowledge: (e.g. boundaries on W , m
◮ Mathematical reformulating of terms: ◮ exact reformulation: e.g. for
u √ u2+1
1
u2+1 =
u
u2+1 with u′ and add constraints:
◮ approximative reformulation of z2
◮ sigmoid (exact): too many variables. ◮ sigmoid (approx): ok. ◮ convex products (approx): ok but feasability issues.
◮ run on large data set: too heavy for now (need to split the model). ◮ trying other modellisations (gia(t) = g data
ia
◮ Gene regulatory network ◮ Some knowledge of the network topology ◮ Don’t know the weight on edges
◮ Descretization of the time ◮ Qualitative activity of gene i: xt+1
i
j − θi
production) decay
◮ Sets and Graph:
◮ Evolution rules ◮ Functions:
◮ (G, T, α) ◮ S := {1..Smax}: set of stages. ◮ U = {Us}s∈S; Us ⊆ V : nodes of Gs (induced subnetworks of G). ◮ I = {ιs,u}s∈S,u∈Us; ιs,u : V → {0, 1}: initial conditions. ◮ Φ = {φs,u}s∈S,u∈Us; φs,u : V → {0, 1}: expression data.
◮ Objective function
s∈S
t∈T\1
u∈Us
◮ Fixed point conditions
u∈Us
s,u − xt−1 s,u |
s
u∈Us
s,u − xt−1 s,u |
s
s
r≥t
r
s
r≥t
r
◮ Evolution rules
u∈Us :(u,v)∈A
s,u
s,v − V (1 − xt s,v)
u∈Us :(u,v)∈A
s,u
s,v) + V xt s,v
◮ of different biological organisms ◮ with different dynamics
◮ loose of efficiency ◮ might require to introduce new elements
◮ automatization of the reformulations ◮ study more complex qualitative models of GRN ◮ integrating different kind of knowledge (experimental,
1
i≤n
i≤n
2 , xU 2 ] × R :
2 , xU 2 ]2 :
2 x1
2 x1
2 (1 − x1)
2 (1 − x1)
◮ termination ◮ confluence ◮ optimality?