Analysis of metabolic networks in presence of biological - - PowerPoint PPT Presentation

analysis of metabolic networks in presence of biological
SMART_READER_LITE
LIVE PREVIEW

Analysis of metabolic networks in presence of biological - - PowerPoint PPT Presentation

Analysis of metabolic networks in presence of biological constraints: geometrical, combinatorial and logical approaches Philippe Dague LRI, Univ. Paris-Sud & CNRS, Univ. Paris-Saclay on leave at INRIA-Saclay Lifeware Acknowledgements:


slide-1
SLIDE 1

Analysis of metabolic networks in presence of biological constraints: geometrical, combinatorial and logical approaches

Philippe Dague LRI, Univ. Paris-Sud & CNRS, Univ. Paris-Saclay

  • n leave at INRIA-Saclay Lifeware

Acknowledgements: Sabine Peres, Antoine Deza, Martin Morterol, Eva Dechaux

BIOSS-IA Workshop: Logical Models for Formal Representation of Living Systems Gif-sur-Yvette, June 22, 2017

1

slide-2
SLIDE 2

2

slide-3
SLIDE 3

3

slide-4
SLIDE 4

Context

Input A metabolic network (weighted bipartite graph) given by: m × r stoichiometric real (actually, rational) matrix S. m number of internal metabolites, r number of enzymatic reactions. Sji stoichiometric coefficient of metabolite j in reaction i (< 0 for j substrate, > 0 for j product). m < r and S assumed of full rank m. Irrev: those of reactions which are irreversible (given « compiled » thermodynamic knowledge). Steady-state solutions (feasible flux vectors at steady-state): P = {v ∈ ℝr | Sv = 0 and vi ≥ 0, i ∈ Irrev}. Convex polyhedral cone in dimension dim(Ker(S)) = r – m with Irrev facets. Two ways of dealing with reversible reactions

  • Keep them as they are, allowing them to appear with positive or negative coefficient in solutions

(thus, the cone P is not pointed).

  • Decompose each reversible reaction into two exclusive irreversible reactions. This means adding

r – |Irrev| columns to S, whose columns number becomes d = 2r – |Irrev|. Then all d reactions are irreversible and the cone P is thus pointed, why this is the choice adopted by most of the algorithms and that we will adopt in the following.

4

slide-5
SLIDE 5

5

A B X3 X4 X1 X2

R1 T1 T2 T4 T3

1 1 1 1 2 1 1 1 1 1

A B X3 X4 X1

T1 T2 T3 T4 R1 A

A

1 −1 −1 1 −1 2

B

B

Example

m = 2 r = 5 = S

T1 T2 T3 T4 R1 R1rev 1 0 –1 0 –1 1 0 1 0 –1 2 –2 A B

) (

Irrev = {T1, T2, T3, T4} Irrev = {T1, T2, T3, T4, R1, R1rev}

d = 6

slide-6
SLIDE 6

Context

For v ∈ P, Supp(v) = {i | vi ≠ 0} is the support of v. As P is pointed: minimal (for set inclusion of supports) solutions = undecomposable solutions = extreme rays of P. They form the unique (up to positive scalings) minimal set of generating (for non negative linear combinations) vectors of all solutions in P. They are called Elementary Flux Modes (EFMs).

6

slide-7
SLIDE 7

The Double Description method: theoretical framework behind EFMs enumeration

Almost all known algorithms1 for enumerating EFMs are variants of the Double Description Method (Motzkin-Raiffa-Thompson-Thrall,1953, known as Fourier-Motzkin). A pair (A,R) of real matrices A (m × d) and R (d × n) is said to be a double description pair

  • r simply a DD pair if the following relationship holds:

A x ≥ 0 if and only if x = R λ for some λ ≥ 0

P(A) polyhedral cone represented by A (representation matrix) as: P(A) = {x ∈ ℝd | A x ≥ 0} is simultaneously represented by R (generating matrix) as:
 {x ∈ ℝd | x = R λ for some λ ≥ 0}. Minkowski’s Theorem for Polyhedral Cones:
 For any m × d real matrix A, there exists some d × n real matrix R such that (A,R) is a DD pair, or in other words, the cone P(A) is generated by R. The theorem states that every polyhedral cone admits a generating matrix. The non-triviality comes from the fact that the column size n of R is finite: every cone is finitely generated.

1For enumerating only part of EFMs according to certain criteria or contraints, other approaches are: LP, MILP, SAT/SMT

(our work), constraints, genetic algorithms, …

7

slide-8
SLIDE 8

The Double Description method

Task: how does one construct a matrix R from a given matrix A? A more appropriate formulation of the problem is to require the minimality of R: find a matrix R such that no proper submatrix is generating P(A).
 A minimal set of generators is unique up to positive scaling when we assume that the cone is pointed (⟺ rank(A) = d), i.e., the origin is an extreme point of P(A). Geometrically, the columns of a minimal generating matrix are in 1-to-1 correspondence with the extreme rays of P. Thus the problem is also known as the extreme ray enumeration problem. No efficient (polynomial) algorithm is known for the general problem (which is isomorphic to the vertex enumeration problem for a convex polyhedron).


8

slide-9
SLIDE 9

9

The Double Description method

slide-10
SLIDE 10

The Double Description Method: Complexity? P : an H-polytope represented by m halfspaces h1, . . ., hm in Rd. Pk = ∩k

i=1hi : kth polytope (P = Pm).

Vk = V (Pk) : the vertex set computed at kth step.

(Pk-1, Vk-1) (Pk, Vk) hk newly generated for each adj pair ( , )

8

10

slide-11
SLIDE 11

How Intermediate Sizes Fluctuate with Different Orderings

20 22 24 26 28 30 32 Iteration 250 500 750 1000 1250 1500 Size INTERMEDIATE SIZES FOR CCP6 maxcutoff mincutoff random lexmin

The input is a 15-dimensional polytope with 32 facets. The output is a list

  • f 368 vertices. The lexmin is a sort of shelling ordering.

10 11

slide-12
SLIDE 12

How Intermediate Sizes Fluctuate with Different Orderings

200 400 600 800 1000 5000 10000 15000 20000 25000 30000

Iteration Size Random Maxcutoff Mincutoff

The input is a 10-dimensional cross polytope with 210 facets. The output is a list of 20 vertices. The highest peak is attained by maxcutoff ordering, following by random and mincutoff. Lexmin is the best among all and the peak intermediate size is less than 30. (Too small too see it above.)

11

12

slide-13
SLIDE 13

Application to EFMs computation

Problem expressed in DD framework as A x ≥ 0 with solutions P(A), where m × d matrix A (m = 2m + d) is given by: A = (ST, –ST, Id)T A is of full rank d and P = P(A).

v is an extreme ray of P if and only if: dim(Ker(SSupp(v))) = 1 (and thus |Supp(v)| ≤ m + 1). Extreme rays v and v’ are adjacent if and only if: dim(Ker(SSupp(v) ∪ Supp(v’))) = 2. P is of dimension d – m and a solution v ∈ P verifies d – |Supp(v)| inequalities with

  • equality. In particular an extreme ray v verifies at least d – m –1 inequalities with equality.

d – m –1 is the non-degenerate case. The problem is thus in general degenerate with degenerescence degree equal to m + 1 – sl, where sl is the shortest length of an EFM (the only non-degenerate cases are a linear chain network or a one-metabolite network) . This has prevented up to now to use a combinatorial method, alternative to DD, called lrs (local reverse search), very efficient for non-degenerate problems. But very recent results show that, once highly parallelized, it can compete with DD for degenerate problems.

13

slide-14
SLIDE 14

Application to EFMs computation: complexity

Counting EFMs is #P-complete. Can EFMs be enumerated with polynomial delay is an open problem in case of at least one irreversible reaction (the answer is yes if all reactions are reversible, as EFMs are then the circuits of a linear matroid). Number of EFMs is bounded by 𝚰(md/2). Deciding if it exists an EFM with given support Supp is NP-complete if |Supp|>1. Given k, deciding the existence of an EFM with at most k reactions is NP-complete.

14

slide-15
SLIDE 15

Results

  • Despite very efficient and specially tuned implementations (Metatool, Efmtool,

FluxModeCalculator) allowing the enumeration of millions of EFMs, the DD approach suffers of combinatorial explosion at genome scale.

  • E. Coli: 89 metabolites, 106 reactions, 26,4 106 EFMs.

Human (estimation of reconstructed network): 2766 metabolites, 4832 reactions, 1029 EFMs.

  • Moreover most of the thousands or millions of EFMs that can be computed for

medium scale systems are biologically irrelevant. This is because only stoichiometric constraints and some known irreversibility constraints are taken into account, but a lot of other biological constraints are missing.

  • In order to both being able to deal with systems at genome scale and to

drastically reduce the number of biologically unfeasible EFMs, it is necessary to take into account and to integrate in the enumeration process biological constraints such as thermodynamic, kinetic and regulatory constraints.

15

slide-16
SLIDE 16

Adding biological constraints: general framework

  • Given supplementary biological constraints C(v, x) on flux vectors v, depending in

general on other variables x (e.g., enzymes concentrations, metabolites concentrations) and parameters p (not represented), we are interested in the solutions satisfying the constraints: SolC(x) = {v ∈ ℝd | S v = 0, v ≥ 0, C(v, x)} ⊆ P ≡ Sol∅.

  • Most often, values of most or all of variables x are unknown. Then, we consider the

following weaker form of the solution space, requiring consistency with the constraints: SolC = {v ∈ ℝd | S v = 0, v ≥ 0, ∃x C(v, x)} ⊆ P where ∃x C(v, x) is True iff {x | C(v, x)} ≠ ∅. Thus: SolC = ∪x SolC(x)

  • Questions:
  • What is the mathematical (geometrical) structure of SolC?
  • What are the analogues of EFMs, i.e., those solutions in SolC that are support-

minimal in SolC, that are undecomposable, … ? Do they characterize SolC?

  • How computing these minimal or undecomposable solutions? In particular, how

integrating their computation into the DD method? Or consider alternative methods? Efficiently enough to be able to process genome scale models.

16

slide-17
SLIDE 17

Support-based constraints

Very often, constraint ∃x C(v, x) depends only on the support of v, not on the values of the positive fluxes in each

  • reaction. ∃x C is said support-based if:

∃x C(v, x), v’ ∈ P, Supp(v’) = Supp(v) ⇒ ∃x C(v’, x). In this case, SolC is still a cone and: ∀ v1, …, vn ∈ SolC, StrictConvexHull(v1, …, vn) ⊆ SolC

  • r StrictConvexHull(v1, …, vn) ∩ SolC = ∅

with StrictConvexHull(v1, …, vn) = {λ1v1 + ⋯ + λnvn | λ1, ⋯, λn > 0}. Thus SolC is a union of interiors of faces of P, so a union of convex polyhedral cones with in general missing faces (not topologically closed). Or at least (in presence of capacity constraint) this holds for enough small fluxes. ∃x C is said truncated-support-based if: ∃x C(v, x) ⇒ ∃λ> 0, (v’ ∈ P ∩ [0,1]

λ, Supp(v’) = Supp(v) ⇒ ∃x C(v’, x)).

  • NB. λcan be assumed independent of v as the number of possible supports is finite.

In this case, SolC is a truncated cone (i.e., v ∈ SolC, 0≤λ≤1 ⇒ λv ∈ SolC) and: ∃λ> 0, ∀ v1, …, vn ∈ SolC, StrictConvexHull(v1, …, vn) ∩ [0,1]

λ ⊆ SolC

  • r StrictConvexHull(v1, …, vn) ∩ SolC = ∅.

17

slide-18
SLIDE 18

Support-monotone constraints: geometrical solution space structure

A stronger property of a constraint C is its monotonicity w.r.t. support set inclusion. ∃x C is said support-monotone if: ∃x C(v, x), v’ ∈ P, Supp(v’) ⊆ Supp(v) ⇒ ∃x C(v’, x). Support-monotonicity has two important consequences:

  • 1. v ∈ SolC, v’ ∈ P, Supp(v’) ⊆ Supp(v) ⇒ v’ ∈ SolC.

In particular, (support-)minimal solutions EFMCs in SolC are also undecomposable solutions and are exactly those EFMs in P that satisfy the constraint ∃x C: Min_support(SolC) = Undecomposable(SolC) ≡ EFMCs = EFMCs ≡ EFMs ∩ ∃x C(v, x). Each solution in SolC is a nonnegative combination of EFMCs (but the converse is false in general as SolC is not convex in general). But actually SolC = U1≤i≤lPi, with Pi (maximal) convex polyhedral cone (face of P). Let MaxConvexHullEFMCi (1≤i≤l) the maximal subsets of EFMs of P such that any non negative linear combination of those EFMs satisfies ∃x C. Then MaxConvexHullEFMCi is the set of extreme rays of Pi .

18

slide-19
SLIDE 19

Support-monotone constraints: integration in the DD method

  • 2. In the DD method, at each iteration step k, any extremal ray v of the cone Pk whose

already processed part vk (i.e., its first k components, if constraints vi ≥ 0 are processed in increasing order of i) does not satisfy ∃x C can be definitely discarded. This is because all rays it could contribute to construct in the next steps would have a larger support in this part and thus would also violate C: ¬∃x C(vk, x), v’ ∈ P, Supp(vk) ⊆ Supp(v’k) ⇒ ¬∃x C(v’k, x). Thus filtering of EFMs that satisfy ∃x C can be done during the iteration process of the DD method by call to a solver that checks satisfiability of ∃x C, and not at the end in post- processing. These results are still valid in a neighbor of 0 if ∃x C is truncated-support-monotone, i.e.: ∃x C(v, x) ⇒ ∃λ> 0, (v’ ∈ P ∩ [0,1]λ, Supp(v’) ⊆ Supp(v) ⇒ ∃x C(v’, x)).

  • NB. λcan be assumed independent of v as the number of possible supports is finite.
  • 1. ∃λ>0 SolC ∩ [0,1]λ = U1≤i≤l(Pi ∩ [0,1]λ).
  • 2. for an extremal ray v of Pk, one checks if ∃λ> 0,λvk satisfies ∃x C.

19

slide-20
SLIDE 20

Example 1: thermodynamic constraints

∃x C(v, x) ≡ ∃x ∀i ∈ Supp(v), ∆rGi = ∑1≤j≤m Sji (∆fG’0j + RT ln(xj/x0)) < 0 (which means: vi > 0 ⇒ ∆rGi < 0). where x ∈ ℝ+m is the metabolites concentrations vector and ∆fG’0j the apparent standard Gibbs free energy of formation of metabolite j. When available, experimental lower and upper bounds xjmin and xjmax of concentrations xj’s can be added: ∀j, 1≤j≤m ln(xjmin/c0) ≤ ln(xj/c0) ≤ ln(xjmax/c0). ∃x C is support-monotone (also true for Cx) and C(v, x) is linear in ln(xj/x0). ∃x C satisfiability can be checked by an LP program at each iteration of the DD

  • method. This is implemented in tEFMA based on Efmtool.

MaxConvexHullEFMCi are the largest thermodynamically consistent sets of EFMs, thermodynamically closed by convex hull. They can be computed at the end by using the adjacency relations between EFMCs.

  • NB. Without bounds on concentrations, at most 50% of EFMs can be eliminated as

thermodynamically unfeasible.

20

slide-21
SLIDE 21

Example 1: thermodynamic constraints

As ∑1≤j≤m Sji∆fG’0j = –RT ln(Ki=), where Ki= is the equilibrium constant of reaction i, ∃x C can be rewritten as: ∃x C(v, x) ≡ ∃x ∀i ∈ Supp(v), ∑1≤j≤m Sji ln(xj/x0) < ln(Ki=). By using one form of Farkas duality lemma (Kuhn and Fourier theorem): {x ∈ ℝm | Ax < b} ≠ ∅ ⟺ (∀z ∈ ℝd \{0}, z ≥ 0, zTA = 0 ⇒ zTb > 0), one obtains: SolC = {v ∈ P | ∀z (z ∈ P \{0}, Supp(z) ⊆ Supp(v) ⇒ zTln(K=) > 0)}. Thus SolC = U1≤i≤lPi, with Pi (maximal) faces of P included in the half-space vTln(K=) > 0 and EFMCs = {v ∈ EFMs | vTln(K=) > 0} (true only for EFMCs, not for all v ∈ SolC). MaxConvexHullEFMCi is the set of EFMs of Pi. NB. SolC is connected. Actually, for given x, let Ix = {i | ∑1≤j≤m Sji ln(xj/x0) < ln(Ki=)}, then: SolC(x) = PIx = P ∩ ℝIx and SolC = ∪x PIx thus each Pi is a union of PIx.

21

slide-22
SLIDE 22

Example 1: thermodynamic constraints

To compute EFMCs within the DD framework (without any call to LP), just add the constraint: vTln(K=) > 0, i.e., add a line to the matrix A made up of the coefficients ln(Ki=) (as a new

metabolite Z involved in all reactions ri with stoichiometry ln(Ki=)), process the constraint

vTln(K=) ≥ 0 and at the end remove the vertices on the hyperplane vTln(K=) = 0. Another way is to add the constraints: vTln(K=) ≥ rd+1, rd+1 ≥ 0, by adding a new column to A, as a new irreversible reaction rd+1, and two lines (ln(K1=), …, ln(Kd=), -1) and (0, …, 0, 1) and imposing that rd+1 > 0 in the solution (see Boolean constraints processing).

  • NB. DD gives thus the EFMCs. But the cone it defines implicitly is not SolC but

contains the convex hull of those EFMCs. We have to identify in this convex hull the entire maximal faces Pi of P (from the adjacency list).

22

slide-23
SLIDE 23

Example 2: kinetic constraints

∃E,x C(v,E,x) ≡ ∃E,x v = E o k(x,p) ∧ cTE ≤ W (kinetics + enzymatic resource) where E ∈ ℝ+

d, x ∈ ℝ+ m are the enzymes and metabolites concentrations vectors, ci and W

positive constants (C(v,E,x) is linear in E but non-linear in ln(x/x0). Lower and upper bounds of metabolites concentrations may be added to C if available: ∀j, 1≤j≤m xmin j ≤ xj ≤ xmax j. ∃E,x C is truncated-support-monotone (also true for ∃E Cx but not for ∃x CE and CE,x) Proof: Let ∃E,x C(v,E,x), λ= mini∈Supp(v)(vi), v’ ∈ P ∩ [0,1]λ, Supp(v’) ⊆ Supp(v). Choose E’i = 0 if i ∉ Supp(v’) and E’i = Ei v’i/vi if i ∈ Supp(v’), then C(v’,E’,x).

  • NB. The result is no longer true if lower bounds Emin j
  • n enzymes concentrations are added.

∃E,x C satisfiability can be checked by a non-linear solver at each iteration k of the DD method: any extremal ray v of Pk whose already processed part vk does not satisfy ∃λ,E,x λvk = E o k(x,p) can be definitely discarded. MaxConvexHullEFMCi are the largest kinetically consistent sets of EFMs, kinetically closed by convex hull and are the sets of EFMs of the Pi’s, where SolC ∩ [0,1]λ = U1≤i≤l(Pi ∩ [0,1]λ).

23

slide-24
SLIDE 24

Example 2: kinetic constraints and relationship with thermodynamics

But actually ∃λ> 0,E,x C(λv,E,x) ⟺ ∃x ∀i ∈ Supp(v) ki(x,p) > 0. (Remind that all reactions are assumed to be irreversible; otherwise the condition would be: sign(ki(x,p)) = sign(vi)). Proof: choose Ei = 0 if i ∉ Supp(v) and Ei = λvi / ki(x,p) if i ∈ Supp(v) with λ= W/(c

T(v/k(x,p))).

Thus, v (up to an enough small factor) is kinetically feasible if and only if it exists internal metabolites concentrations in agreement with the sense of the reactions in Supp(v). But this has to deal with thermodynamics. Actually, for Michaelis-Menten kinetics, we have: ki(x,p) = kcat

+ 𝝀 + (1 – e ∆rGi/RT)

with 𝝀

+ = ∏j|Sji<0(xj/Kxj) –Sji / (1+∏j|Sji<0(xj/Kxj) –Sji+∏k|Ski>0(xk/Kxk) Ski)

and ∆rGi/RT

= ∑1≤j≤m Sji ln(xj/x0) – ln(Ki =)

and thus: ∃λ> 0,E,x Ckinetics(λv,E,x) ⟺ ∃x Cthermodynamics(v, x). It means that, in absence of lower bounds on enzymes concentrations, the solution space SolC and the EFMCs for (Michaelis-Menten or analog) kinetics are the same (except truncation) that for thermodynamics and can be computed identically inside the DD framework (idem with bounds on xj). So, kinetics reduces the solution space and the number of EFMCs w.r.t. thermodynamics if and only if lower bounds Ei

min on enzymes concentrations are known for some reactions. But in this case, all above

is not applicable and we have to rest on non-convex geometry and programming.

24

slide-25
SLIDE 25

Example 3: transcriptional regulatory Boolean constraints

C(v) ≡ rc(vi) where rc is any Boolean regulatory constraint in the vis considered as propositional variables by: vi ⇔ vi > 0 and ¬vi ⇔ vi = 0. C is support-based but is not support-monotone. The only CNF constraints rc that are support-monotone are the conjuncts of negative clauses: ¬vi1 ∨ … ∨ ¬vin (the same as: vi1∧ … ∧ vik ⇒ ¬v). These are the only ones that can be filtered during the iteration process of the DD method: extreme rays whose support of the part already processed at step k contains vi1 , …, vin can be discarded. It is what achieves regEfmtool. So most constraints are only dealt with in post processing. By rewriting rc as DNF, it is easy to prove that Solrc is a union of convex polyhedral cones (that can be assumed to be disjoint), each one included in a face of P. But, due to the presence of positive literals, these cones are no more topologically closed (some of their faces have to be removed).

25

slide-26
SLIDE 26

Example 3: transcriptional regulatory Boolean constraints

For rc = p+∧ p-, (p+ conjunct of positive literals and p- conjunct of negative literals) the convex cone Solrc is equal to (Pp-)p+. Pp- = P ∩ {vi = 0}, where p- = ∧¬vi, is thus the face of the cone P defined by the hyperplanes vi = 0 and generated by those EFMs of P whose support does not contain any i. So the problem boils down to characterize Prc for rc = v1∧ … ∧ vn. Prc is just P without its faces vi = 0. So, the convex polyhedral cone Prc is no more topologically closed. Let EFMrc those EFMs of P that satisfy rc, EFMrc the minimal support elements of Prc and UFMrc the undecomposable elements of Prc (any solution in Prc is a nonnegative combination of elements of UFMrc). EFMrc ⊆ EFMrc ⊆ UFMrc, where both inclusions may be strict. Undecomposability does not imply support minimality. Support minimality does not imply to be an EFM.

26

slide-27
SLIDE 27

Example 3: transcriptional regulatory Boolean constraints

Example: 2 internal metabolites {A, B} and 5 irreversible reactions {t1, t2, t3, t4, e}. EFM = {e1, e2, e3}.

27

e2 e1 e3 e1⊕e2 e1⊕e3 e2⊕e3 e1⊕e2⊕e3

t1 t2 t3 t4 e A B

slide-28
SLIDE 28

Example 3: transcriptional regulatory Boolean constraints

rc = t1∧t4. EFMrc = {e1}. EFMrc = EFMrc ∪ {e2⊕e3}. UFMrc = EFMrc ∪ {e1⊕e2, e1⊕e3}.

28

slide-29
SLIDE 29

Example 3: transcriptional regulatory Boolean constraints

Example: e t3 t2 t1 t4

  • 1 -1 0 1 0 A

1 0 1 0 -1 B Dim (Ker(S)) = 5-2 = 3. Take e = (1 0 0 1 1)T, t3 = (0 1 0 1 0)T, t2 = (0 0 1 0 1)T as a basis. Ker(S) = {xe + yt3 + zt2 | x,y,z ∈ ℝ} = {(x y z x+y x+z)T}. P = {(x y z x+y x+z)T | x,y,z ≥ 0}. EFM = {e1=e, e2=t3, e3=t2}. Projection onto ℝ3 = <e,t3,t2>: rc = t1∧t4. Prc = {(x y z x+y x+z)T | x,y,z ≥ 0, x+y>0, x+z>0} = P without edges e2 and e3.

29

S = e1 e2 e3 x y z

slide-30
SLIDE 30

30

A solution: use of SMT

It is easy to express in SAT both the structural description of a metabolic pathway and the Boolean regulatory constraints. Dealing with stoichiometry and steady state just requires coupling with Linear Real Arithmetic (LRA). Enumerating EFMrcs boils down to enumerate minimal models: once one solution is found, it is minimized (by removing one by one a reaction up to unsatisfiability), kept and then blocked altogether with its supersets for searching another solution, up to global unsatisfiability. Identifying EFMrcs among those EFMrcs is done by applying the dimension

  • ne kernel criterion, so by LRA.

Work done during the thesis of Martin Morterol with CVC4 solver: « Méthodes avancées de raisonnement en logique propositionnelle : application aux réseaux métaboliques », December 2016. Results (comparison with Regefmtool): when the problem is sufficiently constrained to have not a large set of solutions, SMT-based approach

  • utperforms DD method for the enumeration of EFMrcs. But it is not efficient

for enumerating EFMs in the absence of constraints or with few constraints.

slide-31
SLIDE 31

Minimal cut sets

Minimal cut sets (MCSs) introduced in 2004 as minimal sets of reactions whose deletion (knock-outs or cuts) will block the operation of a given objective or target reaction, i.e., removal of an MCS implies a zero flux for the target reaction in steady state. Generalizations: for blocking the operation of arbitrary sets of EFMs (2006) or for accounting for side constraints (2011). Three different approaches for MCSs enumeration:

  • calculation of the concerned EFMs and calculation of their minimal hitting sets

(several methods exist: Berge’s algorithm to compute hypergraph transversals (1989); Reiner algorithm (1987) corrected by Greiner et al. (1989); etc.);

  • calculation of a dual network and computation of its EFMs, which correspond to

MCSs in the original network (and EFMs in the original network correspond to MCSs in the dual network); this duality is based on the Farkas lemma;

  • simultaneous generation of both EFMs and MCSs (2008), based on the Joint-

Generation algorithm from Fredman and Khachiyan (1996).

31

slide-32
SLIDE 32

MCSs as EFMs in a dual network in presence of biological constraints

Objective: study how adding biological constraints (thermodynamics, kinetics, regulation) in the primal network translates in the dual network, in order that the EFMs of the dual network satisfying these translated constraints are the MCSs as

  • btained by computing the hitting sets of the EFMs of the primal network satisfying

the original biological constraints. Done and experimented for thermodynamics (Eva Dechaux’s master internship in 2016).

32

slide-33
SLIDE 33

Perspectives

  • Extend SMT-based approach to thermodynamic and kinetic constraints.
  • Investigate lrs method (parameterized and highly parallelized).
  • Look for and take into account the structural specificities of metabolic networks.
  • Compare DD, lrs and SMT-based methods on large metabolic networks.
  • Extend to MCSs enumeration.
  • Extend to optimal flux optimization, metabolite production optimization.
  • Explore knowledge compilation techniques to compile a network w.r.t. a given

query language in order to be able to query efficiently a network directly. More long term:

  • Coupling metabolic and genetic regulatory networks and studying the evolution of

EFMs along time (e.g., during the different phases of a cell life).

  • Apply my other works about qualitative abstractions of hybrid systems, model

checking at abstract level and counter-example guided abstraction refinement?

33