Formal approaches to the synthesis of biological networks Nicola - - PowerPoint PPT Presentation

formal approaches to the synthesis of biological networks
SMART_READER_LITE
LIVE PREVIEW

Formal approaches to the synthesis of biological networks Nicola - - PowerPoint PPT Presentation

Formal approaches to the synthesis of biological networks Nicola Paoletti Department of Computer Science, University of Oxford Metable Workshop Computer Laboratory, University of Cambridge, 26 Mar 2015 MOTIVATION Model checking Verification


slide-1
SLIDE 1

Formal approaches to the synthesis of biological networks

Nicola Paoletti Department of Computer Science, University of Oxford Metable Workshop

Computer Laboratory, University of Cambridge, 26 Mar 2015

slide-2
SLIDE 2

MOTIVATION Model checking

  • IN: Model + Property
  • OUT:
  • In Systems Biology, uncertain models are inevitable
  • Limited knowledge
  • Synthetic applications
  • Verification assumes a fully specified model

Verification of given biological properties on a model M φ M | = φ?

slide-3
SLIDE 3

Verification of given biological properties on a model

Model checking

  • IN: Model + Property
  • OUT:

M φ M | = φ?

Synthesis (CS)

  • IN: “Uncertain” model +
  • OUT:

Automatic derivation of a model that meets given biological properties

FROM TO

f s.t. M(f) | = φ M(·) φ

MOTIVATION

slide-4
SLIDE 4

PLAN OF THE TALK

Part 1 Part 2

slide-5
SLIDE 5

PLAN OF THE TALK

Part 1

slide-6
SLIDE 6

SEA URCHIN DEVELOPMENTAL GRN

Part 1

  • Boolean GRN model (on/off + logical connections)
  • 45 genes
  • Time delays (due to chemical kinetics)
  • [0-30] hours post-fertilization (hpf). Step: 1 hpf
  • 4 spatial domains (leading to specific kinds of cells

and organs)

  • 2 spatial relations between domains (direct and

indirect contiguity) describing the evolution of embryonic geometry

(one of) THE MOST COMPLETE MODELS

slide-7
SLIDE 7

Davidson et al., PNAS 109(41) 16434-6442, 2012

LIMITATIONS

Part 1

slide-8
SLIDE 8

Davidson et al., PNAS 109(41) 16434-6442, 2012

  • Can’t fully explain experimental data

(discrepancies on 26/45 genes)

LIMITATIONS

Part 1

slide-9
SLIDE 9

Davidson et al., PNAS 109(41) 16434-6442, 2012

  • Can’t fully explain experimental data

(discrepancies on 26/45 genes).

  • Manual modifications to force
  • bservations (3/45 genes)

LIMITATIONS

Part 1

slide-10
SLIDE 10

SMT solving SYNTHESIS OF GRNS

How to obtain a GRN model that fully explains experimental data?

Part 1

slide-11
SLIDE 11

SATISFIABILITY MODULO THEORIES (SMT) SAT

  • IN: Boolean formula
  • OUT: is

SAT? + Interpretation of variables SMT (= SAT + theories)

  • IN: FOL formula over one or more theories (bit-vectors, (non-)linear

integer/real arithmetic, …)

  • OUT: is

SAT? + Interpretation of (free) variables and functions φ = (x1 ∨ ¬x2) ∧ x3

φ = ∀x0.(x0 < x1 = ⇒ f(x0, x2) 6 x3) φ φ

Part 1

slide-12
SLIDE 12
  • States (Boolean

expression of each gene in each domain)

  • Finite paths
  • Synchronous dynamics
  • Transition relation
  • genes
  • spatial domains
  • discrete bounded time domain
  • spatial relations
  • update functions (aka Vector

Equations)

History and domain dependent

fg : Π × T × D → B (g ∈ G) r : D × D × T → B F SR T D G Q = B|G×D| Π δ : Π → B

δ(π) ⇐ ⇒ ^

i∈T,g∈G,d∈D

π[i](g, d) = fg(π, i, d)

FORMAL MODEL (GRN + DELAYS + DOMAINS)

GRN MODEL TRANSITION SYSTEM DYNAMICS

Part 1

slide-13
SLIDE 13

FORMAL MODEL (GRN + DELAYS + DOMAINS)

  • States&&&&&&&&&&&&&&&&&&&&&&&&&&&(Boolean&

expression&of&each&gene&in&each& domain)&

  • Finite&paths&&
  • Synchronous&dynamics&
  • Transition&relation&&
  • genes&
  • spatial&domains&
  • discrete&bounded&time&domain&
  • spatial&relations&
  • update&functions&(aka&Vector&

Equations)&&

History&and&domain&dependent&

fg : Π × T × D → B (g ∈ G) r : D × D × T → B F SR T D G Q = B|G×D| Π δ : Π → B

δ(π) ⇐ ⇒ ^

i∈T,g∈G,d∈D

π[i](g, d) = fg(π, i, d)

GRN$MODEL$ TRANSITION$SYSTEM$DYNAMICS$

OBSERVATIONS O Wildtype expression

à predicates on paths

(e.g. gene g1 is off at time 3 in domain d1)

Perturbation experiments à

modified vector equations + predicates comparing wildtype and perturbed paths

(e.g. g1 is over-expressed in d1, time interval [5,10] and perturbation p1)

¬π[3](g1, d1) πp1[5, 10](g1, d1) > π[5, 10](g1, d1)

Part 1

slide-14
SLIDE 14

Synthesize functions in s.t. the dynamics of admits paths that meet all observations F N

Model encoded as constraints in the theory of bit-vectors (SMT QF_UFBV)

Input:

  • GRN with partial knowledge of
  • Observations

N = (G, D, SR, T, F)

O F PROBLEM FORMULATION

Part 1

slide-15
SLIDE 15

E ::= g | ¬E | E ∧ E | E ∨ E | > t | < t | In ¯ d | In ¯ d E | In r ¯ d E | At-n E | After-n E | Perm-n E

(evaluation in domain ) ¯ d (evaluation in a domain in sp. relation with ) ¯ d r (delayed permanent activation) (delayed permanent repression) (delay of steps) n FORMALIZATION OF VECTOR EQUATION LANGUAGE

E At-2 E E After-1 E

Part 1

slide-16
SLIDE 16

temporal

  • perators

delays spatial relations domains input genes and their expression

Examples

ü Clear biological interpretation ü Incorporates uncertainty Basic interactions (BI) are templates for the synthesis of regulatory terms

f =

  • p

t r d b g

f = At-[1, 3] ¬g1 f = {After-, Perm-} ? In{d1, d2} ? {g1, g2}

FUNCTION SYNTHESIS

Part 1

slide-17
SLIDE 17

FUNCTION SYNTHESIS

We use Uninterpreted Boolean Functions to synthesize logical combinations of regulatory inputs

(BIs or further UBFs).

uf : Bn → B

temporal)

  • perators)

delays) spatial) relations) domains) input)genes)and)their) expression))

Examples)

! Clear)biological) interpretation) ! Incorporates)uncertainty) Basic&interactions&(BI)&are)templates)for)the)synthesis)of)regulatory)terms)

f =

  • p

t r d b g

f = At-[1, 3] ¬g1 f = {After-, Perm-} ? In{d1, d2} ? {g1, g2}

Part 1

slide-18
SLIDE 18

Software implementation of VE language and SMT-based synthesis methods with Z3 as solving engine

User-guided refinement UNSAT core-guided relaxation

hfn1 := AT-2 bra AND AT-2 eve f1:= {AT-,AFTER-}? IN ?? bra f2:= {AT-,AFTER-}? IN ?? eve hfn1 := uf(f1,f2) f1:= AT-[0,5] bra f2:= AT-[0,5] eve hfn1 := uf(f1,f2) hfn1 := AT-1 bra AND eve Synthesized function: Original function: UF of synthesis templates:

RESULTS

Part 1

slide-19
SLIDE 19

FULLY EXPLAINS EXPERIMENTAL DATA

  • Solved discrepancies
  • No need for manual

modifications

slide-20
SLIDE 20
  • SMT-based method for synthesizing models of GRNs
  • Applied to state-of-the-art model of sea urchin development
  • 45 genes x 4 spatial domains x 2 spatial relations x [0,30] hpf
  • Wildtype expression + 3 perturbation experiments
  • Formal encoding of biological DSL
  • Synthesized model fully explains observations with no major

changes

  • Effective (performance depends on size of search space for functions)

f1:= {AT-,AFTER-}[0,6] IN ?? bra f2:= {AT-,AFTER-}[0,6] IN ?? eve 352800 possible functions! hfn1 := uf(f1,f2)

SUMMARY

Part 1

slide-21
SLIDE 21

PLAN OF THE TALK

Part 2

slide-22
SLIDE 22

STOCHASTIC BIOCHEMICAL REACTION NETWORKS

  • Formalism for biosystems like
  • signalling pathways, gene regulation, epidemic models, …
  • molecular devices, DNA logic gates, DNA walker circuits, …
  • low molecular counts

à stochastic dynamics à Continuous Time Markov Chains (CTMC)

  • Uncertain kinetic parameters

Part 2

slide-23
SLIDE 23

STOCHASTIC BIOCHEMICAL REACTION NETWORKS AIM: Precise parameter synthesis synthesising parameters so that a given property is guaranteed to hold or the probability of satisfying is maximised/minimized

Part 2

slide-24
SLIDE 24

STOCHASTIC BIOCHEMICAL REACTION NETWORKS

40 1 2 3 4

Reactions and rate functions CTMC Property

(Continuous Stochastic Logic)

Parameters

Parametric CTMC (pCTMC) semantics

  • STATES: vector of populations/species counts
  • (parametric) TRANSITION RATES: kinetic rate functions (e.g. mass action law)
  • PARAMETER SPACE (continuous): intervals of kinetic parameters

Part 2

slide-25
SLIDE 25

SATISFACTION FUNCTION

40 1 2 3 4 0.5 0.4 0.3 0.2 0.1 0.0 0.10 0.15 0.20 0.25 0.30

slide-26
SLIDE 26

probability bounds

0.5 0.4 0.3 0.2 0.1 0.0 0.10 0.15 0.20 0.25 0.30

slide-27
SLIDE 27

probability bounds

0.5 0.4 0.3 0.2 0.1 0.0 0.10 0.15 0.20 0.25 0.30

slide-28
SLIDE 28

SYNTHESIS METHOD (SKETCH)

1) Method to compute SAFE APPROXIMATIONS to min and max probabilities over a fixed parameter region Upper and lower bounds Safe approximations

Part 2

slide-29
SLIDE 29

SYNTHESIS METHOD (SKETCH)

1) Method to compute SAFE APPROXIMATIONS to min and max probabilities over a fixed parameter region 2) Parameter space decomposition à improves accuracy of approximation Upper and lower bounds Safe approximations

Part 2

slide-30
SLIDE 30

SYNTHESIS METHOD (SKETCH)

1) Method to compute SAFE APPROXIMATIONS to min and max probabilities over a fixed parameter region 2) Parameter space decomposition à improves accuracy of approximation 3) Synthesis algorithms iterate steps 1) and 2) until required precision is reached Threshold (≥r) Max

  • True if lower bound above threshold r
  • False if upper bound below r
  • Undecided otherwise (to refine)
  • False if upper bound below under-

approximation of maximum probability M

  • True otherwise (to refine)

Part 2

slide-31
SLIDE 31

APPLICATIONS: SIR EPIDEMIC MODEL

S + I

ki

− →I + I I

kr

− →R

Susceptible à Infected Infected à Recovered

ki kr uncertain parameters

Property: φ = (I > 0)U[100,120](I = 0)

(infection lasts at least 100 time units and ends within 120 time units)

Part 2

slide-32
SLIDE 32

APPLICATIONS: SIR EPIDEMIC MODEL

S + I

ki

− →I + I I

kr

− →R

Susceptible+!+Infected+ + Infected+!+Recovered+

ki kr uncertain+parameters+

Property:)) φ = (I > 0)U[100,120](I = 0)

(infection+lasts+at+least+100+time+units+and+ ends+within+120+time+units)+

P

ki kr

kr

ki k

0.05 0.1 0.15 0.2 0.25 0.3 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

r

Threshold synthesis:

  • probability of property ≥ 10%
  • volume of undecided region ≤ 10% volume of the parameter space

Part 2

slide-33
SLIDE 33

APPLICATIONS: SIR EPIDEMIC MODEL

Max synthesis:

  • probability tolerance ≤ 1%

0.05 0.1 0.15 0.2 0.25 0.3 0.05 0.1 0.15 0.2 0.1 0.2 0.3 0.4

kr ki

P

S + I

ki

− →I + I I

kr

− →R

Susceptible+!+Infected+ + Infected+!+Recovered+

ki kr uncertain+parameters+

Property:)) φ = (I > 0)U[100,120](I = 0)

(infection+lasts+at+least+100+time+units+and+ ends+within+120+time+units)+

Part 2

slide-34
SLIDE 34

APPLICATIONS: DNA WALKERS

Man-made molecular motor/robot

  • traverses a track of DNA strands (anchorages)
  • stepping and blocking mechanism
  • junctions – circuits evaluating Boolean

functions Parameters:

  • ks, stepping rate (depends on distance between anchorages)
  • c, propensity to step onto a non-adjacent anchorage

Dannenberg, F. et al. Natural Computing (2014)

Part 2

slide-35
SLIDE 35

APPLICATIONS: DNA WALKERS

Probability ≥ 80% that the walker terminates with the correct result at time 200 min AND Probability ≤ 16% that the walker terminates with a wrong result at time 200 min

Man$made(molecular(motor/robot((

  • traverses(a(track(of(DNA(strands((anchorages)(
  • stepping(and(blocking(mechanism(
  • junctions(–(circuits(evaluating(Boolean(

functions(( Parameters:)

  • ks,(stepping(rate((depends(on(distance(between(anchorages)(
  • c,(propensity(to(step(onto(a(non$adjacent(anchorage(

Property:

P>0.8[F[200,200] finish-correct]∧ P60.16[F[200,200] finish-incorrect]

Part 2

slide-36
SLIDE 36

APPLICATIONS: MAMMALIAN CELL CYCLE

Part 2

Swat, M. et al. Bioinformatics 20(10), 1506–1511 (2004)

pRB E2F1

(A) (B)

  • Transition from G1 to S regulated by pRB (A) -- E2F1 (B) circuit
  • Bistability of B:
  • High concentrations à transition to S
  • Low concentrations à prevents committing to S
  • Parameters: degradation rates γA and γB
slide-37
SLIDE 37

APPLICATIONS: MAMMALIAN CELL CYCLE

Bistability property: high probability of reaching either the low (L, B<2) or the high (H, B>4) mode

P>0.4[FtL] ∧ P>0.4[FtH]

Swat,&M.&et&al.&Bioinformatics&20(10),&1506–1511&(2004)&&

pRB E2F1

(A) (B)

Part 2

slide-38
SLIDE 38

SUMMARY

  • First method for precise parameter synthesis for CTMCs and

(time-bounded) CSL

  • Arbitrarily precise results
  • Based on the computation of provably correct bounds and iterative

decomposition of parameter space

  • Application to SIR epidemic model, DNA walkers and bistability of

pRB-E2F1 circuit

Part 2