Analyzing and Synthesizing Genomic Logic Functions Nicola Paoletti - - PowerPoint PPT Presentation

analyzing and synthesizing genomic logic functions
SMART_READER_LITE
LIVE PREVIEW

Analyzing and Synthesizing Genomic Logic Functions Nicola Paoletti - - PowerPoint PPT Presentation

Analyzing and Synthesizing Genomic Logic Functions Nicola Paoletti Department of Computer Science, University of Oxford, UK Joint work with Hillel Kugler, Youssef Hamadi, Christoph M. Wintersteiger, BoyanYordanov Microsoft Research Cambridge


slide-1
SLIDE 1

Analyzing and Synthesizing Genomic Logic Functions

Nicola Paoletti

Department of Computer Science, University of Oxford, UK Joint work with Hillel Kugler, Youssef Hamadi, Christoph M. Wintersteiger, BoyanYordanov Microsoft Research Cambridge UK CAV, 20 Jul 14

slide-2
SLIDE 2
  • Boolean GRN model (on/off + logical connections)
  • 45 genes
  • Time delays (due to chemical kinetics)
  • [0-30] hours post-fertilization (hpf). Step: 1 hpf
  • 4 spatial domains (leading to specific kinds of cells

and organs)

  • 2 spatial relations between domains (direct and

indirect contiguity) describing the evolution of embryonic geometry

Sea urchin developmental GRN

(one of) THE MOST COMPLETE MODELS

slide-3
SLIDE 3

Limitations

Davidson et al., PNAS 109(41) 16434-6442, 2012

slide-4
SLIDE 4

Limitations

Davidson et al., PNAS 109(41) 16434-6442, 2012

  • Can’t fully explain experimental data

(discrepancies on 26/45 genes)

slide-5
SLIDE 5

Limitations

Davidson et al., PNAS 109(41) 16434-6442, 2012

  • Can’t fully explain experimental data

(discrepancies on 26/45 genes).

  • Hard-coded terms to force observations

(3/45 genes)

In d ∧ > ti ∧ < tj

slide-6
SLIDE 6

Limitations

Davidson et al., PNAS 109(41) 16434-6442, 2012

  • Can’t fully explain experimental data

(discrepancies on 26/45 genes).

  • Hard-coded terms to force observations

(3/45 genes)

  • Informal modelling language
  • Simulation semantics (ignores non-determinism)

In d ∧ > ti ∧ < tj

slide-7
SLIDE 7

Synthesis of GRNs

How to obtain a GRN model that fully explains experimental data?

???

slide-8
SLIDE 8

Synthesis of GRNs

SMT solving

How to obtain a GRN model that fully explains experimental data?

slide-9
SLIDE 9

Formal model

Transition system dynamics

  • States (boolean

expression of each gene in each domain)

  • Finite paths
  • Synchronous dynamics
  • Transition relation

GRN model

  • genes
  • spatial domains
  • discrete bounded time domain
  • spatial relations
  • update functions (aka

Vector Equations)

History and domain dependent

fg : Π × T × D → B (g ∈ G) r : D × D × T → B F SR T D G Q = B|G×D| Π δ : Π → B

δ(π) ⇐ ⇒ ^

i∈T,g∈G,d∈D

π[i](g, d) = fg(π, i, d)

slide-10
SLIDE 10

Experimental observations

Perturbed dynamics obtained replacing in the set of functions Pairs perturbed functions

  • bserved effects
  • Wildtype expression:

where is a predicate describing the sequence of observed states

  • Knockout of gene : where is a predicate

comparing the wildtype and the perturbed paths (C, E) C (∅, π) π g ({fg(π, t, d) := 0}, E) E

slide-11
SLIDE 11

Synthesize functions in s.t. the dynamics of admits paths that meet all observations

(for each , there exist paths of and of perturbed by , s.t. holds)

F N

(Ci, Ei) ∈ O Ei(π, πi) πi Ci

Problem formulation

Model encoded as constraints in the theory of bit-vectors (SMT QF_UFBV)

Input:

  • GRN with partial knowledge of
  • Observations

N = (G, D, SR, T, F)

O F

π N N

slide-12
SLIDE 12

Vector equation language…before

Davidson et al., PNAS 109(41) 16434-6442, 2012

slide-13
SLIDE 13

Vector equation language…after

(evaluation in domain ) ¯ d (evaluation in a domain in sp. relation with ) ¯ d r (delayed permanent activation) (delayed permanent repression) (delay of steps) n

E ::= g | ¬E | E ∧ E | E ∨ E | > t | < t | In ¯ d | In ¯ d E | In r ¯ d E | At-n E | After-n E | Perm-n E

slide-14
SLIDE 14

Vector equation language…after

E At-2 E E E After-1 E Perm-1 E

(evaluation in domain ) ¯ d (evaluation in a domain in sp. relation with ) ¯ d r (delayed permanent activation) (delayed permanent repression) (delay of steps) n

n n n

E ::= g | ¬E | E ∧ E | E ∨ E | > t | < t | In ¯ d | In ¯ d E | In r ¯ d E | At-n E | After-n E | Perm-n E

slide-15
SLIDE 15

Vector equation language…after

E At-2 E E E After-1 E Perm-1 E

(evaluation in domain ) ¯ d (evaluation in a domain in sp. relation with ) ¯ d r (delayed permanent activation) (delayed permanent repression) (delay of steps) n

n n n

And it’s a subset of LTL+P

E ::= g | ¬E | E ∧ E | E ∨ E | > t | < t | In ¯ d | In ¯ d E | In r ¯ d E | At-n E | After-n E | Perm-n E

slide-16
SLIDE 16

Function synthesis – Basic interactions

temporal

  • perators

delays spatial relations domains input genes and their expression

Examples

ü Clear biological interpretation ü Incorporates uncertainty Basic interactions (BI) are templates for the synthesis of regulatory terms

f =

  • p

t r d b g

f = At-[1, 3] ¬g1 f = {After-, Perm-} ? In{d1, d2} ? {g1, g2}

slide-17
SLIDE 17

Function synthesis – Basic interactions

We encode BIs as bit-vectors, and each evaluation of a BI is mapped to a function.

temporal

  • perators

delays spatial relations domains input genes and their expression

Examples

ü Clear biological interpretation ü Incorporates uncertainty Basic interactions (BI) are templates for the synthesis of regulatory terms

f =

  • p

t r d b g

f = At-[1, 3] ¬g1 f = {After-, Perm-} ? In{d1, d2} ? {g1, g2} f ^

g0 2 g, b0 2 b, d0 2 d, r0 2 r, t0 2 t, op0 2 op

f = (g0, b0, d0, r0, t0, op0) = ⇒ op0 t0(IN r0d0(b0 ⇐ ⇒ g0)) (π, i, d)

slide-18
SLIDE 18

Function synthesis – Uninterpreted functions

We use Uninterpreted Boolean Functions to synthesize logical combinations of regulatory inputs. Additional constraints can be added to avoid the negation of arguments

Example: with known to be a promoter Without constraints, can be synthesized as , which makes a repressor Idea: adding constraints

^

i=1,...,n

uf(b1, . . . , bi−1, 0, bi+1, . . . bn) = ⇒ uf(b1, . . . , bi−1, 1, bi+1, . . . bn)

uf : Bn → B

fg = uf(f1, f2) f1 uf ¬f1 ∧ f2 f1 uf(0, f2) = ⇒ uf(1, f2) ¬f1 ∧ f2 f2

f2 f1 uf f2 f1 uf

slide-19
SLIDE 19

Results

Software implementation of VE language and SMT

  • based synthesis methods with

Z3 as solving engine

User-guided refinement hfn1 := AT-2 bra AND AT-2 eve f1:= {AT-,AFTER-}? IN ?? bra f2:= {AT-,AFTER-}? IN ?? eve hfn1 := uf(f1,f2) f1:= AT-[0,5] bra f2:= AT-[0,5] eve hfn1 := uf(f1,f2) hfn1 := AT-1 bra AND eve Synthesized function: Original function: UF of synthesis templates:

slide-20
SLIDE 20

Results

  • Preserved inputs and their effect (activation/repression)
  • Most of delays preserved (± 3 hpf, the resolution of observations)
  • Only few temporary interactions synthesized into permanent (and viceversa)
  • Added spatial inputs (of the form and ) are supported by literature data

In ¯ d In r ¯ d

Davidson et al., PNAS 109(41) 16434-6442, 2012

Synthesized model

slide-21
SLIDE 21

FULLY EXPLAINS EXPERIMENTAL DATA

slide-22
SLIDE 22

FULLY EXPLAINS EXPERIMENTAL DATA

  • Solved discrepancies
slide-23
SLIDE 23

FULLY EXPLAINS EXPERIMENTAL DATA

  • Solved discrepancies
  • No need for hard-coded terms
slide-24
SLIDE 24

Related work

  • Parameter synthesis for gene networks (Batt et al. Bioinformatics 23(18), 2007)
  • Synthesis from mutation experiments (Köksal et al. POPL 13)
  • Modular design of biological circuits (Bartocci et al. CMSB 13)

Z34Bio (http://research.microsoft.com/en-us/projects/z3-4biology/)

  • Analysis of DNA computing (Yordanov et al. DNA 2013)
  • Synthesis of minimal GRN for embryonic stem cells (Dunn et al. Science

344(6188), 2014)

slide-25
SLIDE 25

Conclusions

  • SMT
  • based method for synthesizing models of GRNs
  • Applied to state-of-the-art model of sea urchin development
  • 45 genes x 4 spatial domains x 2 spatial relations x 30 hpf
  • Wildtype expression + 3 perturbation experiments
  • Formal encoding of biological DSL
  • Synthesized model fully explains observations with no major changes
  • Effective (performance depends on size of search space for functions)

f1:= {AT-,AFTER-}[0,6] IN ?? bra f2:= {AT-,AFTER-}[0,6] IN ?? eve 352800 possible functions! hfn1 := uf(f1,f2)

slide-26
SLIDE 26

Conclusions

  • SMT
  • based method for synthesizing models of GRNs
  • Applied to state-of-the-art model of sea urchin development
  • 45 genes x 4 spatial domains x 2 spatial relations x 30 hpf
  • Wildtype expression + 3 perturbation experiments
  • Formal encoding of biological DSL
  • Synthesized model fully explains observations with no major changes
  • Effective (performance depends on size of search space for functions)

Improves understanding of biological systems and produces new hypoteses to validate in wet-lab

f1:= {AT-,AFTER-}[0,6] IN ?? bra f2:= {AT-,AFTER-}[0,6] IN ?? eve 352800 possible functions! hfn1 := uf(f1,f2)

slide-27
SLIDE 27

Thank you