Formal Executable Descriptions of Biological Systems Pierpaolo - - PowerPoint PPT Presentation

formal executable descriptions of biological systems
SMART_READER_LITE
LIVE PREVIEW

Formal Executable Descriptions of Biological Systems Pierpaolo - - PowerPoint PPT Presentation

Formal Executable Descriptions of Biological Systems Pierpaolo Degano Dipartimento di Informatica, Universit di Pisa, Italia joint work with a lot of nice people :-) Pisa, 14th June 2007 NETTAB 2007 Pisa p.1/44 From Syntax to Semantics


slide-1
SLIDE 1

Formal Executable Descriptions

  • f Biological Systems

Pierpaolo Degano

Dipartimento di Informatica, Università di Pisa, Italia joint work with a lot of nice people :-)

Pisa, 14th June 2007

NETTAB 2007 Pisa – p.1/44

slide-2
SLIDE 2

From Syntax to Semantics

To understand function, study structure – F . Crick seems to work no longer in modern biology: STRUCTURE AND FUNCTION The genome as a 4-letters language — syntax ⇓ what and how it expresses for — semantics

NETTAB 2007 Pisa – p.2/44

slide-3
SLIDE 3

Systems Biology (a partial view)

Hypothesis-driven investigation in place of reductionism build a formal model of a biological system (generation of hypothesis) experiment it (tuning of hypothesis) until the model gets validated and ready to use Leads to a global view of a system — but often

  • nly offers snapshots of its behaviour

Huge amount of data available — hard to handle, very hard to interpret

NETTAB 2007 Pisa – p.3/44

slide-4
SLIDE 4

Computer Science (similarities)

A computer systems is formally modelled (generation of hypothesis) implemented, refined and eventually validated (experimenting on hypothesis) Experiments requires executing the model, to

  • btain its whole behaviour

Analysis methods and tools exist ... and computational power increasingly grows

NETTAB 2007 Pisa – p.4/44

slide-5
SLIDE 5

Long term goals

Understand the functionality of bio-components assessment of known facts discovery of new functionalities Investigate the underlying structure of biological complex systems how genome, proteome and metabolome interact giving rise to emergent properties

NETTAB 2007 Pisa – p.5/44

slide-6
SLIDE 6

Mathematical description of bio-phenomena

bio-physics – since Schrödinger, lots of differential equations, with deep statistical and stochastic models (monolithic, large, difficult to state, change, adapt and ... to solve for me:-)

NETTAB 2007 Pisa – p.6/44

slide-7
SLIDE 7

Mathematical description of bio-phenomena

bio-physics – since Schrödinger, lots of differential equations, with deep statistical and stochastic models (monolithic, large, difficult to state, change, adapt and ... to solve for me:-) bio-informatics: – structure (human) genome (DNA as a formal language over ACGT) and data bases of genes, proteins, metabolic pathways, ...

NETTAB 2007 Pisa – p.7/44

slide-8
SLIDE 8

Mathematical description of bio-phenomena

bio-physics – since Schrödinger, lots of differential equations, with deep statistical and stochastic models (monolithic, large, difficult to state, change, adapt and ... to solve for me:-) bio-informatics: – structure (human) genome (DNA as a formal language over ACGT) and data bases of genes, proteins, metabolic pathways, ... – function Petri nets, Process calculi, Rewriting systems, ...

NETTAB 2007 Pisa – p.8/44

slide-9
SLIDE 9

"cells as computational devices"

NETTAB 2007 Pisa – p.9/44

slide-10
SLIDE 10

Bio-systems

Metabolic and gene regulation networks, signalling pathways, etc are made of

NETTAB 2007 Pisa – p.10/44

slide-11
SLIDE 11

Bio-systems

Metabolic and gene regulation networks, signalling pathways, etc are made of millions of components acting independently, interacting each other, dispersed in solutions

NETTAB 2007 Pisa – p.11/44

slide-12
SLIDE 12

Bio-systems

Metabolic and gene regulation networks, signalling pathways, etc are made of millions of components acting independently, interacting each other, dispersed in solutions interaction is essentially binary

NETTAB 2007 Pisa – p.12/44

slide-13
SLIDE 13

Bio-systems

Metabolic and gene regulation networks, signalling pathways, etc are made of millions of components acting independently, interacting each other, dispersed in solutions interaction is essentially binary

  • ccurs on selected sites (if any) between close

enough, affine, non-separated components

NETTAB 2007 Pisa – p.13/44

slide-14
SLIDE 14

Bio-systems

Metabolic and gene regulation networks, signalling pathways, etc are made of millions of components acting independently, interacting each other, dispersed in solutions interaction is essentially binary

  • ccurs on selected sites (if any) between close

enough, affine, non-separated components is local, but affects the whole system globally

NETTAB 2007 Pisa – p.14/44

slide-15
SLIDE 15

Bio-systems

Metabolic and gene regulation networks, signalling pathways, etc are made of millions of components acting independently, interacting each other, dispersed in solutions interaction is essentially binary

  • ccurs on selected sites (if any) between close

enough, affine, non-separated components is local, but affects the whole system globally Just as concurrent, distributed, mobile processes

NETTAB 2007 Pisa – p.15/44

slide-16
SLIDE 16

Processes

Concurrent, distributed, mobile processes are made

  • f

several components acting independently, interacting each other, distributed geographically interaction is mainly binary

  • ccurs on selected channels between

components is local, but affects the whole system globally

NETTAB 2007 Pisa – p.16/44

slide-17
SLIDE 17

Process calculi: primitives

Few basic primitives for sending !a(v) and receiving ?a(v) the value v, if any, on channel a channels mimick interaction points, values the exchanged information performing non detailed activities τ abstracting from, e.g., biochemical details creating/handling channels composed with few operators ...

NETTAB 2007 Pisa – p.17/44

slide-18
SLIDE 18

Process calculi: composition

Among the few operators there are: parallel composition P | Q cells as processes, that may interact or proceed independently choice P + Q according to a probabilistic distribution — more to come

NETTAB 2007 Pisa – p.18/44

slide-19
SLIDE 19

Process calculi: semantics

How do systems evolve? Semantics is given through a logically based inference system, defining transitions — how a configuration changes into another Communication, i.e. interaction, is the basic computational step

NETTAB 2007 Pisa – p.19/44

slide-20
SLIDE 20

Process calculi: Semantics

Essentially, communication and asynchrony are ruled by:

  • ?a(x).P | !a(v).Q → P[x → v] | Q

the activity is local

  • IF P → P ′ THEN P | Q → P ′ | Q

its effect is global — more to come

NETTAB 2007 Pisa – p.20/44

slide-21
SLIDE 21

Quantitative information

... otherwise "stamp collection" — Rutherford interactions occur at given rates – channels posses rates (often) interactions are reversible (possibly with different rates) the context affects the overall rates – not only temperature, pressure, etc, but also concentration – here the quantities of reactants per unit (typically, Gillespie’s Stochastic Simulation Algorithm)

NETTAB 2007 Pisa – p.21/44

slide-22
SLIDE 22

Summing up

molecules, metabolites, compounds, cells as processes (biochemical) interactions as communications affinity of interaction as communication capabilities (other features, like membranes, geometry, time, ...

  • ften treated ad hoc or still under investigation)

Process calculi specify and execute Bio-systems

NETTAB 2007 Pisa – p.22/44

slide-23
SLIDE 23

What do we gain?

run the model, and obtain virtual experiments — an integral abstract description of system behaviour: unexpected, global properties may emerge formally analyse the executions, collecting e.g. statistical data on behaviour, or causality among interactions, or similarities/differences between systems, ... compositionality — specify new components in isolation (e.g. active principles), put them aside the others with no other change and see (cf. ODE)

NETTAB 2007 Pisa – p.23/44

slide-24
SLIDE 24

A simple example

Consider the enzyme-catalysed production of a product P from the substrate S: E + S ⇋KES

K−1

ES ES ⇀KP E + P

The corresponding processes are E =!a where rate(a) = KES S =?a.ES where rate(τ1) = K−1

ES

ES = τ1.(E|P) + τ−1.(E|S) where rate(τ−1) = KP A computation is

NETTAB 2007 Pisa – p.24/44

slide-25
SLIDE 25

E =!a where rate(a) = KES S =?a.ES where rate(τ1) = K−1

ES

ES = τ1.(E|P) + τ−1.(E|S) where rate(τ−1) = KP

n · E | m · S

r0

→ (n − 1) · E | (m − 1) · S | ES

r′

→ (n − 2) · E | (m − 2) · S | 2 · ES

r1

→ (n − 1) · E | (m − 2) · S | ES | P

r′′

→ (n − 2) · E | (m − 3) · S | 2 · ES | P

r

→ ...

where the actual rates r0, r′

0, ... are typically computed with

Gillespie’s SSA and depend on the rates of channels and on the number of reactants.

NETTAB 2007 Pisa – p.25/44

slide-26
SLIDE 26

Other approaches

Petri nets formal languages (P systems, ...) rewriting systems (κ-calculus, calculus of looping sequences, ...) logically based formalisms (Pathway logic, ...) ...

NETTAB 2007 Pisa – p.26/44

slide-27
SLIDE 27

Our own work

A brief report on two ongoing investigations: VIrtual CEll: artificial ur-cell, from a simplified prokaryote — with a variant of the π-calculus

  • E. Coli:

the whole metabolic pathways, with knock-outs — with a very fast (subset of) the π-calculus Towards a holistic model of a whole cell: all interactions among metabolic pathways (properties emerge), the whole movie not only snapshots

NETTAB 2007 Pisa – p.27/44

slide-28
SLIDE 28

Building up VICE: the genome

Problems: not an arbitrary list of genes small enough for the sake of computability Our choice: The "Minimal Gene Set" from Haemophylus influenzae, Mycoplasma genitalium

  • cf. Glass et al. – gene KO in vitro

NETTAB 2007 Pisa – p.28/44

slide-29
SLIDE 29

Building up VICE: hypothesis

Reduction and update of the Minimal Gene Set, based on a functional analysis. selection of basic activities (eating, production of energy, synthesis of basic structural components, reproduction) choice of the 187 genes involved design of the metabolic pathways needed (presently only for survival)

NETTAB 2007 Pisa – p.29/44

slide-30
SLIDE 30

VICE: Validation

Check on biological consistency: all the pathways selected have been taken: sufficient no genes are left inactive: necessary Comparison with real results: confirm basic modelling choice calls for deeper analysis and more features

NETTAB 2007 Pisa – p.30/44

slide-31
SLIDE 31

Activities

Group pathway (and reactions) in the standard biochemical manner:

Oxidations: extraction of energy from nutrients:

Glycolysis→Pyruvate→. . .

Lipid metabolism: synthesis of structural components from

monomers: fatty acids. . .

Nucleotide metabolism: building DNA/RNA bases, no de novo

synthesis

DNA/RNA synthesys: RNA for building proteins, DNA for reproduction

– not yet available

Protein synthesis: no amino acids Uptake: Glycerol, amino acids, nitrous bases, fatty acids. . .

. . . plus a few other pathways.

NETTAB 2007 Pisa – p.31/44

slide-32
SLIDE 32

Virtual experiments

Through runs of the π-specification of VICE in presence of different quantities of food (VICE in parallel with different numbers of glucose processes – naïve) for different periods of time (computations of different length) Under the assumption on the environment: enough nutrients (water, sugar, phosphates, amino acids, nitrous bases. . . ) no toxics no competing organisms (a single VICE) right temperature, pressure, ...

NETTAB 2007 Pisa – p.32/44

slide-33
SLIDE 33

Results

Data are collected from 103 computations, made of 104 transitions, involving 106 different processes (∼ 12 hours each) Throughput: Production of energy and metabolites, through oxidation of glucose, shows homeostasis biomass produced as expected Distribution of metabolites over Glycolysis pathway: Like in real prokaryotes (in their steady state) The distributions agree with those computed in vitro.

NETTAB 2007 Pisa – p.33/44

slide-34
SLIDE 34

Steady state

pyruvate, diacilglycerol, phosphoribosylpyrophosphate

NETTAB 2007 Pisa – p.34/44

slide-35
SLIDE 35

Usage of enzymes

1 mg111 5 mg300 9

  • compl. pyr. dehydrogenase

2 mg215 6 mg430 10 mg299 3 mg023 7 mg407 11 mg357 4 mg031 8 mg216

NETTAB 2007 Pisa – p.35/44

slide-36
SLIDE 36

Something emerges

Add the specification of a regulatory feedback circuit on the enzyme phosphofructokinase (the more ADP the faster the phosphorilation of fructose-6-phosphate). Look then at the time course of fructose-6-phosphate and fructose-1.6-bephosphate Change the feeding regimen by supplying the sugar: all at the beginning, a huge quantity — no oscillations at a constant rate — oscillations show up!!

NETTAB 2007 Pisa – p.36/44

slide-37
SLIDE 37

Oscillations

NETTAB 2007 Pisa – p.37/44

slide-38
SLIDE 38

the real ones ...

NETTAB 2007 Pisa – p.38/44

slide-39
SLIDE 39

Other case studies ...

Specify and run the metabolome of Escherichia coli Because of efficiency problems, a new implementation a subset of CCS (fast also with name passing) essentially multiplication of stoichiometric matrices more than two orders of magnitude faster than the previous one (108 transitions involving 107 processes in less than 8 hours — done while sleeping ...)

NETTAB 2007 Pisa – p.39/44

slide-40
SLIDE 40
  • E. Coli

The virtual behaviour “matches” the real one Knock out some genes agrees on known KO (ppc, pgi, zwf) a new KO (rpe) – no data in the literature

NETTAB 2007 Pisa – p.40/44

slide-41
SLIDE 41

Neurons

A first step to studying plasticity and memory Pre-synaptic mechanisms of neuro-transmitter release Executable model (in Spim) Results agree with other deterministic, non executable models More and news in a few minutes during Andrea’s talk

NETTAB 2007 Pisa – p.41/44

slide-42
SLIDE 42

Conclusions

Cells as processes ⇒ "virtual" living matter Formal, mathematical theory ⇒ mechanical analysis tools constructive and executable compositional, with different abstraction levels Quantities crucial for behavioural descriptions New computational models (e.g. new interation mechanisms) ⇒ new semantics "Virtual" experiments as computations ⇐ not enough!!

NETTAB 2007 Pisa – p.42/44

slide-43
SLIDE 43

To Do

Far from satisfactory languages! New challenges: membranes, compartments and the like geometrical issues more faithful (and efficient) bio-chemistry causality usability (graphich interfaces, fast interpreters, specification generators from data bases, ...) new analysis techniques (static vs dynamic) and tools Towards ...

NETTAB 2007 Pisa – p.43/44

slide-44
SLIDE 44

Bio-calculus environment

Towards uniform (families of) environments sharing formal grounds and tools providing the user with mechanisms for describing systems at different levels of abstraction

More fundamental research and more case studies

NETTAB 2007 Pisa – p.44/44