Different kinds of robustness in genetic and metabolic networks - - PowerPoint PPT Presentation

different kinds of robustness in genetic and metabolic
SMART_READER_LITE
LIVE PREVIEW

Different kinds of robustness in genetic and metabolic networks - - PowerPoint PPT Presentation

Different kinds of robustness in genetic and metabolic networks Peter Schuster Institut fr Theoretische Chemie und Molekulare Strukturbiologie der Universitt Wien Seminar lecture Linz, 15.12.2003 Genomics and proteomics Large scale data


slide-1
SLIDE 1
slide-2
SLIDE 2

Different kinds of robustness in genetic and metabolic networks

Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien Seminar lecture Linz, 15.12.2003

slide-3
SLIDE 3
slide-4
SLIDE 4

Mathematics in 21st Century's Life Sciences

Genomics and proteomics Large scale data processing, sequence comparison ...

Developmental biology

Gene regulation networks, signal propagation, pattern formation, robustness ...

Cell biology

Regulation of cell cycle, metabolic networks, reaction kinetics, homeostasis, ...

Neurobiology

Neural networks, collective properties, nonlinear dynamics, signalling, ...

Evolutionary biology

Optimization through variation and selection, relation between genotype, phenotype, and function, ...

slide-5
SLIDE 5

Genomics and proteomics Large scale data processing, sequence comparison ...

  • E. coli:

Length of the Genome 4×106 Nucleotides Number of Cell Types 1 Number of Genes 4 000 Man: Length of the Genome 3×109 Nucleotides Number of Cell Types 200 Number of Genes 30 000 - 100 000

slide-6
SLIDE 6

Fully sequenced genomes Fully sequenced genomes

  • Organisms 751

751 projects 153 153 complete (16 A, 118 B, 19 E)

(Eukarya examples: mosquito (pest, malaria), sea squirt, mouse, yeast, homo sapiens, arabidopsis, fly, worm, …)

598 598 ongoing (23 A, 332 B, 243 E)

(Eukarya examples: chimpanzee, turkey, chicken, ape, corn, potato, rice, banana, tomato, cotton, coffee, soybean, pig, rat, cat, sheep, horse, kangaroo, dog, cow, bee, salmon, fugu, frog, …)

  • Other structures with genetic information

68 68 phages 1328 1328 viruses 35 35 viroids 472 472 organelles (423 mitochondria, 32 plastids,

14 plasmids, 3 nucleomorphs)

Source: NCBI Source: Integrated Genomics, Inc. August 12th, 2003

slide-7
SLIDE 7

Wolfgang Wieser. Die Erfindung der Individualität oder die zwei Gesichter der Evolution. Spektrum Akademischer Verlag, Heidelberg 1998. A.C.Wilson. The Molecular Basis of Evolution. Scientific American, Oct.1985, 164-173.

slide-8
SLIDE 8

Waste Food

Metabolism Replication: DNA 2 DNA →

+ +

Ribosom

mRNA Protein

Translation: mRNA Protein →

Nucleotides Amino Acids Lipids Carbohydrates Small Molecules

mRNA Transcription: DNA RNA → Genetic Code

The gene is a stretch of DNA which after transcription gives rise to a mRNA

slide-9
SLIDE 9

The same section of the microarray is shown in three independent hybridizations. Marked spots refer to: (1) protein disulfide isomerase related protein P5, (2) IL-8 precursor, (3) EST AA057170, and (4) vascular endothelial growth factor Gene expression DNA microarray representing 8613 human genes used to study transcription in the response of human fibroblasts to serum V.R.Iyer et al., Science 283: 83-87, 1999

slide-10
SLIDE 10

genomic DNA mRNA

Elimination of introns through splicing AAA

The gene is a stretch of DNA which after transcription and processing gives rise to a mRNA

slide-11
SLIDE 11

Sex determination in Drosophila through alternative splicing The process of protein synthesis and its regulation is now understood but the notion of the gene as a stretch of DNA has become obscure. The gene is essentially associated with the sequence of unmodified amino acids in a protein, and it is determined by the nucleotide sequence as well as the dynamics of the the process eventually leading to the m-RNA that is translated.

slide-12
SLIDE 12

Number of genes in the human genome

The number of genes in the human genome is still only a very rough estimate

slide-13
SLIDE 13

Developmental biology

Gene regulation networks, signal propagation, pattern formation, robustness ...

Three-dimensional structure of the complex between the regulatory protein cro-repressor and the binding site on

  • phage B-DNA
slide-14
SLIDE 14

Development of the fruit fly drosophila melanogaster: Genetics, experiment, and imago

slide-15
SLIDE 15

Linear chain Network

Processing of information in cascades and networks

slide-16
SLIDE 16

Albert-László Barabási, Linked – The New Science of Networks. Perseus Publ., Cambridge, MA, 2002

slide-17
SLIDE 17

Distributed network Small world network Albert-László Barabási, Linked – The New Science of Networks. Perseus Publ., Cambridge, MA, 2002

slide-18
SLIDE 18

Albert-László Barabási, Linked – The New Science of Networks Perseus Publ., Cambridge, MA, 2002

slide-19
SLIDE 19
  • Formation of a scale-free network through evolutionary point by point expansion: Step 000
slide-20
SLIDE 20
  • Formation of a scale-free network through evolutionary point by point expansion: Step 001
slide-21
SLIDE 21
  • Formation of a scale-free network through evolutionary point by point expansion: Step 002
slide-22
SLIDE 22
  • Formation of a scale-free network through evolutionary point by point expansion: Step 003
slide-23
SLIDE 23
  • Formation of a scale-free network through evolutionary point by point expansion: Step 004
slide-24
SLIDE 24
  • Formation of a scale-free network through evolutionary point by point expansion: Step 005
slide-25
SLIDE 25
  • Formation of a scale-free network through evolutionary point by point expansion: Step 006
slide-26
SLIDE 26
  • Formation of a scale-free network through evolutionary point by point expansion: Step 007
slide-27
SLIDE 27
  • Formation of a scale-free network through evolutionary point by point expansion: Step 008
slide-28
SLIDE 28
  • Formation of a scale-free network through evolutionary point by point expansion: Step 009
slide-29
SLIDE 29
  • Formation of a scale-free network through evolutionary point by point expansion: Step 010
slide-30
SLIDE 30
  • Formation of a scale-free network through evolutionary point by point expansion: Step 011
slide-31
SLIDE 31
  • Formation of a scale-free network through evolutionary point by point expansion: Step 012
slide-32
SLIDE 32
  • Formation of a scale-free network through evolutionary point by point expansion: Step 024
slide-33
SLIDE 33
  • 14

10 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 12 5 5 links # nodes 2 14 3 6 5 2 10 1 12 1 14 1

Analysis of nodes and links in a step by step evolved network

slide-34
SLIDE 34

Structures in Directed Networks

Albert-László Barabási, Linked – The New Science of Networks. Perseus Publ., Cambridge, MA, 2002

slide-35
SLIDE 35

Cell biology

Regulation of cell cycle, metabolic networks, reaction kinetics, homeostasis, ...

The bacterial cell as an example for the simplest form of autonomous life The human body: 1014 cells = 1013 eukaryotic cells +

  • 9

1013 bacterial (prokaryotic) cells, and 200 eukaryotic cell types

slide-36
SLIDE 36

A B C D E F G H I J K L 1

Biochemical Pathways

2 3 4 5 6 7 8 9 10

The reaction network of cellular metabolism published by Boehringer-Ingelheim.

slide-37
SLIDE 37

The citric acid

  • r Krebs cycle

(enlarged from previous slide).

slide-38
SLIDE 38

Parameter set

m j x x x I H p p T k

n j

, , 2 , 1 ; ) , , , ; , , , , (

2 1

K K K =

Time t Concentration ( ); = 1, 2, ... , x t i n

i

Solution curves: xi Kinetic differential equations

n i k k k x x x f x D t x

m n i i i

, , 2 , 1 ; ) , , , ; , , , (

2 1 2 1 2

K K K = + ∇ = ∂ ∂ n i k k k x x x f t d x d

m n i

, , 2 , 1 ; ) , , , ; , , , (

2 1 2 1

K K K = =

Reaction diffusion equations

General conditions: , , pH , , ... Initial conditions: Boundary conditions: boundary ... normal unit vector ... Dirichlet , Neumann , T p I s u n i xi , , 2 , 1 ; ) ( K = n i t r f xs

i

, , 2 , 1 ; ) , ( K = =

  • n

i t r f x u u x

s i i

, , 2 , 1 ; ) , ( ˆ K r

r

= = ∇ ⋅ = ∂ ∂

  • The forward-problem of chemical reaction kinetics
slide-39
SLIDE 39

The inverse-problem of chemical reaction kinetics

Parameter set

m j x x x I H p p T k

n j

, , 2 , 1 ; ) , , , ; , , , , (

2 1

K K K =

Time t Concentration Data from measurements ( ); = 1, 2, ... , ; = 1, 2, ... , x t i n k N

i k

xi Kinetic differential equations

n i k k k x x x f x D t x

m n i i i

, , 2 , 1 ; ) , , , ; , , , (

2 1 2 1 2

K K K = + ∇ = ∂ ∂ n i k k k x x x f t d x d

m n i

, , 2 , 1 ; ) , , , ; , , , (

2 1 2 1

K K K = =

Reaction diffusion equations

General conditions: , , pH , , ... Initial conditions: Boundary conditions: boundary ... normal unit vector ... Dirichlet , Neumann , T p I s u n i xi , , 2 , 1 ; ) ( K = n i t r f x s

i

, , 2 , 1 ; ) , ( K

r

= =

  • n

i t r f x u u x

s i i

, , 2 , 1 ; ) , ( ˆ K r

r

= = ∇ ⋅ = ∂ ∂

slide-40
SLIDE 40

Neurobiology

Neural networks, collective properties, nonlinear dynamics, signalling, ...

A single neuron signaling to a muscle fiber

slide-41
SLIDE 41

The human brain 1011 neurons connected by 1013 to 1014 synapses

slide-42
SLIDE 42

Evolutionary biology

Optimization through variation and selection, relation between genotype, phenotype, and function, ...

Generation time 10 000 generations 106 generations 107 generations RNA molecules 10 sec 1 min 27.8 h = 1.16 d 6.94 d 115.7 d 1.90 a 3.17 a 19.01 a Bacteria 20 min 10 h 138.9 d 11.40 a 38.03 a 1 140 a 380 a 11 408 a Higher multicelluar

  • rganisms

10 d 20 a 274 a 20 000 a 27 380 a 2 × 107 a 273 800 a 2 × 108 a

Time scales of evolutionary change

slide-43
SLIDE 43 O CH2 OH O O P O O O

N1

O CH2 OH O P O O O

N2

O CH2 OH O P O O O

N3

O CH2 OH O P O O O

N4

N A U G C

k =

, , ,

3' - end 5' - end Na Na Na Na

RNA

nd 3’-end

GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG 3'-end 5’-end

70 60 50 40 30 20 10

Definition of RNA structure

5'-e

slide-44
SLIDE 44

The three-dimensional structure of a short double helical stack of B-DNA

James D. Watson, 1928- , and Francis Crick, 1916- , Nobel Prize 1962

1953 – 2003 fifty years double helix

slide-45
SLIDE 45

5'-End 5'-End 3'-End 3'-End

70 60 50 40 30 20 10 GCGGAUUUAGCUCAGDDGGGAGAGCMCCAGACUGAAYAUCUGGAGMUCCUGUGTPCGAUCCACAGAAUUCGCACCA

Sequence Secondary structure

slide-46
SLIDE 46

G G G G C C C G C C G C C G C C G C C G C C C C G G G G G C G C

Plus Strand Plus Strand Minus Strand Plus Strand Plus Strand Minus Strand

3' 3' 3' 3' 3' 5' 5' 5' 3' 3' 5' 5' 5' +

Complex Dissociation Synthesis Synthesis

Complementary replication as the simplest copying mechanism of RNA Complementarity is determined by Watson-Crick base pairs: G C and A=U

slide-47
SLIDE 47

dx / dt = x - x x

i i i j j

; Σ = 1 ; i,j f f

i j

Φ Φ fi Φ = ( = Σ x

  • i

)

j j

x =1,2,...,n [I ] = x 0 ;

i i

i =1,2,...,n ; Ii I1 I2 I1 I2 I1 I2 I i I n I i I n I n

+ + + + + +

(A) + (A) + (A) + (A) + (A) + (A) + fn fi f1 f2 I m I m I m

+

(A) + (A) + fm fm fj = max { ; j=1,2,...,n} xm(t) 1 for t

  • [A] = a = constant

Reproduction of organisms or replication of molecules as the basis of selection

slide-48
SLIDE 48

Selection equation: [Ii] = xi 0 , fi > 0 Mean fitness or dilution flux, φ (t), is a non-decreasing function of time, Solutions are obtained by integrating factor transformation

( )

f x f x n i f x dt dx

n j j j n i i i i i

= = = = − =

∑ ∑

= = 1 1

; 1 ; , , 2 , 1 , φ φ L

( )

{ }

var

2 2 1

≥ = − = = ∑

=

f f f dt dx f dt d

i n i i

φ

( ) ( ) ( ) ( )

( )

n i t f x t f x t x

j n j j i i i

, , 2 , 1 ; exp exp

1

L = ⋅ ⋅ =

∑ =

slide-49
SLIDE 49

s = ( f2-f1) / f1; f2 > f1 ; x1(0) = 1 - 1/N ; x2(0) = 1/N

200 400 600 800 1000 0.2 0.4 0.6 0.8 1 Time [Generations] Fraction of advantageous variant s = 0.1 s = 0.01 s = 0.02

Selection of advantageous mutants in populations of N = 10 000 individuals

slide-50
SLIDE 50

G G G C C C G C C G C C C G C C C G C G G G G C

Plus Strand Plus Strand Minus Strand Plus Strand 3' 3' 3' 3' 5' 3' 5' 5' 5'

Point Mutation Insertion Deletion

GAA AA UCCCG GAAUCC A CGA GAA AA UCCCGUCCCG GAAUCCA

Mutations in nucleic acids represent the mechanism of variation of genotypes.

slide-51
SLIDE 51

Theory of molecular evolution

M.Eigen, Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften 58 (1971), 465-526 C.J. Thompson, J.L. McBride, On Eigen's theory of the self-organization of matter and the evolution

  • f biological macromolecules. Math. Biosci. 21 (1974), 127-142

B.L. Jones, R.H. Enns, S.S. Rangnekar, On the theory of selection of coupled macromolecular

  • systems. Bull.Math.Biol. 38 (1976), 15-28

M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle. Naturwissenschaften 58 (1977), 465-526 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part B: The abstract

  • hypercycle. Naturwissenschaften 65 (1978), 7-41

M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part C: The realistic

  • hypercycle. Naturwissenschaften 65 (1978), 341-369
  • J. Swetina, P. Schuster, Self-replication with errors - A model for polynucleotide replication.

Biophys.Chem. 16 (1982), 329-345 J.S. McCaskill, A localization threshold for macromolecular quasispecies from continuously distributed replication rates. J.Chem.Phys. 80 (1984), 5194-5202 M.Eigen, J.McCaskill, P.Schuster, The molecular quasispecies. Adv.Chem.Phys. 75 (1989), 149-263

  • C. Reidys, C.Forst, P.Schuster, Replication and mutation on neutral networks. Bull.Math.Biol. 63

(2001), 57-94

slide-52
SLIDE 52

Ij In I2 Ii I1 I j I j I j I j I j I j

+ + + + +

(A) + fj Qj1 fj Qj2 fj Qji fj Qjj fj Qjn Q (1- )

ij

  • d(i,j)

d(i,j)

=

l

p p

p .......... Error rate per digit d(i,j) .... Hamming distance between Ii and Ij ........... Chain length of the polynucleotide l

dx / dt = x - x x

i j j i j j

Σ

; Σ = 1 ; f f x

j j j i

Φ Φ = Σ Qji Qij

Σi

= 1 [A] = a = constant [Ii] = xi 0 ;

  • i =1,2,...,n ;

Chemical kinetics of replication and mutation as parallel reactions

slide-53
SLIDE 53

.... GC UC .... CA .... GC UC .... GU .... GC UC .... GA .... GC UC .... CU

d =1

H

d =1

H

d =2

H

City-block distance in sequence space 2D Sketch of sequence space

Single point mutations as moves in sequence space

slide-54
SLIDE 54

4 2 1 8 16 10 19 9 14 6 13 5 11 3 7 12 21 17 22 18 25 20 26 24 28 27 23 15 29 30 31

Binary sequences are encoded by their decimal equivalents: = 0 and = 1, for example, "0" 00000 = "14" 01110 = , "29" 11101 = , etc. ≡ ≡ ≡ , C CCCCC C C C G GGG GGG G

Mutant class

1 2

3 4

5

Sequence space of binary sequences of chain lenght n=5

slide-55
SLIDE 55

CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T A C A C

Hamming distance d (I ,I ) =

H 1 2

4 d (I ,I ) = 0

H 1 1

d (I ,I ) = d (I ,I )

H H 1 2 2 1

d (I ,I ) d (I ,I ) + d (I ,I )

H H H 1 3 1 2 2 3

  • (i)

(ii) (iii)

The Hamming distance between sequences induces a metric in sequence space

slide-56
SLIDE 56

Mutation-selection equation: [Ii] = xi 0, fi > 0, Qij Solutions are obtained after integrating factor transformation by means of an eigenvalue problem

f x f x n i x x Q f dt dx

n j j j n i i i j n j ji j i

= = = = − =

∑ ∑ ∑

= = = 1 1 1

; 1 ; , , 2 , 1 , φ φ L

( ) ( ) ( ) ( ) ( )

) ( ) ( ; , , 2 , 1 ; exp exp

1 1 1 1

∑ ∑ ∑ ∑

= = − = − =

= = ⋅ ⋅ ⋅ ⋅ =

n i i ki k n j k k n k jk k k n k ik i

x h c n i t c t c t x L l l λ λ

{ } { } { }

n j i h H L n j i L n j i Q f W

ij ij ij i

, , 2 , 1 , ; ; , , 2 , 1 , ; ; , , 2 , 1 , ;

1

L L l L = = = = = = ÷

{ }

1 , , 1 , ;

1

− = = Λ = ⋅ ⋅

n k L W L

k

L λ

slide-57
SLIDE 57

Error rate p = 1-q

0.00 0.05 0.10

Quasispecies Uniform distribution Quasispecies as a function of the replication accuracy q

slide-58
SLIDE 58

space Sequence C

  • n

c e n t r a t i

  • n

Master sequence Mutant cloud

The molecular quasispecies in sequence space

slide-59
SLIDE 59

e1 e1 e3 e3 e2 e2

l0 l1 l2

x3 x1 x2

The quasispecies on the concentration simplex S3= {

}

1 ; 3 , 2 , 1 ,

3 1

= = ≥

∑ =

i i i

x i x

slide-60
SLIDE 60

f0 f f1 f2 f3 f4 f6 f5 f7

Replication rate constant: fk = / [+ dS

(k)]

  • dS

(k) = dH(Sk,S

)

Evaluation of RNA secondary structures yields replication rate constants

slide-61
SLIDE 61

Hamming distance d (S ,S ) =

H 1 2

4 d (S ,S ) = 0

H 1 1

d (S ,S ) = d (S ,S )

H H 1 2 2 1

d (S ,S ) d (S ,S ) + d (S ,S )

H H H 1 3 1 2 2 3

  • (i)

(ii) (iii)

The Hamming distance between structures in parentheses notation forms a metric in structure space

slide-62
SLIDE 62

Stock Solution Reaction Mixture

Replication rate constant: fk = / [+ dS

(k)]

  • dS

(k) = dH(Sk,S

) Selection constraint: # RNA molecules is controlled by the flow N N t N ± ≈ ) ( The flowreactor as a device for studies of evolution in vitro and in silico

slide-63
SLIDE 63

5'-End 3'-End

70 60 50 40 30 20 10

Randomly chosen initial structure Phenylalanyl-tRNA as target structure

slide-64
SLIDE 64

s p a c e Sequence Concentration

Master sequence Mutant cloud “Off-the-cloud” mutations

The molecular quasispecies in sequence space

slide-65
SLIDE 65

S{ = ( ) I{ f S

{ {

ƒ = ( )

S{ f{ I{

Mutation Genotype-Phenotype Mapping Evaluation of the Phenotype

Q{

j

I1 I2 I3 I4 I5 In

Q

f1 f2 f3 f4 f5 fn

I1 I2 I3 I4 I5 I{ In+1 f1 f2 f3 f4 f5 f{ fn+1

Q

Evolutionary dynamics including molecular phenotypes

slide-66
SLIDE 66

In silico optimization in the flow reactor: Trajectory (biologists‘ view) Time (arbitrary units) A v e r a g e d i s t a n c e f r

  • m

i n i t i a l s t r u c t u r e 5

  • d
  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-67
SLIDE 67

In silico optimization in the flow reactor: Trajectory (physicists‘ view) Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t

  • t

a r g e t d

  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-68
SLIDE 68

44

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Endconformation of optimization

slide-69
SLIDE 69

44 43

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of the last step 43 44

slide-70
SLIDE 70

44 43 42

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of last-but-one step 42 43 ( 44)

slide-71
SLIDE 71

44 43 42 41

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of step 41 42 ( 43 44)

slide-72
SLIDE 72

44 43 42 41 40

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of step 40 41 ( 42 43 44)

slide-73
SLIDE 73

44 43 42 41 40 39 Evolutionary process Reconstruction

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of the relay series

slide-74
SLIDE 74

Transition inducing point mutations Neutral point mutations

Change in RNA sequences during the final five relay steps 39 44

slide-75
SLIDE 75

In silico optimization in the flow reactor: Trajectory and relay steps Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t

  • t

a r g e t d

  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

Relay steps

slide-76
SLIDE 76

10 08 12 14 Time (arbitrary units) Average structure distance to target dS

  • 500

250 20 10

Uninterrupted presence Evolutionary trajectory Number of relay step

28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations Neutral point mutations

Neutral genotype evolution during phenotypic stasis

slide-77
SLIDE 77

In silico optimization in the flow reactor: Main transitions Main transitions Relay steps Time (arbitrary units) Average structure distance to target d S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-78
SLIDE 78

00 09 31 44

Three important steps in the formation of the tRNA clover leaf from a randomly chosen initial structure corresponding to three main transitions.

slide-79
SLIDE 79

AUGC GC Movies of optimization trajectories over the AUGC and the GC alphabet

slide-80
SLIDE 80

Runtime of trajectories F r e q u e n c y

1000 2000 3000 4000 5000 0.05 0.1 0.15 0.2

Statistics of the lengths of trajectories from initial structure to target (AUGC-sequences)

slide-81
SLIDE 81

Alphabet Runtime Transitions Main transitions

  • No. of runs

AUGC 385.6 22.5 12.6 1017 GUC 448.9 30.5 16.5 611 GC 2188.3 40.0 20.6 107

Statistics of trajectories and relay series (mean values of log-normal distributions)

slide-82
SLIDE 82

Minimum free energy criterion Inverse folding of RNA secondary structures

The idea of inverse folding algorithm is to search for sequences that form a given RNA secondary structure under the minimum free energy criterion.

slide-83
SLIDE 83

Structure

slide-84
SLIDE 84

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G

Compatible sequence Structure

5’-end 3’-end

slide-85
SLIDE 85

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G G G G G G C C C C G G G G C C C C C C C U A U U G U A A A A U

Compatible sequence Structure

5’-end 3’-end

slide-86
SLIDE 86

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G C C C C G G G G C C G G G G G C C C C C U A U U G U A A A A U

Compatible sequence Structure

5’-end 3’-end

Base pairs: AU , UA GC , CG GU , UG Single nucleotides: A U G C , , ,

slide-87
SLIDE 87

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C G C G G G G G G G G G C G C C U U G G G G G C C C C C C C U U A A A A A U

Structure Incompatible sequence

5’-end 3’-end

slide-88
SLIDE 88

Target structure Sk Initial trial sequences Target sequence Stop sequence of an unsuccessful trial Intermediate compatible sequences

Approach to the target structure Sk in the inverse folding algorithm

slide-89
SLIDE 89

Minimum free energy criterion

Inverse folding of RNA secondary structures

1st 2nd 3rd trial 4th 5th

The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.

slide-90
SLIDE 90

Theory of genotype – phenotype mapping

  • P. Schuster, W.Fontana, P.F.Stadler, I.L.Hofacker, From sequences to shapes and back:

A case study in RNA secondary structures. Proc.Roy.Soc.London B 255 (1994), 279-284 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. I. Neutral networks. Mh.Chem. 127 (1996), 355-374 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. II. Structure of neutral networks and shape space covering. Mh.Chem. 127 (1996), 375-389 C.M.Reidys, P.F.Stadler, P.Schuster, Generic properties of combinatory maps. Bull.Math.Biol. 59 (1997), 339-397 I.L.Hofacker, P. Schuster, P.F.Stadler, Combinatorics of RNA secondary structures. Discr.Appl.Math. 89 (1998), 177-207 C.M.Reidys, P.F.Stadler, Combinatory landscapes. SIAM Review 44 (2002), 3-54

slide-91
SLIDE 91

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers Mapping from sequence space into structure space and into function

slide-92
SLIDE 92

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers

slide-93
SLIDE 93

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers

The pre-image of the structure Sk in sequence space is the neutral network Gk

slide-94
SLIDE 94

Neutral networks are sets of sequences forming the same structure. Gk is the pre-image of the structure Sk in sequence space: Gk =

  • 1(Sk) π{

j |

(Ij) = Sk} The set is converted into a graph by connecting all sequences of Hamming distance one. Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence

  • space. In this approach, nodes are inserted randomly into sequence

space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.

slide-95
SLIDE 95

λj = 27 = 0.444 ,

/

12 λk = (k)

j

| | Gk

λ κ

cr = 1 -

  • 1 (

1)

/ κ- λ λ

k cr . . . .

> λ λ

k cr . . . .

< network is connected Gk network is connected not Gk Connectivity threshold: Alphabet size : = 4

  • AUGC

G S S

k k k

= ( ) | ( ) =

  • 1

U

  • I

I

j j

  • cr

2 0.5 3 0.423 4 0.370

GC,AU GUC,AUG AUGC

Mean degree of neutrality and connectivity of neutral networks

slide-96
SLIDE 96

A connected neutral network

slide-97
SLIDE 97

Giant Component

A multi-component neutral network

slide-98
SLIDE 98 5'-End 5'-End 5'-End 5'-End 3'-End 3'-End 3'-End 3'-End 70 70 70 70 60 60 60 60 50 50 50 50 40 40 40 40 30 30 30 30 20 20 20 20 10 10 10 10

Alphabet Degree of neutrality

AU AUG AUGC UGC GC

  • -
  • -

0.275 0.064 0.263 0.071 0.052 0.033

  • -

0.217 0.051 0.279 0.063 0.257 0.070

  • 0.057 0.034
  • 0.073 0.032

0.201 0.056 0.313 0.058 0.250 0.064 0.068 0.034

  • Degree of neutrality of cloverleaf RNA secondary structures over different alphabets
slide-99
SLIDE 99

Reference for postulation and in silico verification of neutral networks

slide-100
SLIDE 100

Gk Neutral Network

Structure S

k

Gk C k

Compatible Set Ck

The compatible set Ck of a structure Sk consists of all sequences which form Sk as its minimum free energy structure (the neutral network Gk) or one of its suboptimal structures.

slide-101
SLIDE 101

Structure S Structure S

1

The intersection of two compatible sets is always non empty: C0 C1 π

slide-102
SLIDE 102

Reference for the definition of the intersection and the proof of the intersection theorem

slide-103
SLIDE 103

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G

3’-end

M i n i m u m f r e e e n e r g y c

  • n

f

  • r

m a t i

  • n

S S u b

  • p

t i m a l c

  • n

f

  • r

m a t i

  • n

S 1

G G G G G G G G G G G G C C C C U U U U C C C C C C U A A A A A C G G G G G G C C C C U U G G G G G C C C C C C C U U A A A A A U G

A sequence at the intersection of two neutral networks is compatible with both structures

slide-104
SLIDE 104

5.10 5.90

2 8

14 15 18 17 23 19 27 22 38 45 25 36 33 39 40 43 41

3.30 7.40

5 3 7 4 10 9 6

13 12 3 . 1 11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49 31 47 48

S0 S1

basin '1' long living metastable structure basin '0' minimum free energy structure

Barrier tree for two long living structures

slide-105
SLIDE 105

Kinetics of RNA refolding between a long living metastable conformation and the minmum free energy structure

slide-106
SLIDE 106

Acknowledgement of support

Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Project No. EU-980189 Siemens AG, Austria The Santa Fe Institute and the Universität Wien The software for producing RNA movies was developed by Robert Giegerich and coworkers at the Universität Bielefeld

Universität Wien

slide-107
SLIDE 107

Coworkers

Universität Wien

Walter Fontana, Santa Fe Institute, NM Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Ivo L.Hofacker, Christoph Flamm, Universität Wien, AT Andreas Wernitznig, Michael Kospach, Universität Wien, AT Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan Cupal, Kurt Grünberger, Andreas Svrček-Seiler, Stefan Wuchty Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber

slide-108
SLIDE 108

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-109
SLIDE 109