Different kinds of robustness in genetic and metabolic networks - - PowerPoint PPT Presentation
Different kinds of robustness in genetic and metabolic networks - - PowerPoint PPT Presentation
Different kinds of robustness in genetic and metabolic networks Peter Schuster Institut fr Theoretische Chemie und Molekulare Strukturbiologie der Universitt Wien Seminar lecture Linz, 15.12.2003 Genomics and proteomics Large scale data
Different kinds of robustness in genetic and metabolic networks
Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien Seminar lecture Linz, 15.12.2003
Mathematics in 21st Century's Life Sciences
Genomics and proteomics Large scale data processing, sequence comparison ...
Developmental biology
Gene regulation networks, signal propagation, pattern formation, robustness ...
Cell biology
Regulation of cell cycle, metabolic networks, reaction kinetics, homeostasis, ...
Neurobiology
Neural networks, collective properties, nonlinear dynamics, signalling, ...
Evolutionary biology
Optimization through variation and selection, relation between genotype, phenotype, and function, ...
Genomics and proteomics Large scale data processing, sequence comparison ...
- E. coli:
Length of the Genome 4×106 Nucleotides Number of Cell Types 1 Number of Genes 4 000 Man: Length of the Genome 3×109 Nucleotides Number of Cell Types 200 Number of Genes 30 000 - 100 000
Fully sequenced genomes Fully sequenced genomes
- Organisms 751
751 projects 153 153 complete (16 A, 118 B, 19 E)
(Eukarya examples: mosquito (pest, malaria), sea squirt, mouse, yeast, homo sapiens, arabidopsis, fly, worm, …)
598 598 ongoing (23 A, 332 B, 243 E)
(Eukarya examples: chimpanzee, turkey, chicken, ape, corn, potato, rice, banana, tomato, cotton, coffee, soybean, pig, rat, cat, sheep, horse, kangaroo, dog, cow, bee, salmon, fugu, frog, …)
- Other structures with genetic information
68 68 phages 1328 1328 viruses 35 35 viroids 472 472 organelles (423 mitochondria, 32 plastids,
14 plasmids, 3 nucleomorphs)
Source: NCBI Source: Integrated Genomics, Inc. August 12th, 2003
Wolfgang Wieser. Die Erfindung der Individualität oder die zwei Gesichter der Evolution. Spektrum Akademischer Verlag, Heidelberg 1998. A.C.Wilson. The Molecular Basis of Evolution. Scientific American, Oct.1985, 164-173.
Waste Food
Metabolism Replication: DNA 2 DNA →
+ +
Ribosom
mRNA Protein
Translation: mRNA Protein →
Nucleotides Amino Acids Lipids Carbohydrates Small Molecules
mRNA Transcription: DNA RNA → Genetic Code
The gene is a stretch of DNA which after transcription gives rise to a mRNA
The same section of the microarray is shown in three independent hybridizations. Marked spots refer to: (1) protein disulfide isomerase related protein P5, (2) IL-8 precursor, (3) EST AA057170, and (4) vascular endothelial growth factor Gene expression DNA microarray representing 8613 human genes used to study transcription in the response of human fibroblasts to serum V.R.Iyer et al., Science 283: 83-87, 1999
genomic DNA mRNA
Elimination of introns through splicing AAA
The gene is a stretch of DNA which after transcription and processing gives rise to a mRNA
Sex determination in Drosophila through alternative splicing The process of protein synthesis and its regulation is now understood but the notion of the gene as a stretch of DNA has become obscure. The gene is essentially associated with the sequence of unmodified amino acids in a protein, and it is determined by the nucleotide sequence as well as the dynamics of the the process eventually leading to the m-RNA that is translated.
Number of genes in the human genome
The number of genes in the human genome is still only a very rough estimate
Developmental biology
Gene regulation networks, signal propagation, pattern formation, robustness ...
Three-dimensional structure of the complex between the regulatory protein cro-repressor and the binding site on
- phage B-DNA
Development of the fruit fly drosophila melanogaster: Genetics, experiment, and imago
Linear chain Network
Processing of information in cascades and networks
Albert-László Barabási, Linked – The New Science of Networks. Perseus Publ., Cambridge, MA, 2002
Distributed network Small world network Albert-László Barabási, Linked – The New Science of Networks. Perseus Publ., Cambridge, MA, 2002
Albert-László Barabási, Linked – The New Science of Networks Perseus Publ., Cambridge, MA, 2002
- Formation of a scale-free network through evolutionary point by point expansion: Step 000
- Formation of a scale-free network through evolutionary point by point expansion: Step 001
- Formation of a scale-free network through evolutionary point by point expansion: Step 002
- Formation of a scale-free network through evolutionary point by point expansion: Step 003
- Formation of a scale-free network through evolutionary point by point expansion: Step 004
- Formation of a scale-free network through evolutionary point by point expansion: Step 005
- Formation of a scale-free network through evolutionary point by point expansion: Step 006
- Formation of a scale-free network through evolutionary point by point expansion: Step 007
- Formation of a scale-free network through evolutionary point by point expansion: Step 008
- Formation of a scale-free network through evolutionary point by point expansion: Step 009
- Formation of a scale-free network through evolutionary point by point expansion: Step 010
- Formation of a scale-free network through evolutionary point by point expansion: Step 011
- Formation of a scale-free network through evolutionary point by point expansion: Step 012
- Formation of a scale-free network through evolutionary point by point expansion: Step 024
- 14
10 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 12 5 5 links # nodes 2 14 3 6 5 2 10 1 12 1 14 1
Analysis of nodes and links in a step by step evolved network
Structures in Directed Networks
Albert-László Barabási, Linked – The New Science of Networks. Perseus Publ., Cambridge, MA, 2002
Cell biology
Regulation of cell cycle, metabolic networks, reaction kinetics, homeostasis, ...
The bacterial cell as an example for the simplest form of autonomous life The human body: 1014 cells = 1013 eukaryotic cells +
- 9
1013 bacterial (prokaryotic) cells, and 200 eukaryotic cell types
A B C D E F G H I J K L 1
Biochemical Pathways
2 3 4 5 6 7 8 9 10
The reaction network of cellular metabolism published by Boehringer-Ingelheim.
The citric acid
- r Krebs cycle
(enlarged from previous slide).
Parameter set
m j x x x I H p p T k
n j
, , 2 , 1 ; ) , , , ; , , , , (
2 1
K K K =
Time t Concentration ( ); = 1, 2, ... , x t i n
i
Solution curves: xi Kinetic differential equations
n i k k k x x x f x D t x
m n i i i, , 2 , 1 ; ) , , , ; , , , (
2 1 2 1 2K K K = + ∇ = ∂ ∂ n i k k k x x x f t d x d
m n i, , 2 , 1 ; ) , , , ; , , , (
2 1 2 1K K K = =
Reaction diffusion equations
General conditions: , , pH , , ... Initial conditions: Boundary conditions: boundary ... normal unit vector ... Dirichlet , Neumann , T p I s u n i xi , , 2 , 1 ; ) ( K = n i t r f xs
i
, , 2 , 1 ; ) , ( K = =
- n
i t r f x u u x
s i i
, , 2 , 1 ; ) , ( ˆ K r
r
= = ∇ ⋅ = ∂ ∂
- The forward-problem of chemical reaction kinetics
The inverse-problem of chemical reaction kinetics
Parameter set
m j x x x I H p p T k
n j
, , 2 , 1 ; ) , , , ; , , , , (
2 1
K K K =
Time t Concentration Data from measurements ( ); = 1, 2, ... , ; = 1, 2, ... , x t i n k N
i k
xi Kinetic differential equations
n i k k k x x x f x D t x
m n i i i, , 2 , 1 ; ) , , , ; , , , (
2 1 2 1 2K K K = + ∇ = ∂ ∂ n i k k k x x x f t d x d
m n i, , 2 , 1 ; ) , , , ; , , , (
2 1 2 1K K K = =
Reaction diffusion equations
General conditions: , , pH , , ... Initial conditions: Boundary conditions: boundary ... normal unit vector ... Dirichlet , Neumann , T p I s u n i xi , , 2 , 1 ; ) ( K = n i t r f x s
i, , 2 , 1 ; ) , ( K
r= =
- n
i t r f x u u x
s i i, , 2 , 1 ; ) , ( ˆ K r
r= = ∇ ⋅ = ∂ ∂
Neurobiology
Neural networks, collective properties, nonlinear dynamics, signalling, ...
A single neuron signaling to a muscle fiber
The human brain 1011 neurons connected by 1013 to 1014 synapses
Evolutionary biology
Optimization through variation and selection, relation between genotype, phenotype, and function, ...
Generation time 10 000 generations 106 generations 107 generations RNA molecules 10 sec 1 min 27.8 h = 1.16 d 6.94 d 115.7 d 1.90 a 3.17 a 19.01 a Bacteria 20 min 10 h 138.9 d 11.40 a 38.03 a 1 140 a 380 a 11 408 a Higher multicelluar
- rganisms
10 d 20 a 274 a 20 000 a 27 380 a 2 × 107 a 273 800 a 2 × 108 a
Time scales of evolutionary change
N1
O CH2 OH O P O O ON2
O CH2 OH O P O O ON3
O CH2 OH O P O O ON4
N A U G C
k =
, , ,
3' - end 5' - end Na Na Na Na
RNA
nd 3’-end
GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG 3'-end 5’-end
70 60 50 40 30 20 10
Definition of RNA structure
5'-e
The three-dimensional structure of a short double helical stack of B-DNA
James D. Watson, 1928- , and Francis Crick, 1916- , Nobel Prize 1962
1953 – 2003 fifty years double helix
5'-End 5'-End 3'-End 3'-End
70 60 50 40 30 20 10 GCGGAUUUAGCUCAGDDGGGAGAGCMCCAGACUGAAYAUCUGGAGMUCCUGUGTPCGAUCCACAGAAUUCGCACCA
Sequence Secondary structure
G G G G C C C G C C G C C G C C G C C G C C C C G G G G G C G C
Plus Strand Plus Strand Minus Strand Plus Strand Plus Strand Minus Strand
3' 3' 3' 3' 3' 5' 5' 5' 3' 3' 5' 5' 5' +
Complex Dissociation Synthesis Synthesis
Complementary replication as the simplest copying mechanism of RNA Complementarity is determined by Watson-Crick base pairs: G C and A=U
dx / dt = x - x x
i i i j j
; Σ = 1 ; i,j f f
i j
Φ Φ fi Φ = ( = Σ x
- i
)
j j
x =1,2,...,n [I ] = x 0 ;
i i
i =1,2,...,n ; Ii I1 I2 I1 I2 I1 I2 I i I n I i I n I n
+ + + + + +
(A) + (A) + (A) + (A) + (A) + (A) + fn fi f1 f2 I m I m I m
+
(A) + (A) + fm fm fj = max { ; j=1,2,...,n} xm(t) 1 for t
- [A] = a = constant
Reproduction of organisms or replication of molecules as the basis of selection
Selection equation: [Ii] = xi 0 , fi > 0 Mean fitness or dilution flux, φ (t), is a non-decreasing function of time, Solutions are obtained by integrating factor transformation
( )
f x f x n i f x dt dx
n j j j n i i i i i
= = = = − =
∑ ∑
= = 1 1
; 1 ; , , 2 , 1 , φ φ L
( )
{ }
var
2 2 1
≥ = − = = ∑
=
f f f dt dx f dt d
i n i i
φ
( ) ( ) ( ) ( )
( )
n i t f x t f x t x
j n j j i i i
, , 2 , 1 ; exp exp
1
L = ⋅ ⋅ =
∑ =
s = ( f2-f1) / f1; f2 > f1 ; x1(0) = 1 - 1/N ; x2(0) = 1/N
200 400 600 800 1000 0.2 0.4 0.6 0.8 1 Time [Generations] Fraction of advantageous variant s = 0.1 s = 0.01 s = 0.02
Selection of advantageous mutants in populations of N = 10 000 individuals
G G G C C C G C C G C C C G C C C G C G G G G C
Plus Strand Plus Strand Minus Strand Plus Strand 3' 3' 3' 3' 5' 3' 5' 5' 5'
Point Mutation Insertion Deletion
GAA AA UCCCG GAAUCC A CGA GAA AA UCCCGUCCCG GAAUCCA
Mutations in nucleic acids represent the mechanism of variation of genotypes.
Theory of molecular evolution
M.Eigen, Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften 58 (1971), 465-526 C.J. Thompson, J.L. McBride, On Eigen's theory of the self-organization of matter and the evolution
- f biological macromolecules. Math. Biosci. 21 (1974), 127-142
B.L. Jones, R.H. Enns, S.S. Rangnekar, On the theory of selection of coupled macromolecular
- systems. Bull.Math.Biol. 38 (1976), 15-28
M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle. Naturwissenschaften 58 (1977), 465-526 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part B: The abstract
- hypercycle. Naturwissenschaften 65 (1978), 7-41
M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part C: The realistic
- hypercycle. Naturwissenschaften 65 (1978), 341-369
- J. Swetina, P. Schuster, Self-replication with errors - A model for polynucleotide replication.
Biophys.Chem. 16 (1982), 329-345 J.S. McCaskill, A localization threshold for macromolecular quasispecies from continuously distributed replication rates. J.Chem.Phys. 80 (1984), 5194-5202 M.Eigen, J.McCaskill, P.Schuster, The molecular quasispecies. Adv.Chem.Phys. 75 (1989), 149-263
- C. Reidys, C.Forst, P.Schuster, Replication and mutation on neutral networks. Bull.Math.Biol. 63
(2001), 57-94
Ij In I2 Ii I1 I j I j I j I j I j I j
+ + + + +
(A) + fj Qj1 fj Qj2 fj Qji fj Qjj fj Qjn Q (1- )
ij
- d(i,j)
d(i,j)
=
l
p p
p .......... Error rate per digit d(i,j) .... Hamming distance between Ii and Ij ........... Chain length of the polynucleotide l
dx / dt = x - x x
i j j i j j
Σ
; Σ = 1 ; f f x
j j j i
Φ Φ = Σ Qji Qij
Σi
= 1 [A] = a = constant [Ii] = xi 0 ;
- i =1,2,...,n ;
Chemical kinetics of replication and mutation as parallel reactions
.... GC UC .... CA .... GC UC .... GU .... GC UC .... GA .... GC UC .... CU
d =1
H
d =1
H
d =2
H
City-block distance in sequence space 2D Sketch of sequence space
Single point mutations as moves in sequence space
4 2 1 8 16 10 19 9 14 6 13 5 11 3 7 12 21 17 22 18 25 20 26 24 28 27 23 15 29 30 31
Binary sequences are encoded by their decimal equivalents: = 0 and = 1, for example, "0" 00000 = "14" 01110 = , "29" 11101 = , etc. ≡ ≡ ≡ , C CCCCC C C C G GGG GGG G
Mutant class
1 2
3 4
5
Sequence space of binary sequences of chain lenght n=5
CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T A C A C
Hamming distance d (I ,I ) =
H 1 2
4 d (I ,I ) = 0
H 1 1
d (I ,I ) = d (I ,I )
H H 1 2 2 1
d (I ,I ) d (I ,I ) + d (I ,I )
H H H 1 3 1 2 2 3
- (i)
(ii) (iii)
The Hamming distance between sequences induces a metric in sequence space
Mutation-selection equation: [Ii] = xi 0, fi > 0, Qij Solutions are obtained after integrating factor transformation by means of an eigenvalue problem
f x f x n i x x Q f dt dx
n j j j n i i i j n j ji j i
= = = = − =
∑ ∑ ∑
= = = 1 1 1
; 1 ; , , 2 , 1 , φ φ L
( ) ( ) ( ) ( ) ( )
) ( ) ( ; , , 2 , 1 ; exp exp
1 1 1 1
∑ ∑ ∑ ∑
= = − = − =
= = ⋅ ⋅ ⋅ ⋅ =
n i i ki k n j k k n k jk k k n k ik i
x h c n i t c t c t x L l l λ λ
{ } { } { }
n j i h H L n j i L n j i Q f W
ij ij ij i
, , 2 , 1 , ; ; , , 2 , 1 , ; ; , , 2 , 1 , ;
1
L L l L = = = = = = ÷
−
{ }
1 , , 1 , ;
1
− = = Λ = ⋅ ⋅
−
n k L W L
k
L λ
Error rate p = 1-q
0.00 0.05 0.10
Quasispecies Uniform distribution Quasispecies as a function of the replication accuracy q
space Sequence C
- n
c e n t r a t i
- n
Master sequence Mutant cloud
The molecular quasispecies in sequence space
e1 e1 e3 e3 e2 e2
l0 l1 l2
x3 x1 x2
The quasispecies on the concentration simplex S3= {
}
1 ; 3 , 2 , 1 ,
3 1
= = ≥
∑ =
i i i
x i x
f0 f f1 f2 f3 f4 f6 f5 f7
Replication rate constant: fk = / [+ dS
(k)]
- dS
(k) = dH(Sk,S
)
Evaluation of RNA secondary structures yields replication rate constants
Hamming distance d (S ,S ) =
H 1 2
4 d (S ,S ) = 0
H 1 1
d (S ,S ) = d (S ,S )
H H 1 2 2 1
d (S ,S ) d (S ,S ) + d (S ,S )
H H H 1 3 1 2 2 3
- (i)
(ii) (iii)
The Hamming distance between structures in parentheses notation forms a metric in structure space
Stock Solution Reaction Mixture
Replication rate constant: fk = / [+ dS
(k)]
- dS
(k) = dH(Sk,S
) Selection constraint: # RNA molecules is controlled by the flow N N t N ± ≈ ) ( The flowreactor as a device for studies of evolution in vitro and in silico
5'-End 3'-End
70 60 50 40 30 20 10
Randomly chosen initial structure Phenylalanyl-tRNA as target structure
s p a c e Sequence Concentration
Master sequence Mutant cloud “Off-the-cloud” mutations
The molecular quasispecies in sequence space
S{ = ( ) I{ f S
{ {
ƒ = ( )
S{ f{ I{
Mutation Genotype-Phenotype Mapping Evaluation of the Phenotype
Q{
j
I1 I2 I3 I4 I5 In
Q
f1 f2 f3 f4 f5 fn
I1 I2 I3 I4 I5 I{ In+1 f1 f2 f3 f4 f5 f{ fn+1
Q
Evolutionary dynamics including molecular phenotypes
In silico optimization in the flow reactor: Trajectory (biologists‘ view) Time (arbitrary units) A v e r a g e d i s t a n c e f r
- m
i n i t i a l s t r u c t u r e 5
- d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory
In silico optimization in the flow reactor: Trajectory (physicists‘ view) Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory
44
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Endconformation of optimization
44 43
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of the last step 43 44
44 43 42
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of last-but-one step 42 43 ( 44)
44 43 42 41
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of step 41 42 ( 43 44)
44 43 42 41 40
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of step 40 41 ( 42 43 44)
44 43 42 41 40 39 Evolutionary process Reconstruction
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of the relay series
Transition inducing point mutations Neutral point mutations
Change in RNA sequences during the final five relay steps 39 44
In silico optimization in the flow reactor: Trajectory and relay steps Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory
Relay steps
10 08 12 14 Time (arbitrary units) Average structure distance to target dS
- 500
250 20 10
Uninterrupted presence Evolutionary trajectory Number of relay step
28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations Neutral point mutations
Neutral genotype evolution during phenotypic stasis
In silico optimization in the flow reactor: Main transitions Main transitions Relay steps Time (arbitrary units) Average structure distance to target d S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory
00 09 31 44
Three important steps in the formation of the tRNA clover leaf from a randomly chosen initial structure corresponding to three main transitions.
AUGC GC Movies of optimization trajectories over the AUGC and the GC alphabet
Runtime of trajectories F r e q u e n c y
1000 2000 3000 4000 5000 0.05 0.1 0.15 0.2
Statistics of the lengths of trajectories from initial structure to target (AUGC-sequences)
Alphabet Runtime Transitions Main transitions
- No. of runs
AUGC 385.6 22.5 12.6 1017 GUC 448.9 30.5 16.5 611 GC 2188.3 40.0 20.6 107
Statistics of trajectories and relay series (mean values of log-normal distributions)
Minimum free energy criterion Inverse folding of RNA secondary structures
The idea of inverse folding algorithm is to search for sequences that form a given RNA secondary structure under the minimum free energy criterion.
Structure
C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G
Compatible sequence Structure
5’-end 3’-end
C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G G G G G G C C C C G G G G C C C C C C C U A U U G U A A A A U
Compatible sequence Structure
5’-end 3’-end
C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G C C C C G G G G C C G G G G G C C C C C U A U U G U A A A A U
Compatible sequence Structure
5’-end 3’-end
Base pairs: AU , UA GC , CG GU , UG Single nucleotides: A U G C , , ,
C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C G C G G G G G G G G G C G C C U U G G G G G C C C C C C C U U A A A A A U
Structure Incompatible sequence
5’-end 3’-end
Target structure Sk Initial trial sequences Target sequence Stop sequence of an unsuccessful trial Intermediate compatible sequences
Approach to the target structure Sk in the inverse folding algorithm
Minimum free energy criterion
Inverse folding of RNA secondary structures
1st 2nd 3rd trial 4th 5th
The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.
Theory of genotype – phenotype mapping
- P. Schuster, W.Fontana, P.F.Stadler, I.L.Hofacker, From sequences to shapes and back:
A case study in RNA secondary structures. Proc.Roy.Soc.London B 255 (1994), 279-284 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. I. Neutral networks. Mh.Chem. 127 (1996), 355-374 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. II. Structure of neutral networks and shape space covering. Mh.Chem. 127 (1996), 375-389 C.M.Reidys, P.F.Stadler, P.Schuster, Generic properties of combinatory maps. Bull.Math.Biol. 59 (1997), 339-397 I.L.Hofacker, P. Schuster, P.F.Stadler, Combinatorics of RNA secondary structures. Discr.Appl.Math. 89 (1998), 177-207 C.M.Reidys, P.F.Stadler, Combinatory landscapes. SIAM Review 44 (2002), 3-54
Sk I. = ( ) ψ
fk f Sk = ( )
Sequence space Structure space Real numbers Mapping from sequence space into structure space and into function
Sk I. = ( ) ψ
fk f Sk = ( )
Sequence space Structure space Real numbers
Sk I. = ( ) ψ
fk f Sk = ( )
Sequence space Structure space Real numbers
The pre-image of the structure Sk in sequence space is the neutral network Gk
Neutral networks are sets of sequences forming the same structure. Gk is the pre-image of the structure Sk in sequence space: Gk =
- 1(Sk) π{
j |
(Ij) = Sk} The set is converted into a graph by connecting all sequences of Hamming distance one. Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence
- space. In this approach, nodes are inserted randomly into sequence
space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.
λj = 27 = 0.444 ,
/
12 λk = (k)
j
| | Gk
λ κ
cr = 1 -
- 1 (
1)
/ κ- λ λ
k cr . . . .
> λ λ
k cr . . . .
< network is connected Gk network is connected not Gk Connectivity threshold: Alphabet size : = 4
- AUGC
G S S
k k k
= ( ) | ( ) =
- 1
U
- I
I
j j
- cr
2 0.5 3 0.423 4 0.370
GC,AU GUC,AUG AUGC
Mean degree of neutrality and connectivity of neutral networks
A connected neutral network
Giant Component
A multi-component neutral network
Alphabet Degree of neutrality
AU AUG AUGC UGC GC
- -
- -
0.275 0.064 0.263 0.071 0.052 0.033
- -
0.217 0.051 0.279 0.063 0.257 0.070
- 0.057 0.034
- 0.073 0.032
0.201 0.056 0.313 0.058 0.250 0.064 0.068 0.034
- Degree of neutrality of cloverleaf RNA secondary structures over different alphabets
Reference for postulation and in silico verification of neutral networks
Gk Neutral Network
Structure S
k
Gk C k
Compatible Set Ck
The compatible set Ck of a structure Sk consists of all sequences which form Sk as its minimum free energy structure (the neutral network Gk) or one of its suboptimal structures.
Structure S Structure S
1
The intersection of two compatible sets is always non empty: C0 C1 π
Reference for the definition of the intersection and the proof of the intersection theorem
C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G
3’-end
M i n i m u m f r e e e n e r g y c
- n
f
- r
m a t i
- n
S S u b
- p
t i m a l c
- n
f
- r
m a t i
- n
S 1
G G G G G G G G G G G G C C C C U U U U C C C C C C U A A A A A C G G G G G G C C C C U U G G G G G C C C C C C C U U A A A A A U G
A sequence at the intersection of two neutral networks is compatible with both structures
5.10 5.90
2 8
14 15 18 17 23 19 27 22 38 45 25 36 33 39 40 43 413.30 7.40
5 3 7 4 10 9 6
13 12 3 . 1 11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49 31 47 48S0 S1
basin '1' long living metastable structure basin '0' minimum free energy structure
Barrier tree for two long living structures
Kinetics of RNA refolding between a long living metastable conformation and the minmum free energy structure
Acknowledgement of support
Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Project No. EU-980189 Siemens AG, Austria The Santa Fe Institute and the Universität Wien The software for producing RNA movies was developed by Robert Giegerich and coworkers at the Universität Bielefeld
Universität Wien
Coworkers
Universität Wien
Walter Fontana, Santa Fe Institute, NM Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Ivo L.Hofacker, Christoph Flamm, Universität Wien, AT Andreas Wernitznig, Michael Kospach, Universität Wien, AT Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan Cupal, Kurt Grünberger, Andreas Svrček-Seiler, Stefan Wuchty Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber
Web-Page for further information: http://www.tbi.univie.ac.at/~pks