Evolution of RNA Molecules From Neutral Networks of Structures to - - PowerPoint PPT Presentation
Evolution of RNA Molecules From Neutral Networks of Structures to - - PowerPoint PPT Presentation
Evolution of RNA Molecules From Neutral Networks of Structures to Complex Interaction Patterns Peter Schuster Institut fr Theoretische Chemie der Universitt Wien, Austria and the Santa Fe Institute, NM Collectives formation and
Evolution of RNA Molecules
From Neutral Networks of Structures to Complex Interaction Patterns Peter Schuster
Institut für Theoretische Chemie der Universität Wien, Austria and the Santa Fe Institute, NM
Collectives formation and specialization in biological and social systems Santa Fe, 20.– 22.04.2005
Web-Page for further information: http://www.tbi.univie.ac.at/~pks
1. Folding and inverse folding of RNA 2. Neutral networks 3. Darwinian evolution of RNA 4. Learning by the Darwinian mechanism 5. Folding kinetics and metastable structures 6. Intersections and conformational switches
1. Folding and inverse folding of RNA 2. Neutral networks 3. Darwinian evolution of RNA 4. Learning by the Darwinian mechanism 5. Folding kinetics and metastable structures 6. Intersections and conformational switches
RNA sequence RNA structure
- f minimal free
energy
RNA folding: Structural biology, spectroscopy of biomolecules, understanding molecular function Empirical parameters Biophysical chemistry: thermodynamics and kinetics
One sequence – one structure problem
G G G G G G G G G G G G G G G G U U U U U U U U U U U A A A A A A A A A A A A U C C C C C C C C C C C C 5’-end 3’-end
S1
(h)
S9
(h)
F r e e e n e r g y G
- Minimum of free energy
Suboptimal conformations
S0
(h) S2
(h)
S3
(h)
S4
(h)
S7
(h)
S6
(h)
S5
(h)
S8
(h)
The minimum free energy structures on a discrete space of conformations
RNA sequence RNA structure
- f minimal free
energy
RNA folding: Structural biology, spectroscopy of biomolecules, understanding molecular function Inverse Folding Algorithm Iterative determination
- f a sequence for the
given secondary structure
Sequence, structure, and design
Inverse folding of RNA: Biotechnology, design of biomolecules with predefined structures and functions
1. Folding and inverse folding of RNA 2. Neutral networks 3. Darwinian evolution of RNA 4. Learning by the Darwinian mechanism 5. Folding kinetics and metastable structures 6. Intersections and conformational switches
Minimum free energy criterion
Inverse folding of RNA secondary structures
1st 2nd 3rd trial 4th 5th
The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.
4 2 1 8 16 10 19 9 14 6 13 5 11 3 7 12 21 17 22 18 25 20 26 24 28 27 23 15 29 30 31
Binary sequences are encoded by their decimal equivalents: = 0 and = 1, for example, "0" 00000 = "14" 01110 = , "29" 11101 = , etc. ≡ ≡ ≡ , C CCCCC C C C G GGG GGG G
Mutant class
1 2
3 4
5 Hypercube of dimension n = 5 Decimal coding of binary sequences
Sequence space of binary sequences of chain lenght n = 5
CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T A C A C
Hamming distance d (I ,I ) =
H 1 2
4 d (I ,I ) = 0
H 1 1
d (I ,I ) = d (I ,I )
H H 1 2 2 1
d (I ,I ) d (I ,I ) + d (I ,I )
H H H 1 3 1 2 2 3
- (i)
(ii) (iii)
The Hamming distance between sequences induces a metric in sequence space
Mapping from sequence space into structure space and into function
Hamming distance d (S ,S ) =
H 1 2
4 d (S ,S ) = 0
H 1 1
d (S ,S ) = d (S ,S )
H H 1 2 2 1
d (S ,S ) d (S ,S ) + d (S ,S )
H H H 1 3 1 2 2 3
- (i)
(ii) (iii)
The Hamming distance between structures in parentheses notation forms a metric in structure space
The pre-image of the structure Sk in sequence space is the neutral network Gk
Properties of RNA sequence to secondary structure mapping 1. More sequences than structures 2. Few common versus many rare structures 3. Shape space covering of common structures 4. Neutral networks of common structures are connected
1. Folding and inverse folding of RNA 2. Neutral networks 3. Darwinian evolution of RNA 4. Learning by the Darwinian mechanism 5. Folding kinetics and metastable structures 6. Intersections and conformational switches
Plus Strand Plus Strand Minus Strand Plus Strand Plus Strand Minus Strand
3' 3' 3' 3' 3' 5' 5' 5' 3' 3' 5' 5' 5' +
Komplexdissoziation Template Synthese Template Synthese
Copying of single-strand RNA-molecules: Plus-Minus-Replication
Variation of the RNA sequence through copying errors
Ij In I2 Ii I1 I j I j I j I j I j I j
+ + + + +
(A) + fj Qj1 fj Qj2 fj Qji fj Qjj fj Qjn Q (1- )
ij
- d(i,j)
d(i,j)
=
l
p p
p .......... Error rate per digit d(i,j) .... Hamming distance between Ii and Ij ........... Chain length of the polynucleotide l
dx / dt = x - x x
i j j i j j
Σ
; Σ = 1 ; f f x
j j j i
Φ Φ = Σ Qji Qij
Σi
= 1 [A] = a = constant [Ii] = xi 0 ;
- i =1,2,...,n ;
Chemical kinetics of replication and mutation as parallel reactions
Replication rate constant: fk = / [ + dS
(k)]
dS
(k) = dH(Sk,S)
Selection constraint: Population size, N = # RNA molecules, is controlled by the flow Mutation rate: p = 0.001 / site replication N N t N ± ≈ ) ( The flowreactor as a device for studies of evolution in vitro and in silico
5'-End 3'-End
70 60 50 40 30 20 10
Randomly chosen initial structure Phenylalanyl-tRNA as target structure
s p a c e Sequence C
- n
c e n t r a t i
- n
Master sequence Mutant cloud
The molecular quasispecies in sequence space
In silico optimization in the flow reactor: Evolutionary trajectory Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory
10 08 12 14 Time (arbitrary units) Average structure distance to target dS
- 500
250 20 10
Uninterrupted presence Evolutionary trajectory Number of relay step
28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations Neutral point mutations
Neutral genotype evolution during phenotypic stasis
1. Folding and inverse folding of RNA 2. Neutral networks 3. Darwinian evolution of RNA 4. Learning by the Darwinian mechanism 5. Folding kinetics and metastable structures 6. Intersections and conformational switches
Element in example 1: The RNA molecule
s p a c e Sequence C
- n
c e n t r a t i
- n
Master sequence Mutant cloud
The molecular quasispecies in sequence space
Evolutionary trajectory Spreading of the population through diffusion on a neutral network Drift of the population center in sequence space
Spread of population in sequence space during a quasistationary epoch: t = 150
Spread of population in sequence space during a quasistationary epoch: t = 170
Spread of population in sequence space during a quasistationary epoch: t = 200
Spread of population in sequence space during a quasistationary epoch: t = 350
Spread of population in sequence space during a quasistationary epoch: t = 500
Spread of population in sequence space during a quasistationary epoch: t = 650
Spread of population in sequence space during a quasistationary epoch: t = 820
Spread of population in sequence space during a quasistationary epoch: t = 825
Spread of population in sequence space during a quasistationary epoch: t = 830
Spread of population in sequence space during a quasistationary epoch: t = 835
Spread of population in sequence space during a quasistationary epoch: t = 840
Spread of population in sequence space during a quasistationary epoch: t = 845
Spread of population in sequence space during a quasistationary epoch: t = 850
Spread of population in sequence space during a quasistationary epoch: t = 855
Element in example 2: The ant worker
Ant colony Random foraging Food source
Foraging behavior of ant colonies
Ant colony Food source detected Food source
Foraging behavior of ant colonies
Ant colony Pheromone trail laid down Food source
Foraging behavior of ant colonies
Ant colony Pheromone controlled trail Food source
Foraging behavior of ant colonies
Evolution of RNA Foraging ants Element RNA nucleotide Individual worker ant Genotype RNA sequence Worker ant collective Phenotype RNA structure Foraging path Learning entity Population of molecules Ant colony Relation between elements Mutation Reorientation of path segment Search process Optimization of structure Optimization of path Search space Sequence space Three-dimensional space Random step Mutation Segment of ant walk Self-enhancing process Replication Secretion of pheromone Measure of activity Mean replication rate Mean pheromone concentration Goal of the search Target structure Richest food source Temporary memory Sequence distribution Pheromone trail
Learning at population or colony level by trial and error
Two examples: (i) RNA model and (ii) ant colony
1. Folding and inverse folding of RNA 2. Neutral networks 3. Darwinian evolution of RNA 4. Learning by the Darwinian mechanism 5. Folding kinetics and metastable structures 6. Intersections and conformational switches
RNA secondary structures derived from a single sequence
Kinetic Folding of RNA Secondary Structures
Christoph Flamm, Walter Fontana, Ivo L. Hofacker, Peter Schuster. RNA folding kinetics at elementary step resolution. RNA 6:325-338, 2000 Christoph Flamm, Ivo L. Hofacker, Sebastian Maurer-Stroh, Peter F. Stadler, Martin Zehl. Design of multistable RNA molecules. RNA 7:325-338, 2001 Christoph Flamm, Ivo L. Hofacker, Peter F. Stadler, Michael T. Wolfinger. Barrier trees of degenerate landscapes. Z.Phys.Chem. 216:155-173, 2002 Michael T. Wolfinger, W. Andreas Svrcek-Seiler, Christoph Flamm, Ivo L. Hofacker, Peter
- F. Stadler. Efficient computation of RNA folding dynamics.
J.Phys.A: Math.Gen. 37:4731-4741, 2004
Mean folding curves for three small RNA molecules with different folding behavior
I1 = ACUGAUCGUAGUCAC I2 = AUUGAGCAUAUUCAC I3 = CGGGCUAUUUAGCUG S0 = • • ( ( ( ( • • • • ) ) ) ) •
Sh S1
(h)
S6
(h)
S7
(h)
S5
(h)
S2
(h)
S9
(h)
Free energy G
- Local minimum
Suboptimal conformations
Search for local minima in conformation space
F r e e e n e r g y G
- "Reaction coordinate"
Sk S{ Saddle point T
{ k
F r e e e n e r g y G
- Sk
S{ T
{ k
"Barrier tree"
Definition of a ‚barrier tree‘
I1 = ACUGAUCGUAGUCAC S0 S1 S2 S3 O
Example of an unefficiently folding small RNA molecule with n = 15
I2 = AUUGAGCAUAUUCAC S0 S1 S4 S2 S3 O
Example of an easily folding small RNA molecule with n = 15
I3 = CGGGCUAUUUAGCUG
S0 S1 S2 S3 O
Example of an easily folding and especially stable small RNA molecule with n = 15
- pen chain
A nucleic acid molecule folding in two dominant conformations
Folding dynamics of the sequence GGCCCCUUUGGGGGCCAGACCCCUAAAAAGGGUC
1. Folding and inverse folding of RNA 2. Neutral networks 3. Darwinian evolution of RNA 4. Learning by the Darwinian mechanism 5. Folding kinetics and metastable structures 6. Intersections and conformational switches
Gk Neutral Network
Structure S
k
Gk C
- k
Compatible Set Ck
The compatible set Ck of a structure Sk consists of all sequences which form Sk as its minimum free energy structure (the neutral network Gk) or one of its suboptimal structures.
Structure S Structure S
1
The intersection of two compatible sets is always non empty: C0 C1
The barrier tree connecting S1 and S0
A ribozyme switch
E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452
Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis--virus (B)
The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures
Two neutral walks through sequence space with conservation of structure and catalytic activity
- J. H. A. Nagel, C. Flamm, I. L. Hofacker, K. Franke, M. H. de Smit, P. Schuster, and
- C. W. A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin
formation, in press 2005.
- J. H. A. Nagel, J. Møller-Jensen, C. Flamm, K. J. Öistämö, J. Besnard, I. L. Hofacker,
- A. P. Gultyaev, M. H. de Smit, P. Schuster, K. Gerdes and C. W. A. Pleij. The refolding
mechanism of the metastable structure in the 5’-end of the hok mRNA of plasmid R1, submitted 2005.
J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation, in press 2004.
JN2C
A A A G A A A U U U C U U U U U U U U U U U U U UC U U U U U U G G G G G G G G G C C C C C A G A A A U G G G C C C G G C A A G A G C G C A G A A G G C C C
5' 5' 3' 3'
CUGUUUUUGCA U AGCUUCUGUUG GCAGAAGC GCAGAAGC
- 19.5 kcal·mol
- 1
- 21.9 kcal·mol
- 1
A A A B B B C C C
3 3 3 15 15 15 36 36 36 24 24 24
JN1LH
1D 1D 1D 2D 2D 2D R R R
G GGGUGGAAC GUUC GAAC GUUCCUCCC CACGAG CACGAG CACGAG
- 28.6 kcal·mol
- 1
G/
- 31.8 kcal·mol
- 1
G G G G G G C C C C C C A A U U U U G G C C U U A A G G G C C C A A A A G C G C A A G C /G
- 28.2 kcal·mol
- 1
G G G G G G GG CCC C C C C C U G G G G C C C C A A A A A A A A U U U U U G G C C A A
- 28.6 kcal·mol
- 1
3 3 3 13 13 13 23 23 23 33 33 33 44 44 44
5' 5' 3’ 3’
J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation, in press 2004.
4 5 8 9 11
1 9 2 2 4 2 5 2 7 3 3 3 4
36
38 39 41 46 47
3
49
1
2 6 7 10
1 2 1 3 1 4 1 5 1 6 1 7 1 8 2 1 22 2 3 2 6 2 8 2 9 3 3 1 32 3 5 3 7
40
4 2 4 3 44 45 48 50
- 26.0
- 28.0
- 30.0
- 32.0
- 34.0
- 36.0
- 38.0
- 40.0
- 42.0
- 44.0
- 46.0
- 48.0
- 50.0
2.77 5.32 2 . 9 3.4 2.36 2 . 4 4 2.44 2.44 1.46 1.44 1.66
1.9
2.14
2.51 2.14 2.51
2 . 1 4 1 . 4 7
1.49
3.04 2.97 3.04 4.88 6.13 6 . 8 2.89
Free energy [kcal / mole]
J1LH barrier tree
Conclusions
I. The Darwinian mechanism of optimization through variation and selection operates equally well on simple and complex repoducing elements because only the number of fertile offspring counts. II. Darwinian learning through trial and error takes place on the level
- f populations. It does not require sophisticated elements and
- ccurs even with self-replicating molecules.
III. Even simple molecules have the capacity for a rich repertoire of properties and interactions. For example, they can have multiple structures and functions.
Acknowledgement of support
Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Project No. EU-980189 Austrian Genome Research Program – GEN-AU Siemens AG, Austria Universität Wien and the Santa Fe Institute
Universität Wien
Coworkers
Walter Fontana, Harvard Medical School, MA Christian Forst, Christian Reidys, Los Alamos National Laboratory, NM Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Jord Nagel, Kees Pleij, Universiteit Leiden, NL Ivo L.Hofacker, Christoph Flamm, Andreas Svrček-Seiler Universität Wien, AT Stefan Bernhart, Ulrike Langhammer, Ulrike Mückstein, Universität Wien, AT Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber Institut für Molekulare Biotechnologie, Jena, GE Andreas Wernitznig, Michael Kospach, Kurt Grünberger, Stefan Wuchty
Universität Wien