What we can Learn in Evolution from RNA Molecules Peter Schuster - - PowerPoint PPT Presentation
What we can Learn in Evolution from RNA Molecules Peter Schuster - - PowerPoint PPT Presentation
What we can Learn in Evolution from RNA Molecules Peter Schuster Institut fr Theoretische Chemie The Santa Fe Institute und Molekulare Strukturbiologie and Santa Fe, New Mexico USA Universitt Wien, Austria Lab Inauguration Meeting
What we can Learn in Evolution from RNA Molecules
Peter Schuster
Institut für Theoretische Chemie und Molekulare Strukturbiologie Universität Wien, Austria The Santa Fe Institute Santa Fe, New Mexico USA and
Lab Inauguration Meeting Köln, 03.12.2004
Web-Page for further information: http://www.tbi.univie.ac.at/~pks
1. RNA and properties and function 2. RNA structures 3. Neutral networks and intersections 4. RNA evolution in silico 5. Intersection molecules and RNA switches 6. Neutrality in evolution and design
1. RNA and properties and function 2. RNA structures 3. Neutral networks and intersections 4. RNA evolution in silico 5. Intersection molecules and RNA switches 6. Neutrality in evolution and design
RNA
RNA as scaffold for supramolecular complexes
ribosome ? ? ? ? ? RNA as transmitter of genetic information
DNA
...AGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUC...
messenger-RNA protein transcription translation RNA as
- f genetic information
working copy
RNA is modified by epigenetic control RNA RNA editing Alternative splicing of messenger
Functions of RNA molecules
RNA is the catalytic subunit in supramolecular complexes
RNA as regulator of gene expression Gene silencing by small interfering RNAs Allosteric control of transcribed RNA
Riboswitches metabolites controlling transcription and translation through
The world as a precursor of the current + biology RNA DNA protein
RNA as catalyst Ribozyme RNA as adapter molecule
G A C . . . C U G . . .
leu genetic code
RNA as carrier of genetic information
RNA viruses and retroviruses RNA evolution in vitro Evolutionary biotechnology RNA aptamers, artificial ribozymes, allosteric ribozymes
Examples of ‘natural selection’ with RNA molecules
An example of ‘artificial selection’ with RNA molecules or ‘breeding’ of biomolecules
1. RNA and properties and function 2. RNA structures 3. Neutral networks and intersections 4. RNA evolution in silico 5. Intersection molecules and RNA switches 6. Neutrality in evolution and design
O CH2 OH O O P O O O
N1
O CH2 OH O P O O O
N2
O CH2 OH O P O O O
N3
O CH2 OH O P O O O
N4
N A U G C
k =
, , ,
3' - end 5' - end Na Na Na Na
5'-end 3’-end
GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG 3'-end 5’-end
70 60 50 40 30 20 10
Definition of RNA structure
Definition and physical relevance of RNA secondary structures
RNA secondary structures are listings of Watson-Crick and GU wobble base pairs, which are free of knots and pseudokots. „Secondary structures are folding intermediates in the formation of full three-dimensional structures.“ D.Thirumalai, N.Lee, S.A.Woodson, and D.K.Klimov. Annu.Rev.Phys.Chem. 52:751-762 (2001):
The Vienna RNA-Package: A library of routines for folding, inverse folding, sequence and structure alignment, cofolding, kinetic folding, …
RNA sequence RNA structure
- f minimal free
energy
RNA folding: Structural biology, spectroscopy of biomolecules, understanding molecular function Empirical parameters Biophysical chemistry: thermodynamics and kinetics
Sequence, structure, and design
Inverse folding of RNA: Biotechnology, design of biomolecules with predefined structures and functions
G G G G G G G G G G G G G G G G U U U U U U U U U U U A A A A A A A A A A A A U C C C C C C C C C C C C 5’-end 3’-end
S1
(h)
S9
(h)
F r e e e n e r g y G
- Minimum of free energy
Suboptimal conformations
S0
(h) S2
(h)
S3
(h)
S4
(h)
S7
(h)
S6
(h)
S5
(h)
S8
(h)
The minimum free energy structures on a discrete space of conformations
UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG
Criterion of Minimum Free Energy
Sequence Space Shape Space
Reference for postulation and in silico verification of neutral networks
1. RNA and properties and function 2. RNA structures 3. Neutral networks and intersections 4. RNA evolution in silico 5. Intersection molecules and RNA switches 6. Neutrality in evolution and design
Sk I. = ( ) ψ
fk f Sk = ( )
Sequence space Structure space Real numbers Mapping from sequence space into structure space and into function
Sk I. = ( ) ψ
fk f Sk = ( )
Sequence space Structure space Real numbers
Sk I. = ( ) ψ
Sequence space Structure space
Sk I. = ( ) ψ
Sequence space Structure space
The pre-image of the structure Sk in sequence space is the neutral network Gk
CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T A C A C
Hamming distance d (I ,I ) =
H 1 2
4 d (I ,I ) = 0
H 1 1
d (I ,I ) = d (I ,I )
H H 1 2 2 1
d (I ,I ) d (I ,I ) + d (I ,I )
H H H 1 3 1 2 2 3
- (i)
(ii) (iii)
The Hamming distance between genotypes induces a metric in sequence space
Neutral networks are sets of sequences forming the same object in a phenotype space. The neutral network Gk is, for example, the pre- image of the structure Sk in sequence space: Gk = -1(Sk) π{j | (Ij) = Sk} The set is converted into a graph by connecting all sequences of Hamming distance one. Neutral networks of small biomolecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence
- space. In this approach, nodes are inserted randomly into sequence
space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.
AUCAAUCAG GUCAAUCAC GUCAAUCAU GUCAAUCAA G U C A A U C C G G U C A A U C G G GUCAAUCUG G U C A A U G A G G U C A A U U A G GUCAAUAAG GUCAACCAG G U C A A G C A G GUCAAACAG GUCACUCAG G U C A G U C A G GUCAUUCAG GUCCAUCAG GUCGAUCAG GUCUAUCAG GUGAAUCAG GUUAAUCAG GUAAAUCAG GCCAAUCAG GGCAAUCAG GACAAUCAG UUCAAUCAG CUCAAUCAG
GUCAAUCAG
One-error neighborhood
The surrounding of GUCAAUCAG in sequence space
Degree of neutrality of neutral networks and the connectivity threshold
n = 9 ; 3n = 27
Degree of neutrality of neutral networks and the connectivity threshold
λj = 27 = 0.444
/
12
“j”
Degree of neutrality of neutral networks and the connectivity threshold
λj = 27 = 0.444
/
12
λk =
(k)
j
| | Gk G S S
k k k
= ( ) | ( ) =
- 1
U
- I
I
j j
Degree of neutrality of neutral networks and the connectivity threshold
λj = 27 = 0.444
/
12
λk =
(k)
j
| | Gk λ λ
k cr . . . .
> λ λ
k cr . . . .
< network is connected Gk network is connected Gk not
λ κ
cr = 1 -
- 1 (
1)
/ κ- Connectivity threshold: G S S
k k k
= ( ) | ( ) =
- 1
U
- I
I
j j
Degree of neutrality of neutral networks and the connectivity threshold
λj = 27 = 0.444
/
12
λk =
(k)
j
| | Gk λ λ
k cr . . . .
> λ λ
k cr . . . .
< network is connected Gk network is connected Gk not
λ κ
cr = 1 -
- 1 (
1)
/ κ- Connectivity threshold: G S S
k k k
= ( ) | ( ) =
- 1
U
- I
I
j j
Alphabet size :
- cr
2 0.5 3 0.423 4 0.370 AUGC AUG , UGC AU,GC,DU
Degree of neutrality of neutral networks and the connectivity threshold
Giant Component
A multi-component neutral network formed by a rare structure
A connected neutral network formed by a common structure
Gk Neutral Network
Structure S
k
Gk C
- k
Compatible Set Ck
The compatible set Ck of a structure Sk consists of all sequences which form Sk as its minimum free energy structure (the neutral network Gk) or one of its suboptimal structures.
Sh S1
(h)
S6
(h)
S7
(h)
S5
(h)
S2
(h)
S9
(h)
Free energy G
- Local minimum
Suboptimal conformations
Search for local minima in conformation space
5.10 5.90
2 8
14 15 18 17 23 19 27 22 38 45 25 36 33 39 40 43 413.30 7.40
5 3 7 4 10 9 6
13 12 3 . 1 11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49 31 47 48S0 S1
Kinetic folding
S0 S1 S2 S3 S4 S5 S6 S7 S8 S10 S9
Suboptimal structures
g
Suboptimal structures
Suboptimal secondary structures of an RNA sequence
5.10 5.90
2 8
14 15 18 17 23 19 27 22 38 45 25 36 33 39 40 43 413.30 7.40
5 3 7 4 10 9 6
13 12 3 . 1 11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49 31 47 48S0 S1
Kinetic folding
S0 S1 S2 S3 S4 S5 S6 S7 S8 S10 S9
Suboptimal structures
g
Metastable Stable Suboptimal structures structure
An RNA molecule with two (meta)stable conformations
Structure
C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G
Compatible sequence Structure
5’-end 3’-end
C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G G G G G G C C C C G G G G C C C C C C C U A U U G U A A A A U
Compatible sequence Structure
5’-end 3’-end
C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G G G G G G C C C C U U G G G G G C C C C C C C U U A A A A A U
Compatible sequence Structure
5’-end 3’-end
Single nucleotides: A U G C , , ,
Single bases pairs are varied independently
C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G C C C C G G G G C C G G G G G C C C C C U A U U G U A A A A U
Compatible sequence Structure
5’-end 3’-end
Base pairs: AU , UA GC , CG GU , UG
Base pairs are varied in strict correlation
Structure S Structure S
1
The intersection of two compatible sets is always non empty: C0 C1
Reference for the definition of the intersection and the proof of the intersection theorem
5.10 5.90
2 8
14 15 18 17 23 19 27 22 38 45 25 36 33 39 40 43 413.30 7.40
5 3 7 4 10 9 6
13 12 3.10 11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49 31 47 48S0 S1
Kinetic folding
S0 S1 S2 S3 S4 S5 S6 S7 S8 S10 S9
Suboptimal structures
lim t finite folding time
A typical energy landscape of a sequence with two (meta)stable comformations
1. RNA and properties and function 2. RNA structures 3. Neutral networks and intersections 4. RNA evolution in silico 5. Intersection molecules and RNA switches 6. Neutrality in evolution and design
Computer simulation of RNA optimization
Walter Fontana and Peter Schuster, Biophysical Chemistry 26:123-147, 1987 Walter Fontana, Wolfgang Schnabl, and Peter Schuster, Phys.Rev.A 40:3301-3321, 1989
Walter Fontana, Wolfgang Schnabl, and Peter Schuster, Phys.Rev.A 40:3301-3321, 1989
Evolution in silico
- W. Fontana, P. Schuster,
Science 280 (1998), 1451-1455
Stock Solution Reaction Mixture
Replication rate constant: fk = / [ + dS
(k)]
dS
(k) = dH(Sk,S)
Selection constraint: Population size, N = # RNA molecules, is controlled by the flow Mutation rate: p = 0.001 / site replication N N t N ± ≈ ) ( The flowreactor as a device for studies of evolution in vitro and in silico
f0 f f1 f2 f3 f4 f6 f5 f7
Replication rate constant: fk = / [ + dS
(k)]
dS
(k) = dH(Sk,S)
Evaluation of RNA secondary structures yields replication rate constants
Hamming distance d (S ,S ) =
H 1 2
4 d (S ,S ) = 0
H 1 1
d (S ,S ) = d (S ,S )
H H 1 2 2 1
d (S ,S ) d (S ,S ) + d (S ,S )
H H H 1 3 1 2 2 3
- (i)
(ii) (iii)
The Hamming distance between structures in parentheses notation forms a metric in structure space
5'-End 3'-End
70 60 50 40 30 20 10
Randomly chosen initial structure Phenylalanyl-tRNA as target structure
space Sequence Concentration
Master sequence Mutant cloud “Off-the-cloud” mutations
The molecular quasispecies in sequence space
S{ = ( ) I{ f S
{ {
ƒ = ( )
S{ f{ I{
Mutation Genotype-Phenotype Mapping Evaluation of the Phenotype
Q{
j
I1 I2 I3 I4 I5 In
Q
f1 f2 f3 f4 f5 fn
I1 I2 I3 I4 I5 I{ In+1 f1 f2 f3 f4 f5 f{ fn+1
Q
Evolutionary dynamics including molecular phenotypes
In silico optimization in the flow reactor: Evolutionary trajectory Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory
44
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Final conformation of optimization
44 43
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of the last step 43 44
44 43 42 41 40 39 Evolutionary process Reconstruction
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of the relay series
Transition inducing point mutations Neutral point mutations
Change in RNA sequences during the final five relay steps 39 44
10 08 12 14 Time (arbitrary units) Average structure distance to target dS
- 500
250 20 10
Uninterrupted presence Evolutionary trajectory Number of relay step
28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations Neutral point mutations
Neutral genotype evolution during phenotypic stasis
AUGC GC Movies of optimization trajectories over the AUGC and the GC alphabet
1. RNA and properties and function 2. RNA structures 3. Neutral networks and intersections 4. RNA evolution in silico 5. Intersection molecules and RNA switches 6. Neutrality in evolution and design
A ribozyme switch
E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452
Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis--virus (B)
The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures
Two neutral walks through sequence space with conservation of structure and catalytic activity
Structure S Structure S
1
The intersection of two compatible sets is always non empty: C0 C1
- J. H. A. Nagel, C. Flamm, I. L. Hofacker, K. Franke, M. H. de Smit, P. Schuster, and
- C. W. A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin
formation, in press 2004.
- J. H. A. Nagel, J. Møller-Jensen, C. Flamm, K. J. Öistämö, J. Besnard, I. L. Hofacker,
- A. P. Gultyaev, M. H. de Smit, P. Schuster, K. Gerdes and C. W. A. Pleij. The refolding
mechanism of the metastable structure in the 5’-end of the hok mRNA of plasmid R1, submitted 2004.
J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation, in press 2004.
JN2C
A A A G A A A U U U C U U U U U U U U U U U U U UC U U U U U U G G G G G G G G G C C C C C A G A A A U G G G C C C G G C A A G A G C G C A G A A G G C C C
5' 5' 3' 3'
CUGUUUUUGCA U AGCUUCUGUUG GCAGAAGC GCAGAAGC
- 19.5 kcal·mol
- 1
- 21.9 kcal·mol
- 1
A A A B B B C C C
3 3 3 15 15 15 36 36 36 24 24 24
JN1LH
1D 1D 1D 2D 2D 2D R R R
G GGGUGGAAC GUUC GAAC GUUCCUCCC CACGAG CACGAG CACGAG
- 28.6 kcal·mol
- 1
G/
- 31.8 kcal·mol
- 1
G G G G G G C C C C C C A A U U U U G G C C U U A A G G G C C C A A A A G C G C A A G C /G
- 28.2 kcal·mol
- 1
G G G G G G GG CCC C C C C C U G G G G C C C C A A A A A A A A U U U U U G G C C A A
- 28.6 kcal·mol
- 1
3 3 3 13 13 13 23 23 23 33 33 33 44 44 44
5' 5' 3’ 3’
J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation, in press 2004.
4 5 8 9 11
1 9 2 2 4 2 5 2 7 3 3 3 4
36
38 39 41 46 47
3
49
1
2 6 7 10
1 2 1 3 1 4 1 5 1 6 1 7 1 8 2 1 22 2 3 2 6 2 8 2 9 3 3 1 32 3 5 3 7
40
4 2 4 3 44 45 48 50
- 26.0
- 28.0
- 30.0
- 32.0
- 34.0
- 36.0
- 38.0
- 40.0
- 42.0
- 44.0
- 46.0
- 48.0
- 50.0
2.77 5.32 2 . 9 3.4 2.36 2 . 4 4 2.44 2.44 1.46 1.44 1.66
1.9
2.14
2.51 2.14 2.51
2 . 1 4 1 . 4 7
1.49
3.04 2.97 3.04 4.88 6.13 6 . 8 2.89
Free energy [kcal / mole]
J1LH barrier tree
1. RNA and properties and function 2. RNA structures 3. Neutral networks and intersections 4. RNA evolution in silico 5. Intersection molecules and RNA switches 6. Neutrality in evolution and design
„...Variations neither useful not injurious would not be affected by natural selection, and would be left either a fluctuating element, as perhaps we see in certain polymorphic species, or would ultimately become fixed, owing to the nature of the
- rganism and the nature of the conditions.
...“
Charles Darwin, Origin of species (1859)
Motoo Kimura’s Populationsgenetik der Neutral Evolution. Evolutionary rate at the molecular level. Nature 217: 624-626, 1955. The Neutral Theory of Molecular Evolution. Cambridge University Press. Cambridge, UK, 1983.
Mount Fuji
Example of a smooth landscape on Earth
Dolomites Bryce Canyon
Examples of rugged landscapes on Earth
Genotype Space Fitness
Start of Walk End of Walk
Evolutionary optimization in absence of neutral paths in sequence space
Genotype Space F i t n e s s
Start of Walk End of Walk Random Drift Periods Adaptive Periods
Evolutionary optimization including neutral paths in sequence space
Grand Canyon
Example of a landscape on Earth with ‘neutral’ ridges and plateaus
Conformational and mutational landscapes of biomolecules as well as fitness landscapes of evolutionary biology are rugged.
Genotype Space Fitness Start of Walk End of Walk
Adaptive or non-descending walks on rugged landscapes end commonly at one of the low lying local maxima.
Genotype Space Fitness Start of Walk End of Walk
Selective neutrality in the form of neutral networks plays an active role in evolutionary optimization and enables populations to reach high local maxima or even the global optimum.
Evolutionary design
- f aptamers
Evolutionary design
- f allosteric RNA
molecules Evolutionary design
- f ribozymes
Evolutionary design
- f deoxyribozymes
Evolutionary design
- f RNA and DNA
Examples of evolutionary design of RNA or DNA molecules
Evolutionary design
- f aptamers
Evolutionary design
- f allosteric RNA
molecules Evolutionary design
- f ribozymes
Evolutionary design
- f deoxyribozymes
Design of small interfering RNA molecules Design of riboswitches Primer design for PCR
in situ
Engineering of ribosomal protein synthesis
Design of RNA and DNA molecules
Examples of evolutionary and rational design of RNA and DNA molecules
Acknowledgement of support
Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Project No. EU-980189 Austrian Genome Research Program – GEN-AU Siemens AG, Austria Universität Wien and the Santa Fe Institute
Universität Wien
Coworkers
Walter Fontana, Harvard Medical School, MA Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Jord Nagel, Kees Pleij, Universiteit Leiden, NL Ivo L.Hofacker, Christoph Flamm, Universität Wien, AT Andreas Wernitznig, Michael Kospach, Universität Wien, AT Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan Cupal, Kurt Grünberger, Andreas Svrček-Seiler, Stefan Wuchty Stefan Bernhart, Lukas Endler Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber, Thomas Wiehe
Universität Wien
Web-Page for further information: http://www.tbi.univie.ac.at/~pks