RNA evolution in vitro and in silico Peter Schuster Institut fr - - PowerPoint PPT Presentation
RNA evolution in vitro and in silico Peter Schuster Institut fr - - PowerPoint PPT Presentation
RNA evolution in vitro and in silico Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, sterreich und The Santa Fe Institute, Santa Fe, New Mexico, USA Institute of Structural Molecular Biology, UCL London, 11.11.2009
RNA evolution in vitro and in silico Peter Schuster
Institut für Theoretische Chemie, Universität Wien, Österreich und The Santa Fe Institute, Santa Fe, New Mexico, USA
Institute of Structural Molecular Biology, UCL London, 11.11.2009
Web-Page for further information: http://www.tbi.univie.ac.at/~pks
RNA
RNA as scaffold for supramolecular complexes
ribosome ? ? ? ? ?
RNA – The magic molecule
The world as a precursor of the current + biology RNA DNA protein
RNA as catalyst Ribozyme
RNA as carrier of genetic information
RNA viruses and retroviruses RNA evolution in vitro
1. RNA replication in vitro and in vivo 2. Evolution of RNA molecules 3. RNA sequences and structures 4. Evolutionary optimization of RNA structure
- 1. RNA replication in vitro and in vivo
2. Evolution of RNA molecules 3. RNA sequences and structures 4. Evolutionary optimization of RNA structure
Evolution of RNA molecules based on Qβ phage
D.R.Mills, R.L.Peterson, S.Spiegelman, An extracellular Darwinian experiment with a self-duplicating nucleic acid molecule. Proc.Natl.Acad.Sci.USA 58 (1967), 217-224 S.Spiegelman, An approach to the experimental analysis of precellular evolution. Quart.Rev.Biophys. 4 (1971), 213-253
- C. Weissmann, The making of a phage. FEBS Letters 40 (1974), S10-S18
C.K.Biebricher, Darwinian selection of self-replicating RNA molecules. Evolutionary Biology 16 (1983), 1-52 G.Bauer, H.Otten, J.S.McCaskill, Travelling waves of in vitro evolving RNA. Proc.Natl.Acad.Sci.USA 86 (1989), 7937-7941 C.K.Biebricher, W.C.Gardiner, Molecular evolution of RNA in vitro. Biophysical Chemistry 66 (1997), 179-192 G.Strunk, T.Ederhof, Machines for automated evolution experiments in vitro based on the serial transfer concept. Biophysical Chemistry 66 (1997), 193-202 F.Öhlenschlager, M.Eigen, 30 years later – A new approach to Sol Spiegelman‘s and Leslie Orgel‘s in vitro evolutionary studies. Orig.Life Evol.Biosph. 27 (1997), 437-457
RNA sample Stock solution: Q RNA-replicase, ATP, CTP, GTP and UTP, buffer
- Time
1 2 3 4 5 6 69 70 Application of serial transfer to RNA evolution in vitro
Decrease in mean fitness due to quasispecies formation
The increase in RNA production rate during a serial transfer experiment
Stock solution: activated monomers, ATP, CTP, GTP, UTP (TTP); a replicase, an enzyme that performs complemantary replication; buffer solution
The flowreactor is a device for studies of evolution in vitro and in silico.
James D. Watson, 1928-, and Francis H.C. Crick, 1916-2004 Nobel prize 1962
1953 – 2003 fifty years double helix The three-dimensional structure of a short double helical stack of B-DNA
Complementary replication is the simplest copying mechanism
- f RNA.
Complementarity is determined by Watson-Crick base pairs: GC and A=U
RNA replication by Q-replicase
- C. Weissmann, The making of a phage.
FEBS Letters 40 (1974), S10-S18
1 1 2 2 2 1
and x f dt dx x f dt dx = =
2 1 2 1 2 1 2 1 2 1 2 1
, , , , f f f f x f x = − = + = = = ξ ξ η ξ ξ ζ ξ ξ
ft ft
e t e t ) ( ) ( ) ( ) ( ζ ζ η η = =
−
Complementary replication as the simplest molecular mechanism of reproduction
Kinetics of RNA replication
C.K. Biebricher, M. Eigen, W.C. Gardiner, Jr. Biochemistry 22:2544-2559, 1983
Christof K. Biebricher 1941-2009 metastable stable C.K. Biebricher, R. Luce. 1992. In vitro recombination and terminal recombination of RNA by Q replicase. The EMBO Journal 11:5129-5135.
Gfold = - 68.5 kcal / mole Gfold = - 98.4 kcal / mole Gfold = - 277.4 kcal / mole Gbind = - 72.1 kcal / mole
SV11 plus strand
Gfold = - 71.1 kcal / mole Gfold = - 101.9 kcal / mole Gfold = - 277.4 kcal / mole Gbind = - 72.1 kcal / mole
SV11 minus strand
- J. Demez. European and mediterranean plant protection organization archive. France
R.W. Hammond, R.A. Owens. Molecular Plant Pathology Laboratory, US Department of Agriculture
Plant damage by viroids
Nucleotide sequence and secondary structure
- f the potato spindle tuber viroid RNA
H.J.Gross, H. Domdey, C. Lossow, P Jank,
- M. Raba, H. Alberty, and H.L. Sänger.
Nature 273:203-208 (1978)
Nucleotide sequence and secondary structure
- f the potato spindle tuber viroid RNA
H.J.Gross, H. Domdey, C. Lossow, P Jank,
- M. Raba, H. Alberty, and H.L. Sänger.
Nature 273:203-208 (1978)
Vienna RNA Package 1.8.2 Biochemically supported structure
An example of two ribozymes growing exponentially by cross-catalysis.
T.A. Lincoln, G.F. Joyce. 2009. Self-sustained replication of an RNA enzyme. Science 323:1229-1232
An example of two ribozymes growing exponentially by cross-catalysis.
T.A. Lincoln, G.F. Joyce. 2009. Self-sustained replication of an RNA enzyme. Science 323:1229-1232
1. RNA replication in vitro and in vivo
- 2. Evolution of RNA molecules
3. RNA sequences and structures 4. Evolutionary optimization of RNA structure
1977 1988 1971
Chemical kinetics of molecular evolution
Replication and mutation are parallel chemical reactions.
Manfred Eigen 1927 -
n j Φ x x f Q x
j i i n i ji j
, , 2 , 1 ; dt d
1
K = − = ∑ =
Mutation and (correct) replication as parallel chemical reactions
- M. Eigen. 1971. Naturwissenschaften 58:465,
- M. Eigen & P. Schuster.1977. Naturwissenschaften 64:541, 65:7 und 65:341
Quasispecies
Driving virus populations through threshold
The error threshold in replication
Chain length and error threshold
n p n p n p p n p Q
n
σ σ σ σ σ ln : constant ln : constant ln ) 1 ( ln 1 ) 1 (
max max
≈ ≈ − ≥ − ⋅ ⇒ ≥ ⋅ − = ⋅ K K
sequence master
- f
y superiorit length chain rate error accuracy n replicatio ) 1 ( K K K K
∑ ≠
= − =
m j j m n
f f σ n p p Q
A fitness landscape showing an error threshold
Error rate p = 1-q
0.00 0.05 0.10
Quasispecies Uniform distribution
Stationary population or quasispecies as a function of the mutation or error rate p
Error threshold on a single peak fitness landscape with n = 50 and = 10
Fitness landscapes showing error thresholds
Error threshold: Individual sequences n = 10, = 2 and d = 0, 1.0, 1.85
1. RNA replication in vitro and in vivo 2. Evolution of RNA molecules
- 3. RNA sequences and structures
4. Evolutionary optimization of RNA structure
The notion of RNA (secondary) structure
- 1. Minimum free energy structure
- 2. Many sequences one structure
- 3. Suboptimal structures
- 4. Kinetic structures
Extension of the notion of structure
N = 4n NS < 3n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ {AU,CG,GC,GU,UA,UG} A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
The notion of RNA (secondary) structure
- 1. Minimum free energy structure
- 2. Many sequences one structure
- 3. Suboptimal structures
- 4. Kinetic structures
The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.
A mapping and its inversion
- Gk =
( ) | ( ) =
- 1
U
- S
I S
k j j k
I
( ) = I S
j k Space of genotypes: = { I
S I I I I I S S S S S
1 2 3 4 N 1 2 3 4 M
, , , , ... , } ; Hamming metric Space of phenotypes: , , , , ... , } ; metric (not required) N M = {
many genotypes
- ne phenotype
An example of ‘artificial selection’ with RNA molecules or ‘breeding’ of biomolecules
tobramycin RNA aptamer, n = 27
Formation of secondary structure of the tobramycin binding RNA aptamer with KD = 9 nM
- L. Jiang, A. K. Suri, R. Fiala, D. J. Patel, Saccharide-RNA recognition in an aminoglycoside antibiotic-
RNA aptamer complex. Chemistry & Biology 4:35-50 (1997)
The three-dimensional structure of the tobramycin aptamer complex
- L. Jiang, A. K. Suri, R. Fiala, D. J. Patel,
Chemistry & Biology 4:35-50 (1997)
RNA 9:1456-1463, 2003
Evidence for neutral networks and shape space covering
Evidence for neutral networks and
intersection of apatamer functions
The notion of RNA (secondary) structure
- 1. Minimum free energy structure
- 2. Many sequences one structure
- 3. Suboptimal structures
- 4. Kinetic structures
Extension of the notion of structure
GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG (((((.((((..(((......)))..)))).))).))............. -7.30 ..........((((((.((....((((.....))))...))...)))))) -6.70 ..........((((((.((....(((((...)))))...))...)))))) -6.60 ..(((.((((..(((......)))..)))).)))..((((...))))... -6.10 (((((.((((..(((......)))..)))).))).))..(........). -6.00 (((((.((((..((........))..)))).))).))............. -6.00 .(((.((..((((..((......))..))))..))....)))........ -6.00 GGCUAUCGUACGUUUACACAAAAGUCUACGUUGGACCCAGGCAUUGGACG (((((.((((..(((......)))..)))).))).))............. -7.30 .(((.((..((((..((......))..))))..))....)))........ -6.50 .(((.....((((..((......))..))))((....)))))........ -6.30 ..(((.((((..(((......)))..)))).)))..((((...))))... -6.10 (((((.((((..(((......)))..)))).))).))..(........). -6.00 (((((.((((..((........))..)))).))).))............. -6.00 .(((...((((((..((......))..))))...))...)))........ -6.00 GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAAUGGACG (((((.((((..(((......)))..)))).))).))............. -7.30 ..(((.((((..(((......)))..)))).)))..(((.....)))... -7.20 ..........((((((.((....((((.....))))...))...)))))) -6.70 ..........((((((.((....(((((...)))))...))...)))))) -6.60 (((((.((((..(((......)))..)))).))).))((.....)).... -6.50 (.(((.((((..(((......)))..)))).))).)(((.....)))... -6.30 .((((.((((..(((......)))..)))).))).)(((.....)))... -6.30 .....(((.((((..((......))..)))))))..(((.....)))... -6.30 (.(((.((((..(((......)))..)))).)))..(((.....))).). -6.10 .....((..((((..((......))..))))..)).(((.....)))... -6.10 ......(((.((((...((....((((.....))))...)).)))).))) -6.10 (((((.((((..(((......)))..)))).))).))..(........). -6.00 (((((.((((..((........))..)))).))).))............. -6.00 .(((.((..((((..((......))..))))..))....)))........ -6.00 ......(((.((((...((....(((((...)))))...)).)))).))) -6.00
The notion of RNA (secondary) structure
- 1. Minimum free energy structure
- 2. Many sequences one structure
- 3. Suboptimal structures
- 4. Kinetic structures
Extension of the notion of structure
The Folding Algorithm
A sequence I specifies an energy ordered set of compatible structures S(I):
S(I) = {S0 , S1 , … , Sm , O}
A trajectory Tk(I) is a time ordered series of structures in S(I). A folding trajectory is defined by starting with the open chain O and ending with the global minimum free energy structure S0 or a metastable structure Sk which represents a local energy minimum:
T0(I) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , S0} Tk(I) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , Sk}
Master equation
( )
1 , , 1 , ) ( ) (
1 1 1
+ = − = − =
∑ ∑ ∑
+ = + = + =
m k k P P k t P t P dt dP
m i ki k i m i ik m i ki ik k
K
Transition probabilities Pij(t) = Prob{Si→Sj} are defined by
Pij(t) = Pi(t) kij = Pi(t) exp(-∆Gij/2RT) / Σi Pji(t) = Pj(t) kji = Pj(t) exp(-∆Gji/2RT) / Σj exp(-∆Gki/2RT)
The symmetric rule for transition rate parameters is due to Kawasaki (K. Kawasaki, Diffusion constants near the critical point for time dependent Ising models. Phys.Rev. 145:224-230, 1966).
∑
+ ≠ =
= Σ
2 , 1 m i k k k
Formulation of kinetic RNA folding as a stochastic process
F r e e e n e r g y G
- "Reaction coordinate"
Sk S{ Saddle point T
{ k
F r e e e n e r g y G
- Sk
S{ T
{ k
"Barrier tree"
Definition of a ‚barrier tree‘
JN1LH
1D 1D 1D 2D 2D 2D R R R
G GGGUGGAAC GUUC GAAC GUUCCUCCC CACGAG CACGAG CACGAG
- 28.6 kcal·mol
- 1
G/
- 31.8 kcal·mol
- 1
G G G G G G C C C C C C A A U U U U G G C C U U A A G G G C C C A A A A G C G C A A G C /G
- 28.2 kcal·mol
- 1
G G G G G G GG CCC C C C C C U G G G G C C C C A A A A A A A A U U U U U G G C C A A
- 28.6 kcal·mol
- 1
3 3 3 13 13 13 23 23 23 33 33 33 44 44 44
5' 5' 3’ 3’
An experimental RNA switch
4 5 8 9 11
1 9 2 2 4 2 5 2 7 3 3 3 4
36
38 39 41 46 47
3
49
1
2 6 7 10
1 2 1 3 1 4 1 5 1 6 1 7 1 8 2 1 22 2 3 2 6 2 8 2 9 3 3 1 32 3 5 3 7
40
4 2 4 3 44 45 48 50
- 26.0
- 28.0
- 30.0
- 32.0
- 34.0
- 36.0
- 38.0
- 40.0
- 42.0
- 44.0
- 46.0
- 48.0
- 50.0
2.77 5.32 2 . 9 3.4 2.36 2 . 4 4 2.44 2.44 1.46 1.44 1.66
1.9
2.14
2.51 2.14 2.51
2 . 1 4 1 . 4 7
1.49
3.04 2.97 3.04 4.88 6.13 6 . 8 2.89
Free energy [kcal / mole]
J1LH barrier tree
A ribozyme switch
E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452
Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis--virus (B)
The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures
Two neutral walks through sequence space with conservation of structure and catalytic activity
The thiamine-pyrophosphate riboswitch
- S. Thore, M. Leibundgut, N. Ban.
Science 312:1208-1211, 2006.
- M. Mandal, B. Boese, J.E. Barrick,
W.C. Winkler, R.R, Breaker. Cell 113:577-586 (2003)
1. RNA replication in vitro and in vivo 2. Evolution of RNA molecules 3. RNA sequences and structures
- 4. Evolutionary optimization of RNA structure
Computer simulation using Gillespie‘s algorithm: Replication rate constant: fk = / [ + dS
(k)]
dS
(k) = dH(Sk,S)
Selection constraint: Population size, N = # RNA molecules, is controlled by the flow Mutation rate: p = 0.001 / site replication N N t N ± ≈ ) ( The flowreactor as a device for studies
- f evolution in vitro and in silico
Evolution in silico
- W. Fontana, P. Schuster,
Science 280 (1998), 1451-1455
Phenylalanyl-tRNA as target structure Structure of randomly chosen initial sequence
In silico optimization in the flow reactor: Evolutionary Trajectory
Randomly chosen initial structure Phenylalanyl-tRNA as target structure
28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations change the molecular structure Neutral point mutations leave the molecular structure unchanged
Neutral genotype evolution during phenotypic stasis
A sketch of optimization on neutral networks
Coworkers
Peter Stadler, Bärbel M. Stadler, Universität Leipzig, GE Paul E. Phillipson, University of Colorado at Boulder, CO Heinz Engl, Philipp Kügler, James Lu, Stefan Müller, RICAM Linz, AT Jord Nagel, Kees Pleij, Universiteit Leiden, NL Walter Fontana, Harvard Medical School, MA Martin Nowak, Harvard University, MA Christian Reidys, Nankai University, Tien Tsin, China Christian Forst, Los Alamos National Laboratory, NM Thomas Wiehe, Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE Ivo L.Hofacker, Christoph Flamm, Andreas Svrček-Seiler, Universität Wien, AT Kurt Grünberger, Michael Kospach , Andreas Wernitznig, Stefanie Widder, Stefan Wuchty, Jan Cupal, Stefan Bernhart, Lukas Endler, Ulrike Langhammer, Rainer Machne, Ulrike Mückstein, Erich Bornberg-Bauer, Universität Wien, AT
Universität Wien
Acknowledgement of support
Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU: Bioinformatics Network (BIN) Österreichische Akademie der Wissenschaften Siemens AG, Austria Universität Wien and the Santa Fe Institute
Universität Wien