Modeling Evolution of Molecules New Variations of an Old Theme - - PowerPoint PPT Presentation
Modeling Evolution of Molecules New Variations of an Old Theme - - PowerPoint PPT Presentation
Modeling Evolution of Molecules New Variations of an Old Theme Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA Minisymposium on Evolutionary Dynamics Utrecht,
Modeling Evolution of Molecules
New Variations of an Old Theme Peter Schuster
Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA
Minisymposium on Evolutionary Dynamics Utrecht, 05.03.2008
Web-Page for further information: http://www.tbi.univie.ac.at/~pks
1. Replication and mutation 2. Quasispecies and error thresholds 3. Fitness landscapes and randomization 4. Lethal mutations 5. Ruggedness of natural landscapes 6. Simulation of stochastic phenomena
- 1. Replication and mutation
2. Quasispecies and error thresholds 3. Fitness landscapes and randomization 4. Lethal mutations 5. Ruggedness of natural landscapes 6. Simulation of stochastic phenomena
The three-dimensional structure of a short double helical stack of B-DNA
James D. Watson, 1928- , and Francis Crick, 1916-2004, Nobel Prize 1962
G C and A = T
‚Replication fork‘ in DNA replication The mechanism of DNA replication is ‚semi-conservative‘
Complementary replication is the simplest copying mechanism
- f RNA.
Complementarity is determined by Watson-Crick base pairs: GC and A=U
Chemical kinetics of molecular evolution
- M. Eigen, P. Schuster, `The Hypercycle´, Springer-Verlag, Berlin 1979
Stock solution: activated monomers, ATP, CTP, GTP, UTP (TTP); a replicase, an enzyme that performs complemantary replication; buffer solution Flow rate:
r = R
- 1
The population size N , the number of polynucleotide molecules, is controlled by the flow r
N N t N ± ≈ ) (
The flowreactor is a device for studies of evolution in vitro and in silico.
Complementary replication as the simplest molecular mechanism of reproduction
Equation for complementary replication: [Ii] = xi 0 , fi > 0 ; i=1,2 Solutions are obtained by integrating factor transformation
f x f x f x x f dt dx x x f dt dx = + = − = − =
2 2 1 1 2 1 1 2 1 2 2 1
, , φ φ φ
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
2 1 2 2 1 1 2 2 2 1 1 1 1 2 1 1 2 1 2 1 1 , 2 2 , 1
, ) ( ) ( ) ( , ) ( ) ( ) ( exp ) ( exp ) ( exp exp f f f x f x f x f x f t f f f t f f f t f t f f t x = − = + = − ⋅ − − ⋅ + − ⋅ + ⋅ = γ γ γ γ γ γ ) ( exp as ) ( and ) (
2 1 1 2 2 1 2 1
→ − + → + → ft f f f t x f f f t x
Reproduction of organisms or replication of molecules as the basis of selection
( )
{ }
var
2 2 1
≥ = − = = ∑
=
f f f dt dx f dt d
i n i i
φ
Selection equation: [Ii] = xi 0 , fi > 0 Mean fitness or dilution flux, φ (t), is a non-decreasing function of time, Solutions are obtained by integrating factor transformation
( )
f x f x n i f x dt dx
n j j j n i i i i i
= = = = − =
∑ ∑
= = 1 1
; 1 ; , , 2 , 1 , φ φ L
( ) ( ) ( ) ( )
( )
n i t f x t f x t x
j n j j i i i
, , 2 , 1 ; exp exp
1
L = ⋅ ⋅ =
∑
=
Selection between three species with f1 = 1, f2 = 2, and f3 = 3
Variation of genotypes through mutation and recombination
Origin of the replication-mutation equation from the flowreactor
extinction active
Origin of the replication-mutation equation from the flowreactor
1. Replication and mutation
- 2. Quasispecies and error thresholds
3. Fitness landscapes and randomization 4. Lethal mutations 5. Ruggedness of natural landscapes 6. Simulation of stochastic phenomena
Chemical kinetics of replication and mutation as parallel reactions
The replication-mutation equation
Mutation-selection equation: [Ii] = xi 0, fi > 0, Qij 0 Solutions are obtained after integrating factor transformation by means
- f an eigenvalue problem
f x f x n i x x Q f dt dx
n j j j n i i i j n j ji j i
= = = = − =
∑ ∑ ∑
= = = 1 1 1
; 1 ; , , 2 , 1 , φ φ L
( ) ( ) ( ) ( ) ( )
) ( ) ( ; , , 2 , 1 ; exp exp
1 1 1 1
∑ ∑ ∑ ∑
= = − = − =
= = ⋅ ⋅ ⋅ ⋅ =
n i i ki k n j k k n k jk k k n k ik i
x h c n i t c t c t x L l l λ λ
{ } { } { }
n j i h H L n j i L n j i Q f W
ij ij ij i
, , 2 , 1 , ; ; , , 2 , 1 , ; ; , , 2 , 1 , ;
1
L L l L = = = = = = ÷
−
{ }
1 , , 1 , ;
1
− = = Λ = ⋅ ⋅
−
n k L W L
k
L λ
Perron-Frobenius theorem applied to the value matrix W
W is primitive: (i) is real and strictly positive (ii) (iii) is associated with strictly positive eigenvectors (iv) is a simple root of the characteristic equation of W (v-vi) etc. W is irreducible: (i), (iii), (iv), etc. as above (ii)
all for ≠ > k
k
λ λ
λ λ λ
all for ≠ ≥ k
k
λ λ
Formation of a quasispecies in sequence space
Formation of a quasispecies in sequence space
Formation of a quasispecies in sequence space
Formation of a quasispecies in sequence space
Uniform distribution in sequence space
Error rate p = 1-q
0.00 0.05 0.10
Quasispecies Uniform distribution
Quasispecies as a function of the replication accuracy q
Chain length and error threshold
p n p n p n p n p Q
n
σ σ σ σ σ ln : constant ln : constant ln ) 1 ( ln 1 ) 1 (
max max
≈ ≈ − ≥ − ⋅ ⇒ ≥ ⋅ − = ⋅ K K
sequence master
- f
y superiorit ) 1 ( length chain rate error accuracy n replicatio ) 1 ( K K K K
∑ ≠
− = − =
m j j m m n
f x f σ n p p Q
Quasispecies
Driving virus populations through threshold
The error threshold in replication
1. Replication and mutation 2. Quasispecies and error thresholds
- 3. Fitness landscapes and randomization
4. Lethal mutations 5. Ruggedness of natural landscapes 6. Simulation of stochastic phenomena
Every point in sequence space is equivalent
Sequence space of binary sequences with chain length n = 5
Fitness landscapes not showing error thresholds
Error thresholds and gradual transitions n = 20 and = 10
Anne Kupczok, Peter Dittrich, Determinats of simulated RNA evolution. J.Theor.Biol. 238:726-735, 2006
Three sources of ruggedness:
1. Variation in fitness values 2. Deviations from uniform error rates 3. Neutrality
Three sources of ruggedness:
- 1. Variation in fitness values
2. Deviations from uniform error rates 3. Neutrality
Fitness landscapes showing error thresholds
Error threshold: Error classes and individual sequences n = 10 and = 2
Error threshold: Individual sequences n = 10, = 2 and d = 0, 1.0, 1.85
Error threshold: Error classes and individual sequences n = 10 and = 1.1
Error threshold: Individual sequences n = 10, = 1.1, d = 1.95, 1.975, 2.00 and seed = 877
Error threshold: Individual sequences n = 10, = 1.1, d = 1.975, and seed = 877, 637, 491
Three sources of ruggedness:
1. Variation in fitness values
- 2. Deviations from uniform error rates
3. Neutrality
Local replication accuracy pk: pk = p + 4 p(1-p) (Xrnd-0.5) , k = 1,2,...,2
Error threshold: Classes n = 10, = 1.1, = 0, 0.3, 0.5, and seed = 877
Error threshold: Classes n = 10, = 1.1, = 0, 0.5, and seed = 299, 877
Three sources of ruggedness:
1. Variation in fitness values 2. Deviations from uniform error rates
- 3. Neutrality
Error threshold: Individual sequences n = 10, = 1.1, d = 1.0
Error threshold: Individual sequences n = 10, = 1.1, d = 1.0
Error threshold: Individual sequences n = 10, = 1.1, d = 1.0
Error threshold: Individual sequences n = 10, = 1.1, d = 1.0
1. Replication and mutation 2. Quasispecies and error thresholds 3. Fitness landscapes and randomization
- 4. Lethal mutations
5. Ruggedness of natural landscapes 6. Simulation of stochastic phenomena 7. Biology in its full complexity
1. Replication and mutation 2. Quasispecies and error thresholds 3. Fitness landscapes and randomization 4. Lethal mutations
- 5. Ruggedness of natural landscapes
6. Simulation of stochastic phenomena
O CH2 OH O O P O O O
N1
O CH2 OH O P O O O
N2
O CH2 OH O P O O O
N3
O CH2 OH O P O O O
N4
N A U G C
k =
, , ,
3' - end 5' - end Na Na Na Na
5'-end 3’-end
GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG
Definition of RNA structure
N = 4n NS < 3n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ {AU,CG,GC,GU,UA,UG} A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
AUCAAUCAG GUCAAUCAC GUCAAUCAU GUCAAUCAA G U C A A U C C G G U C A A U C G G GUCAAUCUG G U C A A U G A G G U C A A U U A G GUCAAUAAG GUCAACCAG G U C A A G C A G GUCAAACAG GUCACUCAG G U C A G U C A G GUCAUUCAG GUCCAUCAG GUCGAUCAG GUCUAUCAG GUGAAUCAG GUUAAUCAG GUAAAUCAG GCCAAUCAG GGCAAUCAG GACAAUCAG UUCAAUCAG CUCAAUCAG
GUCAAUCAG
One-error neighborhood
The surrounding of GUCAAUCAG in sequence space
GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG
One error neighborhood – Surrounding of an RNA molecule in sequence and shape space
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG
One error neighborhood – Surrounding of an RNA molecule in sequence and shape space
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCC AAAGUCUACGUUGGACCCAGGCAUUGGACG
G
One error neighborhood – Surrounding of an RNA molecule in sequence and shape space
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCC AAAGUCUACGUUGGACCCAGGCAUUGGACG
G
G G C U A U C G U A C G U U U A C C
G
A AA G U C U A C G U U G G A C C C A G G C A U U G G A C G C
One error neighborhood – Surrounding of an RNA molecule in sequence and shape space
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGG CCCAGGCAUUGGACG
U
GGCUAUCGUACGUUUACCC AAAGUCUACGUUGGACCCAGGCAUUGGACG
G
G G C U A U C G U A C G U U U A C C
G
A AA G U C U A C G U U G G A C C C A G G C A U U G G A C G C
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGU C C C A G G C A U U G G A C G
One error neighborhood – Surrounding of an RNA molecule in sequence and shape space
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCA UGGACG
C
GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGG CCCAGGCAUUGGACG
U
GGCUAUCGUACGUUUACCC AAAGUCUACGUUGGACCCAGGCAUUGGACG
G
G G C U A U C G U A C G U U U A C C
G
A AA G U C U A C G U U G G A C C C A G G C A U U G G A C G C
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGU C C C A G G C A U U G G A C G
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGG A C C C AG G C A
C
U G G A C G
One error neighborhood – Surrounding of an RNA molecule in sequence and shape space
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCA UGGACG
C
GGCUAUCGUACGU UACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG
G
GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGG CCCAGGCAUUGGACG
U
GGCUAUCGUACGUUUACCC AAAGUCUACGUUGGACCCAGGCAUUGGACG
G
G G C U A U C G U A C G U U U A C C
G
A AA G U C U A C G U U G G A C C C A G G C A U U G G A C G C
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGU C C C A G G C A U U G G A C G
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGG A C C C AG G C A
C
U G G A C G
G G C U A U C G U A C G U
G
U A C C C A A A A G U C U A C G U U G G ACC C A G G C A U U G G A C G
One error neighborhood – Surrounding of an RNA molecule in sequence and shape space
GGCUAUCGUAUGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUAGACG GGCUAUCGUACGUUUACUCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGCUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCCAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUGUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAACGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCUGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCACUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGUCCCAGGCAUUGGACG GGCUAGCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCGAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGCCUACGUUGGACCCAGGCAUUGGACG
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G
One error neighborhood – Surrounding of an RNA molecule in sequence and shape space
Number Mean Value Variance Std.Dev. Total Hamming Distance: 150000 11.647973 23.140715 4.810480 Nonzero Hamming Distance: 99875 16.949991 30.757651 5.545958 Degree of Neutrality: 50125 0.334167 0.006961 0.083434 Number of Structures: 1000 52.31 85.30 9.24 1 (((((.((((..(((......)))..)))).))).))............. 50125 0.334167 2 ..(((.((((..(((......)))..)))).)))................ 2856 0.019040 3 ((((((((((..(((......)))..)))))))).))............. 2799 0.018660 4 (((((.((((..((((....))))..)))).))).))............. 2417 0.016113 5 (((((.((((.((((......)))).)))).))).))............. 2265 0.015100 6 (((((.(((((.(((......))).))))).))).))............. 2233 0.014887 7 (((((..(((..(((......)))..)))..))).))............. 1442 0.009613 8 (((((.((((..((........))..)))).))).))............. 1081 0.007207 9 ((((..((((..(((......)))..))))..)).))............. 1025 0.006833 10 (((((.((((..(((......)))..)))).))))).............. 1003 0.006687 11 .((((.((((..(((......)))..)))).))))............... 963 0.006420 12 (((((.(((...(((......)))...))).))).))............. 860 0.005733 13 (((((.((((..(((......)))..)))).)).)))............. 800 0.005333 14 (((((.((((...((......))...)))).))).))............. 548 0.003653 15 (((((.((((................)))).))).))............. 362 0.002413 16 ((.((.((((..(((......)))..)))).))..))............. 337 0.002247 17 (.(((.((((..(((......)))..)))).))).).............. 241 0.001607 18 (((((.(((((((((......))))))))).))).))............. 231 0.001540 19 ((((..((((..(((......)))..))))...))))............. 225 0.001500 20 ((....((((..(((......)))..)))).....))............. 202 0.001347 G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G
Shadow – Surrounding of an RNA structure in shape space – AUGC alphabet
1. Replication and mutation 2. Quasispecies and error thresholds 3. Fitness landscapes and randomization 4. Lethal mutations 5. Ruggedness of natural landscapes
- 6. Simulation of stochastic phenomena
Phenylalanyl-tRNA as target structure Structure of andomly chosen initial sequence
Evolution in silico
- W. Fontana, P. Schuster,
Science 280 (1998), 1451-1455
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Replication rate constant (Fitness): fk = / [ + dS
(k)]
dS
(k) = dH(Sk,S)
Selection pressure: The population size, N = # RNA moleucles, is determined by the flux: Mutation rate: p = 0.001 / Nucleotide Replication N N t N ± ≈ ) ( The flow reactor as a device for studying the evolution of molecules in vitro and in silico.
In silico optimization in the flow reactor: Evolutionary Trajectory
28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations change the molecular structure Neutral point mutations leave the molecular structure unchanged
Neutral genotype evolution during phenotypic stasis
Randomly chosen initial structure Phenylalanyl-tRNA as target structure
Evolutionary trajectory Spreading of the population
- n neutral networks
Drift of the population center in sequence space
Spreading and evolution of a population on a neutral network: t = 150
Spreading and evolution of a population on a neutral network : t = 170
Spreading and evolution of a population on a neutral network : t = 200
Spreading and evolution of a population on a neutral network : t = 350
Spreading and evolution of a population on a neutral network : t = 500
Spreading and evolution of a population on a neutral network : t = 650
Spreading and evolution of a population on a neutral network : t = 820
Spreading and evolution of a population on a neutral network : t = 825
Spreading and evolution of a population on a neutral network : t = 830
Spreading and evolution of a population on a neutral network : t = 835
Spreading and evolution of a population on a neutral network : t = 840
Spreading and evolution of a population on a neutral network : t = 845
Spreading and evolution of a population on a neutral network : t = 850
Spreading and evolution of a population on a neutral network : t = 855
Anne Kupczok, Peter Dittrich, Determinats of simulated RNA evolution. J.Theor.Biol. 238:726-735, 2006
A sketch of optimization on neutral networks
Initial state Target Extinction
Replication, mutation and dilution
Application of molecular evolution to problems in biotechnology
Acknowledgement of support
Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU Siemens AG, Austria Universität Wien and the Santa Fe Institute
Universität Wien
Coworkers
Walter Fontana, Harvard Medical School, MA Christian Forst, Christian Reidys, Los Alamos National Laboratory, NM Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Jord Nagel, Kees Pleij, Universiteit Leiden, NL Christoph Flamm, Ivo L.Hofacker, Andreas Svrček-Seiler, Universität Wien, AT Kurt Grünberger, Michael Kospach, Andreas Wernitznig, Stefanie Widder, Michael Wolfinger, Stefan Wuchty,Universität Wien, AT Stefan Bernhart, Jan Cupal, Lukas Endler, Ulrike Langhammer, Rainer Machne, Ulrike Mückstein, Hakim Tafer, Universität Wien, AT Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE
Universität Wien