The advent of information and combinatorial complexity: - - PowerPoint PPT Presentation
The advent of information and combinatorial complexity: - - PowerPoint PPT Presentation
The advent of information and combinatorial complexity: Understanding Darwinian evolution at the molecular level Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA
The advent of information and combinatorial complexity: Understanding Darwinian evolution at the molecular level Peter Schuster
Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA
ESF-COST Conference on Systems Chemistry Acquafredda di Maratea, 03.– 08.10.2008
Web-Page for further information: http://www.tbi.univie.ac.at/~pks
What is information ?
- Information is (only) what is understood.
- Information is (only) what creates information.
Carl Friedrich von Weizsäcker, 1912-2007, German physicist and philosopher.
Information in biology
- Understanding of information is interpreted as decoding,
- maintenance of information requires reproduction, and
- creation of information occurs through adaptation to the
environment by means of a Darwinian mechanism of variation and selection.
1. Requirements for information processing 2. The chemistry of Darwinian evolution 3. RNA sequences and structures 4. Consequences of neutrality 5. Evolutionary optimization of RNA structure
- 1. Requirements for information processing
2. The chemistry of Darwinian evolution 3. RNA sequences and structures 4. Consequences of neutrality 5. Evolutionary optimization of RNA structure
Classification of purine- pyrimidine base pairs
Classification of purine-purine base pairs
Classification of pyrimidine- pyrimidine base pairs
General classification of base pairs
N.B. Leontis and E. Westhof, RNA 7:499-512 (2001)
James D. Watson, 1928-, and Francis H.C. Crick, 1916-2004 Nobel prize 1962
1953 – 2003 fifty years double helix The three-dimensional structure of a short double helical stack of B-DNA
C G ``A´´ U
2,6-diamino purine 2-keto, 6-amino purine 2,6-diketo purine 5-keto, 7-amino, 1,6,8-triaza indolicine 5- , 7- , 1,6,8-triaza indolicine amino keto 2-amino,6-keto purine 2-keto, 4-amino pyrimidine
2- , 4- pyrimidine amino keto
2,4-di pyrimidine keto 2,6-diamin pyrimidine
- 2-
, 6-keto pyrazine amino 2- , 6- pyrazine keto amino
Color code: Donor—Acceptor Acceptor—Donor
Hydrogen bonding patterns for Watson- Crick base pairs
S.A. Benner et al., Reading the palimpsest: Contemporary biochemical data and the RNA world. In: R.F.Gesteland and J.F.Atkins, eds. The RNA World, pp.27-70. CSHL Press, 1993
Canonical Watson-Crick base pairs: cytosine – guanine uracil – adenine
W.Saenger, Principles of Nucleic Acid Structure, Springer, Berlin 1984
4n different sequences for chain length n n = 100: 4100 = 1.6 1060 sequences
Combinatorial complexity in biopolymer sequences
Information processing requires digitalization in the sense of „yes-or-no“ decisions. Nature solves the problem through complementarity of nucleobases:
- Biological information storage in nucleic acids is extremely
specific through applying the straightforward stereochemistry
- f the double helix.
- Biological information processing is overcoming thermodynamic
restrictions without violating its rules.
- Digitalization of biological information is the key towards easily
accessible combinatorial complexity and provides the basis for the inexhaustible reservoir of genotypes and shapes in nature.
1. Requirements for information processing
- 2. The chemistry of Darwinian evolution
3. RNA sequences and structures 4. Consequences of neutrality 5. Evolutionary optimization of RNA structure
Three necessary conditions for Darwinian evolution are: 1. Multiplication, 2. Variation, and 3. Selection. Variation through mutation and recombination operates on the genotype whereas the phenotype is the target of selection. One important property of the Darwinian scenario is that variations in the form of mutations or recombination events occur uncorrelated with their effects on the selection process. All conditions can be fulfilled not only by cellular organisms but also by nucleic acid molecules in suitable cell-free experimental assays.
DNA structure and DNA replication
‚Replication fork‘ in DNA replication The mechanism of DNA replication is ‚semi-conservative‘
Complementary replication is the simplest copying mechanism
- f RNA.
Complementarity is determined by Watson-Crick base pairs: GC and A=U
Kinetics of RNA replication
C.K. Biebricher, M. Eigen, W.C. Gardiner, Jr. Biochemistry 22:2544-2559, 1983
1 1 2 2 2 1
and x f dt dx x f dt dx = =
2 1 2 1 2 1 2 1 2 1 2 1
, , , , f f f f x f x = − = + = = = ξ ξ η ξ ξ ζ ξ ξ
ft ft
e t e t ) ( ) ( ) ( ) ( ζ ζ η η = =
−
Complementary replication as the simplest molecular mechanism of reproduction
A point mutation is caused by an incorrect incorporation of a nucleobase into the growing chain during replication. Replication and mutation are parallel chemical reactions.
Evolution of RNA molecules based on Qβ phage
D.R.Mills, R.L.Peterson, S.Spiegelman, An extracellular Darwinian experiment with a self-duplicating nucleic acid molecule. Proc.Natl.Acad.Sci.USA 58 (1967), 217-224 S.Spiegelman, An approach to the experimental analysis of precellular evolution. Quart.Rev.Biophys. 4 (1971), 213-253 C.K.Biebricher, Darwinian selection of self-replicating RNA molecules. Evolutionary Biology 16 (1983), 1-52 G.Bauer, H.Otten, J.S.McCaskill, Travelling waves of in vitro evolving RNA. Proc.Natl.Acad.Sci.USA 86 (1989), 7937-7941 C.K.Biebricher, W.C.Gardiner, Molecular evolution of RNA in vitro. Biophysical Chemistry 66 (1997), 179-192 G.Strunk, T.Ederhof, Machines for automated evolution experiments in vitro based on the serial transfer concept. Biophysical Chemistry 66 (1997), 193-202 F.Öhlenschlager, M.Eigen, 30 years later – A new approach to Sol Spiegelman‘s and Leslie Orgel‘s in vitro evolutionary studies. Orig.Life Evol.Biosph. 27 (1997), 437-457
RNA sample Stock solution: Q RNA-replicase, ATP, CTP, GTP and UTP, buffer
- Time
1 2 3 4 5 6 69 70
Application of serial transfer to RNA evolution in the test tube
Chemical kinetics of molecular evolution
- M. Eigen, P. Schuster, `The Hypercycle´, Springer-Verlag, Berlin 1979
Chemical kinetics of replication and mutation as parallel reactions
Quasispecies
Driving virus populations through threshold
The error threshold in replication
Molecular evolution of viruses
A fitness landscape showing an error threshold
Quasispecies as a function of the mutation rate p f0 = = 10 Single peak fitness landscape: 1 and
2 1
= = = =
N
f f f f f K
n N i i i
N I x f x f κ σ = − =
∑ =
; sequence master ) 1 (
1
K
Fitness landscapes showing error thresholds
Error threshold: Individual sequences n = 10, = 2 and d = 0, 1.0, 1.85
Evolutionary design of RNA molecules
A.D. Ellington, J.W. Szostak, In vitro selection of RNA molecules that bind specific ligands. Nature 346 (1990), 818-822
- C. Tuerk, L. Gold, SELEX - Systematic evolution of ligands by exponential enrichment: RNA
ligands to bacteriophage T4 DNA polymerase. Science 249 (1990), 505-510 D.P. Bartel, J.W. Szostak, Isolation of new ribozymes from a large pool of random sequences. Science 261 (1993), 1411-1418 R.D. Jenison, S.C. Gill, A. Pardi, B. Poliski, High-resolution molecular discrimination by RNA. Science 263 (1994), 1425-1429
- Y. Wang, R.R. Rando, Specific binding of aminoglycoside antibiotics to RNA. Chemistry &
Biology 2 (1995), 281-290
- L. Jiang, A. K. Suri, R. Fiala, D. J. Patel, Saccharide-RNA recognition in an aminoglycoside
antibiotic-RNA aptamer complex. Chemistry & Biology 4 (1997), 35-50
An example of ‘artificial selection’ with RNA molecules or ‘breeding’ of biomolecules
tobramycin RNA aptamer, n = 27
Formation of secondary structure of the tobramycin binding RNA aptamer with KD = 9 nM
- L. Jiang, A. K. Suri, R. Fiala, D. J. Patel, Saccharide-RNA recognition in an aminoglycoside antibiotic-
RNA aptamer complex. Chemistry & Biology 4:35-50 (1997)
Application of molecular evolution to problems in biotechnology
Artificial evolution in biotechnology and pharmacology G.F. Joyce. 2004. Directed evolution of nucleic acid enzymes. Annu.Rev.Biochem. 73:791-836.
- C. Jäckel, P. Kast, and D. Hilvert. 2008. Protein design by
directed evolution. Annu.Rev.Biophys. 37:153-173. S.J. Wrenn and P.B. Harbury. 2007. Chemical evolution as a tool for molecular discovery. Annu.Rev.Biochem. 76:331-349.
Results from kinetic theory of molecular evolution and evolution experiments:
- Evolutionary optimization does not require cells and occurs as
well in cell-free molecular systems.
- Replicating ensembles of molecules form stationary populations
called quasispecies, which represent the genetic reservoir of asexually reproducing species.
- For stable inheritance of genetic information mutation rates
must not exceed a precisely defined and computable error- threshold.
- The error-threshold can be exploited for the development of
novel antiviral strategies.
- In vitro evolution allows for production of molecules for
predefined purposes and gave rise to a branch of biotechnology.
1. Requirements for information processing 2. The chemistry of Darwinian evolution
- 3. RNA sequences and structures
4. Consequences of neutrality 5. Evolutionary optimization of RNA structure
RNA folding determination of RNA function molecular recognition catalysis binding to: ground state transition state aptamers ribozymes
The paradigm of structural biology
O CH2 OH O O P O O O
N1
O CH2 OH O P O O O
N2
O CH2 OH O P O O O
N3
O CH2 OH O P O O O
N4
N A U G C
k =
, , ,
3' - end 5' - end Na Na Na Na
5'-end 3’-end
GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG
Definition of RNA structure
N = 4n NS < 3n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ {AU,CG,GC,GU,UA,UG} A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
What is neutrality ?
Selective neutrality = = several genotypes having the same fitness. Structural neutrality = = several genotypes forming molecules with the same structure.
Reference for postulation and in silico verification of neutral networks
many genotypes
- ne phenotype
AUCAAUCAG GUCAAUCAC GUCAAUCAU GUCAAUCAA G U C A A U C C G G U C A A U C G G GUCAAUCUG G U C A A U G A G G U C A A U U A G GUCAAUAAG GUCAACCAG G U C A A G C A G GUCAAACAG GUCACUCAG G U C A G U C A G GUCAUUCAG GUCCAUCAG GUCGAUCAG GUCUAUCAG GUGAAUCAG GUUAAUCAG GUAAAUCAG GCCAAUCAG GGCAAUCAG GACAAUCAG UUCAAUCAG CUCAAUCAG
GUCAAUCAG
One-error neighborhood
The surrounding of GUCAAUCAG in sequence space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
GGCUAUCGUAUGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUAGACG GGCUAUCGUACGUUUACUCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGCUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCCAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUGUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAACGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCUGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCACUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGUCCCAGGCAUUGGACG GGCUAGCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCGAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGCCUACGUUGGACCCAGGCAUUGGACG
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
Number Mean Value Variance Std.Dev. Total Hamming Distance: 150000 11.647973 23.140715 4.810480 Nonzero Hamming Distance: 99875 16.949991 30.757651 5.545958 Degree of Neutrality: 50125 0.334167 0.006961 0.083434 Number of Structures: 1000 52.31 85.30 9.24 1 (((((.((((..(((......)))..)))).))).))............. 50125 0.334167 2 ..(((.((((..(((......)))..)))).)))................ 2856 0.019040 3 ((((((((((..(((......)))..)))))))).))............. 2799 0.018660 4 (((((.((((..((((....))))..)))).))).))............. 2417 0.016113 5 (((((.((((.((((......)))).)))).))).))............. 2265 0.015100 6 (((((.(((((.(((......))).))))).))).))............. 2233 0.014887 7 (((((..(((..(((......)))..)))..))).))............. 1442 0.009613 8 (((((.((((..((........))..)))).))).))............. 1081 0.007207 9 ((((..((((..(((......)))..))))..)).))............. 1025 0.006833 10 (((((.((((..(((......)))..)))).))))).............. 1003 0.006687 11 .((((.((((..(((......)))..)))).))))............... 963 0.006420 12 (((((.(((...(((......)))...))).))).))............. 860 0.005733 13 (((((.((((..(((......)))..)))).)).)))............. 800 0.005333 14 (((((.((((...((......))...)))).))).))............. 548 0.003653 15 (((((.((((................)))).))).))............. 362 0.002413 16 ((.((.((((..(((......)))..)))).))..))............. 337 0.002247 17 (.(((.((((..(((......)))..)))).))).).............. 241 0.001607 18 (((((.(((((((((......))))))))).))).))............. 231 0.001540 19 ((((..((((..(((......)))..))))...))))............. 225 0.001500 20 ((....((((..(((......)))..)))).....))............. 202 0.001347 G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G
Shadow – Surrounding of an RNA structure in shape space: AUGC alphabet, chain length n=50
1. Requirements for information processing 2. The chemistry of Darwinian evolution 3. RNA sequences and structures
- 4. Consequences of neutrality
5. Evolutionary optimization of RNA structure
Charles Darwin. The Origin of Species. Sixth edition. John Murray. London: 1872
Motoo Kimuras population genetics of neutral evolution. Evolutionary rate at the molecular level. Nature 217: 624-626, 1955. The Neutral Theory of Molecular Evolution. Cambridge University Press. Cambridge, UK, 1983.
The average time of replacement of a dominant genotype in a population is the reciprocal mutation rate, 1/, and therefore independent of population size.
Is the Kimura scenario correct for frequent mutations?
dH = 1
5 . ) ( ) ( lim
2 1
= =
→
p x p x
p
dH = 2
a p x a p x
p p
− = =
→ →
1 ) ( lim ) ( lim
2 1
dH ≥3
random fixation in the sense of Motoo Kimura Pairs of genotypes in neutral replication networks
for comparison: = 0, = 1.1, d = 0
Neutral network: Individual sequences n = 10, = 1.1, d = 1.0
Consensus sequence of a quasispecies of two strongly coupled sequences of Hamming distance dH(Xi,,Xj) = 1.
Neutral network: Individual sequences n = 10, = 1.1, d = 1.0
Consensus sequence of a quasispecies of two strongly coupled sequences of Hamming distance dH(Xi,,Xj) = 2.
N = 7
Computation of sequences in the core of a neutral network
N = 7 Neutral networks with increasing : = 0.10, s = 229
N = 24 Neutral networks with increasing : = 0.15, s = 229
N = 70 Neutral networks with increasing : = 0.20, s = 229
Extension of the notion of structure
Extension of the notion of structure
mfe-weight: 0.7196
GGCCCCUUUGGGGGCCAGACCCCUAAAGGGGUC ((((((((((((((.....)))))))))))))) -26.30 ((((((....)))))).((((((....)))))) -25.30 .(((((((((((((.....))))))))))))). -24.80 (((((((((((((.......))))))))))))) -24.50 ((((((....)))))).(((((......))))) -23.40 (((((......))))).((((((....)))))) -23.30 ..((((((((((((.....)))))))))))).. -23.10 (((((((((((((......)))).))))))))) -23.00 .((((((((((((.......)))))))))))). -23.00 (((((((.((((((.....)))))).))))))) -22.80 ((((((((.(((((.....))))).)))))))) -22.70 ((((((....))))))..(((((....))))). -22.70 ((((((.(((((((.....))))))).)))))) -22.20 (((((((((.((((.....)))).))))))))) -22.10 (.((((((((((((.....)))))))))))).) -21.90 .(((((((((((((.....)))))))))))).) -21.90 ((((((....))))))...((((....)))).. -21.60 (((((((..(((((.....)))))..))))))) -21.50 .((((((((((((......)))).)))))))). -21.50 (((((......))))).(((((......))))) -21.40 .((((((.((((((.....)))))).)))))). -21.30 ..(((((((((((.......))))))))))).. -21.30
Suboptimal structures and partition function
- f a small RNA molecule: n = 33
GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG (((((.((((..(((......)))..)))).))).))............. -7.30 ..........((((((.((....((((.....))))...))...)))))) -6.70 ..........((((((.((....(((((...)))))...))...)))))) -6.60 ..(((.((((..(((......)))..)))).)))..((((...))))... -6.10 (((((.((((..(((......)))..)))).))).))..(........). -6.00 (((((.((((..((........))..)))).))).))............. -6.00 .(((.((..((((..((......))..))))..))....)))........ -6.00 GGCUAUCGUACGUUUACACAAAAGUCUACGUUGGACCCAGGCAUUGGACG (((((.((((..(((......)))..)))).))).))............. -7.30 .(((.((..((((..((......))..))))..))....)))........ -6.50 .(((.....((((..((......))..))))((....)))))........ -6.30 ..(((.((((..(((......)))..)))).)))..((((...))))... -6.10 (((((.((((..(((......)))..)))).))).))..(........). -6.00 (((((.((((..((........))..)))).))).))............. -6.00 .(((...((((((..((......))..))))...))...)))........ -6.00 GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAAUGGACG (((((.((((..(((......)))..)))).))).))............. -7.30 ..(((.((((..(((......)))..)))).)))..(((.....)))... -7.20 ..........((((((.((....((((.....))))...))...)))))) -6.70 ..........((((((.((....(((((...)))))...))...)))))) -6.60 (((((.((((..(((......)))..)))).))).))((.....)).... -6.50 (.(((.((((..(((......)))..)))).))).)(((.....)))... -6.30 .((((.((((..(((......)))..)))).))).)(((.....)))... -6.30 .....(((.((((..((......))..)))))))..(((.....)))... -6.30 (.(((.((((..(((......)))..)))).)))..(((.....))).). -6.10 .....((..((((..((......))..))))..)).(((.....)))... -6.10 ......(((.((((...((....((((.....))))...)).)))).))) -6.10 (((((.((((..(((......)))..)))).))).))..(........). -6.00 (((((.((((..((........))..)))).))).))............. -6.00 .(((.((..((((..((......))..))))..))....)))........ -6.00 ......(((.((((...((....(((((...)))))...)).)))).))) -6.00
Extension of the notion of structure
Extension of the notion of structure
JN1LH
1D 1D 1D 2D 2D 2D R R R
G GGGUGGAAC GUUC GAAC GUUCCUCCC CACGAG CACGAG CACGAG
- 28.6 kcal·mol
- 1
G/
- 31.8 kcal·mol
- 1
G G G G G G C C C C C C A A U U U U G G C C U U A A G G G C C C A A A A G C G C A A G C /G
- 28.2 kcal·mol
- 1
G G G G G G GG CCC C C C C C U G G G G C C C C A A A A A A A A U U U U U G G C C A A
- 28.6 kcal·mol
- 1
3 3 3 13 13 13 23 23 23 33 33 33 44 44 44
5' 5' 3’ 3’
J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation. Nucleic Acids Res. 34:3568-3576, 2006.
An RNA switch
A ribozyme switch
E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452
Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis--virus (B)
The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures
Two neutral walks through sequence space with conservation of structure and catalytic activity
RNA 9:1456-1463, 2003
Evidence for neutral networks and shape space covering
Evidence for neutral networks and intersection of apatamer functions
Neutrality in molecular structures and its role in evolution:
- Neutrality is an essential feature in biopolymer structures at the
resolution that is relevant for function.
- Neutrality manifests itself in the search for minimum free energy
structures.
- Diversity in function despite neutrality in structures results from
differences in suboptimal conformations and folding kinetics.
- Neutrality is indispensible for optimization and adaptation.
1. Requirements for information processing 2. The chemistry of Darwinian evolution 3. RNA sequences and structures 4. Consequences of neutrality
- 5. Evolutionary optimization of RNA structure
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution of RNA molecules as a Markow process and its analysis by means of the relay series
Evolution in silico
- W. Fontana, P. Schuster,
Science 280 (1998), 1451-1455
Phenylalanyl-tRNA as target structure Structure of randomly chosen initial sequence
Replication rate constant (Fitness): fk = / [ + dS
(k)]
dS
(k) = dH(Sk,S)
Selection pressure: The population size, N = # RNA moleucles, is determined by the flux: Mutation rate: p = 0.001 / Nucleotide Replication N N t N ± ≈ ) ( The flow reactor as a device for studying the evolution of molecules in vitro and in silico.
In silico optimization in the flow reactor: Evolutionary Trajectory
28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations change the molecular structure Neutral point mutations leave the molecular structure unchanged
Neutral genotype evolution during phenotypic stasis
Randomly chosen initial structure Phenylalanyl-tRNA as target structure
Evolutionary trajectory Spreading of the population
- n neutral networks
Drift of the population center in sequence space
Spreading and evolution of a population on a neutral network: t = 150
Spreading and evolution of a population on a neutral network : t = 170
Spreading and evolution of a population on a neutral network : t = 200
Spreading and evolution of a population on a neutral network : t = 350
Spreading and evolution of a population on a neutral network : t = 500
Spreading and evolution of a population on a neutral network : t = 650
Spreading and evolution of a population on a neutral network : t = 820
Spreading and evolution of a population on a neutral network : t = 825
Spreading and evolution of a population on a neutral network : t = 830
Spreading and evolution of a population on a neutral network : t = 835
Spreading and evolution of a population on a neutral network : t = 840
Spreading and evolution of a population on a neutral network : t = 845
Spreading and evolution of a population on a neutral network : t = 850
Spreading and evolution of a population on a neutral network : t = 855
A sketch of optimization on neutral networks
Is the degree of neutrality in GC space much lower than in AUGC space ? Statistics of RNA structure optimization: P. Schuster, Rep.Prog.Phys. 69:1419-1477, 2006
Number Mean Value Variance Std.Dev. Total Hamming Distance: 150000 11.647973 23.140715 4.810480 Nonzero Hamming Distance: 99875 16.949991 30.757651 5.545958 Degree of Neutrality: 50125 0.334167 0.006961 0.083434 Number of Structures: 1000 52.31 85.30 9.24 1 (((((.((((..(((......)))..)))).))).))............. 50125 0.334167 2 ..(((.((((..(((......)))..)))).)))................ 2856 0.019040 3 ((((((((((..(((......)))..)))))))).))............. 2799 0.018660 4 (((((.((((..((((....))))..)))).))).))............. 2417 0.016113 5 (((((.((((.((((......)))).)))).))).))............. 2265 0.015100 6 (((((.(((((.(((......))).))))).))).))............. 2233 0.014887 7 (((((..(((..(((......)))..)))..))).))............. 1442 0.009613 8 (((((.((((..((........))..)))).))).))............. 1081 0.007207 9 ((((..((((..(((......)))..))))..)).))............. 1025 0.006833 10 (((((.((((..(((......)))..)))).))))).............. 1003 0.006687 11 .((((.((((..(((......)))..)))).))))............... 963 0.006420 12 (((((.(((...(((......)))...))).))).))............. 860 0.005733 13 (((((.((((..(((......)))..)))).)).)))............. 800 0.005333 14 (((((.((((...((......))...)))).))).))............. 548 0.003653 15 (((((.((((................)))).))).))............. 362 0.002413 16 ((.((.((((..(((......)))..)))).))..))............. 337 0.002247 17 (.(((.((((..(((......)))..)))).))).).............. 241 0.001607 18 (((((.(((((((((......))))))))).))).))............. 231 0.001540 19 ((((..((((..(((......)))..))))...))))............. 225 0.001500 20 ((....((((..(((......)))..)))).....))............. 202 0.001347 Number Mean Value Variance Std.Dev. Total Hamming Distance: 50000 13.673580 10.795762 3.285691 Nonzero Hamming Distance: 45738 14.872054 10.821236 3.289565 Degree of Neutrality: 4262 0.085240 0.001824 0.042708 Number of Structures: 1000 36.24 6.27 2.50 1 (((((.((((..(((......)))..)))).))).))............. 4262 0.085240 2 ((((((((((..(((......)))..)))))))).))............. 1940 0.038800 3 (((((.(((((.(((......))).))))).))).))............. 1791 0.035820 4 (((((.((((.((((......)))).)))).))).))............. 1752 0.035040 5 (((((.((((..((((....))))..)))).))).))............. 1423 0.028460 6 (.(((.((((..(((......)))..)))).))).).............. 665 0.013300 7 (((((.((((..((........))..)))).))).))............. 308 0.006160 8 (((((.((((..(((......)))..)))).))))).............. 280 0.005600 9 (((((.((((..(((......)))..)))).))).))...(((....))) 278 0.005560 10 (((((.(((...(((......)))...))).))).))............. 209 0.004180 11 (((((.((((..(((......)))..)))).))).)).(((......))) 193 0.003860 12 (((((.((((..(((......)))..)))).))).))..(((.....))) 180 0.003600 13 (((((.((((..((((.....)))).)))).))).))............. 180 0.003600 14 ..(((.((((..(((......)))..)))).)))................ 176 0.003520 15 (((((.((((.((((.....))))..)))).))).))............. 175 0.003500 16 ((((( (((( ((( ))) ))))))))) 167 0 003340
G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G C C C C G G G C C G G G G G C G C G C GG GCC GG CGGC G CGGC GG G G GG G G G G C G G C C
Shadow – Surrounding of an RNA structure in shape space – AUGC and GC alphabet
Neutrality in evolution
Charles Darwin: „ ... neutrality might exist ...“ Motoo Kimura: „ ... neutrality is unaviodable and represents the main reason for changes in genotypes and leads to molecular phylogeny ...“ Current view: „ ... neutrality is essential for successful
- ptimization on rugged landscapes ...“
Proposed view: „ ... neutrality provides the genetic reservoir for functions in the rare and frequent mutation scenario ...“
Outlook Does understanding of life require more chemistry ? Thinking in terms of processes rather than structures !
The difficulty to define the notion of „gene”. Helen Pearson, Nature 441: 399-401, 2006
ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799-816, 2007
ENCODE stands for ENCyclopedia Of DNA Elements.
Acknowledgement of support
Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU: Bioinformatics Network (BIN) Österreichische Akademie der Wissenschaften Siemens AG, Austria Universität Wien and the Santa Fe Institute
Universität Wien
Coworkers
Peter Stadler, Bärbel M. Stadler, Universität Leipzig, GE Paul E. Phillipson, University of Colorado at Boulder, CO Heinz Engl, Philipp Kügler, James Lu, Stefan Müller, RICAM Linz, AT Jord Nagel, Kees Pleij, Universiteit Leiden, NL Walter Fontana, Harvard Medical School, MA Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE Ivo L.Hofacker, Christoph Flamm, Andreas Svrček-Seiler, Universität Wien, AT Kurt Grünberger, Michael Kospach , Andreas Wernitznig, Stefanie Widder, Stefan Wuchty, Universität Wien, AT Jan Cupal, Stefan Bernhart, Lukas Endler, Ulrike Langhammer, Rainer Machne, Ulrike Mückstein, Hakim Tafer, Thomas Taylor, Universität Wien, AT
Universität Wien