Molekularer Einblick in die Evolution von Phnotypen Peter Schuster - - PowerPoint PPT Presentation
Molekularer Einblick in die Evolution von Phnotypen Peter Schuster - - PowerPoint PPT Presentation
Molekularer Einblick in die Evolution von Phnotypen Peter Schuster Institut fr Theoretische Chemie und Molekulare Strukturbiologie der Universitt Wien Computergesttzte Analyse evolutionrer Optimierungsprozesse in komplexen Systemen
Molekularer Einblick in die Evolution von Phänotypen
Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien Computergestützte Analyse evolutionärer Optimierungsprozesse in komplexen Systemen Blankensee, 25.05.2002
Darwinian principle Reproduction efficiency expressed by fitness of phenotypes. Variation of genotypes through imperfect copying and recombination. Selection of phenotypes based on differences in fitness. Additional requirements Large reservoirs of genotypes and sufficiently rich repertoires of phenotypes. Proper mapping of genotypes into phenotypes.
The genotypes or genomes of individuals and species, being reproductively related ensembles of individuals, are DNA or RNA
- sequences. They are changing from generation to generation
through mutation and recombination. Genotypes unfold into phenotypes or organisms, which are the targets of the evolutionary selection process. Point mutations are single nucleotide exchanges. The Hamming distance of two sequences is the minimal number of single nucleotide exchanges that mutually converts the two sequence into each other.
A A A A A U U U U U U C C C C C C C C G G G G G G G G A U C G
= adenylate = uridylate = cytidylate = guanylate
5’-
- 3’
Genotype: The sequence of an RNA molecule consisting of monomers chosen from four classes.
Phenotype: Three-dimensional structure of phenylalanyl transfer-RNA
Hydrogen bonds
Hydrogen bonding between nucleotide bases is the principle of template action of RNA and DNA.
G G G G C C C G C C G C C G C C G C C G C C C C G G G G G C G C
Plus Strand Plus Strand Minus Strand Plus Strand Plus Strand Minus Strand
3' 3' 3' 3' 3' 5' 5' 5' 3' 3' 5' 5' 5' +
Complex Dissociation Synthesis Synthesis
Complementary replication as the simplest copying mechanism of RNA
dx / dt = x - x x
j i i j i i
Σ
; Σ = 1 k k x
i i i i
Φ Φ = Σ
[A] = a = constant
Ij Ij I1 I2 I1 I2 I1 I2 Ij In Ij In In
+ + + + + +
(A) + (A) + (A) + (A) + (A) + (A) + kj kn kj k1 k2 Im Im Im
+
(A) + (A) + km
k = max {k ; j=1,2,...,n} x (t) 1 for t
m j m
- s = (km+1-km)/km
Selection of the „fittest“ or fastest replicating species
200 400 600 800 1000 0.2 0.4 0.6 0.8 1 Time [Generations] Fraction of advantageous variant s = 0.1 s = 0.01 s = 0.02
Selection of advantageous mutants in populations of N = 10 000 individuals
G G G C C C G C C G C C C G C C C G C G G G G C
Plus Strand Plus Strand Minus Strand Plus Strand 3' 3' 3' 3' 5' 3' 5' 5' 5'
Point Mutation Insertion Deletion
GAA AA UCCCG GAAUCC A CGA GAA AA UCCCGUCCCG GAAUCCA
Mutations represent the mechanism of variation in nucleic acids.
I j I j I n I 2 I 1 I
j
I
j
I
j
I
j
+ + + +
M + k Q
j jj
k Q
j 2j
k Q
j 1j
k Q
j nj
Σi Q = 1
ij
Q = (1-p) p ; p ...... error rate per digit d(i,j) ...... Hamming distance between and dx / dt = k Q x - x k x x
ij i j j i i ji i j i i i i i n-d(i,j) d(i,j)
I I
Σ
Φ Φ = Σ ; Σ = 1
Chemical kinetics of replication and mutation as parallel reactions
space Sequence C
- n
c e n t r a t i
- n
Master sequence Mutant cloud
The molecular quasispecies in sequence space
Theory of molecular evolution
M.Eigen, Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften 58 (1971), 465-526 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle. Naturwissenschaften 58 (1977), 465-526 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part B: The abstract hypercycle. Naturwissenschaften 65 (1978), 7-41 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part C: The realistic hypercycle. Naturwissenschaften 65 (1978), 341-369 M.Eigen, J.McCaskill, P.Schuster, The molecular quasispecies. Adv.Chem.Phys. 75 (1989), 149-263
- C. Reidys, C.Forst, P.Schuster, Replication and mutation on neutral networks.
Bull.Math.Biol. 63 (2001), 57-94
A A A A A U U U U U U C C C C C C C C G G G G G G G G A U C G
= adenylate = uridylate = cytidylate = guanylate
Combinatorial diversity of sequences: N = 4 4 = 1.801 10 possible different sequences
27 16
- 5’-
- 3’
Combinatorial diversity of heteropolymers illustrated by means of an RNA aptamer that binds to the antibiotic tobramycin
Sk I. = ( ) ψ fk f Sk = ( )
Sequence space Phenotype space Non-negative numbers
Mapping from sequence space into phenotype space and into fitness values
The RNA model considers RNA sequences as genotypes and simplified RNA structures, called secondary structures, as phenotypes. The mapping from genotypes into phenotypes is many-to-one. Hence, it is redundant and not invertible. Genotypes, i.e. RNA sequences, which are mapped onto the same phenotype, i.e. the same RNA secondary structure, form neutral networks. Neutral networks are represented by graphs in sequence space.
RNA Secondary Structures and their Properties
RNA secondary structures are listings of Watson-Crick and GU wobble base pairs, which are free of knots and pseudokots. Secondary structures are folding intermediates in the formation of full three-dimensional structures.
D.Thirumalai, N.Lee, S.A.Woodson, and D.K.Klimov. Annu.Rev.Phys.Chem. 52:751-762 (2001)
5'-End 5'-End 5'-End 3'-End 3'-End 3'-End
70 60 50 40 30 20 10 GCGGAU AUUCGC UUA AGDDGGGA M CUGAAYA AGMUC TPCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG
Sequence Secondary Structure Symbolic Notation
Definition and formation of the secondary structure of phenylalanyl-tRNA
RNA Minimum Free Energy Structures
Efficient algorithms based on dynamical programming are available for computation of secondary structures for given
- sequences. Inverse folding algorithms compute sequences
for given secondary structures.
M.Zuker and P.Stiegler. Nucleic Acids Res. 9:133-148 (1981) Vienna RNA Package: http:www.tbi.univie.ac.at (includes inverse folding, suboptimal structures, kinetic folding, etc.) I.L.Hofacker, W. Fontana, P.F.Stadler, L.S.Bonhoeffer, M.Tacker, and P. Schuster. Mh.Chem. 125:167-188 (1994)
UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG
Criterion of Minimum Free Energy
Sequence Space Shape Space
.... GC UC .... CA .... GC UC .... GU .... GC UC .... GA .... GC UC .... CU
d =1
H
d =1
H
d =2
H
Point mutations as moves in sequence space
CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T A C A C
Hamming distance d (S ,S ) =
H 1 2
4 d (S ,S ) = 0
H 1 1
d (S ,S ) = d (S ,S )
H H 1 2 2 1
d (S ,S ) d (S ,S ) + d (S ,S )
H H H 1 3 1 2 2 3
- (i)
(ii) (iii)
The Hamming distance induces a metric in sequence space
4 2 1 8 16 10 19 9 14 6 13 5 11 3 7 12 21 17 22 18 25 20 26 24 28 27 23 15 29 30 31
Binary sequences are encoded by their decimal equivalents: = 0 and = 1, for example, "0" 00000 = "14" 01110 = , "29" 11101 = , etc. ≡ ≡ ≡ , C CCCCC C C C G GGG GGG G
Mutant class
1 2
3 4
5
Sequence space of binary sequences of chain lenght n=5
Sk I. = ( ) ψ
fk f Sk = ( )
Sequence space Phenotype space Non-negative numbers Mapping from sequence space into phenotype space and into fitness values
Sk I. = ( ) ψ
fk f Sk = ( )
Sequence space Phenotype space Non-negative numbers
Sk I. = ( ) ψ
fk f Sk = ( )
Sequence space Phenotype space Non-negative numbers
Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence
- space. In this approach, nodes are inserted randomly into sequence
space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.
Random graph approach to neutral networks Sketch of sequence space Step 00
Random graph approach to neutral networks Sketch of sequence space Step 01
Random graph approach to neutral networks Sketch of sequence space Step 02
Random graph approach to neutral networks Sketch of sequence space Step 03
Random graph approach to neutral networks Sketch of sequence space Step 04
Random graph approach to neutral networks Sketch of sequence space Step 05
Random graph approach to neutral networks Sketch of sequence space Step 10
Random graph approach to neutral networks Sketch of sequence space Step 15
Random graph approach to neutral networks Sketch of sequence space Step 25
Random graph approach to neutral networks Sketch of sequence space Step 50
Random graph approach to neutral networks Sketch of sequence space Step 75
Random graph approach to neutral networks Sketch of sequence space Step 100
λj = 27 ,
/
12 λk = (k)
j
| | Gk
λ κ
cr = 1 - -1 (
1)
/ κ- λ λ
k cr . . . .
> λ λ
k cr . . . .
< network is connected Gk network is connected not Gk Connectivity threshold: Alphabet size : = 4
- AUGC
G S S
k k k
= ( ) | ( ) =
- 1
- I
I
j j
- cr
2 0.5 3 0.4226 4 0.3700 Mean degree of neutrality and connectivity of neutral networks
Giant Component
A multi-component neutral network
A connected neutral network
Optimization of RNA molecules in silico
W.Fontana, P.Schuster, A computer model of evolutionary optimization. Biophysical Chemistry 26 (1987), 123-147 W.Fontana, W.Schnabl, P.Schuster, Physical aspects of evolutionary optimization and
- adaptation. Phys.Rev.A 40 (1989), 3301-3321
M.A.Huynen, W.Fontana, P.F.Stadler, Smoothness within ruggedness. The role of neutrality in adaptation. Proc.Natl.Acad.Sci.USA 93 (1996), 397-401 W.Fontana, P.Schuster, Continuity in evolution. On the nature of transitions. Science 280 (1998), 1451-1455 W.Fontana, P.Schuster, Shaping space. The possible and the attainable in RNA genotype- phenotype mapping. J.Theor.Biol. 194 (1998), 491-515
s p a c e Sequence Concentration
Master sequence Mutant cloud “Off-the-cloud” mutations
The molecular quasispecies in sequence space
S
=
( ) I f S
- ƒ
= ( )
S f I
Mutation Genotype-Phenotype Mapping Evaluation of the Phenotype
Q
j
I1 I2 I3 I4 I5 In
Q
f1 f2 f3 f4 f5 fn
I1 I2 I3 I4 I5 I In+1 f1 f2 f3 f4 f5 f fn+1
Q
Evolutionary dynamics including molecular phenotypes
Stock Solution Reaction Mixture
Fitness function: fk = / [+ dS
(k)]
- dS
(k) = ds(Ik,I
) The flowreactor as a device for studies of evolution in vitro and in silico
In silico optimization in the flow reactor: Trajectory Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory
44
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Endconformation of optimization
44 43
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of the last step 43 44
44 43 42
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of last-but-one step 42 43 ( 44)
44 43 42 41
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of step 41 42 ( 43 44)
44 43 42 41 40
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of step 40 41 ( 42 43 44)
44 43 42 41 40 39 Evolutionary process Reconstruction
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
Reconstruction of the relay series
Transition inducing point mutations Neutral point mutations
Change in RNA sequences during the final five relay steps 39 44
In silico optimization in the flow reactor: Trajectory and relay steps Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory
Relay steps
In silico optimization in the flow reactor: Uninterrupted presence Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory Uninterrupted presence
Relay steps
10 08 12 14 Time (arbitrary units) Average structure distance to target dS
- 500
250 20 10
Uninterrupted presence Evolutionary trajectory Number of relay step
Transition inducing point mutations Neutral point mutations
Neutral genotype evolution during phenotypic stasis
18 19 20 21 26 28 29 31
Time (arbitrary units)
750 1000 1250
Average structure distance to target dS
- 30
20 10
Uninterrupted presence Evolutionary trajectory 35 30 25 20 Number of relay step
A random sequence of minor or continuous transitions in the relay series
18 19 20 21 26 28 29 31
A random sequence of minor or continuous transitions in the relay series
Elongation of Stacks Shortening of Stacks Opening of Constrained Stacks
Multi- loop
Minor or continuous transitions: Occur frequently on single point mutations
In silico optimization in the flow reactor: Uninterrupted presence Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory Uninterrupted presence
Relay steps
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
38 37 36 Major transition leading to clover leaf
Reconstruction of a major transitions 36 37 ( 38)
44 43 42 41 40 39 Evolutionary process Reconstruction
Average structure distance to target dS
- Evolutionary trajectory
1250 10
44 42 40 38 36 Relay steps Number of relay step Time
38 37 36 Major transition leading to clover leaf
Final reconstruction 36 44
Shift Roll-Over Flip Double Flip
a a b a a b α α α α β β
Closing of Constrained Stacks
Multi- loop
Major or discontinuous transitions: Structural innovations, occur rarely on single point mutations
In silico optimization in the flow reactor: Major transitions Relay steps Major transitions Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Evolutionary trajectory
In silico optimization in the flow reactor Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t
- t
a r g e t d
- S
500 750 1000 1250 250 50 40 30 20 10
Relay steps Major transitions
Uninterrupted presence Evolutionary trajectory
Variation in genotype space during optimization of phenotypes
Main results of computer simulations of molecular evolution
- No trajectory was reproducible in detail. Sequences of target structures were different.
Nevertheless solutions of comparable or the same quality are almost always achieved.
- Transitions between molecular phenotypes represented by RNA structures can be
classified with respect to the induced structural changes. Highly probable minor transitions are opposed by major transitions with low probability of occurrence.
- Major transitions represent important innovations in the course of evolution.
- The number of minor transitions decreases with increasing population size.
- The number of major transitions or evolutionary innovations is approximately
constant for given start and stop structures.
- Not all structures are accessible through evolution in the flow reactor. An example is
the tRNA clover leaf for GC-only sequences.
„...Variations neither useful not injurious would not be affected by natural selection, and would be left either a fluctuating element, as perhaps we see in certain polymorphic species, or would ultimately become fixed, owing to the nature of the organism and the nature of the conditions. ...“
Charles Darwin, Origin of species (1859)
Genotype Space F i t n e s s
Start of Walk End of Walk Random Drift Periods Adaptive Periods
Evolution in genotype space sketched as a non-descending walk in a fitness landscape
Coworkers
Walter Fontana, Santa Fe Institute, NM Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Peter F. Stadler, Universität Wien, AT Ivo L. Hofacker Christoph Flamm Bärbel Stadler, Andreas Wernitznig, Universität Wien, AT Michael Kospach, Ulrike Mückstein, Stefanie Widder, Stefan Wuchty Jan Cupal, Kurt Grünberger, Andreas Svrček-Seiler Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber
Evolution of RNA molecules based on Qβ phage
D.R.Mills, R,L,Peterson, S.Spiegelman, An extracellular Darwinian experiment with a self-duplicating nucleic acid molecule. Proc.Natl.Acad.Sci.USA 58 (1967), 217-224 S.Spiegelman, An approach to the experimental analysis of precellular evolution. Quart.Rev.Biophys. 4 (1971), 213-253 C.K.Biebricher, Darwinian selection of self-replicating RNA molecules. Evolutionary Biology 16 (1983), 1-52 C.K.Biebricher, W.C. Gardiner, Molecular evolution of RNA in vitro. Biophysical Chemistry 66 (1997), 179-192
RNA sample Stock solution: Q RNA-replicase, ATP, CTP, GTP and UTP, buffer
- Time
1 2 3 4 5 6 69 70 The serial transfer technique applied to RNA evolution in vitro
The increase in RNA production rate during a serial transfer experiment
Evolutionary design of RNA molecules
D.B.Bartel, J.W.Szostak, In vitro selection of RNA molecules that bind specific ligands. Nature 346 (1990), 818-822 C.Tuerk, L.Gold, SELEX - Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249 (1990), 505-510 D.P.Bartel, J.W.Szostak, Isolation of new ribozymes from a large pool of random
- sequences. Science 261 (1993), 1411-1418
R.D.Jenison, S.C.Gill, A.Pardi, B.Poliski, High-resolution molecular discrimination by
- RNA. Science 263 (1994), 1425-1429
yes
Selection Cycle
no
Genetic Diversity
Desired Properties ? ? ? Selection Amplification Diversification
Selection cycle used in applied molecular evolution to design molecules with predefined properties
Retention of binders Elution of binders C h r
- m
a t
- g
r a p h i c c
- l
u m n
The SELEX technique for the evolutionary design of aptamers
A A A A A C C C C C C C C G G G G G G G G U U U U U U
5’- 3’-
A A A A A U U U U U U C C C C C C C C G G G G G G G G
5’-
- 3’
Formation of secondary structure of the tobramycin binding RNA aptamer
- L. Jiang, A. K. Suri, R. Fiala, D. J. Patel, Chemistry & Biology 4:35-50 (1997)
The three-dimensional structure of the tobramycin aptamer complex
- L. Jiang, A. K. Suri, R. Fiala, D. J. Patel,
Chemistry & Biology 4:35-50 (1997)
U U U U U G G G G G G G G G G G G G G G G G A A A A A A A A A A C C C C C C C C C C C C C C C
Cleavage site
The "hammerhead" ribozyme
OH OH OH ppp 5' 5' 3' 3'
The smallest known catalytically active RNA molecule
A ribozyme switch
E.A.Schultes, D.B.Bartel, One sequence, two ribozymes: Implication for the emergence of new ribozyme folds. Science 289 (2000), 448-452
Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis-
- virus (B)