Biomolecular Evolution from a Physicists Point of View Peter - - PowerPoint PPT Presentation

biomolecular evolution from a physicist s point of view
SMART_READER_LITE
LIVE PREVIEW

Biomolecular Evolution from a Physicists Point of View Peter - - PowerPoint PPT Presentation

Biomolecular Evolution from a Physicists Point of View Peter Schuster Institut fr Theoretische Chemie und Molekulare Strukturbiologie der Universitt Wien Physics Colloquium Boulder, 29.10.2003 Web-Page for further information:


slide-1
SLIDE 1
slide-2
SLIDE 2

Biomolecular Evolution from a Physicist‘s Point of View

Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien Physics Colloquium Boulder, 29.10.2003

slide-3
SLIDE 3

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-4
SLIDE 4

Generation time 10 000 generations 106 generations 107 generations RNA molecules 10 sec 1 min 27.8 h = 1.16 d 6.94 d 115.7 d 1.90 a 3.17 a 19.01 a Bacteria 20 min 10 h 138.9 d 11.40 a 38.03 a 1 140 a 380 a 11 408 a Higher multicelluar

  • rganisms

10 d 20 a 274 a 20 000 a 27 380 a 2 × 107 a 273 800 a 2 × 108 a

Time scales of evolutionary change

slide-5
SLIDE 5

1. Controlled experiments on evolution and RNA replication 2. Evolution in silico and optimization of RNA structures 3. Sequence-structure maps, neutral networks, and intersections 4. Design of RNA molecules with predefined properties

slide-6
SLIDE 6

1. Controlled experiments on evolution and RNA replication 2. Evolution in silico and optimization of RNA structures 3. Sequence-structure maps, neutral networks, and intersections 4. Design of RNA molecules with predefined properties

slide-7
SLIDE 7

Bacterial Evolution

  • S. F. Elena, V. S. Cooper, R. E. Lenski. Punctuated evolution caused by selection of

rare beneficial mutants. Science 272 (1996), 1802-1804

  • D. Papadopoulos, D. Schneider, J. Meier-Eiss, W. Arber, R. E. Lenski, M. Blot.

Genomic evolution during a 10,000-generation experiment with bacteria. Proc.Natl.Acad.Sci.USA 96 (1999), 3807-3812

slide-8
SLIDE 8

24 h 24 h

Serial transfer of Escherichia coli cultures in Petri dishes

1 day 6.67 generations 1 month 200 generations

  • 1 year 2400 generations
  • lawn of E.coli

nutrient agar

slide-9
SLIDE 9

1 year

Epochal evolution of bacteria in serial transfer experiments under constant conditions

  • S. F. Elena, V. S. Cooper, R. E. Lenski. Punctuated evolution caused by selection of rare beneficial mutants.

Science 272 (1996), 1802-1804

slide-10
SLIDE 10

2000 4000 6000 8000 Time 5 10 15 20 25 Hamming distance to ancestor Generations

Variation of genotypes in a bacterial serial transfer experiment

  • D. Papadopoulos, D. Schneider, J. Meier-Eiss, W. Arber, R. E. Lenski, M. Blot. Genomic evolution during a

10,000-generation experiment with bacteria. Proc.Natl.Acad.Sci.USA 96 (1999), 3807-3812

slide-11
SLIDE 11

Evolution of RNA molecules based on Qβ phage

D.R.Mills, R.L.Peterson, S.Spiegelman, An extracellular Darwinian experiment with a self-duplicating nucleic acid molecule. Proc.Natl.Acad.Sci.USA 58 (1967), 217-224 S.Spiegelman, An approach to the experimental analysis of precellular evolution. Quart.Rev.Biophys. 4 (1971), 213-253 C.K.Biebricher, Darwinian selection of self-replicating RNA molecules. Evolutionary Biology 16 (1983), 1-52 G.Bauer, H.Otten, J.S.McCaskill, Travelling waves of in vitro evolving RNA. Proc.Natl.Acad.Sci.USA 86 (1989), 7937-7941 C.K.Biebricher, W.C.Gardiner, Molecular evolution of RNA in vitro. Biophysical Chemistry 66 (1997), 179-192 G.Strunk, T.Ederhof, Machines for automated evolution experiments in vitro based on the serial transfer concept. Biophysical Chemistry 66 (1997), 193-202

slide-12
SLIDE 12

RNA sample Stock solution: Q RNA-replicase, ATP, CTP, GTP and UTP, buffer

  • Time

1 2 3 4 5 6 69 70 The serial transfer technique applied to RNA evolution in vitro

slide-13
SLIDE 13

Reproduction of the original figure of the serial transfer experiment with Q RNA β D.R.Mills, R,L,Peterson, S.Spiegelman, . Proc.Natl.Acad.Sci.USA (1967), 217-224 An extracellular Darwinian experiment with a self-duplicating nucleic acid molecule 58

slide-14
SLIDE 14

Decrease in mean fitness due to quasispecies formation

The increase in RNA production rate during a serial transfer experiment

slide-15
SLIDE 15

No new principle will declare itself from below a heap of facts.

Sir Peter Medawar, 1985

slide-16
SLIDE 16

Questions that cannot be answered by current experimental techniques:

(i) How does the distribution of genotypes change with time? (ii) Which intermediates are passed during an optimization experiment? (iii) Why does optimization occur in steps? (iv) What happens at the edges of the quasi-stationary epochs? (v) How much do individual trajectories leading from the same initial state to the same target differ? (vi) Is there a proper statistics for evolutionary optimization?

slide-17
SLIDE 17

Molecular genetics Evolution of mo e DNA/RNA sequence Molecular structure and function success Replication rate Population biology lecules Genotype Genom Phenotype Organism Fitness Reproductive

slide-18
SLIDE 18 O CH2 OH O O P O O O

N1

O CH2 OH O P O O O

N2

O CH2 OH O P O O O

N3

O CH2 OH O P O O O

N4

N A U G C

k =

, , ,

3' - end 5' - end Na Na Na Na

RNA

nd 3’-end

GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG 3'-end 5’-end

70 60 50 40 30 20 10

Definition of RNA structure

5'-e

slide-19
SLIDE 19

The three-dimensional structure of a short double helical stack of B-DNA

James D. Watson, 1928- , and Francis Crick, 1916- , Nobel Prize 1962

1953 – 2003 fifty years double helix

slide-20
SLIDE 20

G G G G C C C G C C G C C G C C G C C G C C C C G G G G G C G C

Plus Strand Plus Strand Minus Strand Plus Strand Plus Strand Minus Strand

3' 3' 3' 3' 3' 5' 5' 5' 3' 3' 5' 5' 5' +

Complex Dissociation Synthesis Synthesis

Complementary replication as the simplest copying mechanism of RNA Complementarity is determined by Watson-Crick base pairs: G C and A=U

slide-21
SLIDE 21

5'-End 5'-End 5'-End 3'-End 3'-End 3'-End

70 60 50 40 30 20 10 GCGGAUUUAGCUCAGDDGGGAGAGCMCCAGACUGAAYAUCUGGAGMUCCUGUGTPCGAUCCACAGAAUUCGCACCA

Sequence Secondary structure Symbolic notation

  • A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
slide-22
SLIDE 22

Definition and physical relevance of RNA secondary structures

RNA secondary structures are listings of Watson-Crick and GU wobble base pairs, which are free of knots and pseudokots. „Secondary structures are folding intermediates in the formation of full three-dimensional structures.“ D.Thirumalai, N.Lee, S.A.Woodson, and D.K.Klimov. Annu.Rev.Phys.Chem. 52:751-762 (2001):

slide-23
SLIDE 23

RNA sequence

Empirical parameters Biophysical chemistry: thermodynamics and kinetics

RNA structure

Inverse folding of RNA: Biotechnology, design of biomolecules with predefined structures and functions RNA folding: Structural biology, spectroscopy of biomolecules, understanding molecular function

Sequence, structure, and function

slide-24
SLIDE 24

How to compute RNA secondary structures

Efficient algorithms based on dynamic programming are available for computation of minimum free energy and many suboptimal secondary structures for given sequences.

M.Zuker and P.Stiegler. Nucleic Acids Res. 9:133-148 (1981) M.Zuker, Science 244: 48-52 (1989)

Equilibrium partition function and base pairing probabilities in Boltzmann ensembles of suboptimal structures.

J.S.McCaskill. Biopolymers 29:1105-1190 (1990)

The Vienna RNA Package provides in addition: inverse folding (computing sequences for given secondary structures), computation of melting profiles from partition functions, all suboptimal structures within a given energy interval, barrier tress of suboptimal structures, kinetic folding of RNA sequences, RNA-hybridization and RNA/DNA-hybridization through cofolding of sequences, alignment, etc..

I.L.Hofacker, W. Fontana, P.F.Stadler, L.S.Bonhoeffer, M.Tacker, and P. Schuster. Mh.Chem. 125:167-188 (1994) S.Wuchty, W.Fontana, I.L.Hofacker, and P.Schuster. Biopolymers 49:145-165 (1999) C.Flamm, W.Fontana, I.L.Hofacker, and P.Schuster. RNA 6:325-338 (1999)

Vienna RNA Package: http://www.tbi.univie.ac.at

slide-25
SLIDE 25

hairpin loop hairpin loop stack stack stack hairpin loop stack free end free end free end hairpin loop hairpin loop stack stack free end free end joint hairpin loop stack stack stack internal loop bulge multiloop

Elements of RNA secondary structures as used in free energy calculations

slide-26
SLIDE 26

L

∑ ∑ ∑ ∑

+ + + + = ∆

loops internal bulges loops hairpin pairs base

  • f

stacks , 300

) ( ) ( ) (

i b l kl ij

n i n b n h g G

free energy of stacking < 0

G G G G G G G G G G G G G G G G U U U U U U U U U U U A A A A A A A A A A A A U C C C C C C C C C C C C 5’-end 3’-end

Folding of RNA sequences into secondary structures of minimal free energy, G0

300

slide-27
SLIDE 27

O O O H H H H H H N N N N O O H N N H O N N N N N N N

G=U U=G

O H H H N N N N N

(U=A) A=U

O N

O O H H H H H N N N N N N N

(C G)

  • G C
  • Three base pairing alphabets built from natural nucleotides A, U, G, and C
slide-28
SLIDE 28

Nature , 323-325, 1999 402

Catalytic activity in the AUG alphabet

slide-29
SLIDE 29

Nature , 841-844, 2002 420

Catalytic activity in the DU alphabet

slide-30
SLIDE 30 5'-End 5'-End 5'-End 5'-End 3'-End 3'-End 3'-End 3'-End 70 70 70 70 60 60 60 60 50 50 50 50 40 40 40 40 30 30 30 30 20 20 20 20 10 10 10 10

Alphabet Probability of successful trials in inverse folding

AU AUG AUGC UGC GC

  • -
  • -

0.794 0.007 0.548 0.011 0.067 0.007

  • -

0.003 0.001 0.884 0.008 0.628 0.012

  • 0.086 0.008
  • 0.051 0.006

0.374 0.016 0.982 0.004 0.818 0.012 0.127 0.006

  • Accessibility of cloverleaf RNA secondary structures through inverse folding
slide-31
SLIDE 31

1. Controlled experiments on evolution and RNA replication 2. Evolution in silico and optimization of RNA structures 3. Sequence-structure maps, neutral networks, and intersections 4. Design of RNA molecules with predefined properties

slide-32
SLIDE 32

Optimization of RNA molecules in silico

W.Fontana, P.Schuster, A computer model of evolutionary optimization. Biophysical Chemistry 26 (1987), 123-147 W.Fontana, W.Schnabl, P.Schuster, Physical aspects of evolutionary optimization and

  • adaptation. Phys.Rev.A 40 (1989), 3301-3321

M.A.Huynen, W.Fontana, P.F.Stadler, Smoothness within ruggedness. The role of neutrality in adaptation. Proc.Natl.Acad.Sci.USA 93 (1996), 397-401 W.Fontana, P.Schuster, Continuity in evolution. On the nature of transitions. Science 280 (1998), 1451-1455 W.Fontana, P.Schuster, Shaping space. The possible and the attainable in RNA genotype- phenotype mapping. J.Theor.Biol. 194 (1998), 491-515 B.M.R. Stadler, P.F. Stadler, G.P. Wagner, W. Fontana, The topology of the possible: Formal spaces underlying patterns of evolutionary change. J.Theor.Biol. 213 (2001), 241-274

slide-33
SLIDE 33

Stock Solution Reaction Mixture

Replication rate constant: fk = / [+ dS

(k)]

  • dS

(k) = dH(Sk,S

) Selection constraint: # RNA molecules is controlled by the flow N N t N ± ≈ ) ( The flowreactor as a device for studies of evolution in vitro and in silico

slide-34
SLIDE 34

5'-End 3'-End

70 60 50 40 30 20 10

Randomly chosen initial structure Phenylalanyl-tRNA as target structure

slide-35
SLIDE 35

G G G C C C G C C G C C C G C C C G C G G G G C

Plus Strand Plus Strand Minus Strand Plus Strand 3' 3' 3' 3' 5' 3' 5' 5' 5'

Point Mutation Insertion Deletion

GAA AA UCCCG GAAUCC A CGA GAA AA UCCCGUCCCG GAAUCCA

Mutations in nucleic acids represent the mechanism for variation of genotypes.

slide-36
SLIDE 36

s p a c e Sequence Concentration

Master sequence Mutant cloud “Off-the-cloud” mutations

The molecular quasispecies in sequence space

slide-37
SLIDE 37

S{ = ( ) I{ f S

{ {

ƒ = ( )

S{ f{ I{

Mutation Genotype-Phenotype Mapping Evaluation of the Phenotype

Q{

j

I1 I2 I3 I4 I5 In

Q

f1 f2 f3 f4 f5 fn

I1 I2 I3 I4 I5 I{ In+1 f1 f2 f3 f4 f5 f{ fn+1

Q

Evolutionary dynamics including molecular phenotypes

slide-38
SLIDE 38

In silico optimization in the flow reactor: Trajectory (biologists‘ view) Time (arbitrary units) A v e r a g e d i s t a n c e f r

  • m

i n i t i a l s t r u c t u r e 5

  • d
  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-39
SLIDE 39

In silico optimization in the flow reactor: Trajectory (physicists‘ view) Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t

  • t

a r g e t d

  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-40
SLIDE 40

44

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Endconformation of optimization

slide-41
SLIDE 41

44 43

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of the last step 43 44

slide-42
SLIDE 42

44 43 42

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of last-but-one step 42 43 ( 44)

slide-43
SLIDE 43

44 43 42 41

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of step 41 42 ( 43 44)

slide-44
SLIDE 44

44 43 42 41 40

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of step 40 41 ( 42 43 44)

slide-45
SLIDE 45

44 43 42 41 40 39 Evolutionary process Reconstruction

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of the relay series

slide-46
SLIDE 46

Transition inducing point mutations Neutral point mutations

Change in RNA sequences during the final five relay steps 39 44

slide-47
SLIDE 47

In silico optimization in the flow reactor: Trajectory and relay steps Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t

  • t

a r g e t d

  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

Relay steps

slide-48
SLIDE 48

10 08 12 14 Time (arbitrary units) Average structure distance to target dS

  • 500

250 20 10

Uninterrupted presence Evolutionary trajectory Number of relay step

28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations Neutral point mutations

Neutral genotype evolution during phenotypic stasis

slide-49
SLIDE 49

Variation in genotype space during optimization of phenotypes

Mean Hamming distance within the population and drift velocity of the population center in sequence space.

slide-50
SLIDE 50

In silico optimization in the flow reactor: Main transitions Main transitions Relay steps Time (arbitrary units) Average structure distance to target d S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-51
SLIDE 51

00 09 31 44

Three important steps in the formation of the tRNA clover leaf from a randomly chosen initial structure corresponding to three main transitions.

slide-52
SLIDE 52

Shift Roll-Over Flip Double Flip

a a b a a b α α α α β β

Closing of Constrained Stacks

Multi- loop

Main or discontinuous transitions: Structural innovations, occur rarely on single point mutations

slide-53
SLIDE 53

AUGC GC Movies of optimization trajectories over the AUGC and the GC alphabet

slide-54
SLIDE 54

Runtime of trajectories F r e q u e n c y

1000 2000 3000 4000 5000 0.05 0.1 0.15 0.2

Statistics of the lengths of trajectories from initial structure to target (AUGC-sequences)

slide-55
SLIDE 55

Number of transitions F r e q u e n c y

20 40 60 80 100 0.05 0.1 0.15 0.2 0.25 0.3

All transitions Main transitions

Statistics of the numbers of transitions from initial structure to target (AUGC-sequences)

slide-56
SLIDE 56

Alphabet Runtime Transitions Main transitions

  • No. of runs

AUGC 385.6 22.5 12.6 1017 GUC 448.9 30.5 16.5 611 GC 2188.3 40.0 20.6 107

Statistics of trajectories and relay series (mean values of log-normal distributions)

slide-57
SLIDE 57

1. Controlled experiments on evolution and RNA replication 2. Evolution in silico and optimization of RNA structures 3. Sequence-structure maps, neutral networks, and intersections 4. Design of RNA molecules with predefined properties

slide-58
SLIDE 58

Minimum free energy criterion Inverse folding of RNA secondary structures

The idea of inverse folding algorithm is to search for sequences that form a given RNA secondary structure under the minimum free energy criterion.

slide-59
SLIDE 59

Structure

slide-60
SLIDE 60

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G

Compatible sequence Structure

5’-end 3’-end

slide-61
SLIDE 61

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G G G G G G C C C C G G G G C C C C C C C U A U U G U A A A A U

Compatible sequence Structure

5’-end 3’-end

slide-62
SLIDE 62

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G C C C C G G G G C C G G G G G C C C C C U A U U G U A A A A U

Compatible sequence Structure

5’-end 3’-end

Base pairs: AU , UA GC , CG GU , UG Single nucleotides: A U G C , , ,

slide-63
SLIDE 63

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C G C G G G G G G G G G C G C C U U G G G G G C C C C C C C U U A A A A A U

Structure Incompatible sequence

5’-end 3’-end

slide-64
SLIDE 64

Target structure Sk Initial trial sequences Target sequence Stop sequence of an unsuccessful trial Intermediate compatible sequences

Approach to the target structure Sk in the inverse folding algorithm

slide-65
SLIDE 65

Minimum free energy criterion

Inverse folding of RNA secondary structures

1st 2nd 3rd trial 4th 5th

The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.

slide-66
SLIDE 66

CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T A C A C

Hamming distance d (I ,I ) =

H 1 2

4 d (I ,I ) = 0

H 1 1

d (I ,I ) = d (I ,I )

H H 1 2 2 1

d (I ,I ) d (I ,I ) + d (I ,I )

H H H 1 3 1 2 2 3

  • (i)

(ii) (iii)

The Hamming distance between sequences induces a metric in sequence space

slide-67
SLIDE 67

Hamming distance d (S ,S ) =

H 1 2

4 d (S ,S ) = 0

H 1 1

d (S ,S ) = d (S ,S )

H H 1 2 2 1

d (S ,S ) d (S ,S ) + d (S ,S )

H H H 1 3 1 2 2 3

  • (i)

(ii) (iii)

The Hamming distance between structures in parentheses notation forms a metric in structure space

slide-68
SLIDE 68

RNA sequences as well as RNA secondary structures can be visualized as objects in metric spaces. At constant chain length the sequence space is a (generalized) hypercube. The mapping from RNA sequences into RNA secondary structures is many-to-one. Hence, it is redundant and not invertible. RNA sequences, which are mapped onto the same RNA secondary structure, are neutral with respect to structure. The pre-images of structures in sequence space are neutral

  • networks. They can be represented by graphs where the edges

connect sequences of Hamming distance dH = 1.

slide-69
SLIDE 69
slide-70
SLIDE 70

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers Mapping from sequence space into structure space and into function

slide-71
SLIDE 71

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers

slide-72
SLIDE 72

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Structure space Real numbers

The pre-image of the structure Sk in sequence space is the neutral network Gk

slide-73
SLIDE 73

Neutral networks are sets of sequences forming the same structure. Gk is the pre-image of the structure Sk in sequence space: Gk =

  • 1(Sk) π{

j |

(Ij) = Sk} The set is converted into a graph by connecting all sequences of Hamming distance one. Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence

  • space. In this approach, nodes are inserted randomly into sequence

space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.

slide-74
SLIDE 74

λj = 27 = 0.444 ,

/

12 λk = (k)

j

| | Gk

λ κ

cr = 1 -

  • 1 (

1)

/ κ- λ λ

k cr . . . .

> λ λ

k cr . . . .

< network is connected Gk network is connected not Gk Connectivity threshold: Alphabet size : = 4

  • AUGC

G S S

k k k

= ( ) | ( ) =

  • 1

U

  • I

I

j j

  • cr

2 0.5 3 0.423 4 0.370

GC,AU GUC,AUG AUGC

Mean degree of neutrality and connectivity of neutral networks

slide-75
SLIDE 75

A connected neutral network

slide-76
SLIDE 76

Giant Component

A multi-component neutral network

slide-77
SLIDE 77 5'-End 5'-End 5'-End 5'-End 3'-End 3'-End 3'-End 3'-End 70 70 70 70 60 60 60 60 50 50 50 50 40 40 40 40 30 30 30 30 20 20 20 20 10 10 10 10

Alphabet Degree of neutrality

AU AUG AUGC UGC GC

  • -
  • -

0.275 0.064 0.263 0.071 0.052 0.033

  • -

0.217 0.051 0.279 0.063 0.257 0.070

  • 0.057 0.034
  • 0.073 0.032

0.201 0.056 0.313 0.058 0.250 0.064 0.068 0.034

  • Degree of neutrality of cloverleaf RNA secondary structures over different alphabets
slide-78
SLIDE 78

Stable tRNA clover leaf structures built from binary, GC-only, sequences exist. The corresponding sequences are found through inverse folding. Optimization by mutation and selection in the flow reactor turned out to be a hard problem.

5'-End 3'-End

70 60 50 40 30 20 10

The neutral network of the tRNA clover leaf in GC sequence space is not connected, whereas to the corresponding neutral network in AUGC sequence space is close to the connectivity threshold,

cr .

Here, both inverse folding and optimization in the flow reactor are much more effective than with GC sequences.

The hardness of the structure optimization problem depends on the connectivity of neutral networks.

slide-79
SLIDE 79

Reference for postulation and in silico verification of neutral networks

slide-80
SLIDE 80

Gk Neutral Network

Structure S

k

Gk C k

Compatible Set Ck

The compatible set Ck of a structure Sk consists of all sequences which form Sk as its minimum free energy structure (the neutral network Gk) or one of its suboptimal structures.

slide-81
SLIDE 81

Structure S Structure S

1

The intersection of two compatible sets is always non empty: C0 C1 π

slide-82
SLIDE 82

Reference for the definition of the intersection and the proof of the intersection theorem

slide-83
SLIDE 83

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G

3’-end

M i n i m u m f r e e e n e r g y c

  • n

f

  • r

m a t i

  • n

S S u b

  • p

t i m a l c

  • n

f

  • r

m a t i

  • n

S 1

G G G G G G G G G G G G C C C C U U U U C C C C C C U A A A A A C G G G G G G C C C C U U G G G G G C C C C C C C U U A A A A A U G

A sequence at the intersection of two neutral networks is compatible with both structures

slide-84
SLIDE 84

5.10 5.90

2 8

14 15 18 17 23 19 27 22 38 45 25 36 33 39 40 43 41

3.30 7.40

5 3 7 4 10 9 6

13 12 3 . 1 11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49 31 47 48

S0 S1

basin '1' long living metastable structure basin '0' minimum free energy structure

Barrier tree for two long living structures

slide-85
SLIDE 85

Kinetics of RNA refolding between a long living metastable conformation and the minmum free energy structure

slide-86
SLIDE 86
slide-87
SLIDE 87

1. Controlled experiments on evolution and RNA replication 2. Evolution in silico and optimization of RNA structures 3. Sequence-structure maps, neutral networks, and intersections 4. Design of RNA molecules with predefined properties

slide-88
SLIDE 88

A ribozyme switch

E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452

slide-89
SLIDE 89

Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis-

  • virus (B)
slide-90
SLIDE 90

The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures

slide-91
SLIDE 91

Two neutral walks through sequence space with conservation of structure and catalytic activity

slide-92
SLIDE 92

Evolutionary design of RNA molecules

D.B.Bartel, J.W.Szostak, In vitro selection of RNA molecules that bind specific ligands. Nature 346 (1990), 818-822 C.Tuerk, L.Gold, SELEX - Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249 (1990), 505-510 D.P.Bartel, J.W.Szostak, Isolation of new ribozymes from a large pool of random sequences. Science 261 (1993), 1411-1418 R.D.Jenison, S.C.Gill, A.Pardi, B.Poliski, High-resolution molecular discrimination by RNA. Science 263 (1994), 1425-1429

  • Y. Wang, R.R.Rando, Specific binding of aminoglycoside antibiotics to RNA. Chemistry &

Biology 2 (1995), 281-290 Jiang, A. K. Suri, R. Fiala, D. J. Patel, Saccharide-RNA recognition in an aminoglycoside antibiotic-RNA aptamer complex. Chemistry & Biology 4 (1997), 35-50

slide-93
SLIDE 93

Aptamer binding to aminoglycosid antibiotics: Structure of ligands

  • Y. Wang, R.R.Rando, Specific binding of aminoglycoside antibiotics to RNA. Chemistry & Biology 2

(1995), 281-290

slide-94
SLIDE 94

tobramycin

A A A A A C C C C C C C C G G G G G G G G U U U U U U

5’- 3’-

A A A A A U U U U U U C C C C C C C C G G G G G G G G

5’-

  • 3’

RNA aptamer

Formation of secondary structure of the tobramycin binding RNA aptamer

  • L. Jiang, A. K. Suri, R. Fiala, D. J. Patel, Saccharide-RNA recognition in an aminoglycoside

antibiotic-RNA aptamer complex. Chemistry & Biology 4:35-50 (1997)

slide-95
SLIDE 95

The three-dimensional structure of the tobramycin aptamer complex

  • L. Jiang, A. K. Suri, R. Fiala, D. J. Patel,

Chemistry & Biology 4:35-50 (1997)

slide-96
SLIDE 96

Questions that cannot be answered by current experimental techniques:

(i) How does the distribution of genotypes change with time? (ii) Which intermediates are passed during an optimization experiment? (iii) Why does optimization occur in steps? (iv) What happens at the edges of the quasi-stationary epochs? (v) How much do individual trajectories differ? (vi) Which is the proper statistics for evolutionary optimization?

slide-97
SLIDE 97

Questions that cannot be answered by current experimental techniques:

(i) How does the distribution of genotypes change with time? (ii) Which intermediates are passed during an optimization experiment? (iii) Why does optimization occur in steps? (iv) What happens at the edges of the quasi-stationary epochs? (v) How much do individual trajectories differ? (vi) Is there a proper statistics for evolutionary optimization?

slide-98
SLIDE 98

Questions that cannot be answered by current experimental techniques:

(i) How does the distribution of genotypes change with time? (ii) Which intermediates are passed during an optimization experiment? (iii) Why does optimization occur in steps? (iv) What happens at the edges of the quasi-stationary epochs? (v) How much do individual trajectories differ? (vi) Is there a proper statistics for evolutionary optimization?

slide-99
SLIDE 99

Questions that cannot be answered by current experimental techniques:

(i) How does the distribution of genotypes change with time? (ii) Which intermediates are passed during an optimization experiment? (iii) Why does optimization occur in steps? (iv) What happens at the edges of the quasi-stationary epochs? (v) How much do individual trajectories differ? (vi) Is there a proper statistics for evolutionary optimization?

slide-100
SLIDE 100

Questions that cannot be answered by current experimental techniques:

(i) How does the distribution of genotypes change with time? (ii) Which intermediates are passed during an optimization experiment? (iii) Why does optimization occur in steps? (iv) What happens at the edges of the quasi-stationary epochs? (v) How much do individual trajectories differ? (vi) Is there a proper statistics for evolutionary optimization?

slide-101
SLIDE 101

Questions that cannot be answered by current experimental techniques:

(i) How does the distribution of genotypes change with time? (ii) Which intermediates are passed during an optimization experiment? (iii) Why does optimization occur in steps? (iv) What happens at the edges of the quasi-stationary epochs? (v) How much do individual trajectories differ? (vi) Is there a proper statistics for evolutionary optimization?

slide-102
SLIDE 102

Questions that cannot be answered by current experimental techniques:

(i) How does the distribution of genotypes change with time? (ii) Which intermediates are passed during an optimization experiment? (iii) Why does optimization occur in steps? (iv) What happens at the edges of the quasi-stationary epochs? (v) How much do individual trajectories differ? (vi) Is there a proper statistics for evolutionary optimization?

slide-103
SLIDE 103

Acknowledgement of support

Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Project No. EU-980189 Siemens AG, Austria The Santa Fe Institute and the Universität Wien The software for producing RNA movies was developed by Robert Giegerich and coworkers at the Universität Bielefeld

Universität Wien

slide-104
SLIDE 104

Coworkers

Universität Wien

Walter Fontana, Santa Fe Institute, NM Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Ivo L.Hofacker, Christoph Flamm, Universität Wien, AT Andreas Wernitznig, Michael Kospach, Universität Wien, AT Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan Cupal, Kurt Grünberger, Andreas Svrček-Seiler, Stefan Wuchty Andreas De Stefani Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber

slide-105
SLIDE 105

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-106
SLIDE 106