RNA From Mathematical Models to Real Molecules 3. Optimization and - - PowerPoint PPT Presentation

rna from mathematical models to real molecules
SMART_READER_LITE
LIVE PREVIEW

RNA From Mathematical Models to Real Molecules 3. Optimization and - - PowerPoint PPT Presentation

RNA From Mathematical Models to Real Molecules 3. Optimization and Evolution of RNA Molecules Peter Schuster Institut fr Theoretische Chemie und Molekulare Strukturbiologie der Universitt Wien CIMPA Genoma School Valdivia, 12.


slide-1
SLIDE 1
slide-2
SLIDE 2

RNA – From Mathematical Models to Real Molecules

  • 3. Optimization and Evolution of RNA Molecules

Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien CIMPA – Genoma School Valdivia, 12.– 16.01.2004

slide-3
SLIDE 3

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-4
SLIDE 4

Generation time 10 000 generations 106 generations 107 generations RNA molecules 10 sec 1 min 27.8 h = 1.16 d 6.94 d 115.7 d 1.90 a 3.17 a 19.01 a Bacteria 20 min 10 h 138.9 d 11.40 a 38.03 a 1 140 a 380 a 11 408 a Higher multicelluar

  • rganisms

10 d 20 a 274 a 20 000 a 27 380 a 2 × 107 a 273 800 a 2 × 108 a

Time scales of evolutionary change

slide-5
SLIDE 5

G G G G C C C G C C G C C G C C G C C G C C C C G G G G G C G C

Plus Strand Plus Strand Minus Strand Plus Strand Plus Strand Minus Strand

3' 3' 3' 3' 3' 5' 5' 5' 3' 3' 5' 5' 5' +

Complex Dissociation Synthesis Synthesis

James Watson and Francis Crick, 1953

Complementary replication as the simplest copying mechanism of RNA Complementarity is determined by Watson-Crick base pairs: G C and A=U

slide-6
SLIDE 6

dx / dt = x - x x

1 2 1 i i

; Σ = 1 ; i f f

2 i

Φ Φ dx / dt = x - x

2 1 2

f1 Φ = Σi

i

x =1,2 I1 I2 I1 I2 I2 I1

+ +

(A) + (A) + f1 f2

Complementary replication as the simplest molecular mechanism of reproduction

slide-7
SLIDE 7

Equation for complementary replication: [Ii] = xi 0 , fi > 0 ; i=1,2 Solutions are obtained by integrating factor transformation

f x f x f x x f dt dx x x f dt dx = + = − = − =

2 2 1 1 2 1 1 2 1 2 2 1

, , φ φ φ

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

2 1 2 2 1 1 2 2 2 1 1 1 1 2 1 1 2 1 2 1 1 , 2 2 , 1

, ) ( ) ( ) ( , ) ( ) ( ) ( exp ) ( exp ) ( exp exp f f f x f x f x f x f t f f f t f f f t f t f f t x = − = + = − ⋅ − − ⋅ + − ⋅ + ⋅ = γ γ γ γ γ γ ) ( exp as ) ( and ) (

2 1 1 2 2 1 2 1

→ − + → + → ft f f f t x f f f t x

slide-8
SLIDE 8

G G G C C C C C C G G G C C C G G G C C C G G G G G G C C C

Plus Strand Plus Strand Plus Strand Minus Strand Minus Strand Minus Strand

3' 3' 3' 5' 5' 5' 5' 5' 5' 3' 3' 3'

+

Direct replication of DNA is a higly complex copying mechanism involving more than ten different protein molecules. Complementarity is determined by Watson-Crick base pairs: G C and A=T

slide-9
SLIDE 9

dx / dt = x - x x

i i i j j

; Σ = 1 ; i,j f f

i j

Φ Φ fi Φ = ( = Σ x

  • i

)

j j

x =1,2,...,n [I ] = x 0 ;

i i

i =1,2,...,n ; Ii I1 I2 I1 I2 I1 I2 I i I n I i I n I n

+ + + + + +

(A) + (A) + (A) + (A) + (A) + (A) + fn fi f1 f2 I m I m I m

+

(A) + (A) + fm fm fj = max { ; j=1,2,...,n} xm(t) 1 for t

  • [A] = a = constant

Reproduction of organisms or replication of molecules as the basis of selection

slide-10
SLIDE 10

( )

{ }

var

2 2 1

≥ = − = = ∑

=

f f f dt dx f dt d

i n i i

φ

Selection equation: [Ii] = xi 0 , fi > 0 Mean fitness or dilution flux, φ (t), is a non-decreasing function of time, Solutions are obtained by integrating factor transformation

( )

f x f x n i f x dt dx

n j j j n i i i i i

= = = = − =

∑ ∑

= = 1 1

; 1 ; , , 2 , 1 , φ φ L

( ) ( ) ( ) ( )

( )

n i t f x t f x t x

j n j j i i i

, , 2 , 1 ; exp exp

1

L = ⋅ ⋅ =

∑ =

slide-11
SLIDE 11

s = ( f2-f1) / f1; f2 > f1 ; x1(0) = 1 - 1/N ; x2(0) = 1/N

200 400 600 800 1000 0.2 0.4 0.6 0.8 1 Time [Generations] Fraction of advantageous variant s = 0.1 s = 0.01 s = 0.02

Selection of advantageous mutants in populations of N = 10 000 individuals

slide-12
SLIDE 12

Changes in RNA sequences originate from replication errors called mutations. Mutations occur uncorrelated to their consequences in the selection process and are, therefore, commonly characterized as random elements of evolution.

slide-13
SLIDE 13

G G G C C C G C C G C C C G C C C G C G G G G C

Plus Strand Plus Strand Minus Strand Plus Strand 3' 3' 3' 3' 5' 3' 5' 5' 5'

Point Mutation Insertion Deletion

GAA AA UCCCG GAAUCC A CGA GAA AA UCCCGUCCCG GAAUCCA

The origins of changes in RNA sequences are replication errors called mutations.

slide-14
SLIDE 14

Theory of molecular evolution

M.Eigen, Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften 58 (1971), 465-526 C.J. Thompson, J.L. McBride, On Eigen's theory of the self-organization of matter and the evolution

  • f biological macromolecules. Math. Biosci. 21 (1974), 127-142

B.L. Jones, R.H. Enns, S.S. Rangnekar, On the theory of selection of coupled macromolecular

  • systems. Bull.Math.Biol. 38 (1976), 15-28

M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle. Naturwissenschaften 58 (1977), 465-526 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part B: The abstract

  • hypercycle. Naturwissenschaften 65 (1978), 7-41

M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part C: The realistic

  • hypercycle. Naturwissenschaften 65 (1978), 341-369
  • J. Swetina, P. Schuster, Self-replication with errors - A model for polynucleotide replication.

Biophys.Chem. 16 (1982), 329-345 J.S. McCaskill, A localization threshold for macromolecular quasispecies from continuously distributed replication rates. J.Chem.Phys. 80 (1984), 5194-5202 M.Eigen, J.McCaskill, P.Schuster, The molecular quasispecies. Adv.Chem.Phys. 75 (1989), 149-263

  • C. Reidys, C.Forst, P.Schuster, Replication and mutation on neutral networks. Bull.Math.Biol. 63

(2001), 57-94

slide-15
SLIDE 15

Chemical kinetics of molecular evolution

  • M. Eigen, P. Schuster, `The Hypercycle´,

Springer-Verlag, Berlin 1979

slide-16
SLIDE 16

Ij In I2 Ii I1 I j I j I j I j I j I j

+ + + + +

(A) + fj Qj1 fj Qj2 fj Qji fj Qjj fj Qjn Q (1- )

ij

  • d(i,j)

d(i,j)

=

l

p p

p .......... Error rate per digit d(i,j) .... Hamming distance between Ii and Ij ........... Chain length of the polynucleotide l

dx / dt = x - x x

i j j i j j

Σ

; Σ = 1 ; f f x

j j j i

Φ Φ = Σ Qji Qij

Σi

= 1 [A] = a = constant [Ii] = xi 0 ;

  • i =1,2,...,n ;

Chemical kinetics of replication and mutation as parallel reactions

slide-17
SLIDE 17

.... GC UC .... CA .... GC UC .... GU .... GC UC .... GA .... GC UC .... CU

d =1

H

d =1

H

d =2

H

City-block distance in sequence space 2D Sketch of sequence space

Single point mutations as moves in sequence space

slide-18
SLIDE 18

Mutation-selection equation: [Ii] = xi 0, fi > 0, Qij Solutions are obtained after integrating factor transformation by means of an eigenvalue problem

f x f x n i x x Q f dt dx

n j j j n i i i j n j ji j i

= = = = − =

∑ ∑ ∑

= = = 1 1 1

; 1 ; , , 2 , 1 , φ φ L

( ) ( ) ( ) ( ) ( )

) ( ) ( ; , , 2 , 1 ; exp exp

1 1 1 1

∑ ∑ ∑ ∑

= = − = − =

= = ⋅ ⋅ ⋅ ⋅ =

n i i ki k n j k k n k jk k k n k ik i

x h c n i t c t c t x L l l λ λ

{ } { } { }

n j i h H L n j i L n j i Q f W

ij ij ij i

, , 2 , 1 , ; ; , , 2 , 1 , ; ; , , 2 , 1 , ;

1

L L l L = = = = = = ÷

{ }

1 , , 1 , ;

1

− = = Λ = ⋅ ⋅

n k L W L

k

L λ

slide-19
SLIDE 19

space Sequence C

  • n

c e n t r a t i

  • n

Master sequence Mutant cloud

The molecular quasispecies in sequence space

slide-20
SLIDE 20

Quasispecies as a function of the replication accuracy q

slide-21
SLIDE 21

In evolution variation occurs on genotypes but selection operates on the phenotype. Mappings from genotypes into phenotypes are highly complex objects. The only computationally accessible case is in the evolution of RNA molecules. The mapping from RNA sequences into secondary structures and function, sequence structure function, is used as a model for the complex relations between genotypes and phenotypes. Fertile progeny measured in terms of fitness in population biology is determined quantitatively by replication rate constants of RNA molecules.

Population biology Molecular genetics Evolution of RNA molecules Genotype Genome RNA sequence Phenotype Organism RNA structure and function Fitness Reproductive success Replication rate constant

The RNA model

slide-22
SLIDE 22

Optimized element: RNA structure

slide-23
SLIDE 23

Hamming distance d (S ,S ) =

H 1 2

4 d (S ,S ) = 0

H 1 1

d (S ,S ) = d (S ,S )

H H 1 2 2 1

d (S ,S ) d (S ,S ) + d (S ,S )

H H H 1 3 1 2 2 3

  • (i)

(ii) (iii)

The Hamming distance between structures in parentheses notation forms a metric in structure space

slide-24
SLIDE 24

f0 f f1 f2 f3 f4 f6 f5 f7

Replication rate constant: fk = / [+ dS

(k)]

  • dS

(k) = dH(Sk,S

)

Evaluation of RNA secondary structures yields replication rate constants

slide-25
SLIDE 25

Stock Solution Reaction Mixture

Replication rate constant: fk = / [+ dS

(k)]

  • dS

(k) = dH(Sk,S

) Selection constraint: # RNA molecules is controlled by the flow N N t N ± ≈ ) ( The flowreactor as a device for studies of evolution in vitro and in silico

slide-26
SLIDE 26

5'-End 3'-End

70 60 50 40 30 20 10

Randomly chosen initial structure Phenylalanyl-tRNA as target structure

slide-27
SLIDE 27

s p a c e Sequence Concentration

Master sequence Mutant cloud “Off-the-cloud” mutations

The molecular quasispecies in sequence space

slide-28
SLIDE 28

S{ = ( ) I{ f S

{ {

ƒ = ( )

S{ f{ I{

Mutation Genotype-Phenotype Mapping Evaluation of the Phenotype

Q{

j

I1 I2 I3 I4 I5 In

Q

f1 f2 f3 f4 f5 fn

I1 I2 I3 I4 I5 I{ In+1 f1 f2 f3 f4 f5 f{ fn+1

Q

Evolutionary dynamics including molecular phenotypes

slide-29
SLIDE 29

In silico optimization in the flow reactor: Trajectory (biologists‘ view) Time (arbitrary units) A v e r a g e d i s t a n c e f r

  • m

i n i t i a l s t r u c t u r e 5

  • d
  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-30
SLIDE 30

In silico optimization in the flow reactor: Trajectory (physicists‘ view) Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t

  • t

a r g e t d

  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-31
SLIDE 31

44

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Endconformation of optimization

slide-32
SLIDE 32

44 43

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of the last step 43 44

slide-33
SLIDE 33

44 43 42

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of last-but-one step 42 43 ( 44)

slide-34
SLIDE 34

44 43 42 41

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of step 41 42 ( 43 44)

slide-35
SLIDE 35

44 43 42 41 40

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of step 40 41 ( 42 43 44)

slide-36
SLIDE 36

44 43 42 41 40 39 Evolutionary process Reconstruction

Average structure distance to target dS

  • Evolutionary trajectory

1250 10

44 42 40 38 36 Relay steps Number of relay step Time

Reconstruction of the relay series

slide-37
SLIDE 37

Transition inducing point mutations Neutral point mutations

Change in RNA sequences during the final five relay steps 39 44

slide-38
SLIDE 38

In silico optimization in the flow reactor: Trajectory and relay steps Time (arbitrary units) A v e r a g e s t r u c t u r e d i s t a n c e t

  • t

a r g e t d

  • S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

Relay steps

slide-39
SLIDE 39

In silico optimization in the flow reactor: Main transitions Main transitions Relay steps Time (arbitrary units) Average structure distance to target d S

500 750 1000 1250 250 50 40 30 20 10

Evolutionary trajectory

slide-40
SLIDE 40

00 09 31 44

Three important steps in the formation of the tRNA clover leaf from a randomly chosen initial structure corresponding to three main transitions.

slide-41
SLIDE 41

10 10

1

10

2

10

3

10

4

10

5

Rank

10

  • 6

10

  • 5

10

  • 4

10

  • 3

10

  • 2

10

  • 1

Frequency of occurrence

5'-End 3'-End

70 60 50 40 30 20 10

10 2 5

Rare neighbors Main transitions Frequent neighbors Minor transitions

Probability of occurrence of different structures in the mutational neighborhood of tRNAphe

slide-42
SLIDE 42

Definition of an

  • neighborhood of structure Sk

Y(Sk) ... set of all structures occurring in the Hamming distance one neighborhood of the neutral network Gk of Sk

  • jk ... number of contacts between the two neutral networks Gj and Gk
  • jk =

kj

) S ; S ( ) S ; S ( ; G ) 1 ( ) S ; S ( :

  • ccurrence
  • f

y Probabilit

k j j k k jk k j

n ρ ρ κ γ ρ ≠ − =

{ }

ε ) S ; S ( | ) S ( S ) S ( : S

  • f
  • d

neighborho ε

ε

> Υ ∈ = Ψ −

k j k j k k

ρ

slide-43
SLIDE 43

AUGC GC Movies of optimization trajectories over the AUGC and the GC alphabet

slide-44
SLIDE 44

Runtime of trajectories F r e q u e n c y

1000 2000 3000 4000 5000 0.05 0.1 0.15 0.2

Statistics of the lengths of trajectories from initial structure to target (AUGC-sequences)

slide-45
SLIDE 45

Number of transitions F r e q u e n c y

20 40 60 80 100 0.05 0.1 0.15 0.2 0.25 0.3

All transitions Main transitions

Statistics of the numbers of transitions from initial structure to target (AUGC-sequences)

slide-46
SLIDE 46

Alphabet Runtime Transitions Main transitions

  • No. of runs

AUGC 385.6 22.5 12.6 1017 GUC 448.9 30.5 16.5 611 GC 2188.3 40.0 20.6 107

Statistics of trajectories and relay series (mean values of log-normal distributions)

slide-47
SLIDE 47

10 08 12 14 Time (arbitrary units) Average structure distance to target dS

  • 500

250 20 10

Uninterrupted presence Evolutionary trajectory Number of relay step

28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations Neutral point mutations

Neutral genotype evolution during phenotypic stasis

slide-48
SLIDE 48

Variation in genotype space during optimization of phenotypes

Mean Hamming distance within the population and drift velocity of the population center in sequence space.

slide-49
SLIDE 49

Spread of population in sequence space during a quasistationary epoch: t = 150

slide-50
SLIDE 50

Spread of population in sequence space during a quasistationary epoch: t = 170

slide-51
SLIDE 51

Spread of population in sequence space during a quasistationary epoch: t = 200

slide-52
SLIDE 52

Spread of population in sequence space during a quasistationary epoch: t = 350

slide-53
SLIDE 53

Spread of population in sequence space during a quasistationary epoch: t = 500

slide-54
SLIDE 54

Spread of population in sequence space during a quasistationary epoch: t = 650

slide-55
SLIDE 55

Spread of population in sequence space during a quasistationary epoch: t = 820

slide-56
SLIDE 56

Spread of population in sequence space during a quasistationary epoch: t = 825

slide-57
SLIDE 57

Spread of population in sequence space during a quasistationary epoch: t = 830

slide-58
SLIDE 58

Spread of population in sequence space during a quasistationary epoch: t = 835

slide-59
SLIDE 59

Spread of population in sequence space during a quasistationary epoch: t = 840

slide-60
SLIDE 60

Spread of population in sequence space during a quasistationary epoch: t = 845

slide-61
SLIDE 61

Spread of population in sequence space during a quasistationary epoch: t = 850

slide-62
SLIDE 62

Spread of population in sequence space during a quasistationary epoch: t = 855

slide-63
SLIDE 63

Massif Central Mount Fuji

Examples of smooth landscapes on Earth

slide-64
SLIDE 64

Dolomites

Examples of rugged landscapes on Earth

Bryce Canyon

slide-65
SLIDE 65

Genotype Space Fitness

Start of Walk End of Walk

Evolutionary optimization in absence of neutral paths in sequence space

slide-66
SLIDE 66

Genotype Space F i t n e s s

Start of Walk End of Walk Random Drift Periods Adaptive Periods

Evolutionary optimization including neutral paths in sequence space

slide-67
SLIDE 67

Grand Canyon

Example of a landscape on Earth with ‘neutral’ ridges and plateaus

slide-68
SLIDE 68

Neutral ridges and plateus

slide-69
SLIDE 69

Acknowledgement of support

Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Project No. EU-980189 Siemens AG, Austria The Santa Fe Institute and the Universität Wien The software for producing RNA movies was developed by Robert Giegerich and coworkers at the Universität Bielefeld

Universität Wien

slide-70
SLIDE 70

Coworkers

Universität Wien

Walter Fontana, Santa Fe Institute, NM Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Ivo L.Hofacker, Christoph Flamm, Universität Wien, AT Andreas Wernitznig, Michael Kospach, Universität Wien, AT Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan Cupal, Kurt Grünberger, Andreas Svrček-Seiler, Stefan Wuchty Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber

slide-71
SLIDE 71

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-72
SLIDE 72