RNA: A molecule for many uses seen with the eyes of a physicist - - PowerPoint PPT Presentation

rna a molecule for many uses seen with the eyes of a
SMART_READER_LITE
LIVE PREVIEW

RNA: A molecule for many uses seen with the eyes of a physicist - - PowerPoint PPT Presentation

RNA: A molecule for many uses seen with the eyes of a physicist Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA CAS-MPG Partner Institute for Computational


slide-1
SLIDE 1
slide-2
SLIDE 2

RNA: A molecule for many uses seen with the eyes of a physicist

Peter Schuster

Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA

CAS-MPG Partner Institute for Computational Biology Shanghai, 26.10.2007

slide-3
SLIDE 3

Recent review article: Peter Schuster, Prediction of RNA secondary structures: From theory to models and real molecules

  • Rep. Prog. Phys. 69:1419-1477, 2006.

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-4
SLIDE 4

RNA

RNA as scaffold for supramolecular complexes

ribosome ? ? ? ? ?

RNA is modified by epigenetic control RNA RNA editing Alternative splicing of messenger

Functions of RNA molecules

The world as a precursor of the current + biology RNA DNA protein

RNA as catalyst Ribozyme

RNA as carrier of genetic information

RNA viruses and retroviruses RNA evolution in vitro

slide-5
SLIDE 5

O CH2 OH O O P O O O

N1

O CH2 OH O P O O O

N2

O CH2 OH O P O O O

N3

O CH2 OH O P O O O

N4

N A U G C

k =

, , ,

3' - end 5' - end Na Na Na Na

5'-end 3’-end

GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG

Definition of RNA structure

slide-6
SLIDE 6

A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs

slide-7
SLIDE 7

N = 4n NS < 3n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ {AU,CG,GC,GU,UA,UG} A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs

slide-8
SLIDE 8

Conventional definition of RNA secondary structures

slide-9
SLIDE 9

1. Sequence space and shape space 2. Neutral networks 3. Evolutionary optimization of structure 4. Suboptimal structures and kinetic folding 5. Comparison of kinetic folding and evolution 6. How to model evolution of kinetic folding?

slide-10
SLIDE 10
  • 1. Sequence space and shape space

2. Neutral networks 3. Evolutionary optimization of structure 4. Suboptimal structures and kinetic folding 5. Comparison of kinetic folding and evolution 6. How to model evolution of kinetic folding?

slide-11
SLIDE 11

Sequence space

slide-12
SLIDE 12

CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T A C A C

Hamming distance d (I ,I ) =

H 1 2

4 d (I ,I ) = 0

H 1 1

d (I ,I ) = d (I ,I )

H H 1 2 2 1

d (I ,I ) d (I ,I ) + d (I ,I )

H H H 1 3 1 2 2 3

  • (i)

(ii) (iii)

The Hamming distance between sequences induces a metric in sequence space

slide-13
SLIDE 13

Every point in sequence space is equivalent

Sequence space of binary sequences with chain length n = 5

slide-14
SLIDE 14

Sequence space and structure space

slide-15
SLIDE 15

Hamming distance d (S ,S ) =

H 1 2

4 d (S ,S ) = 0

H 1 1

d (S ,S ) = d (S ,S )

H H 1 2 2 1

d (S ,S ) d (S ,S ) + d (S ,S )

H H H 1 3 1 2 2 3

  • (i)

(ii) (iii)

The Hamming distance between structures in parentheses notation forms a metric in structure space

slide-16
SLIDE 16

Two measures of distance in shape space: Hamming distance between structures, dH(Si,Sj) and base pair distance, dP(Si,Sj)

slide-17
SLIDE 17

Structures are not equivalent in structure space

Sketch of structure space

slide-18
SLIDE 18

? ? ?

slide-19
SLIDE 19
slide-20
SLIDE 20
slide-21
SLIDE 21
slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24
slide-25
SLIDE 25

RNA sequence RNA structure

  • f minimal free

energy

RNA folding: Structural biology, spectroscopy of biomolecules, understanding molecular function Empirical parameters Biophysical chemistry: thermodynamics and kinetics

Sequence, structure, and design

slide-26
SLIDE 26

G G G G G G G G G G G G G G G G U U U U U U U U U U U A A A A A A A A A A A A U C C C C C C C C C C C C 5’-end 3’-end

S1

(h)

S9

(h)

F r e e e n e r g y G

  • Minimum of free energy

Suboptimal conformations

S0

(h) S2

(h)

S3

(h)

S4

(h)

S7

(h)

S6

(h)

S5

(h)

S8

(h)

The minimum free energy structures on a discrete space of conformations

slide-27
SLIDE 27

1. Sequence space and shape space

  • 2. Neutral networks

3. Evolutionary optimization of structure 4. Suboptimal structures and kinetic folding 5. Comparison of kinetic folding and evolution 6. How to model evolution of kinetic folding?

slide-28
SLIDE 28

RNA sequence RNA structure

  • f minimal free

energy

RNA folding: Structural biology, spectroscopy of biomolecules, understanding molecular function Inverse Folding Algorithm Iterative determination

  • f a sequence for the

given secondary structure

Sequence, structure, and design

Inverse folding of RNA: Biotechnology, design of biomolecules with predefined structures and functions

slide-29
SLIDE 29

UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG

Minimum free energy criterion Inverse folding

1st 2nd 3rd trial 4th 5th

The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.

slide-30
SLIDE 30

A mapping and its inversion

  • Gk =

( ) | ( ) =

  • 1

U

  • S

I S

k j j k

I

( ) = I S

j k Space of genotypes: = { I

S I I I I I S S S S S

1 2 3 4 N 1 2 3 4 M

, , , , ... , } ; Hamming metric Space of phenotypes: , , , , ... , } ; metric (not required) N M = {

slide-31
SLIDE 31

Degree of neutrality of neutral networks and the connectivity threshold

slide-32
SLIDE 32

A multi-component neutral network formed by a rare structure: < cr

slide-33
SLIDE 33

A connected neutral network formed by a common structure: > cr

slide-34
SLIDE 34

RNA 9:1456-1463, 2003

Evidence for neutral networks and shape space covering

slide-35
SLIDE 35

Evidence for neutral networks and

intersection of apatamer functions

slide-36
SLIDE 36

1. Sequence space and shape space 2. Neutral networks

  • 3. Evolutionary optimization of structure

4. Suboptimal structures and kinetic folding 5. Comparison of kinetic folding and evolution 6. How to model evolution of kinetic folding?

slide-37
SLIDE 37

Evolution in silico

  • W. Fontana, P. Schuster,

Science 280 (1998), 1451-1455

slide-38
SLIDE 38

Replication rate constant: fk = / [ + dS

(k)]

dS

(k) = dH(Sk,S)

Selection constraint: Population size, N = # RNA molecules, is controlled by the flow Mutation rate: p = 0.001 / site replication N N t N ± ≈ ) ( The flowreactor as a device for studies of evolution in vitro and in silico

slide-39
SLIDE 39

Randomly chosen initial structure Phenylalanyl-tRNA as target structure

slide-40
SLIDE 40

In silico optimization in the flow reactor: Evolutionary Trajectory

slide-41
SLIDE 41

28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations change the molecular structure Neutral point mutations leave the molecular structure unchanged

Neutral genotype evolution during phenotypic stasis

slide-42
SLIDE 42

A sketch of optimization on neutral networks

slide-43
SLIDE 43

1. Sequence space and shape space 2. Neutral networks 3. Evolutionary optimization of structure

  • 4. Suboptimal structures and kinetic folding

5. Comparison of kinetic folding and evolution 6. How to model evolution of kinetic folding?

slide-44
SLIDE 44

RNA secondary structures derived from a single sequence

slide-45
SLIDE 45

The Folding Algorithm

A sequence I specifies an energy ordered set of compatible structures S(I):

S(I) = {S0 , S1 , … , Sm , O}

A trajectory Tk(I) is a time ordered series of structures in S(I). A folding trajectory is defined by starting with the open chain O and ending with the global minimum free energy structure S0 or a metastable structure Sk which represents a local energy minimum:

T0(I) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , S0} Tk(I) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , Sk}

Master equation

( )

1 , , 1 , ) ( ) (

1 1 1

+ = − = − =

∑ ∑ ∑

+ = + = + =

m k k P P k t P t P dt dP

m i ki k i m i ik m i ki ik k

K

Transition probabilities Pij(t) = Prob{Si→Sj} are defined by

Pij(t) = Pi(t) kij = Pi(t) exp(-∆Gij/2RT) / Σi Pji(t) = Pj(t) kji = Pj(t) exp(-∆Gji/2RT) / Σj exp(-∆Gki/2RT)

The symmetric rule for transition rate parameters is due to Kawasaki (K. Kawasaki, Diffusion constants near the critical point for time depen-dent Ising models. Phys.Rev. 145:224-230, 1966).

+ ≠ =

= Σ

2 , 1 m i k k k

Formulation of kinetic RNA folding as a stochastic process

slide-46
SLIDE 46

Corresponds to base pair distance: dP(S1,S2) Base pair formation and base pair cleavage moves for nucleation and elongation of stacks

slide-47
SLIDE 47

Base pair closure, opening and shift corresponds to Hamming distance: dH(S1,S2) Base pair shift move of class 1: Shift inside internal loops or bulges

slide-48
SLIDE 48

Two measures of distance in shape space: Hamming distance between structures, dH(Si,Sj) and base pair distance, dP(Si,Sj)

slide-49
SLIDE 49

Sh S1

(h)

S6

(h)

S7

(h)

S5

(h)

S2

(h)

S9

(h)

Free energy G

  • Local minimum

Suboptimal conformations

Search for local minima in conformation space

slide-50
SLIDE 50

F r e e e n e r g y G

  • "Reaction coordinate"

Sk S{ Saddle point T

{ k

F r e e e n e r g y G

  • Sk

S{ T

{ k

"Barrier tree"

Definition of a ‚barrier tree‘

slide-51
SLIDE 51

CUGCGGCUUUGGCUCUAGCC ....((((........)))) -4.30 (((.(((....))).))).. -3.50 (((..((....))..))).. -3.10 ..........(((....))) -2.80 ..(((((....)))...)). -2.20 ....(((..........))) -2.20 ((..(((....)))..)).. -2.00 ..((.((....))....)). -1.60 ....(((....)))...... -1.60 .....(((........))). -1.50 .((.(((....))).))... -1.40 ....((((..(...).)))) -1.40 .((..((....))..))... -1.00 (((.(((....)).)))).. -0.90 (((.((......)).))).. -0.90 ....((((..(....))))) -0.80 .....((....))....... -0.80 ..(.(((....))))..... -0.60 ....(((....)).)..... -0.60 (((..(......)..))).. -0.50 ..(((((....)).)..)). -0.50 ..(.(((....))).).... -0.40 ..((.......))....... -0.30 ..........((......)) -0.30 ...........((....)). -0.30 (((.(((....)))).)).. -0.20 ....(((.(.......)))) -0.20 ....(((..((....))))) -0.20 ..(..((....))..).... 0.00 .................... 0.00 .(..(((....)))..)... 0.10

M.T. Wolfinger, W.A. Svrcek-Seiler, C. Flamm, I.L. Hofacker, P.F. Stadler. 2004. J.Phys.A: Math.Gen. 37:4731-4741.

slide-52
SLIDE 52

CUGCGGCUUUGGCUCUAGCC ....((((........)))) -4.30 (((.(((....))).))).. -3.50 (((..((....))..))).. -3.10 ..........(((....))) -2.80 ..(((((....)))...)). -2.20 ....(((..........))) -2.20 ((..(((....)))..)).. -2.00 ..((.((....))....)). -1.60 ....(((....)))...... -1.60 .....(((........))). -1.50 .((.(((....))).))... -1.40 ....((((..(...).)))) -1.40 .((..((....))..))... -1.00 (((.(((....)).)))).. -0.90 (((.((......)).))).. -0.90 ....((((..(....))))) -0.80 .....((....))....... -0.80 ..(.(((....))))..... -0.60 ....(((....)).)..... -0.60 (((..(......)..))).. -0.50 ..(((((....)).)..)). -0.50 ..(.(((....))).).... -0.40 ..((.......))....... -0.30 ..........((......)) -0.30 ...........((....)). -0.30 (((.(((....)))).)).. -0.20 ....(((.(.......)))) -0.20 ....(((..((....))))) -0.20 ..(..((....))..).... 0.00 .................... 0.00 .(..(((....)))..)... 0.10

M.T. Wolfinger, W.A. Svrcek-Seiler, C. Flamm, I.L. Hofacker, P.F. Stadler. 2004. J.Phys.A: Math.Gen. 37:4731-4741.

slide-53
SLIDE 53

Arrhenius kinetics M.T. Wolfinger, W.A. Svrcek-Seiler, C. Flamm, I.L. Hofacker, P.F. Stadler. 2004. J.Phys.A: Math.Gen. 37:4731-4741.

slide-54
SLIDE 54

Arrhenius kinetic Exact solution of the kinetic equation M.T. Wolfinger, W.A. Svrcek-Seiler, C. Flamm, I.L. Hofacker, P.F. Stadler. 2004. J.Phys.A: Math.Gen. 37:4731-4741.

slide-55
SLIDE 55

RNA secondary structures derived from a single sequence

slide-56
SLIDE 56

Gk Neutral Network

Structure S

k

Gk C

  • k

Compatible Set Ck

The compatible set Ck of a structure Sk consists of all sequences which form Sk as its minimum free energy structure (the neutral network Gk) or one of its suboptimal structures.

slide-57
SLIDE 57

Structure S Structure S

1

The intersection of two compatible sets is always non empty: C0 C1

slide-58
SLIDE 58

Reference for the definition of the intersection and the proof of the intersection theorem

slide-59
SLIDE 59

JN1LH

1D 1D 1D 2D 2D 2D R R R

G GGGUGGAAC GUUC GAAC GUUCCUCCC CACGAG CACGAG CACGAG

  • 28.6 kcal·mol
  • 1

G/

  • 31.8 kcal·mol
  • 1

G G G G G G C C C C C C A A U U U U G G C C U U A A G G G C C C A A A A G C G C A A G C /G

  • 28.2 kcal·mol
  • 1

G G G G G G GG CCC C C C C C U G G G G C C C C A A A A A A A A U U U U U G G C C A A

  • 28.6 kcal·mol
  • 1

3 3 3 13 13 13 23 23 23 33 33 33 44 44 44

5' 5' 3’ 3’

J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation, Nucleic Acids Res., in press 2005.

An RNA switch

slide-60
SLIDE 60

4 5 8 9 11

1 9 2 2 4 2 5 2 7 3 3 3 4

36

38 39 41 46 47

3

49

1

2 6 7 10

1 2 1 3 1 4 1 5 1 6 1 7 1 8 2 1 22 2 3 2 6 2 8 2 9 3 3 1 32 3 5 3 7

40

4 2 4 3 44 45 48 50

  • 26.0
  • 28.0
  • 30.0
  • 32.0
  • 34.0
  • 36.0
  • 38.0
  • 40.0
  • 42.0
  • 44.0
  • 46.0
  • 48.0
  • 50.0

2.77 5.32 2 . 9 3.4 2.36 2 . 4 4 2.44 2.44 1.46 1.44 1.66

1.9

2.14

2.51 2.14 2.51

2 . 1 4 1 . 4 7

1.49

3.04 2.97 3.04 4.88 6.13 6 . 8 2.89

Free energy [kcal / mole]

J1LH barrier tree

slide-61
SLIDE 61

A ribozyme switch

E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452

slide-62
SLIDE 62

Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis--virus (B)

slide-63
SLIDE 63

The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures

slide-64
SLIDE 64

Two neutral walks through sequence space with conservation of structure and catalytic activity

slide-65
SLIDE 65

1. Sequence space and shape space 2. Neutral networks 3. Evolutionary optimization of structure 4. Suboptimal structures and kinetic folding

  • 5. Comparison of kinetic folding and evolution

6. How to model evolution of kinetic folding?

slide-66
SLIDE 66

Kinetic Folding

Compatible structures: Set of stuctures compatible with a given sequence stability restriction Conformation space Folding trajectory in conformation space: Time ordered series of structures Folding process: Average of trajectories on the ensemble level Criterium: minimizing free energy

Evolutionary optimization

Compatible sequences: Set of sequences compatible with a given structure mfe restriction Neutral network Genealogy on a neutral network: Time ordered series of sequences Optimization process: Average over genealogies on the population level Criterium: maximizing fitness

slide-67
SLIDE 67

1. Sequence space and shape space 2. Neutral networks 3. Evolutionary optimization of structure 4. Suboptimal structures and kinetic folding 5. Comparison of kinetic folding and evolution

  • 6. How to model evolution of kinetic folding?
slide-68
SLIDE 68

Construction of a landscape for the evolution of the free energy spectrum

slide-69
SLIDE 69

Construction of a landscape for the evolution of barrier trees

slide-70
SLIDE 70

Prediction of RNA kinetic folding

  • f secondary structures based on

Arrhenius kinetics

slide-71
SLIDE 71

Prediction of RNA kinetic folding

  • f secondary structures based on

Arrhenius kinetics

slide-72
SLIDE 72

Prediction of RNA kinetic folding

  • f secondary structures based on

Arrhenius kinetics

slide-73
SLIDE 73

Prediction of RNA kinetic folding

  • f secondary structures based on

Arrhenius kinetics

slide-74
SLIDE 74

Prediction of RNA kinetic folding

  • f secondary structures based on

Arrhenius kinetics

slide-75
SLIDE 75

Design of RNA molecules with with predefined folding kinetics

slide-76
SLIDE 76

Acknowledgement of support

Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU: Bioinformatics Network (BIN) Österreichische Akademie der Wissenschaften Siemens AG, Austria Universität Wien and the Santa Fe Institute

Universität Wien

slide-77
SLIDE 77

Coworkers

Peter Stadler, Bärbel M. Stadler, Universität Leipzig, GE Camille Stephan-Otto Atttolini, Athanasius Bompfüneverer Jord Nagel, Kees Pleij, Universiteit Leiden, NL Walter Fontana, Harvard Medical School, MA Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE Ivo L.Hofacker, Christoph Flamm, Andreas Svrček-Seiler, Universität Wien, AT Kurt Grünberger, Michael Kospach, Andreas Wernitznig, Stefanie Widder, Michael Wolfinger, Stefan Wuchty, Universität Wien, AT Jan Cupal, Stefan Bernhart, Lukas Endler, Ulrike Langhammer, Rainer Machne, Ulrike Mückstein, Hakim Tafer, Thomas Taylor, Universität Wien, AT

Universität Wien

slide-78
SLIDE 78

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-79
SLIDE 79