Prediction and Analysis of RNA Secondary Structures Peter Schuster - - PowerPoint PPT Presentation

prediction and analysis of rna secondary structures
SMART_READER_LITE
LIVE PREVIEW

Prediction and Analysis of RNA Secondary Structures Peter Schuster - - PowerPoint PPT Presentation

Prediction and Analysis of RNA Secondary Structures Peter Schuster Institut fr Theoretische Chemie und Molekulare Strukturbiologie der Universitt Wien RNA Secondary Structures in Dijon Dijon, 24. 26.06.2002 Three-dimensional structure


slide-1
SLIDE 1
slide-2
SLIDE 2

Prediction and Analysis of RNA Secondary Structures

Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien RNA Secondary Structures in Dijon Dijon, 24.– 26.06.2002

slide-3
SLIDE 3

Three-dimensional structure of phenylalanyl-transfer-RNA

slide-4
SLIDE 4

RNA Secondary Structures and their Properties

RNA secondary structures are listings of Watson-Crick and GU wobble base pairs, which are free of knots and pseudokots. Secondary structures are folding intermediates in the formation of full three-dimensional structures.

D.Thirumalai, N.Lee, S.A.Woodson, and D.K.Klimov. Annu.Rev.Phys.Chem. 52:751-762 (2001)

slide-5
SLIDE 5

5'-End 5'-End 5'-End 3'-End 3'-End 3'-End

70 60 50 40 30 20 10 GCGGAU AUUCGC UUA AGDDGGGA M CUGAAYA AGMUC TPCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG

Sequence Secondary Structure Symbolic Notation

Definition and formation of the secondary structure of phenylalanyl-tRNA

slide-6
SLIDE 6

5'-Ende 3'-Ende 10 20 30 40 50 60 70

Circle representation of tRNAphe

slide-7
SLIDE 7

5'-Ende 3'-Ende Virtuelle Root

Tree representation of tRNAphe

slide-8
SLIDE 8

76 60 50 40 30 20 10 70

3'-Ende 5'-Ende

Mountain representation of tRNAphe

slide-9
SLIDE 9

Mountain representation used in structure prediction of medium size RNA molecules

slide-10
SLIDE 10

Mountain representation used in structure prediction of large RNA molecules

slide-11
SLIDE 11

5.10

2

2.90

8 14 15 18

2.60

17 23 19 27 22 38 45 25 36 33 39 40

3.10

43

3.40

41

3.30 7.40

5 3 7

3.00

4 10 9

3.40

6 13 12

3.10

11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49

2.80

31 47 48

S0 S1

Kinetic Structures Free Energy S0 S0 S1 S2 S3 S4 S5 S6 S7 S8 S10 S9 Minimum Free Energy Structure Suboptimal Structures T = 0 K , t T > 0 K , t T > 0 K , t finite

5.90

Different notions of RNA structure

slide-12
SLIDE 12

RNA Minimum Free Energy Structures

Efficient algorithms based on dynamical programming are available for computation of secondary structures for given

  • sequences. Inverse folding algorithms compute sequences

for given secondary structures.

M.Zuker and P.Stiegler. Nucleic Acids Res. 9:133-148 (1981) Vienna RNA Package: http:www.tbi.univie.ac.at (includes inverse folding, suboptimal structures, kinetic folding, etc.) I.L.Hofacker, W. Fontana, P.F.Stadler, L.S.Bonhoeffer, M.Tacker, and P. Schuster. Mh.Chem. 125:167-188 (1994)

slide-13
SLIDE 13

UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG

Minimum free energy criterion Inverse folding

1st 2nd 3rd trial 4th 5th

The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.

slide-14
SLIDE 14

UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG

Criterion of Minimum Free Energy

Sequence Space Shape Space

slide-15
SLIDE 15

.... GC UC .... CA .... GC UC .... GU .... GC UC .... GA .... GC UC .... CU

d =1

H

d =1

H

d =2

H

Point mutations as moves in sequence space

slide-16
SLIDE 16

CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T A C A C

Hamming distance d (S ,S ) =

H 1 2

4 d (S ,S ) = 0

H 1 1

d (S ,S ) = d (S ,S )

H H 1 2 2 1

d (S ,S ) d (S ,S ) + d (S ,S )

H H H 1 3 1 2 2 3

  • (i)

(ii) (iii)

The Hamming distance induces a metric in sequence space

slide-17
SLIDE 17

4 2 1 8 16 10 19 9 14 6 13 5 11 3 7 12 21 17 22 18 25 20 26 24 28 27 23 15 29 30 31

Binary sequences are encoded by their decimal equivalents: = 0 and = 1, for example, "0" 00000 = "14" 01110 = , "29" 11101 = , etc. ≡ ≡ ≡ , C CCCCC C C C G GGG GGG G

Mutant class

1 2

3 4

5

Sequence space of binary sequences of chain lenght n=5

slide-18
SLIDE 18

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Phenotype space Non-negative numbers Mapping from sequence space into phenotype space and into fitness values

slide-19
SLIDE 19

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Phenotype space Non-negative numbers

slide-20
SLIDE 20

Sk I. = ( ) ψ

fk f Sk = ( )

Sequence space Phenotype space Non-negative numbers

slide-21
SLIDE 21

Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence

  • space. In this approach, nodes are inserted randomly into sequence

space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.

slide-22
SLIDE 22

Random graph approach to neutral networks Sketch of sequence space Step 00

slide-23
SLIDE 23

Random graph approach to neutral networks Sketch of sequence space Step 01

slide-24
SLIDE 24

Random graph approach to neutral networks Sketch of sequence space Step 02

slide-25
SLIDE 25

Random graph approach to neutral networks Sketch of sequence space Step 03

slide-26
SLIDE 26

Random graph approach to neutral networks Sketch of sequence space Step 04

slide-27
SLIDE 27

Random graph approach to neutral networks Sketch of sequence space Step 05

slide-28
SLIDE 28

Random graph approach to neutral networks Sketch of sequence space Step 10

slide-29
SLIDE 29

Random graph approach to neutral networks Sketch of sequence space Step 15

slide-30
SLIDE 30

Random graph approach to neutral networks Sketch of sequence space Step 25

slide-31
SLIDE 31

Random graph approach to neutral networks Sketch of sequence space Step 50

slide-32
SLIDE 32

Random graph approach to neutral networks Sketch of sequence space Step 75

slide-33
SLIDE 33

Random graph approach to neutral networks Sketch of sequence space Step 100

slide-34
SLIDE 34

λj = 27 ,

/

12 λk = (k)

j

| | Gk

λ κ

cr = 1 - -1 (

1)

/ κ- λ λ

k cr . . . .

> λ λ

k cr . . . .

< network is connected Gk network is connected not Gk Connectivity threshold: Alphabet size : = 4

  • AUGC

G S S

k k k

= ( ) | ( ) =

  • 1

Υ

  • I

I

j j

  • cr

2 0.5 3 0.4226 4 0.3700 Mean degree of neutrality and connectivity of neutral networks

slide-35
SLIDE 35

Giant Component

A multi-component neutral network

slide-36
SLIDE 36

A connected neutral network

slide-37
SLIDE 37

Suboptimal RNA Secondary Structures

Michael Zuker. On finding all suboptimal foldings of an RNA molecule. Science 244 (1989), 48-52 Stefan Wuchty, Walter Fontana, Ivo L. Hofacker, Peter Schuster. Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers 49 (1999), 145-165

slide-38
SLIDE 38

3' 5'

Total number of structures including all suboptimal conformations, stable and unstable (with G0>0): #conformations = 1 416 661 Minimum free energy structure AAAGGGCACAGGGUGAUUUCAAUAAUUUUA Sequence

Example of a small RNA molecule: n=30

slide-39
SLIDE 39

Density of stares of suboptimal structures of the RNA molecule with the sequence: AAAGGGCACAGGGUGAUUUCAAUAAUUUUA

slide-40
SLIDE 40

Partition Function of RNA Secondary Structures

John S. McCaskill. The equilibrium function and base pair binding probabilities for RNA secondary structure. Biopolymers 29 (1990), 1105-1119 Ivo L. Hofacker, Walter Fontana, Peter F. Stadler, L. Sebastian Bonhoeffer, Manfred Tacker, Peter Schuster. Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie 125 (1994), 167-188

slide-41
SLIDE 41

3' 5'

Example of a small RNA molecule with two low-lying suboptimal conformations which contribute substantially to the partition function

UUGGAGUACACAACCUGUACACUCUUUC

Example of a small RNA molecule: n=28

slide-42
SLIDE 42

U U G G A G U A C A C A A C C U G U A C A C U C U U U C U U G G A G U A C A C A A C C U G U A C A C U C U U U C C U U U C U C A C A U G U C C A A C A C A U G A G G U U U U G G A G U A C A C A A C C U G U A C A C U C U U U C

U U G G A G U A C A C A A C C U G U A C A C U C U U U C

U U G G A G U A C A C A A C C U G U A C A C U C U U U C U U G G A G U A C A C A A C C U G U A C A C U C U U U C

second suboptimal configuration first suboptimal configuration

minimum free energy configuration

∆E = 0.55 kcal / mole

0→2

∆E = 0.50 kcal / mole

1 →

  • G = - 5.39 kcal / mole

3' 5'

„Dot plot“ of the minimum free energy structure (lower triangle) and the partition function (upper triangle) of a small RNA molecule (n=28) with low energy suboptimal configurations

slide-43
SLIDE 43

5'-End 5'-End 5'-End 3'-End 3'-End 3'-End

70 60 50 40 30 20 10 GCGGAU AUUCGC UUA AGDDGGGA M CUGAAYA AGMUC TPCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG

Sequence Secondary Structure Symbolic Notation

Phenylalanyl-tRNA as an example for the computation of the partition function

slide-44
SLIDE 44

tRNAphe

modified bases without

G

first suboptimal configuration E = 0.43 kcal / mole ∆ 0

1 →

3’ 5’

slide-45
SLIDE 45

G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G MU C C U G U G T P C G A U C C A C A G A A U U C G C A C C A G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G MU C C U G U G T P C G A U C C A C A G A A U U C G C A C C A A C C A C G C U U A A G A C A C C U A G C P T G U G U C C U MG A G G U C U A Y A A G U C A G A C C M C G A G A G G G D D G A C U C G A U U U A G G C G G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G M U C C U G U G T P C G A U C C A C A G A A U U C G C A C C A

tRNA modified bases

phe

with

first suboptimal configuration E = 0.94 kcal / mole ∆ 0

1 →

G C G G A U U U A G C U C A G D D G G G A G A G C M C C A G A C U G A A Y A U C U G G A G M U C C U G U G T P C G A U C C A C A G A A U U C G C A C C A

3’ 5’

slide-46
SLIDE 46

Kinetic Folding of RNA Secondary Structures

Christoph Flamm, Walter Fontana, Ivo L. Hofacker, Peter Schuster. RNA folding kinetics at elementary step resolution. RNA 6:325-338, 2000 Christoph Flamm, Ivo L. Hofacker, Sebastian Maurer-Stroh, Peter F. Stadler, Martin Zehl. Design of multistable RNA molecules. RNA 7:325-338, 2001

slide-47
SLIDE 47

The Folding Algorithm

A sequence I specifies an energy ordered set of compatible structures S(I):

S(I) = {S0 , S1 , … , Sm , O}

A trajectory Tk(I) is a time ordered series of structures in S(I). A folding trajectory is defined by starting with the open chain O and ending with the global minimum free energy structure S0 or a metastable structure Sk which represents a local energy minimum:

T0(I) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , S0} Tk(I) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , Sk}

Transition probabilities Pij(t) = P rob{Si→Sj} are defined by

Pij(t) = Pi(t) kij = Pi(t) exp(-∆Gij/2RT) / Σi Pji(t) = Pj(t) kji = Pj(t) exp(-∆Gji/2RT) / Σj exp(-∆Gki/2RT)

The symmetric rule for transition rate parameters is due to Kawasaki (K. Kawasaki, Diffusion constants near the critical point for time depen-dent Ising models. Phys.Rev. 145:224-230, 1966).

+ ≠ =

= Σ

2 , 1 m i k k k

Formulation of kinetic RNA folding as a stochastic process

slide-48
SLIDE 48

Base pair formation Base pair formation Base pair cleavage Base pair cleavage

Base pair formation and base pair cleavage moves for nucleation and elongation of stacks

slide-49
SLIDE 49

Base pair shift

Base pair shift move of class 1: Shift inside internal loops or bulges

slide-50
SLIDE 50

Base pair shift

Base pair shift move of class 2: Shift involving free ends

slide-51
SLIDE 51

Examples of rearrangements through consecutive shift moves

slide-52
SLIDE 52

Mean folding curves for three small RNA molecules with different folding behavior

slide-53
SLIDE 53

Sh S1

(h)

S6

(h)

S7

(h)

S5

(h)

S2

(h)

S9

(h)

Free energy G Local minimum Suboptimal conformations

Search for local minima in conformation space

slide-54
SLIDE 54

Free energy G0

  • Free energy G0
  • "Reaction coordinate"

Sk Sk S{ S{ Saddle point T

{k

T

{k

"Barrier tree"

slide-55
SLIDE 55

I1 = ACUGAUCGUAGUCAC S0 S1 S2 S3 O

Example of an unefficiently folding small RNA molecule with n = 15

slide-56
SLIDE 56

I2 = AUUGAGCAUAUUCAC S0 S1 S4 S2 S3 O

Example of an easily folding small RNA molecule with n = 15

slide-57
SLIDE 57

I3 = CGGGCUAUUUAGCUG

S0 S1 S2 S3 O

Example of an easily folding and especially stable small RNA molecule with n = 15

slide-58
SLIDE 58

Folding dynamics of the sequence GGCCCCUUUGGGGGCCAGACCCCUAAAAAGGGUC

slide-59
SLIDE 59

C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G G G G G G G G G G G G G G G G G G G C C C C C C C C U U U U U U G G G G G C C C C C C C C C C C C C U U U A A A A A A A A A A U

3’-end

Minimum free energy conformation S0 Suboptimal conformation S1

C G

One sequence is compatible with two structures

slide-60
SLIDE 60

5.10

2

2.90

8 14 15 18

2.60

17 23 19 27 22 38 45 25 36 33 39 40

3.10

43

3.40

41

3.30 7.40

5 3 7

3.00

4 10 9

3.40

6 13 12

3.10

11 21 20 16 28 29 26 30 32 42 46 44 24 35 34 37 49

2.80

31 47 48

S0 S1

Barrier tree of a sequence with two conformations

5.90

slide-61
SLIDE 61

modified

unmodified Folding dynamics of tRNAphe with and without modified nucelotides

slide-62
SLIDE 62

Barrier tree of tRNAphe without modified nucelotides

slide-63
SLIDE 63

Coworkers

Walter Fontana, Santa Fe Institute, NM Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Peter Stadler, Universität Leipzig, GE Ivo L.Hofacker, Christoph Flamm, Universität Wien, AT Bärbel Stadler, Andreas Wernitznig, Universität Wien, AT Michael Kospach, Ulrike Langhammer, Ulrike Mückstein, Stefanie Widder Jan Cupal, Kurt Grünberger, Andreas Svrček-Seiler, Stefan Wuchty Ulrike Göbel, Institut für Molekulare Biotechnologie, Jena, GE Walter Grüner, Stefan Kopp, Jaqueline Weber