From Sequences to Structures and Back The Vienna RNA Package Peter - - PowerPoint PPT Presentation

from sequences to structures and back
SMART_READER_LITE
LIVE PREVIEW

From Sequences to Structures and Back The Vienna RNA Package Peter - - PowerPoint PPT Presentation

From Sequences to Structures and Back The Vienna RNA Package Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA Siemens PSE Life Science Symposium Brno,


slide-1
SLIDE 1
slide-2
SLIDE 2

From Sequences to Structures and Back

The Vienna RNA Package Peter Schuster

Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA

Siemens PSE Life Science Symposium Brno, 14.03.2006

slide-3
SLIDE 3

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-4
SLIDE 4

O CH2 OH O O P O O O

N1

O CH2 OH O P O O O

N2

O CH2 OH O P O O O

N3

O CH2 OH O P O O O

N4

N A U G C

k =

, , ,

3' - end 5' - end Na Na Na Na

5'-end 3’-end

GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG

Definition of RNA structure

slide-5
SLIDE 5

A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs

slide-6
SLIDE 6

N = 4n NS < 3n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ {AU,CG,GC,GU,UA,UG} A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs

slide-7
SLIDE 7

Conventional definition of RNA secondary structures

slide-8
SLIDE 8

Restrictions on physically acceptable mfe-structures: 3 and 2

slide-9
SLIDE 9

Vienna RNA Package

RNAfold RNAdistance RNAinverse RNAduplex RNAsubopt RNAeval RNAheat RNAcofold RNApdist RNAalifold RNAplot http://www.tbi.univie.ac.at/RNA/

slide-10
SLIDE 10
slide-11
SLIDE 11

RNA sequence RNA structure

  • f minimal free

energy

RNA folding: Structural biology, spectroscopy of biomolecules, understanding molecular function Empirical parameters Biophysical chemistry: thermodynamics and kinetics

Sequence, structure, and design

slide-12
SLIDE 12

G G G G G G G G G G G G G G G G U U U U U U U U U U U A A A A A A A A A A A A U C C C C C C C C C C C C 5’-end 3’-end

S1

(h)

S9

(h)

F r e e e n e r g y G

  • Minimum of free energy

Suboptimal conformations

S0

(h) S2

(h)

S3

(h)

S4

(h)

S7

(h)

S6

(h)

S5

(h)

S8

(h)

The minimum free energy structures on a discrete space of conformations

slide-13
SLIDE 13

hairpin loop hairpin loop stack s t a c k stack hairpin loop stack free end free end free end hairpin loop hairpin loop stack stack free end free end joint hairpin loop stack stack stack internal loop bulge multiloop

Elements of RNA secondary structures as used in free energy calculations

L

∑ ∑ ∑ ∑

+ + + + = ∆

loops internal bulges loops hairpin pairs base

  • f

stacks , 300

) ( ) ( ) (

i b l kl ij

n i n b n h g G

slide-14
SLIDE 14

RNA sequence RNA structure

  • f minimal free

energy

RNA folding: Structural biology, spectroscopy of biomolecules, understanding molecular function Inverse Folding Algorithm Iterative determination

  • f a sequence for the

given secondary structure

Sequence, structure, and design

Inverse folding of RNA: Biotechnology, design of biomolecules with predefined structures and functions

slide-15
SLIDE 15

Inverse folding algorithm I0 I1 I2 I3 I4 ... Ik Ik+1 ... It S0 S1 S2 S3 S4 ... Sk Sk+1 ... St Ik+1 = Mk(Ik) and dS(Sk,Sk+1) = dS(Sk+1,St) - dS(Sk,St) < 0 M ... base or base pair mutation operator dS (Si,Sj) ... distance between the two structures Si and Sj ‚Unsuccessful trial‘ ... termination after n steps

slide-16
SLIDE 16

Target structure Sk

Initial trial sequences Target sequence Stop sequence of an unsuccessful trial Intermediate compatible sequences Intermediate compatible sequences

Approach to the target structure Sk in the inverse folding algorithm

slide-17
SLIDE 17

Minimum free energy criterion

Inverse folding of RNA secondary structures

1st 2nd 3rd trial 4th 5th

The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.

slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20

( ) ( ) ( ) ( )

( )

( ) ( ) ( )

∑ ∑ ∑ ∑

≠ − −

− = − = = = =

i j j ij ii j ij ij i k k kT k k k ij k k ij

p p p p s T T Q T Q e g T S a T T X p

k

, /

1 with ln / with , γ γ γ

ε ε

base pair probability base pairing entropy Base pair probability derived from the partition function Q(T)

slide-21
SLIDE 21

3' 5'

Example of a small RNA molecule with two low-lying suboptimal conformations which contribute substantially to the partition function

UUGGAGUACACAACCUGUACACUCUUUC

Example of a small RNA molecule: n=28

slide-22
SLIDE 22

U U G G A G U A C A C A A C C U G U A C A C U C U U U C U U G G A G U A C A C A A C C U G U A C A C U C U U U C C U U U C U C A C A U G U C C A A C A C A U G A G G U U U U G G A G U A C A C A A C C U G U A C A C U C U U U C

U U G G A G U A C A C A A C C U G U A C A C U C U U U C

U U G G A G U A C A C A A C C U G U A C A C U C U U U C U U G G A G U A C A C A A C C U G U A C A C U C U U U C

second suboptimal configuration first suboptimal configuration

minimum free energy configuration

∆E = 0.55 kcal / mole

0→2

∆E = 0.50 kcal / mole

1 →

G = - 5.39 kcal / mole

3' 5'

„Dot plot“ of the minimum free energy structure (lower triangle) and the partition function (upper triangle) of a small RNA molecule (n=28) with low energy suboptimal configurations

slide-23
SLIDE 23

Phenylalanyl-tRNA as an example for the computation of the partition function

slide-24
SLIDE 24

tRNAphe

modified bases without

G

first suboptimal configuration E = 0.43 kcal / mole ∆ 0

1 →

3’ 5’

slide-25
SLIDE 25

G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G MU C C U G U G T P C G A U C C A C A G A A U U C G C A C C A G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G MU C C U G U G T P C G A U C C A C A G A A U U C G C A C C A A C C A C G C U U A A G A C A C C U A G C P T G U G U C C U MG A G G U C U A Y A A G U C A G A C C M C G A G A G G G D D G A C U C G A U U U A G G C G G C G G A U U U A G C U C A G D D G G G A G A G C MC C A G A C U G A A Y A U C U G G A G M U C C U G U G T P C G A U C C A C A G A A U U C G C A C C A

tRNA modified bases

phe

with

first suboptimal configuration E = 0.94 kcal / mole ∆ 0

1 →

G C G G A U U U A G C U C A G D D G G G A G A G C M C C A G A C U G A A Y A U C U G G A G M U C C U G U G T P C G A U C C A C A G A A U U C G C A C C A

3’ 5’

slide-26
SLIDE 26

( ) ( ) ( ) ( )

( )

( ) ( ) ( )

∑ ∑ ∑ ∑

≠ − −

− = − = = = =

i j j ij ii j ij ij i k k kT k k k ij k k ij

p p p p s T T Q T Q e g T S a T T X p

k

, /

1 with ln / with , γ γ γ

ε ε

base pair probability base pairing entropy Reliability measures for structure prediction

slide-27
SLIDE 27

Base pairing entropy and base pair probability in a model RNA molecule

slide-28
SLIDE 28

without modification nucleotides with modification base pairing entropy base pair probability

Reliability of structure prediction in tRNAphe

slide-29
SLIDE 29

native structure base pair probability base pairing entropy

Reliability of structure prediction in 5S ribosomal RNA

slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32
slide-33
SLIDE 33

The Folding Algorithm

A sequence I specifies an energy ordered set of compatible structures S(I):

S(I) = {S0 , S1 , … , Sm , O}

A trajectory Tk(I) is a time ordered series of structures in S(I). A folding trajectory is defined by starting with the open chain O and ending with the global minimum free energy structure S0 or a metastable structure Sk which represents a local energy minimum:

T0(I) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , S0} Tk(I) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , Sk}

Master equation

( )

1 , , 1 , ) ( ) (

1 1 1

+ = − = − =

∑ ∑ ∑

+ = + = + =

m k k P P k t P t P dt dP

m i ki k i m i ik m i ki ik k

K

Transition probabilities Pij(t) = Prob{Si→Sj} are defined by

Pij(t) = Pi(t) kij = Pi(t) exp(-∆Gij/2RT) / Σi Pji(t) = Pj(t) kji = Pj(t) exp(-∆Gji/2RT) / Σj exp(-∆Gki/2RT)

The symmetric rule for transition rate parameters is due to Kawasaki (K. Kawasaki, Diffusion constants near the critical point for time depen-dent Ising models. Phys.Rev. 145:224-230, 1966).

+ ≠ =

= Σ

2 , 1 m i k k k

Formulation of kinetic RNA folding as a stochastic process

slide-34
SLIDE 34

Corresponds to base pair distance: dP(S1,S2) Base pair formation and base pair cleavage moves for nucleation and elongation of stacks

slide-35
SLIDE 35

Base pair closure, opening and shift corresponds to Hamming distance: dH(S1,S2) Base pair shift move of class 1: Shift inside internal loops or bulges

slide-36
SLIDE 36

Sh S1

(h)

S6

(h)

S7

(h)

S5

(h)

S2

(h)

S9

(h)

Free energy G

  • Local minimum

Suboptimal conformations

Search for local minima in conformation space

slide-37
SLIDE 37

F r e e e n e r g y G

  • "Reaction coordinate"

Sk S{ Saddle point T

{ k

F r e e e n e r g y G

  • Sk

S{ T

{ k

"Barrier tree"

Definition of a ‚barrier tree‘

slide-38
SLIDE 38

CUGCGGCUUUGGCUCUAGCC ....((((........)))) -4.30 (((.(((....))).))).. -3.50 (((..((....))..))).. -3.10 ..........(((....))) -2.80 ..(((((....)))...)). -2.20 ....(((..........))) -2.20 ((..(((....)))..)).. -2.00 ..((.((....))....)). -1.60 ....(((....)))...... -1.60 .....(((........))). -1.50 .((.(((....))).))... -1.40 ....((((..(...).)))) -1.40 .((..((....))..))... -1.00 (((.(((....)).)))).. -0.90 (((.((......)).))).. -0.90 ....((((..(....))))) -0.80 .....((....))....... -0.80 ..(.(((....))))..... -0.60 ....(((....)).)..... -0.60 (((..(......)..))).. -0.50 ..(((((....)).)..)). -0.50 ..(.(((....))).).... -0.40 ..((.......))....... -0.30 ..........((......)) -0.30 ...........((....)). -0.30 (((.(((....)))).)).. -0.20 ....(((.(.......)))) -0.20 ....(((..((....))))) -0.20 ..(..((....))..).... 0.00 .................... 0.00 .(..(((....)))..)... 0.10

M.T. Wolfinger, W.A. Svrcek-Seiler, C. Flamm, I.L. Hofacker, P.F. Stadler. 2004. J.Phys.A: Math.Gen. 37:4731-4741.

slide-39
SLIDE 39

CUGCGGCUUUGGCUCUAGCC ....((((........)))) -4.30 (((.(((....))).))).. -3.50 (((..((....))..))).. -3.10 ..........(((....))) -2.80 ..(((((....)))...)). -2.20 ....(((..........))) -2.20 ((..(((....)))..)).. -2.00 ..((.((....))....)). -1.60 ....(((....)))...... -1.60 .....(((........))). -1.50 .((.(((....))).))... -1.40 ....((((..(...).)))) -1.40 .((..((....))..))... -1.00 (((.(((....)).)))).. -0.90 (((.((......)).))).. -0.90 ....((((..(....))))) -0.80 .....((....))....... -0.80 ..(.(((....))))..... -0.60 ....(((....)).)..... -0.60 (((..(......)..))).. -0.50 ..(((((....)).)..)). -0.50 ..(.(((....))).).... -0.40 ..((.......))....... -0.30 ..........((......)) -0.30 ...........((....)). -0.30 (((.(((....)))).)).. -0.20 ....(((.(.......)))) -0.20 ....(((..((....))))) -0.20 ..(..((....))..).... 0.00 .................... 0.00 .(..(((....)))..)... 0.10

M.T. Wolfinger, W.A. Svrcek-Seiler, C. Flamm, I.L. Hofacker, P.F. Stadler. 2004. J.Phys.A: Math.Gen. 37:4731-4741.

slide-40
SLIDE 40

Arrhenius kinetics M.T. Wolfinger, W.A. Svrcek-Seiler, C. Flamm, I.L. Hofacker, P.F. Stadler. 2004. J.Phys.A: Math.Gen. 37:4731-4741.

slide-41
SLIDE 41

Arrhenius kinetic Exact solution of the master equation M.T. Wolfinger, W.A. Svrcek-Seiler, C. Flamm, I.L. Hofacker, P.F. Stadler. 2004. J.Phys.A: Math.Gen. 37:4731-4741.

slide-42
SLIDE 42

JN1LH

1D 1D 1D 2D 2D 2D R R R

G GGGUGGAAC GUUC GAAC GUUCCUCCC CACGAG CACGAG CACGAG

  • 28.6 kcal·mol
  • 1

G/

  • 31.8 kcal·mol
  • 1

G G G G G G C C C C C C A A U U U U G G C C U U A A G G G C C C A A A A G C G C A A G C /G

  • 28.2 kcal·mol
  • 1

G G G G G G GG CCC C C C C C U G G G G C C C C A A A A A A A A U U U U U G G C C A A

  • 28.6 kcal·mol
  • 1

3 3 3 13 13 13 23 23 23 33 33 33 44 44 44

5' 5' 3’ 3’

J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation, Nucleic Acids Res., in press 2006.

An RNA switch

slide-43
SLIDE 43

4 5 8 9 11

1 9 2 2 4 2 5 2 7 3 3 3 4

36

38 39 41 46 47

3

49

1

2 6 7 10

1 2 1 3 1 4 1 5 1 6 1 7 1 8 2 1 22 2 3 2 6 2 8 2 9 3 3 1 32 3 5 3 7

40

4 2 4 3 44 45 48 50

  • 26.0
  • 28.0
  • 30.0
  • 32.0
  • 34.0
  • 36.0
  • 38.0
  • 40.0
  • 42.0
  • 44.0
  • 46.0
  • 48.0
  • 50.0

2.77 5.32 2 . 9 3.4 2.36 2 . 4 4 2.44 2.44 1.46 1.44 1.66

1.9

2.14

2.51 2.14 2.51

2 . 1 4 1 . 4 7

1.49

3.04 2.97 3.04 4.88 6.13 6 . 8 2.89

Free energy [kcal / mole]

J1LH barrier tree

slide-44
SLIDE 44

A ribozyme switch

E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452

slide-45
SLIDE 45

Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis--virus (B)

slide-46
SLIDE 46

The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures

slide-47
SLIDE 47

Two neutral walks through sequence space with conservation of structure and catalytic activity

slide-48
SLIDE 48

Acknowledgement of support

Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU: Bioinformatics Network (BIN) Österreichische Akademie der Wissenschaften Siemens AG, Austria Universität Wien and the Santa Fe Institute

Universität Wien

slide-49
SLIDE 49

Coworkers

Peter Stadler, Bärbel M. Stadler, Universität Leipzig, GE Paul E. Phillipson, University of Colorado at Boulder, CO Heinz Engl, Philipp Kügler, James Lu, Stefan Müller, RICAM Linz, AT Jord Nagel, Kees Pleij, Universiteit Leiden, NL Walter Fontana, Harvard Medical School, MA Christian Reidys, Christian Forst, Los Alamos National Laboratory, NM Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE Ivo L.Hofacker, Christoph Flamm, Andreas Svrček-Seiler, Universität Wien, AT Kurt Grünberger, Michael Kospach , Andreas Wernitznig, Stefanie Widder, Stefan Wuchty, Andreas De Stefani, Universität Wien, AT Jan Cupal, Stefan Bernhart, Lukas Endler, Ulrike Langhammer, Rainer Machne, Ulrike Mückstein, Hakim Tafer, Thomas Taylor, Universität Wien, AT

Universität Wien

slide-50
SLIDE 50

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-51
SLIDE 51