Sequences, structures, shapes, and conformations Peter Schuster - - PowerPoint PPT Presentation
Sequences, structures, shapes, and conformations Peter Schuster - - PowerPoint PPT Presentation
Sequences, structures, shapes, and conformations Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA RNA 2006 Benasque, 17. 27.07.2006 Web-Page for further
Sequences, structures, shapes, and conformations
Peter Schuster
Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA
RNA 2006 Benasque, 17.– 27.07.2006
Web-Page for further information: http://www.tbi.univie.ac.at/~pks
tRNAphe: sequence and molecular structure
tRNAphe: secondary structure is a shape
A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ {AU,CG,GC,GU,UA,UG} N = 4n NS < 3n
Sequence space
The Hamming distance between sequences induces a metric in sequence space
CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T A C A C
Hamming distance d (I ,I ) =
H 1 2
4 d (I ,I ) = 0
H 1 1
d (I ,I ) = d (I ,I )
H H 1 2 2 1
d (I ,I ) d (I ,I ) + d (I ,I )
H H H 1 3 1 2 2 3
- (i)
(ii) (iii)
Every point in sequence space is equivalent
Sequence space of binary sequences with chain length n = 5
Sequence space and structure space
The Hamming distance between structures in parentheses notation forms a metric in structure space
Hamming distance d (S ,S ) =
H 1 2
4 d (S ,S ) = 0
H 1 1
d (S ,S ) = d (S ,S )
H H 1 2 2 1
d (S ,S ) d (S ,S ) + d (S ,S )
H H H 1 3 1 2 2 3
- (i)
(ii) (iii)
Two measures of distance in shape space: Hamming distance between structures, dH(Si,Sj) and base pair distance, dP(Si,Sj)
Sketch of structure space
Structures are not equivalent in structure space
? ? ?
Compatible structures Suboptimal conformations
Reference for the definition of the intersection and the proof of the intersection theorem
The compatible set Ck of a structure Sk consists of all sequences which form Sk as its minimum free energy structure (the neutral network Gk) or one of its suboptimal structures.
Gk Neutral Network
Structure S
k
Gk C
- k
Compatible Set Ck
The intersection of two compatible sets is always non empty: C0 C1 Structure S Structure S
1
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA as a Markow process
Kinetic folding of RNA secondary structures
Christoph Flamm, Walter Fontana, Ivo L. Hofacker, Peter Schuster. RNA folding kinetics at elementary step resolution. RNA 6:325-338, 2000 Christoph Flamm, Ivo L. Hofacker, Sebastian Maurer-Stroh, Peter F. Stadler, Martin Zehl. Design of multistable RNA molecules. RNA 7:325-338, 2001 Michael T. Wolfinger, W.Andreas Svrcek-Seiler, Christoph Flamm, Ivo L. Hofacker, Peter F.
- Stadler. Efficient computation of RNA folding dynamics. J.Phys.A: Math.Gen. 37:4731-
4741, 2004
Base pair formation and base pair cleavage moves for nucleation and elongation of stacks Corresponds to base pair distance: dP(S1,S2)
Base pair shift move of class 1: Shift inside internal loops or bulges Base pair closure, opening and shift corresponds to Hamming distance: dH(S1,S2)
Base pair shift Class 2
Base pair closure, opening and shift corresponds to Hamming distance: dH(S1,S2) Base pair shift move of class 2: Shift involves free ends
The kinetic folding algorithm
A sequence X specifies an energy ordered set of compatible structures S(X):
S(X) = {S0 , S1 , … , Sm-1 , O}
A trajectory Tk(X) is a time ordered series of structures in S(X). A folding trajectory is defined by starting with the open chain O and ending with the global minimum free energy structure, S0 or a metastable structure Sk, which represents a local energy minimum:
T0(X) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , S0} Tk(X) = {O , S (1) , … , S (t-1) , S (t) , S (t+1) , … , Sk}
A description of the folding process is obtained through sampling a large number of trjectories. When no stopping structure, S0 or Sk, is defined, the long time distribution of conformations is the Boltzmann ensemble.
Formulation of kinetic RNA folding as a stochastic process
Folding dynamics of the sequence GGCCCCUUUGGGGGCCAGACCCCUAAAAAGGGUC
( ) { }
( ) { } ( ) { }
( )
( )
, ) 1 ( ) ( ) 1 , ( , ) , ( ) ( , ) ( 1 ) ( 1 ) 1 , ( , ) ( ) ( 1 ) ( 1 ) ( ) ( ) ( ) ( ) (
and / exp / exp with ) ( ) ( ) ( } | { ) ( : ies probabilit Transition ) (, 1
- r
) , ( 1 : step single ; , , 1 , , and , , 2 , 1 , , ) 1 ( ) ( ) ( ) ( ) 1 ( ) ( ) ( ) ( ) ( ) ( ) ( : values n Expectatio ) ( with } ) ( { ) ( : ies Probabilit
- n
conformati with molecules
- f
number ) ( : variables Stochastic
l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l
K l K L G G G RT G RT G k k t p k n k t P i k t P dt t t S S dt t P n n n n m j N i n P i k n k P P n k P P i k t P t P n k t P n k t P t P dt t dP t p t P n n t N N t n t t P S t
j j m k k k j j n j N i j n i j j j j m j n i N i j j j n m j j n j j n n i N i j m j j n j j j n jl j n j j n j j n N n j j j j j n j
∆ − ∆ = ∆ ∆ − ∆ − = = > < = = + ≤ ≤ → = ± = ∆ ± → = = + − + + = = + − + + = = = > < = = = =
∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑
≠ = − = − ≠ = = ≠ = + − − = ≠ = + − =
τ Prob N N Prob N
j j j
( ) { }
m j k p p k dt dp P p k n k P n k P p k dt dp dt dP n
m j j j m j j j N n m j j n n j j j n j j n n j j N n j n
, , 2 , 1 , ; ) 1 (
, , , ) ( ) ( ) ( 1 ) ( 1 ) 1 ( ) (
K
l l l l l l l l l l l l l l l
= − = + − + + = =
∑ ∑ ∑ ∑ ∑
≠ = ≠ = = ≠ = + − − =
Free energy G
- "Reaction coordinate"
Sk S{ Saddle point T
{ k
F r e e e n e r g y G
- Sk
S{ T
{ k
"Barrier tree"
Definition of a ‚barrier tree‘
JN1LH
1D 1D 1D 2D 2D 2D R R R
G GGGUGGAAC GUUC GAAC GUUCCUCCC CACGAG CACGAG CACGAG
- 28.6 kcal·mol
- 1
G/
- 31.8 kcal·mol
- 1
G G G G G G C C C C C C A A U U U U G G C C U U A A G G G C C C A A A A G C G C A A G C /G
- 28.2 kcal·mol
- 1
G G G G G G GG CCC C C C C C U G G G G C C C C A A A A A A A A U U U U U G G C C A A
- 28.6 kcal·mol
- 1
3 3 3 13 13 13 23 23 23 33 33 33 44 44 44
5' 5' 3’ 3’
J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation, Nucleic Acids Res. 34:3568-3576, 2006.
An RNA switch
4 5 8 9 11
19 20 24 25 27 33 34
36
38 39 41 46 47
3
49
1
2 6 7 10
1 2 1 3 1 4 1 5 1 6 1 7 1 8 2 1 22 2 3 2 6 2 8 2 9 3 3 1 32 35 37
40
42 43 44 45 48 50
- 26.0
- 28.0
- 30.0
- 32.0
- 34.0
- 36.0
- 38.0
- 40.0
- 42.0
- 44.0
- 46.0
- 48.0
- 50.0
2.77 5.32 2.09 3.4 2.36 2.44 2.44 2.44 1.46 1.44 1.66
1.9
2.14
2.51 2.14 2.51
2.14 1.47
1.49
3.04 2.97 3.04 4.88 6.13 6.8 2.89
Free energy [kcal / mole]
J1LH barrier tree
A ribozyme switch
E.A.Schultes, D.B.Bartel, Science 289 (2000), 448-452
Two ribozymes of chain lengths n = 88 nucleotides: An artificial ligase (A) and a natural cleavage ribozyme of hepatitis--virus (B)
The sequence at the intersection: An RNA molecules which is 88 nucleotides long and can form both structures
Two neutral walks through sequence space with conservation of structure and catalytic activity
Universität Wien
Acknowledgement of support
Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU Siemens AG, Austria Universität Wien and the Santa Fe Institute
Coworkers
Walter Fontana, Harvard Medical School, MA Christian Forst, Christian Reidys, Los Alamos National Laboratory, NM Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Jord Nagel, Kees Pleij, Universiteit Leiden, NL Christoph Flamm, Ivo L.Hofacker, Andreas Svrček-Seiler, Universität Wien, AT Kurt Grünberger, Michael Kospach, Ulrike Mückstein, Stefan Washietl, Andreas Wernitznig, Stefanie Widder, Michael Wolfinger, Stefan Wuchty, Universität Wien, AT Stefan Bernhart, Jan Cupal, Lukas Endler, Ulrike Langhammer, Rainer Machne, Hakim Tafer, Universität Wien, AT Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE
Universität Wien