a method for aligning rna secondary structures
play

A Method for Aligning RNA Secondary Structures Jason T. L. Wang - PowerPoint PPT Presentation

A Method for Aligning RNA Secondary Structures Jason T. L. Wang New Jersey Institute of Technology J Liu, JTL Wang, J Hu and B Tian, BMC Bioinformatics, 2005 1 Outline Introduction Structural alignment of RNA (preliminaries, RSmatch


  1. A Method for Aligning RNA Secondary Structures Jason T. L. Wang New Jersey Institute of Technology J Liu, JTL Wang, J Hu and B Tian, BMC Bioinformatics, 2005 1

  2. Outline • Introduction • Structural alignment of RNA (preliminaries, RSmatch algorithm, software) • Experiments (RNA motif detection) • Multiple structural alignment (RMulti) • Combining RSmatch with RNAView • Conclusion and future work 2

  3. Molecule building blocks • Protein building blocks: – 20 types of amino acid • RNA building blocks: – Purine: A denine, G uanine – Pyrimidine: C ytosine, U racil 3

  4. RNA structure elements • RNA sequence folds to form secondary/tertiary structure • Majority of base connections involve two bases – Watson-Crick: AU or CG – Non-canonical: UG or AG • Basic structure elements of RNA 4

  5. Definition of structural components C G • Given an RNA sequence: G U A G – 5’ � 3’: r 1 r 2 r 3 …r n A U C G • Two types of structural C G G A components [1] : U A G A U – Single bases (blue) G G C – Bonded base pairs (red) G C A U A G C G G U 5’ 3’ [1] Zuker, M. (1989) Science 5

  6. Secondary structure constraint (1) • No common base can be shared by any C G C G G U G U A two pairs [2] . G A G A U A U C G C G – Bad: “G” is shared C G C G Prohibited! G A G A by two pairs: A-G U A U A G G A U and G-C A U GG C GG C CG AC G C G C 3’ C A G A U G A GC G U A GC G G U 5’ 5’ 3’ (b) BAD (a) GOOD [2] Hofacker, I.L. (2003) NAR 6

  7. Secondary structure constraint (2) hairpin • A hairpin element must Prohibited! have at least 3 bases on C G G U A U the loop part [3] . A G A U A U C G C G – Bad: only two bases (A C G C G and U) present in the G A G A U A U A G loop G A U A U GG C GG C G C G C A U A U A GC G G A GC G G U U 5’ 5’ 3’ 3’ (a) GOOD (b) BAD [3] Zuker, M. (1991) NAR 7

  8. Secondary structure constraint (3) • Pseudoknots are not included [4] (b) GOOD (nested structure) (a) BAD (c) GOOD (branching) C G C G G U G U A G A G A U C G A U C G U G C G Prohibited! C G A G G G C G G A U G A C G A C G U A G C U A G C G A A U U U G AGG C GG C G A G G C GG U A 3’ G U AU G C A U 3’ G C G A A A U G G C A GC G G A U 5’ 5’ A U A C G G 3’ U 5’ [4] Mathews, D.H. (1999) JMB 8

  9. RNA secondary structure representation schemes a. Bond annotation [5] b. Arc representation [6] c. Tree representation [7] d. Nested parenthesis representation [8] [5] Shapiro, B. (1990) CABIOS [6] Zhang, K. (1999) CPM [7] Ma, B. (2002) TCS [8] Hofacker, I.L. (2002) JMB 9

  10. Outline • Introduction • Structural alignment of RNA (preliminaries, RSmatch algorithm, software) • Experiments (RNA motif detection) • Multiple structural alignment (RMulti) • Combining RSmatch with RNAView • Conclusion and future work 10

  11. Extended circle model � Circle model [9] : circle 5 U C G • circle 0: G, C, A, G, A, A G A A circle 7 • circle 1: A, A, U, G A U circle 4 U A • circle 7: C, C, G, C, G U A C G C U circle 3 A U • circle 8: G, U, A, U, U, U, C G G C U U G circle 2 C C � Sequential order between GA G circle 8 U components: A G circle 6 circle 1 A A C G > C > A-U > U > C-G > A-G 3’ G 5’ circle 0 [9] Liu, J. (2005) BMC Bioinformatics 11

  12. Hierarchical organization • circles are organized in a tree-like hierarchy circle 5 circle 0 U C G G A A circle 7 circle 1 A U circle 4 U A U circle 2 A C G C U circle 3 A U G circle 3 G C circle 6 U U G circle 2 C C GA G circle 8 U circle 4 circle 7 A G circle 6 circle 1 A A C 3’ G circle 8 5’ circle 5 circle 0 12

  13. Hierarchical relationship between two structural components (1) the same circle: e.g. each pair from G, C, G, A-U, G-C, G, A-U (2) descendant/ancestor circles: e.g. pair (G, A-U) (3) cousin circles: e.g. pairs (U, C), (A-U, G-C) and (U, G-C) (1) (2) (3) GU CG G U CG GU C G A A A A A A A U A U A U UA U A C G UA UA circle U A U A C G C G C U A U C C G U U A U A U G G G C U U G G G C C U U U U C G G C C C C C G G G A U G G G A U A U A G A G A G A A A A A A C C C G 3’ G 3’ G 3’ 5’ 5’ 5’ 13

  14. Partial structure induced by a structural component GU CG A A A U 10 UA U A C G C U A U G G C U U G C parent C 30 G G A U structure A G A A C GU CG G 3’ 5’ A A A U UA U A C G C U A U G G C U U G GU CG C C G 3’ 5’ A A child A U structure UA U A C G C U A U G G C U U G C C G G 5’ 3’ 14

  15. Structural alignment rules (1) • A 1 precedes A 2 iff B 1 precedes B 2 where A 1 , A 2 , B 1 ,B 2 are structural components. 15

  16. Structural alignment rules (2) RNA 1 RNA 2 (a) Same loop relationship preserved: A 1 is in the same loop as A 2 iff B 1 is in the same loop as B 2 (a) (b) Ancestor/descendant relationship preserved: A 1 is ancestor of A 2 iff B 1 is ancestor of B 2 (b) (c) Cousin relationship preserved: A 1 is cousin of A 2 iff B 1 is cousin of B 2 16 (c)

  17. Example alignment First RNA Second RNA • All structural alignment GU CG rules must be satisfied for C U CU A A a valid alignment A U U A U A GC UA U A U A C G G • In addition, a single base C A U G C U A U G G A C A C U U U U G U can not be aligned with a C C C GC G G G G C G base pair A U U A A A A A U U G G 3’ 3’ 5’ 5’ ..((...(((......)))((.(.....))).)).. ..((..((......))(((.......))).)).. GUACGCAGUAAGUCGAUACGCCGUAUUUCGCGGUAA GUUCGAUUUCUCUAAAGAGUAGCUUUCUCGGAAA Alignment Result ..((...(((......)))((.(.. ...))).)).. GUACGCAGUAAGUCGAUACGCCGUA—-UUUCGCGGUAA || || | || | | | ||| |||| ||| || GUUCGA-UU-UCUCUA-AAGA-GUAGCUUUCUCGGAAA 17 ..((.. (( ...... ))(( (.......))).))..

  18. Dynamic programming algorithm: overview First structure Second structure 5’ 3’ A 5’ G A UC GA 3’ UA U U U U U CA U A C G The best alignment A U G G C between partial structures A of U and A - U DP scoring table A U U C A U C A G G U A - U A G C-G A U C A U G U 18 A-U

  19. Case 1 5’ 3’ 5’ 3’ 19

  20. Case 2 5’ 3’ 5’ 3’ 20

  21. Case 3 5’ 3’ 5’ 3’ 21

  22. Case 4.1 5’ 3’ 5’ 3’ 22

  23. Case 4.2 5’ 3’ 3’ 5’ 23

  24. Example of matching score function • Score function of matching two equal-length structural components: i.e. =  1 , if both C and C are single bases and C C a b a b  = = g ( C , C ) 2 , if both C and C are base pairs and C C  a b a b a b  0 , otherwise  • Gap penalty equals 0 • Extending g to the whole set of matched component pairs, our goal is to maximize f(R 1 , R 2 ) ∑ = f ( R , R ) g ( C , C ) 1 2 a b i i i 24

  25. Cell type 1 : single base vs. single base 5’ 3’ A 5’ G 3’ UC A ? G A C U U U A U U CA U A G C U G C A U G A AUACAUGUUC UCAUACAGGUUA ..(.....). ....(.....). (C) (B) (A) 5’ 5’ 3’ 5’ 3’ A A 3’ A 3’ 5’ A 5’ 3’ 5’ A G A 3’ UC G UC G A C G A G C UC U U C G A U A U U A U U U U U A U U U U U C CA U A U U C G CA U A G U C CA U A G A U G A U G G C C G A U G A A G C A ..(.....) . ..(.....). ..(.....). --AUACAUGUU-C --AUACAUGUUC --AUACAUGUUC- UCAUACAGGUUA- UCAUACAGGUUA UCAUACAGGUU-A ....(.....). ....(.....). ....(.....) . 25

  26. Cell type 2: base pair vs. single base 5’ 3’ 5’ A 3’ A G UC G A C U ? U A U U U U C CA U A G A U G C G A first score 5’ 5’ 3’ A 3’ A G UC ? C G A U A U U second score U U C U CA U A G A U G G C A 5’ 3’ A 5’ 3’ A C G U UC ? G A A U U U U C U CA U A G A U G C G A 26

  27. Cell type 2: base pair vs. single base (first score) 5’ 5’ 3’ A 3’ A G UC G A C U ? A U U U U U C C G U A A U G A G C A UCAUACAGGUUA ACAUGUU ....(.....). (.....) 5’ 5’ 3’ 5’ 5’ 3’ A A 3’ 3’ A A G G UC UC G A G A C C U U A A U U U U U U U U U U C C C C G G U A U A A A U G U G A A G G C C A A ( ..... ) (.....) A-----CAUGU--U ----ACAUGUU- -UCAUACAGGUUA UCAUACAGGUUA ....(.....). ....(.....). 27

  28. Cell type 2: base pair vs. single base (second score) 5’ 5’ 3’ 3’ A A G UC C U G A A U ? U U U U C C G U A U G A A G C A UCAUACAGGUUA AUACAUGUU ..(.....) ....(.....). (A) (B) (C) 5’ 5’ 5’ 3’ 3’ 5’ 3’ A A 5’ 3’ A A 5’ 3’ 3’ A A G UC G UC C C U G A G U G A UC C U G A A U A U U U A U U U U U U U C U C U U C U C G C G U A C U G U A A A G U G A A U A U G A A G C G C A G A C A .. (.....) .. (.....) ..(.....) AU----ACAUGUU- --AU--------ACAUGUU --AUACAUGUU- --UCAUACAGGUUA UCAUACAGGUUA------- UCAUACAGGUUA ....(.....). ....(.....). ....(.....). 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend