prediction of rna rna interaction
play

Prediction of RNA-RNA-Interaction 20 1 15 1 5 10 20 5 10 20 - PowerPoint PPT Presentation

Prediction of RNA-RNA-Interaction 20 1 15 1 5 10 20 5 10 20 15 10 1 15 5 1 20 10 Can Alkan, Emre Karakoc, Joseph H. Nadeau, S. Cenk Sahinalp, Kaizhong Zhang. RNA-RNA interaction prediction and antisense RNA target search. JCB


  1. Prediction of RNA-RNA-Interaction 20 1 15 1 5 10 20 5 10 20 15 10 1 15 5 1 20 10 Can Alkan, Emre Karakoc, Joseph H. Nadeau, S. Cenk Sahinalp, Kaizhong Zhang. RNA-RNA interaction prediction and antisense RNA target search. JCB 2006 • define problem RIP (with and without PKs) S.Will, 18.417, Fall 2011 • prove NP-completeness even without PK for Base pair-energy model and more complex models (reduction from “longest common subsequence of multiple binary strings”, mLCP)

  2. Relation between PK-Prediction and RIP 15 20 1 1 5 10 15 20 10 5 20 15 10 5 1 15 5 1 10 20 • RNAcofold: concatenate RNAs A and B, predict PK-free structure • specific restrictions on the structure of the interaction complex • Can we apply pseudoknot-prediction to concatenation? Difference to Alkan-algorithm? S.Will, 18.417, Fall 2011

  3. Semiautomatic RNA 3D Structure Modeling S.Will, 18.417, Fall 2011 Bruce A Shapiro, Yaroslava G Yingling, Wojciech Kasprzak and Eckart Bindewald. Bridging the gap in RNA structure prediction Current Opinion in Structural Biology. 2007

  4. An automated pipeline: MC-Fold/MC-Sym S.Will, 18.417, Fall 2011 Marc Parisien & Francois Major. The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature 2008.

  5. Potential obstacles • Reliability of secondary structure prediction → prediction from alignments, covariance • Pseudoknots → pseudoknot prediction → covariance analysis of large multiple alignments • Non-canonical base pairs → experimental loop energies? learn from 3D-structures! • 3D-motifs (due to non-canonical base pairs) → learn from 3D-structures, isostericity S.Will, 18.417, Fall 2011

  6. Non-canonical Base Pairs, 3D-Motifs and Isostericity S.Will, 18.417, Fall 2011 Recurrent structural RNA motifs, Isostericity Matrices and sequence alignments. Aur´ elie Lescoute, Neocles B. Leontis, Christian Massire and Eric Westhof. NAR 2005.

  7. Non-Canonical Base Pairs S.Will, 18.417, Fall 2011 Leontis, N.B. and Westhof, E. Geometric nomenclature and classification of RNA base pairs. RNA 2001

  8. Back to MC-Fold/MC-sym • NCMs: Nucleotide Cyclic Motifs from PDB (531 structures) • MC-fold predicts secondary structure including non-canonical base pairs by merging NCMs • Probability-based scoring Pr [ structure | seq ] = Pr [ NCMs | seq ] × Pr [ junctions | NCMs ] × Pr [ hinges | junctions ] × Pr [ pairs | hinges ] • predict sub-optimals S.Will, 18.417, Fall 2011

  9. Prediction Performance of MC-Fold CONTRAfold Predic t ed base pairs RNAsubopt MC-Fold (Machine (%) (Therm odynam ic s) (NCM) learning) False posit ives 6.7 7.5 17.9 False negat ives 25.2 26.9 10.1 True Posit ives 74.8 73.1 89.9 Canonic als 88.4 86.3 94.7 Non-c anonicals N/A 1.4 62.1 Mat t hew s = TP TP 82.8 81.4 86.6 ( TP � FN ) ( TP � FP ) 1968 base pairs (1665 Watson-Crick) in 264 hairpins from 182 S.Will, 18.417, Fall 2011 different PDB structures

  10. MC-Sym • libraries of 3D-fragments for each NCM • solve combinatorial puzzle, satisfy steric/RMSD constraints S.Will, 18.417, Fall 2011 • Las-vegas algorithm (no exhaustive enumeration, could fail to produce solution) • run-time in pipeline 24h

  11. Example Predictions of MC-Fold/MC-Sym 14 a 16 c 3 � 5 � 3 � 5 � 15 5 � II 3 � A G 18 G U C A25 C C G G C A C G B105 G III I A U 7 II A 10 A C U 20 B110 G A U G U C G A20 U A C U U G A C A A G G A G A e 3 � G C U U C C B120 B115 5 U G 25 C G A15 U A G C G G C G C A U C G C G C A A G C I U A G C G C 25 G C B125 G C 5 � 3 � III 5 � 3 � U A 5 � G C 3 � A U 5 � 16 b d 7 25 U A 3 � C 35 U A U C U C U C G 20 A U U G U A 30 U C A U 30 G C G C G A 11 40 U U G U A 15 G A U A A A 25 A U U U C G 35 U A 45 A G C 10 G G C S.Will, 18.417, Fall 2011 C G 4 20 U A U A 16 C G 5 � A U U C G 50 15 55 5 C G G U U G G G C C C 3 � A U 40 U G C C 5 � A U 3 � U C C C G G G 5 � 5 � A C G C U A 3 � 10 5 5 � 3 � [Parisien&Major, Nature 2007]

  12. Rfam / Infernal • Infernal: scan genomic data for RNA family members Inference of RNA alignments • important tool for Rfam Rfam 10.1 (June 2011, 1973 families) http://rfam.sanger.ac.uk/ • in Rfam: ’hand-curated’ seed alignments ⇒ full alignments • use Stochastic Context Free Grammars to model RNA families • model of a family: Consensus Model (CM) U C input multiple alignment: example structure: U G 10 C G [structure] . : : < < < _ _ _ _ > - > > : < < - < . _ _ _ . > > > . A A U S.Will, 18.417, Fall 2011 human . A A G A C U U C G G A U C U G G C G . A C A . C C C . 5 G C mouse a U A C A C U U C G G A U G - C A C C . A A A . G U G a A 15 U C 21 GA A G G orc . A G G U C U U C - G C A C G G G C A g C C A c U U C . 2 C 1 5 10 15 20 25 28 C C C A 27 25

  13. Infernal Construct grammatical description ROOT 1 2 MATL 2 consensus structure: guide tree: 3 MATL 3 BIF 4 2 15 3 4 14 16 27 BEGL 5 BEGR 15 5 13 17 26 4 MATP 6 14 15 MATL 16 12 18 5 MATP 7 13 16 MATP 17 27 6 11 19 25 MATR 8 12 17 MATP 18 26 7 10 21 23 6 MATP 9 11 18 MATL 19 8 9 22 7 MATL 10 19 MATP 20 25 8 MATL 11 21 MATL 21 9 MATL 12 22 MATL 22 S.Will, 18.417, Fall 2011 10 MATL 13 23 MATL 23 END 14 END 24

  14. Infernal • Construct CM from guide tree • Expand nodes of guide tree: Add match, insertion, and deletion states • learn transition and output probabilities from alignment • CM comparable to profile HMM for protein families (Pfam) S 1 IL 2 ROOT 1 IR 3 ML 4 D 5 MATL 2 IL 6 ML 7 MATL 3 D 8 IL 9 "split set" MP 12 ML 13 MR 14 D 15 B 10 BIF 4 S 11 BEGL 5 MP 12 MATP 6 ML 13 MR 14 MATP 6 inserts D 15 IL 16 IR 17 IL 16 IR 17 MP 18 ML 19 "split set" MR 20 MP 18 ML 19 MR 20 D 21 MATP 7 D 21 IL 22 MATP 7 IR 23 MR 24 inserts MATR 8 D 25 S.Will, 18.417, Fall 2011 IL 22 IR 23 IR 26 MP 27 ML 28 MR 29 MATP 9 "split set" D 30 MR 24 D 25 IL 31 IR 32 MATR 8 ML 33 D 34 MATL 10 insert IL 35 IR 26 ML 36 MATL 11 D 37 IL 38 ML 39 D 40 MATL 12 IL 41 ML 42 D 43 MATL 13 IL 44 E 45 END 14 S 46 BEGR 15 IL 47 ML 48 D 49 MATL 16 IL 50 MP 51 ML 52 MR 53 MATP 17 D 54 IL 55 IR 56 MP 57 ML 58 MR 59 MATP 18 D 60 IL 61 IR 62 ML 63 D 64 MATL 19 IL 65 MP 66 ML 67 MR 68 MATP 20 D 69 IL 70 IR 71 ML 72 D 73 MATL 21 IL 74 ML 75 MATL 22 D 76 IL 77 ML 78 D 79 MATL 23 IL 80 END 24 E 81

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend