RESEARCH & METHODS RNA-RNA interaction prediction Jerome - PowerPoint PPT Presentation

COMP598: ADVANCED COMPUTATIONAL BIOLOGY RESEARCH & METHODS RNA-RNA interaction prediction Jerome Waldispuhl School of Computer Science, McGill From slides from Ivo Hofacker (University of Vienna)

Motivation • Experimental and bioinformatical methods find novel ncRNAs en masse • Give no hint as to the function of these novel ncRNAs • Functional characterization of ncRNAs is difficult and slow • Most ncRNAs function through interaction with other RNAs • Identification of interaction partners is the easiest approach to learn about possible functions • Most obvious in the case of miRNA target prediction

Well known Examples of RNA-RNA Interaction • micro RNAs regulate mRNA translation • snoRNAs guide methylation and pseudouridylation of rRNA • some well studied bacterial examples • RyhB is transcribed under low Fe, binds several mRNA of Fe binding proteins (sdh, sodB) and leads to mRNA degradation • GadY interacts with the 3’ UTR of GadX and inhibits its degradation • DsrA is expressed at low temperatures and stimulates the translation of RpoS a translational regulator • OxyS is expressed under oxidative stress and inhibits translation of its targets RpoS and flhA • T-box motifs bind uncharged tRNAs to control transcription of aminoacyl synthetases

Interaction of OxyS and fhla Binding of OxyS to fhlA mRNA makes the ribosome binding site (start codon) inaccessible

Transcriptional control by T-box Motifs Concentration of un-charged tRNAs controls transcription of its aminoacyl synthetase

Challenges • Few well-studied examples • Energetics of many interaction motifs are unknown • Length of the interacting region is often quite small • Binding is a concentration dependent process • Folding kinetics rather than thermodynamics may play a role • A single small RNA may have many targets • RNA chaperones such as Hfq may be required for binding • ncRNAs often act within RNPs, what’s the influence of the protein?

Overview of Prediction Strategies • Co-folding by concatenation of two sequences, e.g. RNAcofold , pairfold , DINAMELT , Nupack • Co-folding with pseudoknot-like structures, IRIS • Using only inter-molecular interaction, i.e. assume that both molecules are unstructured by themselves. RNAhybrid , RNAduplex , codeRNAplex • Combine interaction search with accessibility calculations. RNAup , RNAplfold + RNAplex , oligowalk

Simple Co-folding of two RNAs • Poor man’s approach to cofolding: • Concatenate two RNAs using a short linker • Use conventional folding programs such as mfold • Proper way: • Use modified folding algorithm that keeps track of the break between the strands • Any loop containing the break point is treated specially. • Implemented in the RNAcofold program of the Vienna RNA package • Limited to structures that are pseudo-knot free for concatenated sequences.

Pair Probabilities from RNAcofold

Concentration Dependence of RNA-RNA interactions Binding processes are always concentration dependent For two RNAs we have three reactions in equilibrium: A + B ⇋ AB A + A ⇋ AA B + B ⇋ BB Compute concentrations of all five monomers and dimers. mRNA-siRNA mRNA dimer mRNA monomer siRNA monomer 10 concentration [nmol] 5 0 1 10 total siRNA concentration b [nmol]

UNAFold: prediction of RNA/DNA hybridization (Dimitrov&Zuker,2004) Motivation: Allowed configurations: Let A and B be two polynucleotide sequences. In solution, UNAFold aims to predict the concentration of single stranded folded and unfolded A and B AND hybridization AA, BB and AB. Principles: • Simple modification of the McCaskill’s algorithm. • Stacking energies computed from experimental measures. Results: Reproduce experimental observations

Sfold: Accessibilty prediction through Boltzmann sampling (Ding&Lawrence,2001) Sample secondary structures using a Principle: stochastic backtracking procedure: • Estimate accessibility (not base paired) of each nucleotide in the sample set. • Identify the hybridization regions.

Structures (not) Predicted by RNAcofold knot-free pseudo-knotted

Predicting more complex Structures Without restricting allowed structure motif RNA-RNA interaction is NP-complete • The most general algorithms (Alkan 2006, Pervouchine 2004) allow structures where • Intra-molecular pairs form pseudo-knot free structures • Inter-molecular pairs are not allowed to cross • Run time is too slow for most purposes ( O ( n 3 · m 3 ))

Fast Interaction Search Methods for fast interaction search • Search for sequence complementarity by BLAST • Better: Interaction search using thermodynamics • Simplified folding algorithm without intra-molecular pairs. • Runs in O ( n · m ) time. • Used in RNAhybrid (miRNA target prediction), RNAduplex, RNAplex What’s the e ffect of neglecting intra-molecular structure?

Frequency of ncRNA - mRNA Interactions 0.10 I II III IV 0.08 0.06 density 0.04 0.02 0.00 -500 -400 -300 -200 -100 0 Free energy of interaction [kcal/mol] RNA-mRNA interaction interaction energies (from RNAduplex ) red: ncRNA candidates from RNAz , grey: shuffled sequences. Enrichments relative to randomly chosen conserved regions: I: 2.3, II: 1.9, III: 1.4, IV: 1.1

Combining Interaction and Accessibility A G G C Two ingredients for efficient C G C hybridization U G C G C G • Complementarity G C G • Accessibility G C A A A G A C How to quantify these? G C A GACC G Complementarity → interaction energy C G C Accessibility → probability to be unpaired G C G G A A A

RNA Hybridization as a two Step Process − ⇀ ∆ G open − − − − − − − − − ↽ Free energy − ∆ G ↽ − − − − duplex − − − − ⇀ − ∆∆ G − − − ↽ − − − − ⇀ −

Example: ompN and RybB C A U A U UC G A U U A G U U A U G G C U A U U C C A U G U A GCC A G U U A U C A A A A G AG C G C C C G A U C U A U A U A C G A C G U A U U G U A G A C U G MFE -38.2 kcal/mol C G G U U U U U C U A U Cost of opening 23.6 kcal/mol A U U A G U A A UA A -24 kcal/mol A U G C A G U U A C U G G C A G A U A U C C G G C G U A A A G A A A G U U C U U A U U G A C U U A G A C A U U U U G U U U C G G A U U U C G G C A U C U A C G C G C G A C U U C G U C A C C C G A U A C G A A C C U G GC CGCU C A G G G C C A C A A G A C G C A A U A A C G G U G A C A C A A U U U U G G GCCAC-----TGCTTTTCTTTGATGTCCCCATTTT-GTGGA-------GC-CCATCAACCCCGCCATTTCGGTT---CAAG-GTTGGTGGGTTTTTT ||| |||| |||||| ||| ||||| |||| || ||| || || || |||| |||| || ||| |||||| -40.30 AGGTCAAACAACGGC-AGAAACAATATT--TAAAGTCGCCGCACACGACGCGGTCGTCGGT-CGTCTCGGCCCTACTGTTCACGGTTATGAAAAGAAACC-3’

Example: ompN and RybB A C U A UC U G U A U U A C U U U G G C A A U G C G U U U C G G C A U U A U A C C U A C G G C C GCC A A U G G U U A C C U U c u A A A A u U A a G A C AG G G g C a u c u u u u g c a C C C G a G U A C U U a U A U U A g A G A C a G A C C G G u A U C U g U G U A G u G c C U C U A G G u U C G G U U a a u a a A U U C U U U g A U U G A A U U G U A U U A UA A U c G A a a g u u G C G A a U U A U C a G U c G u G C A G A U U u G U C A G C a C a a g u u U G u U A A A G A A A G U a CCCAUU g U C U U U g U A u A C U U G A g C G A A U U U U U G C U u a uuu a a g g U C G U G A U U U u g u C A U G G C c G C c U A C G u a U C a G C G A U GA a C U C U G A C U A a C C C G A U A U A A a C a a A UG U A G A C G C A U A C U G GC CGCU U C A G A G G C C A 0 1 C A U G A A A C C G A A U A A U A C G G C G U G G C A U A C U A A U U U U A G G C UU A C G C G G C A C C U C U G U A C A UCC G G C U G C A G A A C C U A A C C C G U C A G C A G G C C C C G A G G U A C A A A G C U C G G G C A A C G A U A G U A U U U A G C U ∆ G open = 1 . 6 + 3 . 9 kcal/mol, ∆∆ G = − 16 kcal/mol

The RNAup Approach m (3’) i* i j j* 1 (5’) n (3’) 1 (5’) • Compute probability that a site at [ i .. j ] is unpaired (equivalent to the energy ∆ G open needed to force it open). • Consider all possible ways of binding to the region [ i .. j ] to compute the interaction energy ∆ G interact • Total binding energy is the sum of these contributions: ∆∆ G = ∆ G open + ∆ G interact • Currently, restrict interactions to a single region

Computing Accessibility ∆ G open is equivalent to the probability that the region [ i .. j ] is unpaired in equilibrium ∆ G open = − RT ln P u [ i , j ] • Constrained folding ∆ G open = ∆ G constr − ∆ G free • Boltzmann sampling, works for short regions only • Direct computation by modified folding algorithm

RESEARCH & METHODS RNA-RNA interaction prediction Jerome - PowerPoint PPT Presentation

COMP598: ADVANCED COMPUTATIONAL BIOLOGY RESEARCH & METHODS RNA-RNA interaction prediction Jerome Waldispuhl School of Computer Science, McGill From slides from Ivo Hofacker (University of Vienna) Motivation Experimental and

Meshless Meshless Methods Meshless Meshless Methods Methods Methods Contents

METHODS METHODS METHODS METHODS of of of of RADIONUCLIDE PRODUCTION RADIONUCLIDE PRODUCTION

Generic Methods 36 What are Generic Methods? Generic methods = methods that introduce type

Formal Methods and Cryptography Lecture 25 Formal Methods Formal Methods Logical foundations

Formal Methods and Cryptography Lecture 24 1 Formal Methods 2 Formal Methods Logical

EAP roadmap Or What to do about methods? Erik Nordmark erik.nordmark@sun.com Methods, methods,

R Regression Methods Interrogate R Output Objects Paul E. Johnson Center for Research Methods

COMP 516 COMP 516 Research Methods in Computer Science Research Methods in Computer Science

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Clustering ! Hierarchical methods ! Model-based methods ! Density-based methods 1 2 What is

XMLTree Methods 7 January 2019 OSU CSE 1 Methods for XMLTree All the methods for XMLTree are

Direct Search Methods (nongradient methods) 1. Random search methods 2. Univariate method (one

Mat 2170 Methods Week 7 Scope return Examples Methods Algorithms Predicate Methods

Wayland Input Methods Michael Hasselmann Openismus GmbH Wayland Input Methods Input methods?

Chapter 9. Survey Research Chapter 9. Survey Research survey research methods? survey research

P h y s i c s o f b i o l o g i c a l s y s t e ms P H 5 4 9 L

Hybrid SMR Drives Fenggang Wu , Bingzhe Li, Zhichao Cao, Baoquan Zhang Ming-Hong Yang, Hao Wen,

A/acker'Knowledge ' Frdric'Besson,'Nataliia'Bielova ,' Thomas'Jensen' INRIA' '

Strip Module Design Short Strip Modules for Barrel region feature: Silicon sensor plate

PTT 207 Biomolecular and Genetic Engineering Semester 1 2012/2013 BY: PUAN NURUL AIN HARMIZA

1 Molecular characters Nucleotide sequences structural genes (protein, RNA, regulatory)

Synthesis of N -acetyl and N -formyl pyrazoline derivatives from vanillin and their antigenotoxic

Computational Study on the Structure of N- (2-Amino-benzoyl)-N-phenyl hydrazine Ibrahim SEN,