ComiR: A New Efficient Tool for Predicting Multiple miRNA Targets - - PowerPoint PPT Presentation

comir a new efficient tool for
SMART_READER_LITE
LIVE PREVIEW

ComiR: A New Efficient Tool for Predicting Multiple miRNA Targets - - PowerPoint PPT Presentation

ComiR: A New Efficient Tool for Predicting Multiple miRNA Targets Claudia Coronnello, PhD Dept. of Computational and System Biology, UPMC Mentor: Panayiotis (Takis) Benos, AP Agenda Introduction to miRNAs and to existing miRNA target


slide-1
SLIDE 1

ComiR: A New Efficient Tool for Predicting Multiple miRNA Targets

Claudia Coronnello, PhD

  • Dept. of Computational and System Biology, UPMC

Mentor: Panayiotis (Takis) Benos, AP

slide-2
SLIDE 2

Agenda

  • Introduction to miRNAs and to existing

miRNA target prediction tools

  • Why develop ComiR?
  • ComiR: the algorithm
  • ComiR: training and validation
  • Next steps
  • Conclusion
slide-3
SLIDE 3

miRNA

  • Mature miRNA: 20-24

nucleotides

  • Function: translation regulation
  • RISC assembly: Argonaute

protein & mature miRNA & mRNA target

  • Binding sites are mostly

located in 3’UTR regions

  • One miRNA regulates many

genes and vice versa

slide-4
SLIDE 4

miRNA target prediction tools

Tool Features PITA It considers the difference between the free energy gained from the formation of the miRNA-target duplex and the energetic cost of unpairing the target to make it accessible to the miRNA. miRanda Sequence binding, thermodynamics-based miRNA-mRNA duplex prediction and comparative sequence analysis TargetScan Thermodynamics-based miRNA-mRNA duplex prediction and comparative sequence analysis. Focus on seed region. mirSVR*** Based on regression method for predicting likelihood of target mRNA down-regulation from sequence and structure features in microRNA/mRNA predicted target sites. All the existing tools are focused on single miRNA target prediction

slide-5
SLIDE 5

Why develop ComiR?

  • Endogenous miRNAs are typically in the order of tens.
  • Biological question: what are the genes regulated by a

set of miRNAs?

  • With a naïve application of existing tools, the set of

predicted targets is not informative (too big or too small).

  • Often we find a poor agreement between the sets of

targets predicted with different tools.

  • ComiR is a target prediction tool designed to predict

targets of a set of miRNAs.

slide-6
SLIDE 6

ComiR: the algorithm

List of miRNAs Expression

  • f miRNAs

Input

miRNA hsa-miR-1 hsa-miR-2 hsa-miR-3 … m 0.5 0.01 0.25 … example

slide-7
SLIDE 7

ComiR: the algorithm

Target predictions are computed separately for each single miRNA i of the input list. Pre-computed values List of miRNAs Expression

  • f miRNAs

miRanda mirSVR TargetSca n PITA

Input Single miR target pred

Eijk DGijk Nik FCijk i … miRNAs k … genes j … multiple miRNA-gene binding sites

slide-8
SLIDE 8

ComiR: the algorithm

List of miRNAs Expression

  • f miRNAs

miRanda mirSVR TargetSca n PITA FD FD w-sum w-sum

Input Single miR target pred Score combination

฀ FD : Sk  1 1e

(Sijk RTmi )/RT j1 Nij

i miRs

฀ Targetscan : Sk  miNik

i miRs

฀ mirSVR : Sk  miFCijk

j1 Nij

i miRs

FD: Fermi Dirac model1 w-sum: weighted sum

(1) Zhao, Y., D. Granas, and G.D. Stormo, Inferring binding energies from selected binding

  • sites. PLoS Comput Biol, 2009. 5(12): p. e1000590.
slide-9
SLIDE 9

ComiR: the algorithm

List of miRNAs Expression

  • f miRNAs

miRanda mirSVR TargetSca n PITA FD FD w-sum w-sum

SVM

Input Single miR target pred Score combination Tools integration

  • Support Vector Machine classifier
  • Training
slide-10
SLIDE 10

ComiR: the algorithm

List of miRNAs Expression

  • f miRNAs

miRanda mirSVR TargetSca n PITA FD FD w-sum w-sum

SVM Gene ranking

Input Single miR target pred Score combination Tools integration Output

slide-11
SLIDE 11

ComiR: the algorithm

List of miRNAs Expression

  • f miRNAs

miRanda mirSVR TargetSca n PITA FD FD w-sum w-sum

SVM Gene ranking

Input Single miR target pred Score combination Tools integration Output

ComiR

slide-12
SLIDE 12

ComiR: training

  • Input

Set of miRNAs

  • Known output

Targets -> Positive set No-Targets -> Negative set

slide-13
SLIDE 13

ComiR: training set

AGO1 IP experiment in D.melanogaster S2 cells:

  • 28 miRNAs (known expression level)
  • 142 mRNAs over-expressed in IP and up-regulated after AGO1

depletion (POSITIVE SET)

  • 142 mRNAs not over-expressed in IP and not up-regulated after

AGO1 depletion (NEGATIVE SET)

Hong, X., et al., Immunopurification of Ago1 miRNPs selects for a distinct class of microRNA targets. Proceedings of the National Academy of Sciences of the United States of America, 2009. 106(35)

142 287 949 4907 AGO1 IP enriched not-enriched AGO1 depletion down up

slide-14
SLIDE 14

ComiR: validation

  • Test set

– Set of miRNAs – Targets -> Positive set – No-Targets -> Negative set

  • Calculate sensitivity (SN) and specificity (SP)

฀ SN  TP TP  FN ฀ SP  TN TN  FP

TP: number of true positive FN: number of false negative TN: number of true negative FP: number of false positive

slide-15
SLIDE 15

ComiR: validation

  • Tool’s performance comparison by ROC

curves analysis.

  • Calculate SN and SP at

different threshold level

  • Calculate the area under

the resulting curve (AUC) AUC

slide-16
SLIDE 16

ComiR: validation

Self-test

142 top expressed

Algorithm AUC ComiR 0.864 PITA 0.713 MIRANDA 0.694 mirSVR 0.801 Targetscan 0.792

ROC – self training set ROC – extended set

Algorithm AUC ComiR 0.724 PITA 0.647 MIRANDA 0.616 mirSVR 0.663 Targetscan 0.644

slide-17
SLIDE 17

ComiR: training and validation

Training set Test sets

Normalization

slide-18
SLIDE 18

ComiR: validation

H.sapiens – hek293 cells PAR-CLIP protocol AGO1 IP

  • 27 top expressed miRNAs
  • Positive test set:

2083 genes with CCR matching the top 27 miRs in 3’ UTR sequence

  • Negative test set:

2083 genes without CCR matching the top 27 miRs, with the highest average expression.

slide-19
SLIDE 19

ComiR: validation

Algorithm AUC ComiR 0.774 PITA 0.649 MIRANDA 0.626 mirSVR 0.65 Targetscan 0.601

ROC – H.sapiens test set

H.sapiens – hek293 cells PAR-CLIP protocol AGO1 IP

  • 27 top expressed miRNAs
  • Positive test set:

2083 genes with CCR matching the top 27 miRs in 3’ UTR sequence

  • Negative test set:

2083 genes without CCR matching the top 27 miRs, with the highest average expression.

slide-20
SLIDE 20

Conclusion

  • ComiR outperforms existing tools in predicting

the targets of set of miRNAs, as validated by IP

  • r PAR-CLIP experiment.
  • ComiR outperforms existing tools in predicting

single miRNA targets, tested on experimental validated miRNA-mRNA pairs.

slide-21
SLIDE 21

Next steps

  • Developing a web interface to run custom

ComiR target predictions on any set of miRNAs

  • Improve ComiR predictions by including the

information on the location of the target prediction site in the mRNA sequence.

slide-22
SLIDE 22

Acknowledgment

  • I. Rakova, PhD
  • M. Butterworth, PhD
  • G. Stormo, PhD

Benos lab

  • N. Kaminski, MD
  • L. Huleihel
  • K. Pandit, PhD
slide-23
SLIDE 23

Questions