SLIDE 16 Intron-prediction pipeline
498,231 predictions with orthologs D.ere D.mel D.moj 1,398,939 predicted introns for
B
retain orthologous intronscan predictions
A
+ 12 insects predict introns in individual insect genomes using intronscan variation donor score acceptor score variation variation intron length conservation scores scores splice site
C
evaluate characteristic intron evolution
training samples distributions of
train an SVM with these 5 discriminative features apply to 342,785 predictions that overlap no protein−coding gene
369 conserved introns predicted
negative positive
substitution genome genome D.ere D.mel D.moj + 12 insects
+ strand intron − strand intron > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > >
1
False Positive Rate True Positive Rate
1
independent test set ROC curve of AUC = 0.983
Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 16 / 35