The Brave New World of Non-Coding RNAs Peter F. Stadler - - PowerPoint PPT Presentation

the brave new world of non coding rnas
SMART_READER_LITE
LIVE PREVIEW

The Brave New World of Non-Coding RNAs Peter F. Stadler - - PowerPoint PPT Presentation

The Brave New World of Non-Coding RNAs Peter F. Stadler Bioinformatics Group, Dept. of Computer Science & Interdisciplinary Center for Bioinformatics, University of Leipzig Max-Planck-Institute for Mathematics in the Sciences RNomics


slide-1
SLIDE 1

The Brave New World of Non-Coding RNAs

Peter F. Stadler

Bioinformatics Group, Dept. of Computer Science & Interdisciplinary Center for Bioinformatics, University of Leipzig Max-Planck-Institute for Mathematics in the Sciences RNomics Group, Fraunhofer Institute for Cell Therapy and Immunology Institute for Theoretical Chemistry, Univ. of Vienna (external faculty) The Santa Fe Institute (external faculty)

Jena, Aug 2010 Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 1 / 35

slide-2
SLIDE 2

The Central Dogma

DNA − →

  • transcription

RNA − →

  • translation

protein

  • nly 3% of the non-repetitive

part of genome codes for proteins Is all the rest junk DNA? Are all the repeats just genomic parasites?

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 2 / 35

slide-3
SLIDE 3

Pervasive Transcription

More than 90% of the non-repetitive genome shows evidence for transcription in at least one direction

The ENCODE Consortium, Nature, 447: 779-816 (2007). Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 3 / 35

slide-4
SLIDE 4

Transcriptome Complexity

chr7: RNAz_set1_50 EvoFold sno/miRNA Conservation RepeatMasker 26.90m 26.95m 27.00m 27.05m HOXA1 chr7.279 HOXA2 HOXA3 chr7.283 HOXA4 HOXA5 HOXA6 chr7.287 HOXA7 HOXA10 HOXA9 chr7.290 HOXA11 hoxa11-as HOXA13 chr7.295 EVX1 HOXA1 HOXA1 AC004079.7 AC004079.7 AC004079.7 AC004079.7 HOXA2 HOXA3 HOXA3 HOXA3 AC010990.1 HOXA3 HOXA3 AC010990.1 AC010990.1 AC010990.1 HOXA4 AC004080.14 HOXA5 HOXA5 HOXA6 HOXA6 AC004080.14 HOXA6 AC004080.14 AC004080.14 HOXA7 HOXA7 HOXA9 HOXA9 HOXA9 HOXA9 HOXA9 HOXA10 HOXA10 HOXA10 HOXA10 HOXA10 HOXA11 HOXA11 HOXA13 EVX1 AC004080.12 AC004080.12 AC004080.12 AC004080.13 AC004080.15 AC004080.15 AC004080.1 AC004080.1 AC004080.1 AC004080.17 AC004080.18 AC004080.19 Affy Transcription GENCODE GENCODE putative mRNA alternative splicing

Hox A cluster.

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 4 / 35

slide-5
SLIDE 5

Transcriptome Complexity

Science 316: 1484-1488 (2007) Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 5 / 35

slide-6
SLIDE 6
  • H. phylori doesn’t read textbooks

mapping of transcription start sites in Helicobacter pylori secondary start-sites and pervasive antisense transcription

cag island

H P 5 3 3 a 1 c g 3 c a g 1 8 a 1 c g 4 c a g 2 1 cag19 cag20 cag10 a 1 c g 5 cag22 cag24 cag23 cag25

570,000 564,000 566,000 568,000 572,000 574,000 576,000 578,000 580,000 582,000 584,000

cagA c a g 1 1 cag17 cag16 cag12

primary 810 secondary 119 internal 440 antisense 969

  • rphan

38

Nature 464: 250-255 (2010) Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 6 / 35

slide-7
SLIDE 7

A New Paradigm of Molecular Biology!

There is no junk! Most of the human genome is transcribed, and there are good reasons to believe to most of the transcripts have function Most “genes” do not code for proteins We have to re-think — and maybe even abandon — the very notion of a gene Are these ncRNAs really functional????

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 7 / 35

slide-8
SLIDE 8

Evidence for ncRNA function

A small number of well-studied transcripts have functions identifyable by genetic methods (e.g. deletion/complementation) Statistical arguments:

differential regulation Conservation at sequence level Conservation of RNA structure Conservation of splicing patterns Association with (disease) phenotypes Specific processing

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 8 / 35

slide-9
SLIDE 9

CHD QTL Locus

The majority of QTLs for complex multi-genic diseases fall into non-coding regions

McPherson et al., Science (2007)

Association of coronary heart disease (CHD) with a 58kb region on chr. 9p21 non-coding locus, produces the ANRIL transcript(s) ANRIL expression is associ- ated with the atherosclerosis risk

Holdt et al. Arterioscler Thromb Vasc Biol., 30, 620-627 (2010) Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 9 / 35

slide-10
SLIDE 10

Computational RNA Gene Finding

Many (but by no means all known functional RNAs are structured, i.e. certain base pairing patterns must be conserved This implies that substitutions are not random, but must be consistent with (GC→GU) or even compensate for base pairs (GC→AU) Empirical Observation: Known ncRNAs are (a little bit) more stable than genomic background with the same base composition. IDEA: use this to build a gene finder

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 10 / 35

slide-11
SLIDE 11

RNAz: a gene finder for structured RNA

0.2 0.4 0.6 0.8 1 1.2 0.2 0.4 0.6 0.8 1 1.2 5S rRNA tRNA Signal recognition particle RNA RNAseP U2 spliceosomal RNA U5 spliceosomal RNA

z-score Structure conservation index

Separation of native ncRNAs from random controls in two dimensions

  • Proc. Natl. Acad. Sci. USA 102: 2454-2459 (2005)

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 11 / 35

slide-12
SLIDE 12

Structured RNAs in the Human Genome

92.0M 94.0M 96.0M 98.0M Most conserved noncoding regions (present in at least human/mouse/rat/dog) RNAz structural RNAs (P>0.5) RNAz structural RNAs (P>0.9) RefSeq Genes

90801000 90801500

RNAz structural RNAs (P>0.9) miRNAs

mir-17 mir-19a mir-19b-1 mir-18 mir-20 mir-92-1

(((((..((((((..((((((((.((.(((((...(((........)))...))))).)).))))))))...))))))....))))) GTCAGAATAATGTCAAAGTGCTTACAGTGCAGGTAGTGATATGT-GCATCTACTGCAGTGAAGGCACTTGTAGCATTA-TG-GTGAC GTCAGAATAATGTCAAAGTGCTTACAGTGCAGGTAGTGATGTGT-GCATCTACTGCAGTGAGGGCACTTGTAGCATTA-TG-CTGAC GTCAGGATAATGTCAAAGTGCTTACAGTGCAGGTAGTGGTGTGT-GCATCTACTGCAGTGAAGGCACTTGTGGCATTG-TG-CTGAC GTCAGAGTAATGTCAAAGTGCTTACAGTGCAGGTAGTGATATATAGAACCTACTGCAGTGAAGGCACTTGTAGCATTA-TG-TTGAC GTCAATGTATTGTCAAAGTGCTTACAGTGCAGGTAGTATTATGGAATATCTACTGCAGTGGAGGCACTTCTAGCAATA-CACTTGAC GTCTGTGTATTGCCAAAGTGCTTACAGTGCAGGTAGTTCTATGTGACACCTACTGCAATGGAGGCACTTACAGCAGTACTC-TTGAC Human Mouse Rat Chicken Zebrafish Fugu

G U C A G A A U A A U G U C A A A G U G C U UA C A G U G C A G G U AG U G A U A U G U _ G C A U C U A C U G C A G U G A A G G C A C U U G U A G C A U U A _ U G _ U U G A C

93104k 93106k 93108k RNAz structural RNAs (P>0.5) RNAz structural RNAs (P>0.9) H/ACA snoRNAs C/D-box snoRNAs ACA25 ACA32 ACA1 ACA8 ACA18 ACA40 mgh28S-2412 mgh28S-2410

  • Chr. 13
  • Chr. 13
  • Chr. 11

a b d c

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 12 / 35

slide-13
SLIDE 13

Structured RNAs in the Human Genome

Mammalian genomes contain ∼ 105 structured RNA motifs Statistics of the highest-confidence fraction (∼ 36000):

Known gene < 10 kb from nearest gene > 10 kb from nearest gene Intron of coding region 3’−UTR (exon or intron) 15380 16860 3745 2866 2830 11205 5’−UTR (exon or intron)

Nature Biotech. 23 1383-1390 (2005) Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 13 / 35

slide-14
SLIDE 14

Finding mRNA-like ncRNAs

long = contains at least one intron predict non-coding transcripts by predicting conserved short introns Why introns?

  • intron evolution is slow and essentially independent of the

evolution of the mature sequence

  • splice sites are often conserved
  • disruption of correct splicing usually destroys function

! non-coding transcrips do not have randomly placed large in/dels. Why short introns?

  • Most Drosophila introns are short.
  • Can be accurately predicted (94% with both splice sites correct)

Intron prediction (Lim & Burge 1999): machine learning using patterns of donor, acceptor, intron length, branch point, intron composition

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 14 / 35

slide-15
SLIDE 15

mlncRNAs – splice sites

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 15 / 35

slide-16
SLIDE 16

Intron-prediction pipeline

498,231 predictions with orthologs D.ere D.mel D.moj 1,398,939 predicted introns for

B

retain orthologous intronscan predictions

A

+ 12 insects predict introns in individual insect genomes using intronscan variation donor score acceptor score variation variation intron length conservation scores scores splice site

C

evaluate characteristic intron evolution

training samples distributions of

train an SVM with these 5 discriminative features apply to 342,785 predictions that overlap no protein−coding gene

  • D. melanogaster

369 conserved introns predicted

negative positive

substitution genome genome D.ere D.mel D.moj + 12 insects

+ strand intron − strand intron > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >

> > > > > > > > > > > >

1

False Positive Rate True Positive Rate

1

independent test set ROC curve of AUC = 0.983

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 16 / 35

slide-17
SLIDE 17

Evaluation of Introns: Learn Species-Specific Patterns

Conservation scores (PhastCons) 1 0.5 8...20 and −20...−8 average conservation scores for region substitution scores sum of 0.5 substitution scores sum of 8...20 and −20...−8 average conservation scores for region Density 0.03 1.0 Density 20 40 80 −10 Density 0.5 40 80 1.0 Density 10 −10 0 average = 0.002 sum = 31.6 sum = −3.7 average = 0.92 G G T negative positive D.pse D.sec D.mel D.sim D.yak D.ere D.vir D.gri D.gri D.moj D.vir D.wil D.per D.pse D.ana D.ere D.sim D.mel D.sec G T A A G A T − T A T T C C G A T T T T T A T A G C T T C A T T T T T G A G A A A T T T A A T T T G A T T A A − − − − T T T T T A G G T A A G C C − − − − T T A C A A A A A A C C A T A T A T A T T T T T A G T G A A T C A A T A T T G C C T T A T T − − T T T G T A G G T A G G A T − T A A C C A T C C A G C T A T C T A T A T A T C T G T A G T A A T A T C T T G A A C T A T A A − − − − T T T G C A G G T A A A C − − − G C T A T T A G A A T T C A T T T A C A T T T A C A G A C G A T − A A T A G T G T A T A T C T T C A T A G G G T G A G T G − T A A C C G T A A C C A G C A A C T G G C T C C A G C A G T A G A C C T A T C G A A T A T A − − − − − T C C G C A G G T G A G T G − T A A C C G T A A C C A G C A A C T G G C T C C A G C A G T A G A C C T A T C G A A T A T A − − − − − T C C G C A G G T A A G C T T T T C C G A A G A G A T A G C A T T − − T A T T A T G A T T C A A T T G T T T − − − − − − − − − − − − T T C A C A G G T G A G A A − − A C A C A A G A C A T G C T A T T G C C A A T A A T A T C A T A T − A C C A A G A A C T C A A − − − T T T A C A G G T G A G A C − − A C C C A A G A C A T T C T A T T G G C A A T A A T A T C C T T T − A C C A A G G A C C C A − − − − T T T A C A G G T G A G A C − − A C C C A A G A C A T T A T A T T G G C A A T A A T A T C A T C T − A C C A A G G G C T C A − − − − T T T A C A G G T G A G A C − − C C C C A A G A C A T T T T A T T G G C A A T A A T A T C C T A T − A C C A A G G A C C C A − − − − T T T A C A G

A

substitution scores

B

+20 +8 −8 −20 +20 +8 1 −8 −20

G T G G G C T C A G − − − T C G G T A C T C C A T T A T G A T T G T T T A T T T A − − − − − − − A T A T G C G C T T G A T T T G A A G G T G G G C T C A G T C T G T G G T A C T C C A T T A T G A T T G T T T A T T T A − − − − − − − A T A T G C G C T T G A T T T G A A G G T G G G C T C A G T C T G T G G T A C T C C A T T A T G A T T G T T T A T T T A − − − − − − − A T A T G C G C T T G A T T T G A A G G T G G G C T C T C − − − T C G G T A C T G C A T T A T G A T T G T T T A T T T T − − − − − − − A T A T G C G C T T G A T T T G A G G G T G G G C T C A G − − − T C G G A A C T C C A T T A C G A T T G T T T A T T T T − − − − − − − A T A T G C G C T T G A T T T G A G G G T G G G C T C A G − A G T C G G T A C T C C A C T G C G A T T A T T T A T T T T − − − − − − − A T T T G C G C C T G A T T T G A G G G T G G T T T G − − − − − − − G A C T C C A T T A T A A T T A T T T A T A T T − A C C C G T G T T T G C G C T T G A T T T G A A G A T G T G G A T C T − − − − G G G G A C T C C A T T A T A A T T A T T T A T A T T T G C T C G T A T T T G C G C T T G A T T T G A A G G A

distribution of positive training samples distribution of negative training samples classified as false prediction (SVM probability 0.001) classified as real intron (SVM probability 0.999)

Conservation scores (PhastCons)

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 17 / 35

slide-18
SLIDE 18

Validation with un-annotated ESTs

chr3R: chr2R: chr3L: chr3L: chrX: Conservation Conservation CG14614 21856300 13232800 19480100 4479900 8881300 21856400 13232900 8881400 4480000 4480100 19480200 500 bp 500 bp 600 bp 300 bp 21856500 13233000 8881500 19480300 21856600 8881600 13233100 19480400 4480200 FlyBase Protein−Coding Genes predicted intron predicted intron 8881700 4480300 FlyBase Noncoding Genes

  • D. melanogaster mRNAs from GenBank

predicted intron

  • D. melanogaster ESTs That Have Been Spliced

8881800 4480400

  • D. melanogaster ESTs That Have Been Spliced

predicted intron predicted intron predicted intron 600 bp predicted intron

  • D. melanogaster ESTs That Have Been Spliced

FlyBase Protein−Coding Genes

  • D. melanogaster ESTs That Have Been Spliced
  • D. melanogaster ESTs That Have Been Spliced

CA805633 CA807669 CA805453 CA807471 CO192200 CA807690

E D B C A

CA804813 Conservation Conservation EY198607 EY198595 CA805394 CA805952 CA805663 CA804428 CA805031 CA805317 CA807678 pncr009:3L−RA Conservation BE979091 AI944913 EC251326 AY113603 CO334041 CO319199 CK135604 dally EC247591 CO295956 EC249419

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 18 / 35

slide-19
SLIDE 19

Novel conserved ncRNAs in Drosophila

E L P •

  • gen. M bp

200 100

mlncRNA36C10 chr2L:17486220-17486296 +

n.t. + + + + +

  • n.t.

+ + + + n.t. + + + +

  • D. sim
  • D. mel
  • D. ere
  • D. pse

200 100

mlncRNA68E3

chr3L:11950619-11950679 -

n.t. + + + +

  • +
  • n.t.

+ + + + n.t. + + + +

  • D. sim
  • D. mel
  • D. ere
  • D. pse

mlncRNA69E2

chr3L:12771073-12771124 - 200 100

n.t. + + + +

  • +

+ ++ + n.t. + + + +

  • D. sim
  • D. mel
  • D. ere
  • D. pse

n.t. + + + +

200 100

mlncRNA66A2 chr3L:7469008-7469065 +

  • D. sim

n.t. + + + +

  • D. mel
  • +

+ ++ +

  • D. ere

n.t. + + + +

  • D. pse

n.t. + + + +

200 100

mlncRNA102B1 chr4:285056-285111 +

  • D. sim

n.o. n.o. n.o. n.o. n.o.

  • D. mel

+ + + + +

  • D. ere

n.t. + + + +

  • D. pse

n.t. + + + +

200 100

mlncRNA42E5-1 chr2R:2907741-2907800 +

  • D. sim

n.t. + + + +

  • D. mel
  • +

+

  • D. ere

n.t. + + + +

  • D. pse

n.t. + + + +

mlncRNA42E5-2 chr2R:2906739-2906797 +

200 100

  • D. sim

n.t. + + + +

  • D. mel

+ + + + +

  • D. ere

n.t. + + + +

  • D. pse

n.t. + + + +

11 out of 17 predictions verified by PCR and sequencing

Expression of transcripts and existance of introns also verified in 3 other fly species

Embryo Larva Pupa male female Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 19 / 35

slide-20
SLIDE 20

Extension to Mammals

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 20 / 35

slide-21
SLIDE 21

Generation of small RNAs

Many types of small RNAs are produced from longer precursors. In many cases, these precursors are mRNA-like pol-II transcripts.

1

primary microRNA precursors. miRNAs a processed out of either exons or introns

200 400 600 800 0.0 0.5 1.0 hg18:chr13 miRNA cluster

primary precursor of the mir-17 cluster

2

snoRNA host genes vertebrate snoRNAs are produced from introns

Scale chr1: RepeatMasker 2 kb 172099500 172100000 172100500 172101000 172101500 172102000 172102500 172103000 172103500 UCSC Genes Based on RefSeq, UniProt, GenBank, CCDS and Comparative Genomics Mammalian Gene Collection Full ORF mRNAs Vertebrate Multiz Alignment & Conservation (44 Species) Placental Mammal Basewise Conservation by PhyloP Repeating Elements by RepeatMasker GAS5 GAS5 GAS5 SNORD81 SNORD47 SNORD80 SNORD79 SNORD78 SNORD44 SNORD77 SNORD76 SNORD75 SNORD74 ZBTB37 Mammal Cons 2 _

  • 0.3 _

3

some piRNA precursors

4

Affymetrix high-density arrays showed that at least 1% of the human genome produces small RNAs

Science 316: 1484-1488 (2007) (joint work with Affymetrix) Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 21 / 35

slide-22
SLIDE 22

microRNAs and their relatives

7mG TAS3 AAA RDR6 SGS3 DCL4 HEN1 HEN1 DCL1/HYL1

7mG AAA

7mG AAA RDR6 SGS3 Met Met Met Met Met Met DRM1 DRM2 AGO4 NRPD1a NRPD2b DRD1 Genomic DNA AAA 7mG U U Met Met A Met 7mG AAA A U A AAA HEN1 A U Met A sense−piRNA AAA U + AS−piRNA HEN1 AGO3 AUB AUB AGO3 7mG AAA sense pi−master AAA 7mG anti−sense pi master pre−miRNA pri−miRNA Met Met Met Met Met TAS1&2 AAA Met Met Met AAA 7mG Anti−sense TC 7mG AAA Tas precursor 7mG AAA Met Met Met 7mG AAA Met Met AAA Met Met AGO1 AGO1

  • r

DRB4 AGO7 AGO? Exportin−5 HST Drosha/Pasha DCL1/HYL1 AGO? DCL1 HEN1 NRPD1a CH3 CH3 CH3 CH3 CH3 CH3 NRPD1a HEN1 2DCLR3 NRPD2a tasiRNA miRNA natsiRNA rasiRNA AGO1 Dicer RDR2 24nt 21nt piRNA PIWI CYTOPLASMA Met 7mG DCL2 7mG AAA AAA NUCLEUS

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 22 / 35

slide-23
SLIDE 23

MicroRNAs: Innovation

Most protein-coding genes are evolutionarily old. For instance, there are no or very few new transcription factor that were invented throughout vertebrate evolution Small RNAs, in particular microRNAs, however, are readily created de novo. Is there a link between ncRNA innovation and novelty at the

  • rganismal level?

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 23 / 35

slide-24
SLIDE 24

MicroRNAs: Innovation

Expansion of the Metazoan microRNA Repertoire

Gastropoda

Danio rerio Callorhinchus mili Petromyzon marinus Branchiostoma floridae Ciona intestinalis Ciona savignyi Oikopleura dioica Saccoglossus kowalevskii Strongylocentrotus purpuratus Schistosoma mansoni Caenorhabditis remanei Caenorhabditis briggsae Pristionchus pacificus Brugia malayi Trichinella spiralis Schmidtea mediterranea Helobdella robusta Capitella capitata Lottia gigantea Aplysia californica Biomphalaria glabrata Spisula solidissima Hydra magnipapillata Nematostella vectensis Acropora palmata Acropora millepora Trichoplax adhaerens Apis mellifera Tribolium castaneum Bombyx mori Drosophila Daphnia pulex Caenorhabditis elegans Oryzias latipes Gasterosteus aculeatus Tetraodon nigroviridis Takifugu rubripes Gallus gallus Xenopus tropicalis Ornithorhynchus anatinus Monodelphis domestica Canis familiaris Bos taurus Mus musculus Rattus norvegicus Pan troglodytes Homo sapiens Anopheles gambiae

5 7 18 21 25 20 8 1 8 9 9 83 9 4 4 2 13 2 18 70 1 1 3 Cnidaria Eumetazoa Deuterostomia Protostomia Echinodermata Urochordata Vertebrata Gnathostoma Teleostomi Mammalia Eutheria Rodentia 90 Primates 19 Teleostei Bilateria Nematoda Arthropoda Mollusca Annelida Plathelmynthes

BMC Genomics 7: 15 (2006), updated Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 24 / 35

slide-25
SLIDE 25

MicroRNA Offset RNAs

chr3: hsa-mir-425 Mammal Cons Rhesus Mouse Dog Horse Armadillo Opossum Platypus Lizard Chicken X_tropicalis Stickleback 49032550 49032600 49032650 49032700 blockbuster blocks read density annotated ncRNAs Vertebrate Multiz Alignment & PhastCons Conservation (28 Species)

Distribution of short reads at the hsa-mir-425 locus. There are three clearly distinct blocks of reads: the two more abundant ones correspond to miR and miR*, the third one to the 5’moRNA. Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 25 / 35

slide-26
SLIDE 26

Offset RNAs are associated with old microRNAs

Metazoa Eumetazoa Bilateria Deuterostomia Chordata Olfactores Vertebrata Gnathostomata Teleostomi Tetrapoda Amniota Mammalia Theria Eutheria Epitheria Euarchontoglires Primates Haplorrhini Hominoidea

0.00 0.25 0.50 0.75 1.00 cdf 0.15 0.30 0.45 0.60 density

microRNAs with moRNAs (N=39) expressed microRNA families (N=199) all microRNA families (N=277)

Bioinformatics 25: 2298-2301 (2009) Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 26 / 35

slide-27
SLIDE 27

Promoter and Termini Associated RNAs

Examples of novel families of small RNAs

High-density tiling array screen of human small RNAs More than 1% of the human genome is transcribed into small RNAs.

Science 316: 1484-1488 (2007) (joint work with Affymetrix) Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 27 / 35

slide-28
SLIDE 28

Regulated specific cleavage of tRNAs

in Aspergillus fumigatus

Nucleic Acids Res. 36, 2677-2689 (2008)

recently discovered by several groups also in mammals

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 28 / 35

slide-29
SLIDE 29

Specific pattern of small RNAs

U6 5S 7SL U2 U1 Y3 Y1

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 29 / 35

slide-30
SLIDE 30

Block patterns are source specific

A first attempt: random forrest classificator to distinguish the block patterns of microRNAs, the two snoRNA classes, and tRNAs. Confusion matrix: (10-fold crossvalidation) classified as class miRNA H/ACA C/D tRNA

  • ther

miRNA 249 2 6 8 21 H/ACA 6 8 5 2 4 C/D 20 3 82 13 22 tRNA 7 12 310 41

  • ther

25 4 16 56 312

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 30 / 35

slide-31
SLIDE 31

New RNA Classes from Structural Clustering

Comparison of RNA secondary structures: Structure-enhanced alignments (e.g. stral) Tree-alignment or Tree-editing (e.g. RNAforrester, MARNA) RNAshapes Variants of the Sankoff algorithm

A B D E A B D E C

A B

D E C E B D A E A D A C A B B C B E C D

C

D E C

dot.ps U A C G A C G G A C U U A C G G A C U U A C G U A C G A C G G A C U U A C G G A C U U A C G U A C G A C G G A C U U A C G G A C U U A C G U A C G A C G G A C U U A C G G A C U U A C G dot.ps A U C A C U C G U A C U G U A C A U C A C U C G U A C U G U A C A U C A C U C G U A C U G U A C A U C A C U C G U A C U G U A C dot.ps A U C A C U C G U A C U G U A C A U C A C U C G U A C U G U A C A U C A C U C G U A C U G U A C A U C A C U C G U A C U G U A C dot.ps A U C A C U C G U A C U G U A C A U C A C U C G U A C U G U A C A U C A C U C G U A C U G U A C A U C A C U C G U A C U G U A C dot.ps U A C G A C G G A C U U A C G G A C U U A C G U A C G A C G G A C U U A C G G A C U U A C G U A C G A C G G A C U U A C G G A C U U A C G U A C G A C G G A C U U A C G G A C U U A C G

locarna: a Sankoff-based local structure alignment tool Trick: use thermodynamically most plausible base-pairs only

Joint work with Rolf Backofen’s group

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 31 / 35

slide-32
SLIDE 32

Clustering Ciona intestinalis RNAz Predictions

Arg/Asn 2 2 Arg 2 Thr 2 ~ Ile 2 Phe Cys 2 2 Lys 2 Gln 3 Gly 2 2 Val Glu 2 Pro ~ 2 Met Arg Met Gln Ala ~ ~ ~ 4 2 Ile 4 Ser Leu Tyr 0.0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

tRNAs subtree from a clustering 3332 ncRNA candidates

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 32 / 35

slide-33
SLIDE 33

Putative Novel RNA Classes

ci_558117 *** ci_555438 *** ci_554296 ci_557698 ci_555929 *** ci_554730 *** ci_555491 *** ci_554599 *** ci_556562 *** ci_555236 *** ci_554528 *** ci_555486 ci_557864 ci_556204 *** ci_556966 *** ci_557168 ci_556973 *** ci_556971 *** ci_556968 *** ci_556955 *** ci_554931 *** ci_557471 ci_557305 *** ci_555637 ci_556275 *** ci_556105 *** ci_555312 *** ci_556276 *** ci_555555 *** ci_554842 *** ci_554683 *** ci_554678 ci_554324 *** ci_554354 *** ci_557087 *** ci_555122 ci_555447 *** ci_556560 *** ci_555756 *** ci_554903 *** ci_555970-5S ci_555994 *** ci_557058 ci_555492 *** ci_554321 *** ci_556663 *** ci_556021 *** ci_555550 ci_556949 ci_555833 *** ci_555828 *** ci_555456 ci_557837-sc19 ci_555813 *** ci_554098 *** ci_554384 *** ci_555508 ci_554681

A G G G _ C C _ A A U A A A A A G U U U CG A A G C U G C _ _ G A G G _ U U G C A A C C A A A _ C A C C G _ _ U _ C A AC _ U A U A U C A G G A A U _ G U U G A _ U A A U A _ _ U C _A_ A _ _ _ A C A_ U C G C U G C U G C _ CAA U G _ A A C A U C G A U C C G A C G C A G G U U C G C A U G C G _ _ _ U A U U G A A A C U A U A A C A C alidot.ps A G G G _ C C _ A A U A A A A A G U U U C G A A G C U G C _ _ G A G G _ U U G C A A C C A A A _ C A C C G _ _ U _ C A A C _ U A U A U C A G G A A U _ G U U G A _ U A A U A _ _ U C _ A _ A _ _ _ A C A _ U C G C U G C U G C _ C A A U G _ A A C A U C G A U C C G A A G G G _ C C _ A A U A A A A A G U U U C G A A G C U G C _ _ G A G G _ U U G C A A C C A A A _ C A C C G _ _ U _ C A A C _ U A U A U C A G G A A U _ G U U G A _ U A A U A _ _ U C _ A _ A _ _ _ A C A _ U C G C U G C U G C _ C A A U G _ A A C A U C G A U C C G A A G G G _ C C _ A A U A A A A A G U U U C G A A G C U G C _ _ G A G G _ U U G C A A C C A A A _ C A C C G _ _ U _ C A A C _ U A U A U C A G G A A U _ G U U G A _ U A A U A _ _ U C _ A _ A _ _ _ A C A _ U C G C U G C U G C _ C A A U G _ A A C A U C G A U C C G A A G G G _ C C _ A A U A A A A A G U U U C G A A G C U G C _ _ G A G G _ U U G C A A C C A A A _ C A C C G _ _ U _ C A A C _ U A U A U C A G G A A U _ G U U G A _ U A A U A _ _ U C _ A _ A _ _ _ A C A _ U C G C U G C U G C _ C A A U G _ A A C A U C G A U C C G A

cluster152 N=6 MPI=26.40 SCI=0.42

alidot.ps _ _ U _ G _ G _ G _ A _ G A U _ G _ _ A _ G _ A U G A U G U A U G _ A U U _ U G G C _ C A U A U C A G U _ U U A _ U C _ U G U _ A U A A A A _ A A G A U G A A _ C U G U A G _ U U G C A _ A _ A A U U C C A _ A A U G C G U A _ _ _ U _ G U A C C A U A _ _ U _ G _ G _ G _ A _ G A U _ G _ _ A _ G _ A U G A U G U A U G _ A U U _ U G G C _ C A U A U C A G U _ U U A _ U C _ U G U _ A U A A A A _ A A G A U G A A _ C U G U A G _ U U G C A _ A _ A A U U C C A _ A A U G C G U A _ _ _ U _ G U A C C A U A _ _ U _ G _ G _ G _ A _ G A U _ G _ _ A _ G _ A U G A U G U A U G _ A U U _ U G G C _ C A U A U C A G U _ U U A _ U C _ U G U _ A U A A A A _ A A G A U G A A _ C U G U A G _ U U G C A _ A _ A A U U C C A _ A A U G C G U A _ _ _ U _ G U A C C A U A _ _ U _ G _ G _ G _ A _ G A U _ G _ _ A _ G _ A U G A U G U A U G _ A U U _ U G G C _ C A U A U C A G U _ U U A _ U C _ U G U _ A U A A A A _ A A G A U G A A _ C U G U A G _ U U G C A _ A _ A A U U C C A _ A A U G C G U A _ _ _ U _ G U A C C A U A _ _ U _ G _ G _ G _ A _ G A U _ G _ _ A _ G _ A U G A U G U A UG _ A U U _ U G G C _ C A U A UC A G U _ U U A _ U C _ U G U _ A U A A A A_ A A G A U G A A _ C U G U A G_ U U G C A_ A _ A A U U C C A _ A A U G C G U A _ _ _ U _ G U A C C A U A U A G U G A U A A U A A U A U A A U A _

cluster107 N=12 MPI=21.10 SCI=0.29

alidot.ps G C U A U U C U U _ C A _ _ A U _ U U U U A C A _ U A G _ _ A U G _ G U U U U A U G _ G A _ C U G G C U A U U U A U A G A U A A _ A A G _ C U G _ G C _ U A U G _ A U G A A _ G U C A _ C G A A A _ _ U A A U G _ C _ _ G U C _ _ A _ C A _ _ _ U U G A G C U A U U C U U _ C A _ _ A U _ U U U U A C A _ U A G _ _ A U G _ G U U U U A U G _ G A _ C U G G C U A U U U A U A G A U A A _ A A G _ C U G _ G C _ U A U G _ A U G A A _ G U C A _ C G A A A _ _ U A A U G _ C _ _ G U C _ _ A _ C A _ _ _ U U G A G C U A U U C U U _ C A _ _ A U _ U U U U A C A _ U A G _ _ A U G _ G U U U U A U G _ G A _ C U G G C U A U U U A U A G A U A A _ A A G _ C U G _ G C _ U A U G _ A U G A A _ G U C A _ C G A A A _ _ U A A U G _ C _ _ G U C _ _ A _ C A _ _ _ U U G A G C U A U U C U U _ C A _ _ A U _ U U U U A C A _ U A G _ _ A U G _ G U U U U A U G _ G A _ C U G G C U A U U U A U A G A U A A _ A A G _ C U G _ G C _ U A U G _ A U G A A _ G U C A _ C G A A A _ _ U A A U G _ C _ _ G U C _ _ A _ C A _ _ _ U U G A G C U A U U C U U _ C A __ A U _ U U U U A C A _ U A G _ _ A U G_ G U U U U A U G _ G A _ C U GG C U A U U U AU A G A U A A _ A A G _C U G _ G C_ U A U G _ A U G A A _ G U C A _ C G A A A _ _ U A A U G _ C _ _ G U C _ _ A _ C A _ _ _ U U G A G U U U A U A U U A A C A A G U C A A G G U U U A U G U U A U G C G G A G G A C A U

cluster127 N=13 MPI=21.34 SCI=0.18

A G U _ _ A U G _ U G _ U A U C U A U G A A U AU A U U C A U U G A A C C U C A U U A C U U AG C U _ _ A G C C A UC _ G C U A G A U G U G A _ G A A G G A U C C A U G G G U A C U A A U C U A A A A A A A U A A A U A _ A A U A U A U A C A U U A G U C U U A G C G U alidot.ps A G U _ _ A U G _ U G _ U A U C U A U G A A U A U A U U C A U U G A A C C U C A U U A C U U A G C U _ _ A G C C A U C _ G C U A G A U G U G A _ G A A G G A U C C A U G G G U A C U A A U C U A A A A A A A U A A A U A _ A A G U _ _ A U G _ U G _ U A U C U A U G A A U A U A U U C A U U G A A C C U C A U U A C U U A G C U _ _ A G C C A U C _ G C U A G A U G U G A _ G A A G G A U C C A U G G G U A C U A A U C U A A A A A A A U A A A U A _ A A G U _ _ A U G _ U G _ U A U C U A U G A A U A U A U U C A U U G A A C C U C A U U A C U U A G C U _ _ A G C C A U C _ G C U A G A U G U G A _ G A A G G A U C C A U G G G U A C U A A U C U A A A A A A A U A A A U A _ A A G U _ _ A U G _ U G _ U A U C U A U G A A U A U A U U C A U U G A A C C U C A U U A C U U A G C U _ _ A G C C A U C _ G C U A G A U G U G A _ G A A G G A U C C A U G G G U A C U A A U C U A A A A A A A U A A A U A _ A

cluster144 N=4 MPI=28.11 SCI=0.87

alidot.ps C U A A A U U _ U U G U U U U A U U _ _ U U _ A G U U U U C C C U G A A A A U U G _ U G A U U C A U U U A A U G G C C C U C A C U C A A U U G A U U G U C U C A U C _ _ A C A A U _ C G G G A _ A U G A _ _ U U _ G G U U G U A A A G U A A A A G G U C U U G G A C U A A A U U _ U U G U U U U A U U _ _ U U _ A G U U U U C C C U G A A A A U U G _ U G A U U C A U U U A A U G G C C C U C A C U C A A U U G A U U G U C U C A U C _ _ A C A A U _ C G G G A _ A U G A _ _ U U _ G G U U G U A A A G U A A A A G G U C U U G G A C U A A A U U _ U U G U U U U A U U _ _ U U _ A G U U U U C C C U G A A A A U U G _ U G A U U C A U U U A A U G G C C C U C A C U C A A U U G A U U G U C U C A U C _ _ A C A A U _ C G G G A _ A U G A _ _ U U _ G G U U G U A A A G U A A A A G G U C U U G G A C U A A A U U _ U U G U U U U A U U _ _ U U _ A G U U U U C C C U G A A A A U U G _ U G A U U C A U U U A A U G G C C C U C A C U C A A U U G A U U G U C U C A U C _ _ A C A A U _ C G G G A _ A U G A _ _ U U _ G G U U G U A A A G U A A A A G G U C U U G G A C U A A A U U _ U U G U U U U A U U _ _U U _ A G U U U U C C C U G A A AA U U G _ U G A U U C A U U U A A U G G C C CU C A C U C A A U U G AU U G U C U C A U C _ _ A C A A U _ C G G G A _ A U G A _ _ U U _ G G U U G U A A A G U A A A A G GU C U U G G A G U A U U G A U U A G U G U A U A C G C G C G U A U A U C G U A G C

cluster115 N=9 MPI=42.30 SCI=0.71

alidot.ps U G U A A G G _ A U G G G _ _ G U U _ C C A G U G _ U U U U G G C U A A C G G _ A A U U A C _ A U G U G _ U _ U G U A A U A C _ A U G A A A _ _ _ U U C A G _ U A G U _ C A G _ _ A U A U U G _ U U A C C _ C U U _ U A C U _ U G U A C U U G U A A G G _ A U G G G _ _ G U U _ C C A G U G _ U U U U G G C U A A C G G _ A A U U A C _ A U G U G _ U _ U G U A A U A C _ A U G A A A _ _ _ U U C A G _ U A G U _ C A G _ _ A U A U U G _ U U A C C _ C U U _ U A C U _ U G U A C U U G U A A G G _ A U G G G _ _ G U U _ C C A G U G _ U U U U G G C U A A C G G _ A A U U A C _ A U G U G _ U _ U G U A A U A C _ A U G A A A _ _ _ U U C A G _ U A G U _ C A G _ _ A U A U U G _ U U A C C _ C U U _ U A C U _ U G U A C U U G U A A G G _ A U G G G _ _ G U U _ C C A G U G _ U U U U G G C U A A C G G _ A A U U A C _ A U G U G _ U _ U G U A A U A C _ A U G A A A _ _ _ U U C A G _ U A G U _ C A G _ _ A U A U U G _ U U A C C _ C U U _ U A C U _ U G U A C U U G U A A G G _ A U G G G _ _G U U _ CC A G U G _ U UU U G G C U A A C G G _ A A U U A C _ A U G U G _ U _ U G U A A U A C _ AU G A A A _ _ _ U U C A G _ U AG U _ C A G _ _ A U A U U G _ U U A C C _ C U U _ U A C U _ U G U A C U A U U G U A C G U U U G C G C G U G A U U G G U A U C G A U U A _ A G C C G U A C G A U

cluster134 N=8 MPI=22.71 SCI=0.39

alidot.ps _ A G U U G A C C _ _ _ A A _ U A U A A C U _ _ C G _ G _ U A _ G G G U U C G C _ A G C _ C A U G C C A G _ G G U U U A U C A _ C C A A G G _ A A C A U G G C U G C G A A G _ _ C C A _ G C C G G G _ A A A C A A U A G G U C C _ G _ A U U U _ A G U U G A C C _ _ _ A A _ U A U A A C U _ _ C G _ G _ U A _ G G G U U C G C _ A G C _ C A U G C C A G _ G G U U U A U C A _ C C A A G G _ A A C A U G G C U G C G A A G _ _ C C A _ G C C G G G _ A A A C A A U A G G U C C _ G _ A U U U _ A G U U G A C C _ _ _ A A _ U A U A A C U _ _ C G _ G _ U A _ G G G U U C G C _ A G C _ C A U G C C A G _ G G U U U A U C A _ C C A A G G _ A A C A U G G C U G C G A A G _ _ C C A _ G C C G G G _ A A A C A A U A G G U C C _ G _ A U U U _ A G U U G A C C _ _ _ A A _ U A U A A C U _ _ C G _ G _ U A _ G G G U U C G C _ A G C _ C A U G C C A G _ G G U U U A U C A _ C C A A G G _ A A C A U G G C U G C G A A G _ _ C C A _ G C C G G G _ A A A C A A U A G G U C C _ G _ A U U U _ A G U U G A C C _ __ A A _ U A U A A C U _ _ C G _ G _ U A_ G G G U U C G C _ A G C _ C A U G C C A G _ G G UU U A U C A _C C A A G G _ A A C A U G G C U G C G A A G _ _ C C A _ G C C G G G _ A A A C A A U A GG U C C _ G _ A U U U U _ U A A _ C A A A U A U A

cluster139 N=6 MPI=25.90 SCI=0.34

0.1 cluster152 cluster144 cluster139 cluster134 cluster127 cluster115 cluster107

mir−7 candidate mir−126 candidate let−7 mir−124−b mir−124−a

Ciona intestinalis: microRNA subtree

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 33 / 35

slide-34
SLIDE 34

Putative Novel RNA Classes

ci_558069 ci_557306 ci_555831 ci_555830 ci_555401 ci_557531 ci_556678 ci_557415 ci_554789 ci_555710 ci_555454

cluster1378 cluster1381 cluster1383

alidot.ps A _ _ U _ _ C U U G C U U A A C A C A U C G G A G U A A G A G G C G G _ U U A A C C A A A A U G U U C G U G G A _ C G A A C A C A C C C A C G G A C C U _ U G G U _ A A U _ C A A C U U C A A A A U U A A A A C U A A C G A _ _ U _ _ C U U G C U U A A C A C A U C G G A G U A A G A G G C G G _ U U A A C C A A A A U G U U C G U G G A _ C G A A C A C A C C C A C G G A C C U _ U G G U _ A A U _ C A A C U U C A A A A U U A A A A C U A A C G A _ _ U _ _ C U U G C U U A A C A C A U C G G A G U A A G A G G C G G _ U U A A C C A A A A U G U U C G U G G A _ C G A A C A C A C C C A C G G A C C U _ U G G U _ A A U _ C A A C U U C A A A A U U A A A A C U A A C G A _ _ U _ _ C U U G C U U A A C A C A U C G G A G U A A G A G G C G G _ U U A A C C A A A A U G U U C G U G G A _ C G A A C A C A C C C A C G G A C C U _ U G G U _ A A U _ C A A C U U C A A A A U U A A A A C U A A C G A _ _ U _ _ C U U G C U U A A C A C AU C GG A G U A A G A GG C G G _ U U A A C C A A A A U G U U C G U G G A_C G A A C A C A C C C A C G G A C C U_ U GG U _ A A U_ C A A C U U C A A A A U U A A A A C U A A C G G U G _ A _ G C A U U G C A G U U A G C U C U G U A A A

cluster1378 N=5 MPI=34.15 SCI=0.74

alidot.ps G G _ G A A U U A C U C A U A G U C G C C _ _ _ U U G A _ A _ G C _ C U U A C C A A _ A G _ U A G U A A A C C _ U A C C A C A A G A U G A A G A A _ _ C U G A A A G A C U U G U A A _ U G G A U U G G U _ _ _ A U U U A G G _ G A A U U A C U C A U A G U C G C C _ _ _ U U G A _ A _ G C _ C U U A C C A A _ A G _ U A G U A A A C C _ U A C C A C A A G A U G A A G A A _ _ C U G A A A G A C U U G U A A _ U G G A U U G G U _ _ _ A U U U A G G _ G A A U U A C U C A U A G U C G C C _ _ _ U U G A _ A _ G C _ C U U A C C A A _ A G _ U A G U A A A C C _ U A C C A C A A G A U G A A G A A _ _ C U G A A A G A C U U G U A A _ U G G A U U G G U _ _ _ A U U U A G G _ G A A U U A C U C A U A G U C G C C _ _ _ U U G A _ A _ G C _ C U U A C C A A _ A G _ U A G U A A A C C _ U A C C A C A A G A U G A A G A A _ _ C U G A A A G A C U U G U A A _ U G G A U U G G U _ _ _ A U U U A G G _ G A A U U A C U C AU A G U C G C C _ _ _ U U G A _ A _ G C_ C U U A C C A A _ A G _ U A G U A A A C C _U A C CA C A A G A U G A A G A A _ _ C U G A A AG A C U U G U A A _ U G G A U U G G U _ _ _ A U U U A CG G _ A C A U A U U G A G _ A G A

cluster1381 N=4 MPI=25.36 SCI=0.79

alidot.ps U U C G A C C A _ A U C A C A G C _ C C C C C A A A C C G A C C C A _ C _ A A C C G C C C C C G A A A A A G A A A _ A C A A U A U A A A C A A A U G A C A _ C A A C _ A U C G C G G G _ C U A A G U _ A C A C C A A C A G A A C C G C C G U C U U C G A C C A _ A U C A C A G C _ C C C C C A A A C C G A C C C A _ C _ A A C C G C C C C C G A A A A A G A A A _ A C A A U A U A A A C A A A U G A C A _ C A A C _ A U C G C G G G _ C U A A G U _ A C A C C A A C A G A A C C G C C G U C U U C G A C C A _ A U C A C A G C _ C C C C C A A A C C G A C C C A _ C _ A A C C G C C C C C G A A A A A G A A A _ A C A A U A U A A A C A A A U G A C A _ C A A C _ A U C G C G G G _ C U A A G U _ A C A C C A A C A G A A C C G C C G U C U U C G A C C A _ A U C A C A G C _ C C C C C A A A C C G A C C C A _ C _ A A C C G C C C C C G A A A A A G A A A _ A C A A U A U A A A C A A A U G A C A _ C A A C _ A U C G C G G G _ C U A A G U _ A C A C C A A C A G A A C C G C C G U C U U C G A C C A _ A U C A C A G C _ C C C C C A A AC C G A C C C A _ C _ A A C C G C C C C C G A A A A A G A A A _ A C A A U A U A A A C A A A U G A C A _ C A A C_ A UC G C G G G_ C U A A G U _ A C A C C A A C A G A A C C G C C G U C A A A _ C C G C A C _ C C G U A G C C C

cluster1383 N=2 MPI=31.09 SCI=0.89

alidot.ps _ G U G G U A A A A A U A U U U G A C U A C _ _ _ _ _ G A _ G U C G C _ U U A A C A A A A A _ U G U U A A A G C _ U _ _ A A C A C A _ A C A _ G G A A G U _ U _ G U U A A U _ A C U A C U U C A A G A U U G _ _ A C U _ A A A G _ _ G U G G U A A A A A U A U U U G A C U A C _ _ _ _ _ G A _ G U C G C _ U U A A C A A A A A _ U G U U A A A G C _ U _ _ A A C A C A _ A C A _ G G A A G U _ U _ G U U A A U _ A C U A C U U C A A G A U U G _ _ A C U _ A A A G _ _ G U G G U A A A A A U A U U U G A C U A C _ _ _ _ _ G A _ G U C G C _ U U A A C A A A A A _ U G U U A A A G C _ U _ _ A A C A C A _ A C A _ G G A A G U _ U _ G U U A A U _ A C U A C U U C A A G A U U G _ _ A C U _ A A A G _ _ G U G G U A A A A A U A U U U G A C U A C _ _ _ _ _ G A _ G U C G C _ U U A A C A A A A A _ U G U U A A A G C _ U _ _ A A C A C A _ A C A _ G G A A G U _ U _ G U U A A U _ A C U A C U U C A A G A U U G _ _ A C U _ A A A G _ _ G U G G U A A A A A U A UU U G A C U A C _ _ _ _ _ G A _ G U C G C _ U U A A C A A A A A _ U G U U A AAG C_U__AA C A C A _ A C A _ G G A A G U _ U_ G U U A A U _ A C U A C U U C A A G A U U G _ _ A C U _ A A A G _ CG A U UG A U U A G C U A A G A U G U U A A U

cluster1382 N=9 MPI=26.41 SCI=0.45

alidot.ps U G G G U A U U U A A A U U U G A C U A C _ _ _ _ _ G A _ G U C G C _ U U A A C A A A A A _ U A U U A A A G C A U _ _ A A C A C A _ A U A _ G G A C G U _ U _ G U U _ A A G _ A C A A G U U C A A G A U U G _ _ A C U _ A A C G U G G G U A U U U A A A U U U G A C U A C _ _ _ _ _ G A _ G U C G C _ U U A A C A A A A A _ U A U U A A A G C A U _ _ A A C A C A _ A U A _ G G A C G U _ U _ G U U _ A A G _ A C A A G U U C A A G A U U G _ _ A C U _ A A C G U G G G U A U U U A A A U U U G A C U A C _ _ _ _ _ G A _ G U C G C _ U U A A C A A A A A _ U A U U A A A G C A U _ _ A A C A C A _ A U A _ G G A C G U _ U _ G U U _ A A G _ A C A A G U U C A A G A U U G _ _ A C U _ A A C G U G G G U A U U U A A A U U U G A C U A C _ _ _ _ _ G A _ G U C G C _ U U A A C A A A A A _ U A U U A A A G C A U _ _ A A C A C A _ A U A _ G G A C G U _ U _ G U U _ A A G _ A C A A G U U C A A G A U U G _ _ A C U _ A A C G U G G G U A U U U A A A UU U G A C U A C _ _ _ _ _ G A _ GU C G C_ U U A A C A A A A A _ U A U U A A A G C A U_ _ A A C AC A _ A U A _ G G A C G U _ U_ G U U _ A A G _ A C A A G U U C A A G A U U G _ _ A C U _ A A C G C U A G UG CG A U U U A U A U U A A C U A G C C G A U C G A U U A A G

cluster1384 N=11 MPI=24.96 SCI=0.45

0.1

cluster1382 cluster1384

PLoS Comp. Biol. 3: e65 (2007) Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 34 / 35

slide-35
SLIDE 35

Many, many thanks . . .

Leipzig: Sonja J. Prohaska, Dominic Rose, Jana Hertel, Manja Marz, Claudia & Roman Stocsits, Sven Findeiß, . . . FH RNomics group: J¨

  • rg Hackerm¨

uller, Antje Kretzschmar, Kristin Reiche, Kathy Schutt, Kerstin Ullmann, Tine Schulz Vienna: Stefan Washietl, Ivo L. Hofacker, Christoph Flamm, Andrea Tanzer, Stefan Bernhart, Hakim Tafer, Susanne Rauscher, Caroline Thurner, Christina Witwer, . . . Halle: G¨ unter Reuter’s Lab Freiburg: Rolf Backofen, Sebastian Will T¨ ubingen: Kay Nieselt’s Group Freiburg: Rolf Backofen’s Lab W¨ urzburg: J¨

  • rg Vogel, Cynthia Sharma

Copenhagen: Jan Gorodkin, Stefan Seemann, Peter Menzel Affymetrix: Tom Gingeras, Phil Kapranov, et al. CAS Beijing: Wei Deng and all the others in Runsheng Chen’s Lab PICB Shanghai: Axel Mosig and Phil Khaitovich and their students (PICB/SIBS) ASU Tempe: Julian L. Chen and his lab Stanford: Michael Hiller ENCODE: Ewan Birney and 102.5 coauthors

Peter F. Stadler (Leipzig) Modern RNA World Jena, Aug 2010 35 / 35