jobim
play

JOBIM 3 July 2012 Chondrichthyans Teleostomi Scyliorhinus canicula - PowerPoint PPT Presentation

JOBIM 3 July 2012 Chondrichthyans Teleostomi Scyliorhinus canicula (dog fish) Genome sequencing Ongoing project with Gnoscope started 3.5 Gbases, Illumina paired-end sequencing, 32 x Draft assembly : 3 449 662 contigs, N50 : 1 292 bp Draft


  1. JOBIM 3 July 2012

  2. Chondrichthyans Teleostomi

  3. Scyliorhinus canicula (dog fish) Genome sequencing Ongoing project with Génoscope started 3.5 Gbases, Illumina paired-end sequencing, 32 x Draft assembly : 3 449 662 contigs, N50 : 1 292 bp Draft assembly Callorhinchus milii (elephant shark) 910 Mbases Sanger + 454,1.4 x, 633 833 contigs, N50 : 1 466 bp Draft assembly Leucoraja erinacea (little skate) 3.42 Gbases, Illumina paired-end, 26 x, 2 962 365 contigs, N50 : 665 bp

  4. Transcriptome project Peptisan project Sequencing done by Génoscope Libraries for mRNA Two normalised libraries (Non directional / directional) Illumina paired-end sequencing (~412 M, ~316 M) Poster on the transcriptome assembly (Pierre Pericard) Two Small RNA libraries Adult and Embryo libraries Illumina paired-end sequencing 51 nt long to identify miRNA : de novo identification

  5. Small non coding RNA post-transcriptional regulators of mRNA transcripts Discovery of lin-4 in C.elegans in 1993 Pre-miRNA structure miRNA* GAGUAAA UA UA GA U 5’ CCUUG G GCAGCACA AUGGUUUGUG UU U ||||| | |||||||| |||||||||| || G 3’ GGAAC C CGUCGUGU UACCGGACGU AA A AUAAAAA UC UA GG A miRNA miRNA conservation miR-143 miRNA * loop miRNA Zebrafish .....GAUCUACAGUCGUCUGGCCCGCGGUGCAGUGCUGCAUCUCUGGUCAACUGGGAGUC UGAGAUGAAGCACUGUAGCUC GGGAGGACAACACUGUCAGCUC..... Medaka UGGUUCUGGUCCAUCUCUGCUGCCCAUGGUGCAGUGCUGCAUCUCUGGUCAGUUGAUAGUC UGAGAUGAAGCACUGUAGCUC GGGACGGAGGGCAGGAGUCUCAGUCUG Xenopus ............UGUCUCCCAGCCCAAGGUGCAGUGCUGCAUCUCUGGUCAGUUGUGAGUC UGAGAUGAAGCACUGUAGCUC GGGAAGGGGGAAU.............. Human .GCGCAGCGCCCUGUCUCCCAGCCUGA GGUGCAGUGCUGCAUCUCUGGU CAGUUGGGAGUC UGAGAUGAAGCACUGUAGCUC AGGAAGAGAGAAGUUGUUCUGCAGC.. Mouse ......................CCUGA GGUGCAGUGCUGCAUCUCUGG UCAGUUGGGAGUC UGAGAUGAAGCACUGUAGCUC AGG........................ Rat .GCGGAGCGCC.UGUCUCCCAGCCUGA GGUGCAGUGCUGCAUCUCUGG UCAGUUGGGAGUC UGAGAUGAAGCACUGUAGCUC AGGAAGGGAGAAGAUGUUCUGCAGC.. Cow ......GCGUCCUGUCUCCCAGCCUGAGGUGCAGUGCUGCAUCUCUGGUCAGUUGGGAGUC UGAGAUGAAGCACUGUAGCUC GGGAAGGGAGAAGUUGUUCUGCAGC.. Pig .............GUCCCCCAGCCGGA GGUGCAGUGCUGCAUCUCUGG UCAGCUGGGAGUC UGAGAUGAAGCACUGUAGCUC GGGAAGGGAGA................ Opossum ......................CCCGAGGUGCAGUGCUGCAUCUCUGGUCAGUUGUGAGUC UGAGAUGAAGCACUGUAGCUC GGG........................ Lizard ...........AUGUCUCCCAGCCCAA GGUGCAGUGCUGCAUCUCUGG UCAGUUGUGAGUC UGAGAUGAAGCACUGUAGCUC GGGAAGGGAGGAAC.............

  6. Illumina paired-end sequencing Adult Embryo Sequences < 17nt ; >27nt Data Cleaning Rfam no adaptors PRINSEQ Flash cutadapt rRNA, tRNA, ncRNA High-Quality Sequences S. canicula miRBase 18.0 17 – 27 nt Draft Genome miRDeep2 C. milii Putative miRNA Genome R. erinacea Mature, Star, pre-miRNA Genome MIReNA CIDmiRNA Validation Triplet-SVM Conservation randfold PHDcleav miRNA SVM miRNAPred MFE

  7. Cleaning Illumina paired-end sequencing Adult Embryo Sequences < 17nt ; >27nt Data Cleaning Rfam no adaptors PRINSEQ Flash cutadapt rRNA, tRNA, ncRNA High-Quality Sequences S. canicula miRBase 18.0 17 – 27 nt Draft Genome miRDeep2 Prediction C. milii Putative miRNA Validation Genome R. erinacea Mature, Star, pre-miRNA Genome MIReNA CIDmiRNA Validation Triplet-SVM Conservation randfold PHDcleav miRNA SVM miRNAPred MFE

  8. Cleaning Illumina paired-end sequencing Adult Embryo Sequences < 17nt ; >27nt Data Cleaning Rfam no adaptors PRINSEQ Flash cutadapt rRNA, tRNA, ncRNA High-Quality Sequences 17 – 27 nt @PHOSPHORE_0144:8:1101:1512:2663#GGCUAC/1 @PHOSPHORE_0144:8:1101:1512:2663#GGCUAC/2 UUCCCAAGACUGUGAAACCCUU UGGAAUUCUCGGGUGCCAAGGAACUCCAG AAGGGUUUCACAGUCUUGGGAA GAUCGUCGGACUGUAGAACUCUGAACGUG @PHOSPHORE_0144:8:1101:1699:2666#GGCUAC/1 @PHOSPHORE_0144:8:1101:1699:2666#GGCUAC/2 AGGGCCCGGAUAGCUCAGUCGGUAG UGGAAUUCUCGGGUGCCAAGGAACUC CUACCGACUGAGCUAUCCGGGCCCU GAUCGUCGGACUGUAGAACUCUGAAC @PHOSPHORE_0144:8:1101:1503:2691#GGCUAC/1 @PHOSPHORE_0144:8:1101:1503:2691#GGCUAC/2 GAAUACCAGGUGCAGUAGGCUU UGGAAUUCUCGGGUGCCAAGGAACUCCAG AAGCCUACUGCCCCUGGUAUUC GAUCGUCGGACUGUAGAACUCUGAACGUG UUCCCAAGACUGUGAAACCCUU UGGAAUUCUCGGGUGCCAAGGAACUCCAG CACGUUCAGAGUUCUACAGUCCGACGAUC UUCCCAAGACUGUGAAACCCUU AGGGCCCGGAUAGCUCAGUCGGUAG UGGAAUUCUCGGGUGCCAAGGAACUC GUUCAGAGUUCUACAGUCCGACGAUC AGGGCCCGGAUAGCUCAGUCGGUAG GAAUACCAGGUGCAGUAGGCUU UGGAAUUCUCGGGUGCCAAGGAACUCCAG CACGUUCAGAGUUCUACAGUCCGACGAUC GAAUACCAGGGGCAGUAGGCUU • PRINSEQ (Schmieder and Edwards 2011 Bioinformatics ) • Cutadapt (Martin 2011. EMBnet.journal ) • Flash ( Magoč and Salzberg 2011 Bioinformatics )

  9. Embryo Adult All Initial reads 89,766,100 81,179,402 170,945,502 Cleaned reads 82,325,424 65,651,400 147,976,824 Frequency

  10. Embryo Adult All Initial reads 89,766,100 81,179,402 170,945,502 Cleaned reads 82,325,424 65,651,400 147,976,824 Frequency miR-143-3p

  11. Illumina paired-end sequencing Adult Embryo Sequences < 17nt ; >27nt Data Cleaning Rfam no adaptors PRINSEQ Flash cutadapt rRNA, tRNA, ncRNA High-Quality Sequences S. canicula miRBase 18.0 17 – 27 nt Draft Genome miRDeep2 Prediction Putative miRNA Mature, Star, pre-miRNA miRDeep2 : Friedländer et al. 2008 Nature Biotechnology

  12. Pre-miRNA Structural information: miRNA and miRNA* information: both miRNA and miRNA* Overexpression of the miRNA vs miRNA* Overhang (around 2 nt) Sequence conservation

  13. Modification to miRDeep2 Variability of the miRDeep2 related to randfold Putative new miRNA 2445 new miRNA with score >= 0 1103 new miRNA with score >= 5 with 10% expected false positives

  14. Conserved miRNA 170 miRNA identified similar to other species 15 rejected after manual inspection (2 with score > 5) 155 good known miRNA (21 with score < 5) contig_452580_14256 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNAACAUUCAACGCUGUCGGUGAGUNNNNNNNNNNNNNNNNNACCAUCGACCGUUGAUUGUACC NNNNNNNNNNNNNNNNNNNNGUUUCAGGGAACAUUCAACGCUGUCGGUGAGUUUGAUGCUAUUGGAGAAACCAUCGACCGUUGAUUGUACCUUGUAGC GAAUUCUGCUUCGAAUGGUUGCUUCAGUGAACAUUCAACGCUGUCGGUGAGUUUGGAAUUAAAGUAGAAACCAUCGACCGUUGAUUGUACCCUGCGGCAACCACCGUCCU NNNNNNNNNNNNNNNNNNNNNNNNNNNNNAACAUUCAACGCUGUCGGUGAGUNNNNNNNNNNNNNNNNNACCAUCGACCGUUGAUUGUACC oan-mir-181a (Ornithorhynch) NNNUNNNNNANNNUNNNNNNCUNNNNNNNANNNNGANGNU GCUU AA U U A U CU A GGAAU GUUNCAGGGNACANUCAACGNNGUCGGUGNGUUUNNUNCNA CG UGGUUGCU CAG G ACA UCAACG GUCGGUG GUUU U |||N|||||N|||N||||||NN|||||||N||||NN|N| || |||||||| ||| | ||| |||||| ||||||| |||| A CGANGUUCCNUGUNAGUUGCNNCAGCUACNCAAANNANGNU GC ACCAACGG GUC C UGU AGUUGC CAGCUAC CAAA A NNNUNNNNNANNNUNNNNNN--NNNNNNN-NNNNG-NGNU UCCU -C C C A U -- - GAUGA

  15. Comparison conserved miRNA with other species C. milii (elephant shark) and L. erinacea (little skate) 131 identified in C.milii , 152 identified in L.erinacea , 154 altogether Previously identified chondrichthyans miRNA (Heimberg et al . 2011) 104 S.canicula miRNA mapped on C.milii scaffolds all 104 miRNAs identified in S. canicula miRNA* loop miRNA sca-mir-301 UGUCGGAG GCUCUGACGAUAUUGCACUACU GUACUCACAGU-UAAG CAGUGCAAUAGUAUUGUCAAAGC GUCAGGCACC cmi-mir-301 UGUCGGAG GCUCUGACGAUAUUGCACUACU GUCCUCACCGU-UAAG CAGUGCAAUAGUAUUGUCAAAGC GUCAGGCAAC ler-mir-301 UGUCGGGC GCUCUGACGAUAUUGCACUACU GUCCGCACAGCUAAAG CAGUGCAAUAGUAUUGUCAAAGC GUCAGGCACC hsa-mir-301a ACUGCUAACGAAU GCUCUGACUUUAUUGCACUACU GUACUUUACAG-CUAG CAGUGCAAUAGUAUUGUCAAAGC AUCUGAAAGCAGG mmu-mir-301a CCUGCUAACGGCU GCUCUGACUUUAUUGCACUACU GUACUUUACAG-CGAG CAGUGCAAUAGUAUUGUCAAAGC AUCCGCGAGCAGG pma-mir-301a CUUGCAAGCCCCUGCUGGAG GCUCUGACACCAUUGCACUACU GUACGCAAUGG-UGAG CAGUGCAAUUGUAUUGUCAAAGC UUCCGUCGGUGAGCCCA G G C --- A GU U UGUC GA GCU UGACGAUAU UGCACU CU AC C |||| || ||| ||||||||| |||||| || || A ACGG CU CGA ACUGUUAUG ACGUGA GA UG C A G A AUA C AU A

  16. miRBase miRNA not in data set blastn of all miRBase miRNA against genome assembly 24 potential new conserved miRNA 2 identified by miRDeep2 but not identified as conserved 23444 522851 AAAG-UUCUGUCAUACACUCAGGCU UCAGUGCAUCACAGAACUUUGA contig_3412856_61753 CUCGAGCU AAAG-UUCUGUCAUACACUCAGGCU GCAGAUACACA-AGG UCAGUGCAUCACAGAACUUUGA UUCGGG rno-mir-148b UUGAGGU GAAG-UUCUGUUAUACACUCAGG CUGUGGCU-CUGA-AAG UCAGUGCAUCACAGAACUUUGU CUCG cmi CCCAAGCU GAAG-UUCUGUCAUACACUCAGGCU GUAGCUAAUGG-AAG UCAGUGCAUCACAGAACUUUGA CUCGAGAU ler CUCAAGCC AAAGGUUCUGUCAUACACUUUGGCU CUGUCGCUGGG-AAG UCAGUGCAUGACAGAACUUUG C C A CA GCAGA CUCGAG UAAAGUUCUGU AU CACU GGCU U |||||| ||||||||||| || |||| |||| GGGCUU GUUUCAAGACA UA GUGA CUGG A A C C -- AACAC 1425623 19236 UGAGAACUGAAUUCCAUGGGC UCCAUAGUAGACAGUUCUCCAG contig_2512524_51750 UUCCCAGCUA UGAGAACUGAAUUCCAUGGGC UGGUUGCACACUUUAUUUC-UCAG UCCAUAGUAGACAGUUCUCCAG CUUGGCUGCU gga-mir-146c-1 UUCCCAGCUC UGAGAACUGAAUUCCAUGGAC UGGUUUCAAUUCCAUGCGU-UCAG UCCAUGGUAUUCAGUUCUCUAG CUUGGCUGC cmi CCAGCUG UGAGAACUGAAUUCCAUGGGC UGGUCACGCAGUUUUCUUCCUCAG UCCAUAGUAGUCAGUUCUUCCG UUUGGCUGCU ler UUCCUGGCUC UGAGAACUGAAUUCCAUGGGC UGGUUGUUCACAUUAUUUC-UCAG UCCAUAGUAG-CAGUUCUCCGG CUUGGCUGCU ---UUCCCA AU AAUUCC UUGCACA GCU GAGAACUG AUGGGCUGG C ||| |||||||| ||||||||| CGA CUCUUGAC UACCUGACU U UCGUCGGUU C- AGAUGA CUUUAUU

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend