transcriptomics 101
play

Transcriptomics 101 Nicole Cloonan Winter School, 5 th July 2011 - PowerPoint PPT Presentation

Transcriptomics 101 Nicole Cloonan Winter School, 5 th July 2011 Transcriptional Complexity Mutations Allelic Expression RNA Editing TSS TSS TSS pA pA pA ATG ATG pA TSS PASR miRNA TASR tiRNA AAA ATG AAA ATG ATG AAA ATG AAA


  1. Transcriptomics 101 Nicole Cloonan Winter School, 5 th July 2011

  2. Transcriptional Complexity Mutations Allelic Expression RNA Editing TSS TSS TSS pA pA pA ATG ATG pA TSS PASR miRNA TASR tiRNA AAA ATG AAA ATG ATG AAA ATG AAA ATG AAA ATG ATG AAA AAA genomic DNA microRNAs spliced intron TSS pA polyadenylation signal transcription start site protein coding regions AAA polyadenylation ATG translation start site non-coding regions

  3. Presentation Outline RNAseq Introduction RNAseq post-mapping miRNAseq Conclusions analysis Transcriptional RNAseq Uniqueome isomiRs Things to complexity Mapping consider How to Information Surveying Novel exon- measure a content Take home transcriptional junction transcript messages complexity with discovery Expression microarrays Transcript Thresholding assembly RNA-seq

  4. Tag sequencing TSS TSS TSS pA pA pA ATG ATG pA TSS AAA ATG AAA ATG ATG AAA ATG AAA ATG AAA ATG ATG AAA AAA SAGE CAGE MPSS PET

  5. Microarrays TSS TSS TSS pA pA pA ATG ATG pA TSS ATG AAA ATG ATG AAA AAA ATG AAA ATG AAA ATG ATG AAA AAA microarray exon arrays exon-junction arrays

  6. RNAseq TSS TSS TSS pA pA pA ATG ATG pA TSS AAA ATG AAA ATG ATG AAA ATG AAA ATG AAA ATG ATG AAA AAA Cloonan et al . Nat Methods 2008 ; 5:613-619

  7. Presentation Outline RNAseq Introduction RNAseq post-mapping miRNAseq Conclusions analysis Transcriptional RNAseq Uniqueome isomiRs Things to complexity Mapping consider How to Information Surveying Novel exon- measure a content Take home transcriptional junction transcript messages complexity with discovery Expression microarrays Transcript Thresholding assembly RNA-seq

  8. RNAseq Mapping TSS TSS TSS pA pA pA ATG ATG pA TSS ATG AAA The fastest alignment methods are ungapped … but what about junctions? genomic DNA microRNAs spliced intron TSS pA polyadenylation signal transcription start site protein coding regions AAA polyadenylation ATG translation start site non-coding regions

  9. Novel exon-junction discovery (systematic) TSS TSS TSS pA pA pA ATG ATG Pros: Cons: Computationally easy Does not find all novel splicing genomic DNA microRNAs spliced intron TSS pA polyadenylation signal transcription start site protein coding regions AAA polyadenylation ATG translation start site non-coding regions

  10. Novel exon-junction discovery ( Paired End ) TSS TSS TSS pA pA pA ATG ATG ATG AAA Cons: Pros: Reasonable coverage required Very sensitive Accuracy dependent on insert size distribution Sequencing twice as expensive genomic DNA microRNAs spliced intron TSS pA polyadenylation signal transcription start site protein coding regions AAA polyadenylation ATG translation start site non-coding regions

  11. Novel exon-junction discovery ( de novo ) ACGATAT G ACACGTACAGTCAA A TCGT Non-matching tags ACGATATTACACGTACA T TCAAGTCGT ACGATATTACACG C ACAGTCAAGTCGT CGATATTACACGT C CAGTCAAGTCGTT ATATT T CACGTACAGTCAAGTCGTTCG remove adaptor sequence aligned reads ATATTA A ACGTACAGTCAAGTCGTTCG ATT G CACGTACAGTCAAGTCGTTCGGA ATTACACGTACAGTCA C GTCGTTCGGA Create consensus read CACGTACAG T CAAGTCGTTCGGAACCT CACGTAC CT TCAAGTCGTTCGGAACCT ACGATATTACACGTACAGTCAAGTCGTTCGGAACCT consensus read Blat against genome Pros: Cons: De novo Requires high coverage

  12. Novel exon-junction discovery ( Top Hat ) TSS TSS TSS pA pA pA ATG ATG ATG AAA http://tophat.cbcb.umd.edu Pros: Cons: Very sensitive Relies on reference genomic DNA microRNAs spliced intron TSS pA polyadenylation signal transcription start site protein coding regions AAA polyadenylation ATG translation start site non-coding regions

  13. Look at your data! Gene Symbol GRB7 Exon-exon junction usage Alternative splicing Single nucleotide resolution coverage plot Known gene structure Novel exons or novel transcripts (exons and introns)

  14. Presentation Outline RNAseq Introduction RNAseq post-mapping miRNAseq Conclusions analysis Transcriptional RNAseq Uniqueome isomiRs Things to complexity Mapping consider How to Information Surveying Novel exon- measure a content Take home transcriptional junction transcript messages complexity with discovery Expression microarrays Transcript Thresholding assembly RNA-seq

  15. Different aligners give different results The patterns are largely the same so don’t panic… … unless you’re doing RNAseq Koehler et al Bioinformatics 2011 27(2):272-274

  16. Uniqueome affects quantitation of RNAseq Correction for unique content improves correlation to microarrays

  17. RNAseq TSS TSS TSS pA pA pA ATG ATG pA TSS AAA ATG AAA ATG ATG AAA ATG AAA ATG AAA ATG ATG AAA AAA Cloonan et al . Nat Methods 2008 ; 5:613-619

  18. How to detect a transcript? A AAA B AAA C AAA D AAA AAA polyadenylation spliced intron protein coding regions non-coding regions

  19. How to detect a transcript? Accuracy relies on the quality of the gene models used. A AAA B AAA C Different gene models will give AAA different results from the same data. D AAA ~80% 92.6% known transcripts have diagnostic features (covers 99.8% of loci) 217127 diagnostic features covering 160156 individual transcripts from 65254 loci AAA polyadenylation spliced intron protein coding regions non-coding regions

  20. Reference assisted transcript assembly Scripture Cufflinks Guttman et al., Nat Biotech 2010 28( 5 ):503-10

  21. Reference free alignment - de novo assembly Gene Symbol: MGAT5 Trinity Oases Abyss Gene Symbol: RAN Cloonan et al ., Unpublished

  22. Presentation Outline RNAseq Introduction RNAseq post-mapping miRNAseq Conclusions analysis Transcriptional RNAseq Uniqueome isomiRs Things to complexity Mapping consider How to Information Surveying Novel exon- measure a content Take home transcriptional junction transcript messages complexity with discovery Expression microarrays Transcript Thresholding assembly RNA-seq

  23. miRNAs Drosha Dicer Processing Processing 5’ 3’ miRNA duplex 5’ 3’ 5’ pri-miRNA 3’ pre-miRNA Asymmetrical Unwinding Most interactions thought to occur in the 3’ UTR 3’ 5’ 3’ 5’ AAAAAAAAAAAAAA 3’ 5’ RNA-Induced Silencing Complex mRNA (RISC) RISC-mRNA Translational mRNA mRNA interactions Inhibition sequestration degradation

  24. MicroRNAs are small and closely related 60 Proportion of miRNAs (%) 50 * 20 miR-17-5p : CAAAGUGCUUACAGUGCAGGUAGU 40 UAAAGUGCUUAUAGUGCAGGUAG- miR-20 : AAAAGUGCUUACAGUGCAGGUAGC miR-106a : UAAAGUGCUGACAGUGCAGAU--- miR-106b : -AAAGUGCUGUUCGUGCAGGUAG- 30 miR-93 : UAAGGUGCAUCUAGUGCAGAUA-- miR-18 : AAaGUGCu aGUGCAG Ua * 20 miR-19a : 20 UGUGCAAAUCUAUGCAAAACUGA- miR-19b-1 : UGUGCAAAUCCAUGCAAAACUGA- miR-19b-2 : UGUGCAAAUCCAUGCAAAACUGA- UGUGCAAAUCcAUGCAAAACUGA 10 0 15 16 17 18 19 20 21 22 23 24 25 Length of miRNAs (nt)

  25. Information content in short tags Map to a subset of the genome instead

  26. Not allowing mismatches does not solve the problem tagcgggatctctcga g agctcgcgat miR A 1 MM 0 MM tctctcga c agct tctctcga g agct 1 MM 0 MM tagcgggatctctcga c agctcgcgat miR B

  27. IsomiRs are common and functional 5’ 3’ pre-miRNA Cloonan et al . Genome Biol 2011 ; 12(12):R126

  28. Expression Thresholding Cloonan et al . Genome Biol 2011 ; 12(12):R126

  29. Presentation Outline RNAseq Introduction RNAseq post-mapping miRNAseq Conclusions analysis Transcriptional RNAseq Uniqueome isomiRs Things to complexity Mapping consider How to Information Surveying Novel exon- measure a content Take home transcriptional junction transcript messages complexity with discovery Expression microarrays Transcript Thresholding assembly RNA-seq

  30. Things to consider  Check your data!  visualization strategies  IGV (brilliant for individual read resolution)  UCSC (brilliant for genomic context of expression)  Heatmaps, etc. (brilliant for quantification)  Check your mapping statistics  % mapped?, % mapped at what length?, redundancy etc.  Make sure the controls are doing what they should be  Remember the limitations and parameters of your alignment strategy - be careful with interpretation!  Eg. Variable alignment strategies that trim starts and ends of tags will overestimate the relative complexity of your library  Eg. Discarding all tags that map to multiple regions will limit your ability to detect closely related gene families, or sequence motifs in repetitive/low complexity areas

  31. Conclusions  RNAseq and miRNAseq both require special attention to mapping strategies  Choose an alignment strategy that will answer your biological question first and foremost, and then consider available resources  If your strategy won’t work, it’s better to know BEFORE sequencing rather than afterwards.  Check your mapped data – better to find errors before extensive analysis and validation  Be careful in your interpretation of the data

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend