intro to ngs
play

Intro to NGS Theoretical and Practical HiC Workshop: Wet-lab and - PowerPoint PPT Presentation

Intro to NGS Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics November 4th, 2019 Selene L. Fernndez-Valverde regRNAlab.github.io @SelFdz 1 Learning objectives In this class we will learn How high-throughput (NGS)


  1. Intro to NGS Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics November 4th, 2019 Selene L. Fernández-Valverde regRNAlab.github.io @SelFdz 1

  2. Learning objectives In this class we will learn How high-throughput (NGS) sequencing technologies • arose How NGS technologies transformed our capacity to • acquire large amounts of genomic information ‘ Get acquainted with the common NGS techniques • available in the market Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 2

  3. The sequencing revolution $100,000,000 $10,000,000 HiSeqX Ten $1,000,000 10,000 G i g /Gibabase $100,000 1,000 a HiSeq 2 500 b a s $10,000 100 e Output/Week Genome $1,000 10 t Analyzer IIx s Genome o Analyzer C $100 1 $10 0.1 ABI 3730xl 0.01 $1 2000 2002 2004 2006 2008 2010 2012 2014 Figure 1: Sequencing Cost and Data Output Since 2000 —The dramatic rise of data output and concurrent falling cost of sequencing since 2000. The Y-axes on both sides of the graph are logarithmic. Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 3

  4. The sequencing revolution $100,000,000 $10,000,000 HiSeqX Ten $1,000,000 10,000 G i g /Gibabase $100,000 1,000 a HiSeq 2 500 b a s $10,000 100 e Output/Week Genome $1,000 10 t Analyzer IIx s Genome o Analyzer C $100 1 $10 0.1 ABI 3730xl 0.01 $1 2000 2002 2004 2006 2008 2010 2012 2014 Figure 1: Sequencing Cost and Data Output Since 2000 —The dramatic rise of data output and concurrent falling cost of sequencing since 2000. The Y-axes on both sides of the graph are logarithmic. Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 3

  5. High-throughput sequencing techniques • Pyrosequencing • Sequencing by synthesis • Sequencing by ligation • Ion semiconductor • Nanopore sequencing • Single Molecule Real Time Sequencing (SMRT) Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 4

  6. Pyrosequencing - 1 Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �5

  7. Pyrosequencing - 2 Reacción enzimatica chemoluminiscente Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �6

  8. Pyrosequencing Disadvantages Advantages Few sequences produced Reasonable cost • • High number of errors in Long sequences (500 • • regions with the same nts) nucleotide (homopolymers) With the rise of other • technologies and given its high level of errors it was ultimately discontinued Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �7

  9. Illumina - sequencing by synthesis - 1 • The process starts by joining DNA adapters to the DNA or RNA fragments that we want to Adapters sequence. Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �8

  10. Illumina - sequencing by synthesis - 2 • The templates are Adapter immobilized on a flow cell DNA fragment • In the case of RNA-Seq, complementarity with the Dense lawn of primers adapter is used to Adapter synthesize a new cDNA chain in order to preserve information about the directionality of the transcript. Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �9

  11. Illumina - sequencing by synthesis - 3 • A chain of DNA complementary to the DNA template is synthesized on the flow cell surface. Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �10

  12. Illumina - sequencing by synthesis - 4 • A chain of DNA complementary to the DNA Attached terminus Attached template is synthesized on the Free terminus terminus flow cell surface. Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �11

  13. Illumina - sequencing by synthesis - 5 • The templates are Attached separated using high Attached temperature. Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 12

  14. Illumina - sequencing by synthesis - 6 • This process is repeated hundreds of times until generating a "colony" or cluster of identical transcripts. Clusters Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �13

  15. Illumina - sequencing by synthesis - 7 • Primers and fluorescent nucleotides (reversible terminators) are added in order (first A, then T, etc.) along with polymerase. When a nucleotide is incorporated a laser pulse coupled with imaging are used to identify which base was incorporated in each position. Laser Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �14

  16. Illumina - sequencing by synthesis - 8 • This process is continued for all bases. Laser Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �15

  17. Illumina - sequencing by synthesis - 9 • The images are analyzed spatially to reveal each sequence. GCTGA... Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �16

  18. Sequencing by Synthesis Advantages Disadvantages • Undoubtedly the leader in • The sequences are short the market = strong (150 to 300 bp) scientific support network • The cost is high • Produces large amounts of sequences (Up to 20 billion • Relatively slow sequencing for NovaSeq) (13–44 hr for NovaSeq) • Low error rate compared with other technologies • Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �17

  19. Nanopore sequencing Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �18

  20. Nanopore sequencing Kate Rubins Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �19

  21. Nanopore whale watching • Nanopore is capable of generating very very long reads or "whales" • The longest read detected to date has a length of 2,272,580 bases Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 20

  22. Nanopore sequencing Advantages Disadvantages • High number of errors • Real-time sequencing although they have had a drastic increase in accuracy in • You can stop sequencing when the last year you have enough data • Pores failed - sequence loss • Very portable - useful for work in difficult areas • Simple preparation • Low cost - $ 80 USD per sample Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �21

  23. Sources of error • There are two main sources of error: • Human error: mixing of samples (in the laboratory or when the files were received), errors in the protocol • Technical error: Errors inherent to the platform (e.g., mononucleotide sequences in pyrosequencing) - All platforms have some level of error that must be taken into account when designing the experiment. Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 22 1/16/17

  24. Errors in sample preparation • User error (e.g. mistakenly labeling a sample) • DNA / RNA degradation by preservation methods • Contamination with external sequences • Low amount of DNA start Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 23

  25. Errors in library preparation • User error (e.g. polluting one sample with another, contaminate with previous reactions, errors in the protocol) • PCR amplification errors • Bias for primers (binding bias, methylation bias, primer dimers [first dimers]) • Bias for capture (Poly-A, Ribozero) • Machine errors (misconfiguration, reaction interruption) • Chimeras • Index errors, adapter (contamination of adapters, lack of index diversity, incompatible codes (barcodes), overload) Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 24

  26. Sequencing and image errors • User error (e.g. cell overload) • Delay (e.g., incomplete extension, addition of multiple nucleotides) • Dead fluorophores, damaged nucleotides and overlapping signals • Context of the sequence (e.g. high GC content, homologous and low complexity sequences, homopolymers). • Machine errors (e.g. laser, hard disk, programs) • Chain biases Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 25

  27. The challenge - differentiate biological signals from noise/errors • Negative and positive controls - What do I expect? • Technical and biological replicas - help determine the noise rate • Know the types of common errors in a certain platform Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde � 26

  28. Now what? Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �27

  29. Theoretical and Practical HiC Workshop: Wet-lab and Bioinformatics Selene L. Fernandez-Valverde �28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend