Sequencing technology and assembly Sanger sequencing Sanger - - PowerPoint PPT Presentation
Sequencing technology and assembly Sanger sequencing Sanger - - PowerPoint PPT Presentation
Sequencing technology and assembly Sanger sequencing Sanger sequencing with radioactivity High throughput Sanger sequencing with fmuorescence Roche/454 sequencing Yield: 500,000,000 bp Cost: $5,000 Time: ~1 min per bp
Sanger sequencing with radioactivity High throughput Sanger sequencing
with fmuorescence
Sanger sequencing
Roche/454 sequencing
- Yield: 500,000,000 bp
- Cost: $5,000
- Time: ~1 min per bp
- Read length: 450 bp - > 1kb
Pyrosequencing
Yield: 8,000,000,000 – 80,000,000,000 bp Time: ~1 hour per bp Read length: ~150 bp Cost:
– Sample Extraction, $14.00/sample – Automated Sample Library, $90.00/sample – MiSeq (2x250), 1 lane 8-10Gb/lane, $1,700.00/sample – MiSeq (2x300), 1 lane, 10-12Gb/lane, $2,100.00/sample – HiSeq2500 (2x150), 1 lane, ~40Gb/lane, $2,500.00/lane – HiSeq2500 (2x250), 1 lane, ~65Gb/lane, $3,500.00/lane
Illumina sequencing
Illumina sequencing
Yield: 50,000,000 bp Time: 2 hours Read length: 500bp <1 min per bp Cost: $500
Ion Torrent
Ion Torrent
PacBio
- Long reads (5-10kb)
- High error, but read 150x coverage
- Library prep: $600
- Sequencing: $300
PacBio
Minion
- Quick sample prep
- Long reads (~50kb)
- High error
- $150 per run
Minion
Errors
Difgerent technologies have difgerent error rates:
Need to be sure which base you have identifjed Depends on the technology Each machine includes software Phred is an historical package developed by at U.
Washington
Phred scores are probability that the base is
correct
Base calling
Phred 10: 1 x 101 chance that the base is wrong Phred 20: 1 x 102 chance that the base is wrong Phred 30: 1 x 103 chance that the base is wrong Phred 40: 1 x 104 chance that the base is wrong Phred 99: the base is correct! Fastq scores are the score + 33 then converted to ascii text
Quality values
Homopolymeric errors
Homopolymeric runs: Signal is not linear Not clear if 5 or 6 bases
Errors
- Difgerent technologies have difgerent error rates:
- Pyrosequencing/Ion Torrent – homopolymeric
tracts
- Illumina – substitution errors
- PacBio – Machines can not keep up with biology
- Minion – noise coming through the membrane