Ultra high throughput DNA sequencing technologies Keith Harshman - - PowerPoint PPT Presentation

ultra high throughput dna sequencing technologies
SMART_READER_LITE
LIVE PREVIEW

Ultra high throughput DNA sequencing technologies Keith Harshman - - PowerPoint PPT Presentation

Ultra high throughput DNA sequencing technologies Keith Harshman DNA Array Facility Center for Integrative Genomics University of Lausanne Outline: 1. What UHTS is replacing: Sanger sequencing/CE 2. Current UHTS next generation


slide-1
SLIDE 1

Ultra high throughput DNA sequencing technologies

Keith Harshman DNA Array Facility Center for Integrative Genomics University of Lausanne

slide-2
SLIDE 2

Outline: 1. What UHTS is replacing: Sanger sequencing/CE 2. Current UHTS “next generation” technologies: a. Illumina Genome Analyzer II (aka “Solexa”) b. Applied Biosystem’s SOLiD c. 454 3. Some next next generation technologies 4. Some next next next generation technologies

slide-3
SLIDE 3
slide-4
SLIDE 4

Human Genome Re-sequencing using the Sanger Method 5.3x coverage $2,000,000-$4,000,000

~15,000,000 plasmid preps 27,000,000 AB 3730 reads

slide-5
SLIDE 5
slide-6
SLIDE 6

Enter UHTS (following a brief performance by MPSS)

slide-7
SLIDE 7

3730: ~1 x 106 bases/day (12 x 96 sample run/day; 900bp reads) Genome Analyzer II: ~ 2 x 109/run = ~670 x 106 bases/day (35bp reads)

Ultra high throughput/output:

slide-8
SLIDE 8

=

25x 35bp reads ≠ 1x 900bp read

slide-9
SLIDE 9

Illumina Genome Analyzer II

slide-10
SLIDE 10

Sequencing Process

Fragment DNA Repair ends / Add A overhang Ligate adapters Select ligated DNA

Library prep (~ 6 hrs)

1

Automated Cluster Generation (~ 5 hrs)

2

Hybridize to flow cell Extend hybridized oligos Perform bridge amplification

1-8 samples

Sequencing (~ 48 to 72 hrs)

3

Perform sequencing Generate base calls

1-8 samples

slide-11
SLIDE 11

Genomic DNA Library Prep

DNA fragments Blunting by Fill-in and exonuclease Phosphorylation Addition of A-overhang Ligation to adapters

slide-12
SLIDE 12

Cluster generation: Cluster Station

  • Aspirates DNA

samples into flow cell

  • Automates the

formation of amplified clonal clusters from the DNA single molecules

DNA libraries Flow cell (clamped into place)

slide-13
SLIDE 13

Flow cell

  • Clonal

clusters are generated in a contained environment (need no clean rooms)

  • Sequencing also

performed in the flow cell on the generated clusters Key to the simplified workflow

8 channels

Surface of flow cell coated with a lawn of oligo pairs

slide-14
SLIDE 14

Cluster generation: Hybridize fragment & extend

> 50 M single molecules hybridize to the lawn of primers Bound molecules are then extended by polymerases

Adapter sequence 3’ extension

slide-15
SLIDE 15

Cluster generation: Denature double-stranded DNA

Double-stranded molecule is denatured. Original template is washed away. Newly synthesized covalently attached to the flow cell surface.

discard

Newly synthesized strand Original template

slide-16
SLIDE 16

Cluster generation: Covalently bound spatially separated single molecules

Single molecules bound to flow cell in a random pattern

slide-17
SLIDE 17

Cluster generation: Bridge amplification

Single-strand flips

  • ver to hybridize to

adjacent primers to form a bridge. Hybridized primer is extended by polymerases.

slide-18
SLIDE 18

Cluster generation: Bridge amplification

double-stranded bridge is formed.

slide-19
SLIDE 19

Cluster generation: Bridge amplification

Double-stranded bridge is denatured. Result: Two copies of covalently bound single- stranded templates.

slide-20
SLIDE 20

Cluster generation: Bridge amplification

Single-strands flip over to hybridize to adjacent primers to form bridges. Hybridized primer is extended by polymerase.

slide-21
SLIDE 21

Cluster generation: Bridge amplification

Bridge amplification cycle repeated till multiple bridges are formed

slide-22
SLIDE 22

Cluster generation

dsDNA bridges denatured. Reverse strands cleaved and washed away…..

slide-23
SLIDE 23

Cluster generation

… leaving a cluster with forward strands only.

slide-24
SLIDE 24

Cluster generation

Free 3’ ends are blocked to prevent unwanted DNA priming.

slide-25
SLIDE 25

Sequencing primer is hybridized to adapter sequence.

Sequencing

Sequencing primer

slide-26
SLIDE 26

Add 4 Fl- NTP’s + Polymerase Incorporate d Fl-NTP is imaged Terminator and fluorescent dye are cleaved from the Fl-NTP

X 36 - 50

Hybridize sequencing primer

Genome Analyzer II Sequencing

slide-27
SLIDE 27

Flow cell imaging

laser

Fluidics port Fluidics port Flow cell Prism

slide-28
SLIDE 28

Genome Analyzer II imaging set up

OIL FLOWCELL PRISM

Obj. lens camera laser . . . . . .

Tile

50 tiles/column X 2 columns/channel X 8 channels/flow cell

slide-29
SLIDE 29

50 MILLION CLUSTERS PER FLOW CELL

20 MICRONS 100 MICRONS

Genome Analyzer II Sequencing

slide-30
SLIDE 30

Base Calling

1 2 3 7 8 9 4 5 6

T T T T T T T G T … T G C T A C G A T …

The identity of each base of a cluster is read off from sequential images

slide-31
SLIDE 31

What comes out today:

– 36bp standard read length; enabled for 50-75bp – >50 million reads per 8-channel (lane) flowcell; >6.25 million reads per channel – >1.5GB per standard run; >3GB per paired-end run – 2 day standard and 4 day paired-end run – Raw read accuracy of >99.5% (36bp) – Consensus accuracy of >99.999% (20x depth of coverage)

slide-32
SLIDE 32

What comes out at the end of 2008 (Ha!) :

– 36bp 75bp standard read length – 50million >130 million reads per flowcell; >6.25million>16 million reads per channel – >1.5GB 10GB per standard run; >3GB 20GB per paired-end run – 2 3.5 day standard and 4 7 day paired-end run – Raw read accuracy of >99.5% (36bp) – Consensus accuracy of >99.999% (20x depth of coverage) Plus improvements in data quality

slide-33
SLIDE 33

What goes in:

DNA Fragments + Adapters + Sequencing Library

DNA fragment sources Applications

  • Genomic DNA
  • Genome and directed

SNP/mutation; genome structure re-arrangements; re-sequencing breakpoints; CNVs; methylation pattern

  • Genome sequencing

de novo genome sequencing

  • ChIP

products transcription factor binding sites; protein complex positioning; methylation patterns

  • cDNA

mRNA transcript structure and differential expression; small RNA discovery & differential expression

  • ???

????

slide-34
SLIDE 34

454 and SOLiD sequencing template preparation

slide-35
SLIDE 35

Fan et al., Nature Reviews Genetics 2006

Library preparation by Emulsion PCR

DNA to be sequenced Single-stranded PCR template Emulsion PCR Clonal sequencing template Sequencing Chambers Single DNA molecules + capture beads + PCR mix

SOLiD: 90bp template fragment size; 1um beads, 10-20,000 template copies/bead 454: 300-500bp template fragment size; 30um beads, “millions” template copies/bead

slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38
slide-39
SLIDE 39
slide-40
SLIDE 40
slide-41
SLIDE 41
slide-42
SLIDE 42
slide-43
SLIDE 43
slide-44
SLIDE 44
slide-45
SLIDE 45
slide-46
SLIDE 46
slide-47
SLIDE 47
slide-48
SLIDE 48
slide-49
SLIDE 49
slide-50
SLIDE 50
slide-51
SLIDE 51
slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54
slide-55
SLIDE 55
slide-56
SLIDE 56
slide-57
SLIDE 57
slide-58
SLIDE 58
slide-59
SLIDE 59
slide-60
SLIDE 60
slide-61
SLIDE 61
slide-62
SLIDE 62
slide-63
SLIDE 63
slide-64
SLIDE 64
slide-65
SLIDE 65
slide-66
SLIDE 66

454/Roche

slide-67
SLIDE 67

Sequencing-by-Synthesis – pyrosequencing (454)

slide-68
SLIDE 68

ABI 3730xl: ~ 1 x106 bases per day (at 15 runs/day) 800 bases per read and 1250 reads per day Cost to sequence a human genome (2007): $4,000,000 454/Roche: ~ 100 x106 bases per day (at 1 run/day) 250 bases per read and 400,000 reads per run Cost to sequence a human genome (2007): $1,000,000 Illumina GA II/SOLiD ~ 1.5–3.0 x109 bases per run (1 run/3 days) 35 bases per read and 40-100 x106 reads per run Cost to sequence a human genome (2008): $100,000 (GA2) $60,000 (SOLiD)

Sequencing Technologies

slide-69
SLIDE 69

The Next Next Generation Technologies

  • Complete Genomics

(http://www.completegenomics.com): Sequencing of DNA Nano-balls (DNBs) using combinatorial Probe-Anchor Ligation (cPAL)

  • Pacific Biosciences

(http://www.pacificbiosciences.com): Single Molecule Real Time DNA sequencing based on zero mode waveguides

slide-70
SLIDE 70

Complete Genomics – Library Generation

Library construction Template Amplification

slide-71
SLIDE 71

Complete Genomics – Sequencing “Complete Genomics says that by next spring it will be conducting complete genome scans for $5,000.”

  • BioITWorld.com

6 January 2009

Sequencing surface Sequencing chemistry

slide-72
SLIDE 72

Pacific Biosciences – Sequencing vessel and method

slide-73
SLIDE 73

Pacific Biosciences – Sequencing

slide-74
SLIDE 74

Nanopore Sequencing: the Next Next Next Generation Sequencing Technology (?)

slide-75
SLIDE 75