SLIDE 1 HIV-1 drug resistance testing using Oxford Nanopore’s MinION – a long-range low error sequencing approach
Alexander Thielen
AREVIR 2017
SLIDE 2
Oxford Nanopore’s technology
SLIDE 3
– long range sequencing
Oxford Nanopore’s technology
SLIDE 4
– long range sequencing – very fast
Oxford Nanopore’s technology
SLIDE 5
Oxford Nanopore’s technology
SLIDE 6
– long range sequencing – very fast – portability / global access (search for “Massimo Delledonne: MinIONs and Nanofrogs“)
Oxford Nanopore’s technology
SLIDE 7
Oxford Nanopore’s MinION
SLIDE 8
– long range sequencing – very fast – portability / global access (search for “Massimo Delledonne: MinIONs and Nanofrogs“)
Oxford Nanopore’s technology
SLIDE 9
– long range sequencing – very fast – portability / global access (search for “Massimo Delledonne: MinIONs and Nanofrogs“) – relatively cheap, no capital cost – but: high error rate
Oxford Nanopore’s technology
SLIDE 10
Oxford Nanopore’s error rate
SLIDE 11
Error rate reduction using an Rolling Circle Amplification approach
SLIDE 12 cDNA PR RT RNaseH INT 3.1kb
M13for M13rev
MID1 MID2
RCA Approach: 1. Amplification
PCR product
SLIDE 13 Self-ligation using T4 ligase, digestion of non-circularized fragments
RCA Approach: 2. Circularization
SLIDE 14 random priming
RCA Approach: 3. Rolling Circle Amplification
isothermal amplification by phi29 polymerase
SLIDE 15 RCA Approach: 3. Rolling Circle Amplification
80-100 kb copyn copy1 copy2 Amplification products are sheared to ~20kb Final products:
SLIDE 16
RCA Approach: 4. Library Preparation & Sequencing
SLIDE 17 RCA Approach: 5. Analysis
Individual copies are detected in the sequenced reads and aligned against each other A consensus sequence is generated 80-100 kb copyn copy1 copy2 copyn copy1 copy2 ... consensus x x x x x x x x
SLIDE 18 – except for homopolymers, errors shoud occur randomly – consensus sequences should therefore reduce error rates – accuracy simulations:
RCA accuracy estimation
Amplicon count Estimated RCA Accuracy R7 sim.
1 75.00% 80.00% 85.00% 90.00% 77.91% 3 84.38% 89.60% 93.93% 97.20% 84.41% 5 89.65% 94.21% 97.34% 99.14% 89.38% 7 92.94% 96.67% 98.79% 99.73% 91.49% 9 95.11% 98.04% 99.44% 99.91% 93.03%
SLIDE 19 – 16 samples were prepared with the RCA approach (RCA- 1D) – for comparison, the same amplified 3.1kb products were also sequenced with
- Illumina MiSeq, with Nextera XT tagmentation (MiSeq)
- MinION with 1D Ligation Kit without fragmentation & RCA (1D-
- nly)
– MiSeq results were used as gold-standard
Experiment
SLIDE 20 – MinION output:
240.000 reads mapped onto HIV-1 1300 bp (range: 31-3098 bp)
198.000 reads mapped onto HIV-1 1698 bp (range: 173-82.556 bp)
Results
log10 read length Frequency 1 2 3 4 5 1000 2000 3000
SLIDE 21 – Error rates (when only one copy of the amplified region is considered):
14.7% in total, 6.4% substitution errors, 5.6% deletions, and 1.1% insertions
14.7% in total, 6.0% mismatches, 6.3%, deletions 0.8% insertions RCA has no effect on raw error rates
Results
SLIDE 22 – median RCA consensus error rates:
Results
region copies total substitution deletion insertion 1 14.72% 6.03% 6.29% 0.77% 2 11.25% 5.63% 3.18% 1.00% 3 11.00% 3.49% 6.02% 0.18% 4 7.39% 1.64% 4.76% 0.00% 5 6.12% 1.22% 3.97% 0.00% 6 4.44% 0.94% 2.79% 0.00% 7 2.76% 0.00% 1.56% 0.00% 8 1.78% 0.00% 0.00% 0.00% 9 1.56% 0.00% 0.00% 0.00% 10 1.47% 0.00% 0.00% 0.00% 11 1.2% 0.00% 0.00% 0.00%
SLIDE 23 Results
0,00% 2,00% 4,00% 6,00% 8,00% 10,00% 12,00% 14,00% 16,00% 1 2 3 4 5 6 7 8 9 10 11 12
error rates
total substitution deletion insertions
SLIDE 24 RCA accuracy estimation
Amplicon count Estimated RCA Accuracy R7 sim. R9 exp.
1 75.00% 80.00% 85.00% 90.00% 77.91% 85.28% 3 84.38% 89.60% 93.93% 97.20% 84.41% 89.00% 5 89.65% 94.21% 97.34% 99.14% 89.38% 93.88% 7 92.94% 96.67% 98.79% 99.73% 91.49% 97.24% 9 95.11% 98.04% 99.44% 99.91% 93.03% 98.44%
SLIDE 25 – Basecalling errors at different sensitivity cut-offs: – Basecalling errors almost entirely within or adjacent to homo-polymers
Results
0,00% 5,00% 10,00% 15,00% 20,00% 25,00% 30,00% 35,00% 2 4 6 8 10 12 5% 10% 15% 20%
SLIDE 26
16-76608: M184V, K219R, E138K ~ 100% V108I: 77% V179I: 30% H221Y: 24% V106I: 4.4%
Outlook: Long-range covariation analysis
SLIDE 27 Acknowledgments
Kirsten Becker Nina Engel Anja Förster Anna Memmer Bettina Spielberger Martin Däumer Bernhard Thiele