Acknowledgements CONFIDENTIAL Next Generation Genotyping Workflow - - PowerPoint PPT Presentation
Acknowledgements CONFIDENTIAL Next Generation Genotyping Workflow - - PowerPoint PPT Presentation
RIPTIDE enabled Next Generation Genotyping Acknowledgements CONFIDENTIAL Next Generation Genotyping Workflow (Same for all RIPTIDE Applications) Sample Extraction (RNA/DNA) Library Preparation (RNA/DNA) Sequencing De-multiplexing
CONFIDENTIAL
Acknowledgements
CONFIDENTIAL
Next Generation Genotyping Workflow (Same for all RIPTIDE Applications)
De-multiplexing Sample Extraction (RNA/DNA) Library Preparation (RNA/DNA) Sequencing Result: Variant Call File for each genome
CONFIDENTIAL
RIPTIDE Workflow
CONFIDENTIAL
RIPTIDE Workflow: 960 Individually Barcoded Samples
A) B)
CONFIDENTIAL
RIPTIDE Size Selection (Included in Kit)
SPRI bead based size selection flexible for multiple insert size ranges
a) Pre-size selected library produces an equimolar amount of product from 200bp-2kb+. b) Example of SPRI bead size selected material with 500bp mean insert length. Options for larger/smaller inserts included in protocol.
CONFIDENTIAL
RIPTIDE Data Analysis: Read Structure
Sample Barcode Random Sequence Template Bases Template Bases 8 nt 12 nt 130 nt 142 nt Random Sequence 8 nt P5 Adapter P7 Adapter Bases identifying the sample Plate Barcode (Index) 6 nt Bases identifying the plate Read 1 (150 nt) Read 2 (150 nt) Derived from primer A Insert Derived from primer B
CONFIDENTIAL
RIPTIDE: One Technology – Many Applications
Whole Genome Sequencing Microbiome Genotyping by Sequencing Targeted Sequencing
Sample # of Barcodes Mean Coverage (unique) # hets called % hets phased N50 Haplotype block NA128781 1544 153x 2,612,866 99.9963% 16,791,622 Jurkat 6144 21x 2,229,468 99.2557% 2,261,063 NA04510 6144 34x 2,302,366 99.6859% 3,732,649 NA05289 6144 28x 2,165,774 99.5555% 1,149,861 NA11410 6144 30x 2,653,959 99.6975% 4,753,710 NA11629 6144 29x 2,278,703 99.6456% 1,582,026 NA13707 6144 11x 1,646,406 96.7802% 430,602 NA14622 6144 27x 2,318,691 99.8724% 7,255,334
Phasing / Haplotyping RNA Sequencing
CONFIDENTIAL
RIPTIDE Generates Uniform Coverage and is Reproducible
Heat map of wheat genomes: Chinese Spring (Reads/Mbp)
CONFIDENTIAL
Riptide Enabled Next Generation Genotyping (NGG): Simple Description
Homozygous (inbred) parental lines
- Shallow sequencing of progeny determines
recombination ”boundaries”
- Known (parental) genotypes are “assigned”
Deep Sequencing of parental lines
CONFIDENTIAL
Empirical Evidence Shows High Concordance to Arrays: Maize
Cross: B73 and LH82 lines 0.01x coverage shows > 99% concordance to AFFX 600K 2.4Gbp genome @ 0.01x coverage (.024G)=< $6 per sample*
(*Novaseq S4 flow cell, 2x 150, 100K samples per flow cell)
CONFIDENTIAL
What about a more complicated scenario?
CONFIDENTIAL
NGG for Accurate Human Genotyping Requires a Bit More Evidence (Reads)
How much sequencing is needed to accurately determine the correct haplotype and assign (impute) all the genotypes in a given genomic region?
CONFIDENTIAL
Human Data Generation
Samples obtained from Corriel institute 4ul of DNA input into Riptide (no sample QC performed) Multiplex libraries sent to macrogen for sequencing Demux using fgbio… …then Gencove for VCF generation for 38M bi-allelic variants RTGtools VCF eval used for comparison ILMN GSA array variants in NIST high confidence regions used as "truth" set
*Gencove now calls > 80M variants for human
CONFIDENTIAL
960 Human Genomes Across a Full NovaSeq S4 Flow Cell
Number of samples per flow cell depends on goals of study.
CONFIDENTIAL
384 Human Genomes: (96 Per Lane on NovaSeq S4 Flow Cell)
99.7% 98.2% 98.9% 98.7% 99.0% 99.6% 92.5% 87.5% 90.0% 99.9% 50.0% 55.0% 60.0% 65.0% 70.0% 75.0% 80.0% 85.0% 90.0% 95.0% 100.0% SNV Precision SNV Sensitivity SNV Accuracy MAF <1% Precision MAF 1-5% Precision MAF >5% Precision SV Precision SV Sensitivity SV Accuracy Replicate Precision
Summary Stats (mean) for JPT and YRI samples (n=288) on autosome variant calls
Wellderly (96, right), YRI(96) & JPT(96X2)
Wellderly: NGG vs Illumina 40X WGS Gold Standard
CONFIDENTIAL
0.00E+00 2.00E+07 4.00E+07 6.00E+07 8.00E+07 1.00E+08 1.20E+08 1.40E+080.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1
Read CountAccuracy: f-measure
0.00E+00 2.00E+07 4.00E+07 6.00E+07 8.00E+07 1.00E+08 1.20E+08 1.40E+080.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1
Read CountSensitivity: TP / (TP + FN)
0.0E+00 2.0E+07 4.0E+07 6.0E+07 8.0E+07 1.0E+08 1.2E+08 1.4E+080.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1
Read CountPrecision: TP / (TP+FP)
Precision, Sensitivity, and Accuracy for 288 Individual Samples
CONFIDENTIAL
Precision (Concordance) by MAF: GSA
98.0% 98.5% 99.0% 99.5% 100.0% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100%
Precision Minor Allele Frequency
Precision: TP / (TP + FP) by Minor Allele Frequency
CONFIDENTIAL
Comparison of Human Genotyping Products
CONFIDENTIAL
Potential for Single Phase GWAS and Causal Variant Discovery
- Merge FAST Q files from individual
low pass samples (case/control)
- Treat cases and controls as high
coverage individual samples
- Perform genotype calling
- Call novel variants
- Identify causal candidates
Reference: https://doi.org/10.1038/s41576-018-0016-z
CONFIDENTIAL
Reference: University of Liege
“Reagent costs to sequence a mammalian genome at 1-fold depth are now <20€ thus making this a cost-effective proposition. As a matter of fact, the method is being deployed in the field in several countries.” SNP-based quantitative deconvolution of biological mixtures: application to the detection of cows with subclinical mastitis by whole genome sequencing of tank milk
− Wouter Coppieters, Latifa Karim, Michel Georges
CONFIDENTIAL
RIPTIDE + Gencove Promotion: Purchase a Kit, Get Your Analysis for Free!
Reference: https://docs.gencove.com/main/#data-analysis-configurations
CONFIDENTIAL
Back to technology
CONFIDENTIAL
RIPTIDE Tunability and Customization
CONFIDENTIAL
RIPTIDE Kit Includes Hi/Low GC Primers for Tunability
- Easily tunable to GC content
- Species representation is maintained
CONFIDENTIAL
RIPTIDE targeted spike ins: in development
TTR Gene: 17 targeted primer annealing sites spaced 300-600bp across 9.564kb
IGV browser images
CONFIDENTIAL
NGS library (rRNA sequences in purple) Total cellular RNA sample NGS library prep
CRISPR-mediated Depletion Protocol: (A JumpCode Technology)
CONFIDENTIAL
NGS library (rRNA sequences in purple) CRISPR Cas9 digestion of library (or multiplexed libraries) Size selection to remove short fragments PCR amplfication Total cellular RNA sample NGS library prep Post-library ribodepletion
CRISPR-mediated Depletion Protocol: (A JumpCode Technology)
CONFIDENTIAL
CRISPR-mediated Depletion Protocol: (A JumpCode Technology)
Cas9/gRNA RNP formation (10 mins at room temp) Cas9/gRNA digestion of library (1 hr at 37°C) Size Selection (0.6X Ampure Beads) PCR Size Selection (0.6X Ampure Beads)
CONFIDENTIAL
0.31% 4.25% 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% Predepletion Post Depletion
% reads coding
15.5 7.3 2 4 6 8 10 12 14 16 Predepletion Post Depletion
Genome size (Gbp)
JumpCode’s CRISPR repeat depletion applied to wheat
Reduction of genome size
(78% goal)
53%
Enrichment increase of CDS coverage
Theoretical max = 8.49%
14x
7.3
CONFIDENTIAL
Conclusions
- High quality data
- Simple, cost effective
- Tunable to the data you want
- Just the beginning…