Insights from the first RT-qPCR based human transcriptome profiling - - PowerPoint PPT Presentation
Insights from the first RT-qPCR based human transcriptome profiling - - PowerPoint PPT Presentation
Slide 1 of 38 Insights from the first RT-qPCR based human transcriptome profiling based on wet lab validated assays Jan Hellemans, PhD CEO Biogazelle qPCR & NGS 2013 Freising, Germany March 19, 2013 Biogazelle Slide 2 of 38 Slide 2
Slide 2 of 38 Slide 2 of 38
Biogazelle
Slide 3 of 38 Slide 3 of 38
The biogazelle team and collaborators
Biogazelle
- Barbara D’haene
- Pieter Mestdagh
- Gaëlle Van Severen
- Nele Nijs
- Anthony Van Driessche
- Manuel Luypaert
- Shana Robbrecht
- Ariane Deganck
- Jo Vandesompele
Ghent University
- Steve Lefever
VIB nucleomics core Bio-Rad
Slide 4 of 38 Slide 4 of 38
qPCR: reference technology for nucleic acid quantification
sensitivity and specificity wide dynamic range speed relative low cost conceptual and practical simplicity
easy to perform ≠ easy to do it right
many steps involved all need to be right
Introduction
Slide 5 of 38 Slide 5 of 38
R
relative quantification quality control statistical analysis
C
Prepare – cycle – report
P
experiment design samples assays
prepare cycle report
Slide 6 of 38 Slide 6 of 38
R
relative quantification quality control statistical analysis
C
Prepare – cycle – report
P
experiment design samples assays
prepare cycle report
Slide 7 of 38 Slide 7 of 38
Assays & MIQE
design
amplicon length primer positions (exonic or intron-spanning) transcript coverage
in-silico
specificity prediction (retropseudogenes and other homologues) secondary structure analysis
wet lab
specificity assessment (gel, melt, sequence) Cq of NTC (for SYBR assays) amplification efficiency determination (slope, E, SE(E), r²)
Slide 8 of 38 Slide 8 of 38
Dealing with MIQE
DIY experts in qPCR
spend a lot of effort in doing it right
DIY novel to qPCR
adhering to the MIQE guidelines is a challenge
users of commercial assays
if they sell it, it must be good
Slide 9 of 38 Slide 9 of 38
Dealing with MIQE
DIY experts in qPCR
spend a lot of effort in doing it right
à save time
DIY novel to qPCR
adhering to the MIQE guidelines is a challenge
à focus on biological question rather than technical qPCR challenges
users of commercial assays
if they sell it, it must be good
à have proof that it is good
Slide 10 of 38 Slide 10 of 38
The perfect assay
Properties of the perfect assay
specific for the gene of interest (no off-target amplification) detection of all transcript variants detection not affected by polymorphisms (no allelic bias or drop out) amplification efficiency ~100% no gDNA co-amplification no primer dimer formation
Slide 11 of 38 Slide 11 of 38
The perfect assay
Some genes cannot have a perfect assay
no unique sequences (homology with other genes – pseudogenes) not a single part of the gene occurs in all transcripts regions are excluded because of repeats, secondary structures, SNPs, homology, ...
Make the best possible compromise and report any potential issues Design à in-silico quality control à lab validation
Slide 12 of 38 Slide 12 of 38
Assay designs
primerXL (UGent)
database of genomic information tools for target region selection
Slide 13 of 38 Slide 13 of 38
Gene sequence fragmentation for target region selection
1 gene, 3 transcripts, 6 fragments (coverage frequency 1 to 3) gene ¡ transcript ¡1 ¡ transcript ¡2 ¡ transcript ¡3 ¡ 2 ¡ 3 ¡ 2 ¡ 3 ¡ 2 ¡ 1 ¡
Slide 14 of 38 Slide 14 of 38
Assay designs
primerXL (UGent)
database of genomic information tools for target region selection primer3 based primer design analysis of secondary structures and SNPs in primer binding regions specificity prediction (BiSearch) relaxation cascade
Slide 15 of 38
BiSearch specificity prediction
BiSearch loose
1222222222222222
- nly the gene of interest
BiSearch strict
1233333333333
Slide 16 of 38
BiSearch specificity prediction
BiSearch loose
1222222222222222
- nly the gene of interest (FFAR2)
BiSearch strict
1233333333333
reads ¡ seq ¡ gene_list ¡
- fficial_symbol ¡ loca8on ¡
2843 ¡ CATGGCAGTCACCATCTTCTGCTACTGGCGTTTTGTGTGGATCATGCTCTCCCAGCCCCTTGTGGGGGCCCAGAGG
CGGCGCCGAGCCGTGGGGCTGGCTGTGGTGACGCTGCTCAATTTCCTGGTGTGCTTCGGACCTTACAGATCGGAA
ENSG00000126262 ¡ FFAR2 ¡ 19:35940617-‑35942667 ¡ 1897 ¡ GTAAGGTCCGAAGCACACCAGGAAATTGAGCAGCGTCACCACAGCCAGCCCCACGGCTCGGCGCCGCCTCTGGGCC
CCCACAAGGGGCTGGGAGAGCATGATCCACACAAAACGCCAGTAGCAGAAGATGGTGACTGCCATGAGATCGGAA
ENSG00000126262 ¡ FFAR2 ¡ 19:35940617-‑35942667 ¡ 1535 ¡ GTAAGGTCCGAAGCACACCGAGAGCTGGGAGCAGGAGCTACACAGTCTGCTGGCCTCACTGCACACCCTGCTGGGG
GCCCTGTACGAGGGAGCAGAGACTGCTCCTGTGCAGAATGAAGGCCCTGGGGTGGAGATGCTGCTGTCCTCAGAA
ENSG00000141456 ¡ AC091153.1 ¡ 17:4574680-‑4607632 ¡ 1097 ¡ CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATTCTGCACAGGAGCAGTCTCTGC
TCCCTCGTACAGGGCCCCCAGCAGGGTGTGCAGTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGG
ENSG00000141456 ¡ AC091153.1 ¡ 17:4574680-‑4607632 ¡ 1091 ¡ CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATTCTGCACAGGAGCAGTCTCTGC
TCCCTCGTACAGGGCCCCCAGCAGGGTGTGCAGTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGGT
ENSG00000141456 ¡ AC091153.1 ¡ 17:4574680-‑4607632 ¡
Slide 17 of 38 Slide 17 of 38
Gene homology prevents perfect designs
1532 / 2043 (75%) of genes without perfect design have homologous genes that differ less than 12.5% (2 variations per 16 bases)
0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%
1 32 63 94 125 156 187 218 249 280 311 342 373 404 435 466 497 528 559 590 621 652 683 714 745 776 807 838 869 900 931 962 993 1024 1055 1086 1117 1148 1179 1210 1241 1272 1303 1334 1365 1396 1427 1458 1489 1520 1551 1582 1613 1644 1675 1706 1737 1768 1799 1830 1861 1892 1923 1954 1985 2016
distances (clustalW) between all genes without perfect design
Slide 18 of 38 Slide 18 of 38
Wet lab validation
PCR composition
total volume: 5 ul instrument: CFX-384 (with automation)
- mastermix: SsoAdvanced SYBR
primer conc: 250 nM each
PCR program
default cycling protocol for SsoAdvanced SYBR (Ta=60°C)
Samples
- cDNA: 25 ng (total RNA equivalents – Agilent Universal human reference RNA)
- gDNA: 2.5 ng (Roche)
NTC: water + carrier (5 ng/µl yeast transfer RNA) synthetic template (pooled 60-mers in concentration range: 2E7 – 2E1 copies)
Slide 19 of 38 Slide 19 of 38
Some numbers
lab validation of 50 133 assays (human and mouse) 829 056 reactions 2 159 PCR plates (384-well) equivalent to 8 636 PCR plates (96 well)
172m
Slide 20 of 38 Slide 20 of 38 Vermeulen et al., Nucleic Acids Research, 2009
55-mer standard desalted 3’ blocked to prevent elgongation 5 points dilution series: 150 000 molecules > 15 molecules
New approach: easier + cheaper + as good
60-mer first (5’) and last (3’) 30 nucleotides of amplicon sequence standard desalted no 3’ blocking 7 points dilution series: 20 000 000 > 20 molecules
Two generations of external oligonucleotide standards
stuffer FP 30 nt 3’ 30 nt 5’ RCRP
Slide 21 of 38 Slide 21 of 38
Synthetic templates are equivalent to natural templates
ds template ss oligo r²<0.99 1 1 median E 2.00 2.01 average E 2.00 2.01 count E <> [1.90-2.10] 1 3 paired t-test p-value 0.14
20 200 2000 20000 200000 2000000 20000000 10 15 20 25 30 35
comparison between short ss synthetic template and full length ds template
> 300 assays
Slide 22 of 38 Slide 22 of 38
Efficiency evaluation
amplification efficiency
6 orders of magnitude 20 – 20M copies linear over entire range LOD (LOQ) ≤ 20 molecules E in 90-110% range
Slide 23 of 38 Slide 23 of 38
Efficiency distribution (n = 50 133)
89%
Slide 24 of 38 Slide 24 of 38
Efficiency distribution (n = 50 133)
89% redesign redesign
Slide 25 of 38 Slide 25 of 38
NGS as preferred method for specificity assessment
amplicon sizing ( + melt analysis for SYBR assays)
limited sensitivity for detecting low level non-specific coamplification failure to observe non-specific amplification of sequences with similar size and/or Tm
e.g. expressed pseudogenes or homologous genes
Next level of specificity assessment
in-silico specificity predictions by BiSearch massively parallel sequencing of pooled PCR products average coverage > 1000-fold à lab specificity > 99.9% 50 – 200 times more sensitive than size analysis and Sanger sequencing
Slide 26 of 38 Slide 26 of 38
Most assays are 100% on-target
Slide 27 of 38 Slide 27 of 38
2/3 of non-specific assays may go unnoticed without NGS
0% 25% 50% 75% 100% % on-target
assays with off-target reads
0% 10% 20% 30% 40% 50% 60% 0 < x < 0.1 0.1 < x < 0.2 0.2 < x < 0.3 0.3 < x < 0.4 0.4 < x < 0.5 0.5 < x < 0.6 0.6 < x < 0.7 0.7 < x < 0.8 0.8 < x < 0.9 0.9 < x < 1
Slide 28 of 38 Slide 28 of 38
MIQE compliant PrimePCR assay validation data sheet
Slide 29 of 38 Slide 29 of 38
MAQC
Micro-array quality control study
A = universal RNA B = brain RNA C = ¾ A + ¼ B D = ¼ A + ¾ B ~ 100,000 PCRs reproducibility titration response accuracy dynamic range
1 2 3 4
A C D B
Slide 30 of 38 Slide 30 of 38
Reproducibility (n = 3 678)
Expression A = expression B à expression C = expression D
Slide 31 of 38 Slide 31 of 38
Titration response (n = 18 437)
Expression A > Expression B à A > C > D > B
Slide 32 of 38 Slide 32 of 38
Accuracy (n = 785)
No expression in A or B à C/D = 3 or 1/3 à dCq = 1.58
Slide 33 of 38 Slide 33 of 38
dynamic range
500 1000 1500 2000 2500 3000 3500 4000 gene count copies per cell human mouse
> 10 000 000 fold
Slide 34 of 38 Slide 34 of 38
qPCR detects more brain genes than micro-arrays (n = 12 853)
150 1 972 7 334
array qPCR
Slide 35 of 38 Slide 35 of 38
SEQC
Comparison of our qPCR data to SEQC results
correlation for 10 784 common genes
Slide 36 of 38
qPCR
Sum of detected copies of measured protein coding genes: ~500 million
NGS
high throughput gene expression profiling: 125 indexed samples (two concurrent HiSeq flow cells at about 25 million reads per sample) high resolution transcriptome analysis: more than 50 samples 100 million reads per sample
qPCR vs sequencing depth
Slide 37 of 38 Slide 37 of 38
Human vs mouse
500 1000 1500 2000 2500 3000 3500
- 15
- 14
- 13
- 12
- 11
- 10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
gene count dCq (human-mouse)
Slide 38 of 38 Slide 38 of 38
Conclusions
Assay design and in-silico validation
qPCR assays for protein coding genes in human and mouse Transcript coverage SNPs and secondary structures Specificity prediction
Lab validation
E in 90-110% range Stringent specificity analysis by NGS
qPCR based transcriptome profiling
High sensitivity and dynamic range MAQC screening allows for cross-platform comparison