Different approaches on normalisation of gene expression RT-qPCR data
Jan Hellemans PhD, Ghent University co-founder and CEO, Biogazelle Lo studio dell‟espressione genica in real-time PCR September 5, 2008 Siena, Italy
Different approaches on normalisation of gene expression RT-qPCR - - PowerPoint PPT Presentation
Different approaches on normalisation of gene expression RT-qPCR data Jan Hellemans PhD, Ghent University co-founder and CEO, Biogazelle Lo studio dellespressione genica in real -time PCR September 5, 2008 Siena, Italy acknowledgement
Different approaches on normalisation of gene expression RT-qPCR data
Jan Hellemans PhD, Ghent University co-founder and CEO, Biogazelle Lo studio dell‟espressione genica in real-time PCR September 5, 2008 Siena, Italy
acknowledgement
qPCR workflow Cq values Data processing Statistical analysis & interpretations Experiment design Sample prep Assay design qPCR reactions
qPCR workflow – data processing Cq values Data processing Statistical analysis & interpretations Experiment design Sample prep Assay design qPCR reactions
absolute vs relative quantification
how many copies or molecules (molarity)
standard (dilution series)
quantification relative to the standard
the accuracy of external standard quantification is entirely dependent
log linear relationship between input and Ct + reproducibility
precise and reproducible answer, but not necessarily an accurate answer
exception: digital PCR
Vandenbroucke et al., Nucleic Acids Research, 2001
relative quantification – Cq values to relative quantities
RQ1/3 = 24 = 16 RQ2/3 = 22 = 4 RQ3/3 = 20 = 1
2 4 1 2 3
PCR cycle threshold fluorescence Cq1 Cq2 Cq3
relative quantification – variation differences in RQ due to
Data processing - relative quantification differences in RQ due to
relative quantification – variation differences in RQ due to
relative quantification – variation differences in RQ due to
relative quantification – variation Cq RQ NRQ CNRQ Normalization Inter-run calibration technical variation
relative quantification – Cq values to relative quantities Cq RQ NRQ CNRQ RQ = 2ΔCq RQ = EΔCq
relative quantification – amplification efficiencies
increase number of dilution points (n) increase range of dilution
n a a n a a a
Q Q Cq Cq Q Q slope
1 2 1 slope
E
1
10 ) 1 (n s s slope SE
x e 2
10 ln slope slope SE E E SE 2
1 2 , ,
n Cq Cq s
n a predicted a measured a e n a a x
Q Q n s
1 2
1 1
relative quantification – amplification efficiencies
SE(E)=0.032
increase number of points increase range of points
SE(E)=0.032
relative quantification – amplification efficiencies
SE(E)=0.032
SE(E)=0.013
increase number of points increase range of points
SE(E)=0.032
SE(E)=0.018
relative quantification – amplification efficiencies
SE(E)=0.032
SE(E)=0.013
SE(E)=0.008
increase number of points increase range of points
SE(E)=0.032
SE(E)=0.018
SE(E)=0.008
relative quantification – amplification efficiencies
SE(E)=0.032
SE(E)=0.013
SE(E)=0.008
SE(E)=0.005
increase number of points increase range of points
SE(E)=0.032
SE(E)=0.018
SE(E)=0.008
SE(E)=0.004
relative quantification – amplification efficiencies
SE(E)=0.032
SE(E)=0.013
SE(E)=0.008
SE(E)=0.005
SE(E)=0.003
increase number of points increase range of points
SE(E)=0.032
SE(E)=0.018
SE(E)=0.008
SE(E)=0.004
SE(E)=0.002
relative quantification – amplification efficiencies
relative quantification – normalization
sample: size and type RNA extraction: quality and quantity RNA degradation cDNA synthesis
normalization
relative quantification – normalization
100% PCR efficiency 1 reference gene
adjusted PCR efficiency 1 reference gene
adjusted PCR efficiency multiple reference genes Ct
ref Ct ref goi Ct goi
, , n ref Ct ref n i goi Ct goi
i i
, ,
relative quantification – inter-run calibration
relative quantification – inter-run calibration
relative quantification – inter-run calibration
use sample maximization
use the same instrument, reagents and consumables
inter-run calibration
relative quantification – inter-run calibration
no increase in variation due to absence of inter-run variation suitable for retrospective studies and controlled experiments
introduces (under-estimated) inter-run variation applicable for prospective studies or large studies in which the number
inter-run variation can be controlled and corrected for using inter-run
calibrators (IRC)
possible on two levels:
the more inter-run calibrators, the better simple vs. complex inter-run calibration specialised software is needed
relative quantification – inter-run calibration
relative quantification – inter-run calibration
inter-run calibrator identical sample measured for the same gene in different runs
relative quantification - qBase
relative quantification – qBase & qBasePlus
Hellemans et al., Genome Biology, 2007 http://www.qbaseplus.com
relative quantification - qBasePlus
qPCR workflow – data processing Cq values Data processing Statistical analysis & interpretations Experiment design Sample prep Assay design qPCR reactions
normalization
gene-specific (biological) variation (true fold changes) non-specific (technical) variation
variation
normalization
normalization – reference genes
most popular captures most variation
reference genes (might) vary in expression until recently, non-validated reference genes were used
(assuming stable expression)
as the most appropriate and universally applicable method
which genes? how to do the calculations?
normalization – multiple reference genes
gene concept:
quantified errors related to the use of a single reference gene
(> 3 fold in 25% of the cases; > 6 fold in 10% of the cases)
developed a robust algorithm for assessment of expression stability of
candidate reference genes
proposed the geometric mean of at least 3 reference genes for
accurate and reliable normalisation
Vandesompele et al., Genome Biology, 2002
normalization – multiple reference genes
functional and abundance classes) on 85 samples from 13 different human tissues
1 2 3 4
ACTB HMBS HPRT1 TBP UBC A B C D E F G 15 fold difference between A and B if normalized by only one gene (ACTB or HMBS)
normalization – geNorm
average pairwise variation V of a gene with all other genes gene A gene B sample 1 a1 b1 log2(a1/b1) sample 2 a2 b2 log2(a2/b2) sample 3 a3 b3 log2(a3/b3) … … … … sample n an bn log2(an/bn) standard deviation = V
http://medgen.ugent.be/genorm
normalization – geNorm
ranking of candidate reference genes according to their stability determination of how many genes are required for reliable normalization
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 HM BS B2M RPL1 3A SDHA TBP ACTB UBC YHWAZ GAPD HPRT1
normalization – geNorm
normalization – geNorm
genes geometric mean = (a x b x c) 1/3 arithmetic mean = a + b + c 3
normalization – geNorm
NF ACTB HMBS HPRT1 TBP UBC
normalization – geNorm
3 good references 3 mediocre references 3 bad references
0.003 0.006 0.021 0.023 0.056 NF4 NF1
statistically more significant results
normalization – geNorm
log rank statistics Hoebeeck et al., Int J Cancer, 2006
accurate assessment of small expression differences
normalization – geNorm
Hellemans et al., Nature Genetics, 2004
patient / control
3 independent experiments
95% confidence intervals
present mathematical (linear mixed-effects) models to further analyze candidate reference genes
log yij = μ + Ti + Gj + εij
Global Pattern Recognition (Akilesh et al., Genome Research, 2003) BestKeeper (Pfaffl et al., Biotechnology Letters, 2004) Equivalence test (Haller et al., Analytical Biochemistry, 2004) ANOVA test (Brunner et al., BMC Plant Biology, 2004) Normfinder (Andersen et al., Cancer Research, 2004) Szabo et al., Genome Biology, 2004 Abruzzo et al., Biotechniques, 2005
normalization – geNorm
Reference gene validation software for improved normalization book chapter in “Real-time PCR: an essential guide”, Horizon Bioscience, 2nd edition (2009)
normalization – effect of sample quality
(Perez-Novo et al., Biotechniques, 2005)
normalization – quality control
selection of the best set of reference genes quality control of reference gene stability
tissue type gene CV (%) M mean CV% mean M neuroblastoma UBC 31.84 0.740 30.89 0.703 SDHA 27.40 0.660 HPRT1 37.11 0.736 GAPDH 27.21 0.675 fibroblast YHWAZ 18.19 0.408 14.81 0.365 HPRT1 8.84 0.308 GAPDH 17.40 0.378 leukocyte B2M 15.76 0.400 15.81 0.394 UBC 15.79 0.389 YWHAZ 15.89 0.393 bone marrow YWHAZ 17.77 0.383 15.47 0.372 UBC 13.60 0.356 RPL13A 15.03 0.376 normal pool TBP 47.51 1.099 43.73 0.925 HPRT1 46.99 0.988 HMBS 31.16 0.849 SDHA 49.50 0.869 GAPDH 43.50 0.819
Hellemans et al., Genome Biology, 2007
reference gene validation requires (extensive) experimental work sometimes not possible (lack of sample material, funding, time or
devotion)
EAR normalization (Expressed Alu Repeat)
“using a repetitive sequence in the human transcriptome as a measure for the mRNA fraction”
normalization – EARs
the differential expression of a small number of genes won‟t influence the
normalization – EARs
by far the most abundant repeats in the human genome 1 million copies (10% of the genome), 31 subfamilies (well conserved) short interspersed elements (SINE) replicating via retrotransposition ~280 bp long, followed by a variable poly-A tail no known biological function implicated in human disease (unequal recombination)
normalization – EARs
UCSC genome browser table function
human genes -> „expressed Alu repeats‟
MySQL
PHP script „Alu FASTA generator‟ wEMBOSS clustalW alignment
normalization – EARs
normalization – EARs
ADAMTS4 (1q23.3) ADCY6 (12q13.12) AluSq AluSg
normalization – EARs
64, 16, 4 and 1 ng QPCR Reference Total RNA (Stratagene)
Pearsons correlation 0.943 (p=0.0014)
normalization – EARs
0.5 1 1.5 2 2.5 3 3.5 4 CHP-134 CLB-GA IMR-32 QPCR Ref Total RNA SK-N-AS SK-N-FI SK-N-SH
AluSx NF3
normalization – EARs
1 10 100 1000 10000 100000 CHP-134 CLB-GA IMR-32 NGP QPCR Ref Total RNA SK-N-AS SK-N-FI SK-N-SH MYCN normalised by AluSq MYCN normalised by NF3
normalization – EARs
gene expression analysis (cDNA) (EAR normalization) gene copy number quantification (DNA)
normalization – EARs
RDML: exchanging & publishing of qPCR data & results
RDML: http://www.RDML.org
Conclusions
avoid minimize correct for
RTprimerDB (assay design & database - http://medgen.ugent.be/rtprimerdb) geNorm (reference gene validation - http://medgen.ugent.be/genorm) qBasePlus (relative quantification & quality controls – http://www.qbaseplus.com)