Digital PCR for copy number analysis Jo Vandesompele, PhD - - PowerPoint PPT Presentation

digital pcr for copy number analysis
SMART_READER_LITE
LIVE PREVIEW

Digital PCR for copy number analysis Jo Vandesompele, PhD - - PowerPoint PPT Presentation

Digital PCR for copy number analysis Jo Vandesompele, PhD Biogazelle CSO, UGent professor EMBL Advanced Course Digital PCR, Heidelberg, Germany October 22, 2015 Acknowledgements (A-Z) Lieven Clement, Els Goetghebeur, Bart Jacobs, Peter


slide-1
SLIDE 1

Digital PCR for copy number analysis

Jo Vandesompele, PhD Biogazelle CSO, UGent professor EMBL Advanced Course Digital PCR, Heidelberg, Germany October 22, 2015

slide-2
SLIDE 2

Acknowledgements (A-Z)

Lieven Clement, Els Goetghebeur, Bart Jacobs, Peter Pipelers, Olivier Thas, Matthijs Vynck Steve Lefever, Björn Menten, Katrien Vanderheyden, Kimberly Verniers, Nurten Yigit Ariane De Ganck, Nele Nijs Xavier Alba, Jen Berman, Frank Bizouarn, Viresh Pattel, Svilen Tzonev

slide-3
SLIDE 3

Agenda

  • introduction
  • experiment design
  • power analysis
  • sensitivity vs. inhibition vs. availability of input
  • CNV use cases
  • advanced data-analysis
  • droplet classification
  • combining replicates & multigene normalization
  • tips & tricks
slide-4
SLIDE 4

Full text papers available on Biogazelle website

http://www.biogazelle.com > Knowledge center > publications

slide-5
SLIDE 5

Biogazelle blog on dPCR vs. qPCR

http://www.biogazelle.com/knowledge-center/blog

slide-6
SLIDE 6

Digital PCR is emerging as gold standard method for CNV

  • Biogazelle is reference lab for Bio-Rad’s QX100/200 droplet digital

PCR technology

  • Scalable precision and relative sensitivity (needle in the

haystack) (“more is better”)

  • High accuracy (without calibration)
  • Excels in quantification of small differences and rare events
slide-7
SLIDE 7

Application domains

  • in principle any nucleic acid quantification study

(cost/throughput)

  • focus on those areas where dPCR excels
  • small differences
  • CNV analysis (high copy number range, transgene stability testing,

cell-free DNA (NIPT, oncogene amplification)

  • gene expression (microRNA, splice variants)
  • rare events
  • pathogens (e.g. viral load in body fluid such as urine)
  • mutant cancer cells (tissue, circulating cells or cell-free DNA)
  • circulating RNA biomarker (cell-free RNA)
slide-8
SLIDE 8

dMIQE guidelines for digital PCR

  • Clinical Chemistry, 2013
  • co-authored by Biogazelle founders
slide-9
SLIDE 9

dMIQE guidelines have 3 goals

1. Design, perform, and report dPCR experiments that have greater scientific integrity 2. Facilitate replication of published experiments adhering to the guidelines 3. Provide critical information that allows reviewers and editors to assess the technical quality of manuscripts

slide-10
SLIDE 10

Power analysis is a crucial aspect of experiment design

  • Ensure proper setup to find a true difference with statistical

significance

  • Often ignored
  • Limitations of dPCR power analysis in literature
  • no or few details on the methods
  • no incorporation of replicate variability (instead, reactions

are (naively) pooled over replicates)

  • not taking into account of all variables (e.g. replicates,

fraction of negative droplets, …)

  • use of meta-analysis methods (instead of ad hoc statistical

method)

slide-11
SLIDE 11

Digital PCR power analysis is a function of

  • true difference you want to see
  • number of partitions
  • fraction of negative partitions
  • number of replicates
  • alpha value (type I error, false positive rate, 5%)
  • 97% power to detect a 10% difference in copy number using

3 replicated reactions of each 14,000 partitions with 30% negative partitions

  • 53% for a 5% difference
slide-12
SLIDE 12

Interactive tool to determine power in digital PCR experiments

  • power for a given condition
  • power

~ number of replicates ~ fraction of negative partitions ~ number of partitions ~ copy number difference

  • optimal negative fraction (for max power) ~

copy number difference

  • Vynck et al., in preparation

http://vandesompelelab.ugent.be/power/

slide-13
SLIDE 13

Power in function of fraction of negative partitions

http://vandesompelelab.ugent.be/power/

  • difference of 10%
  • 14,000 partitions
  • 3 replicates
slide-14
SLIDE 14

Power in function of number of replicates

http://vandesompelelab.ugent.be/power/

  • difference of 10%
  • 14,000 partitions
  • 95% negatives
slide-15
SLIDE 15

Power in function of number of partitions

http://vandesompelelab.ugent.be/power/

  • difference of 15%
  • 1 replicate
  • 30% negatives
slide-16
SLIDE 16

What is determining the sensitivity of dPCR?

  • Both qPCR and dPCR can detect 1 molecule (precision is higher

for dPCR at low concentrations)

  • Input amount of nucleic acids
  • more cDNA to detect a low abundant transcript (e.g. long

non-coding RNA)

  • more circulating cell-free DNA to detect a low frequent

mutation

intended&sensitivity ng&of&DNA&needed 10.000% 0.229 1.000% 2.286 0.100% 22.857 0.010% 228.571 0.001% 2285.714

assuming at least 5 positive droplets are needed for confident calling, a perfectly discriminating assay between wild type and mutant, 14,000 recovered droplets from 20,000 formed

slide-17
SLIDE 17

Large dynamic range, high precision and accuracy

  • Correlation between expected and measured concentrations
  • n a gDNA dilution series (ranging from 100 000 copies/reaction

to 5 copies/reaction) (320 ng – 16 pg DNA)

y = 0.9781x + 0.0695 R² = 0.99877 1 2 3 4 5 6 1 2 3 4 5 6 log10 (measured concentration) copies/ddPCR reaction log10 (expected concentration) copies/ddPCR reaction

slide-18
SLIDE 18

Unpurified digested genomic DNA inhibits ddPCR if > 30 v/v%

y = 1.143x + 3.224 R² = 0.990 3.0 3.5 4.0 4.5 5.0 5.5 0.6 0.8 1.0 1.2 1.4 1.6 1.8 log10 (measured concentration) copies/reaction log10 (gDNA concentration) v/v%

25 5 7.5 10 15 20 30

slide-19
SLIDE 19

cDNA inhibits ddPCR if > 25 v/v%

  • Influence of cDNA input amounts (ranging from 5 to 45 v/v%) on

measured concentration

y = 0.921x + 3.306 R² = 0.999 0.0 1.0 2.0 3.0 4.0 5.0 6.0 0.6 0.8 1.0 1.2 1.4 1.6 1.8 log10 (measured concentration) copies/reaction log10 (cDNA concentration) v/v%

5 10 15 20 25

slide-20
SLIDE 20

Case 1 – genetic characterization of cell banks

  • Therapeutic protein production in

biopharmaceutical industry

  • Transgene copy number has influence
  • n expression level
  • Need for a cell line that is genetically

stable throughout the biopharmaceutical manufacturing process

  • Genetic characterization of Master Cell

Bank (MCB) and Working Cell Bank (WCB)

  • Traditionally by Southern blot analysis -

laborious and time consuming

  • > qPCR method for transgene copy

number determination

slide-21
SLIDE 21

Case 1 – struggling with qPCR

  • Transgene copy number analysis
  • Limited accuracy at higher copy numbers
  • Compensated by including more PCR replicates and

calibrators

(D’haene et al., Methods, 2010)

  • Pilot study: synthetic CN series (1-10 copies) measured with

16 qPCR replicates

  • Resampling to investigate impact of increased number of

replicates & calibrator samples

  • Conclusion
  • 8 qPCR replicates and 3 calibrator samples are required for

CN analysis at increased copy numbers

  • Still relatively large deviation from expected copy number in

proof of concept study

slide-22
SLIDE 22

S1 S2 S3 S4 S5 S6 S7 S8

Case 1 – proof of concept 1

  • Copy numbers from duplex assay – gene 1 (performed in

triplicate)

  • bserved normalized copy numbers tightly agree with expected

integer copies

expected CN: 0 0 1 2 3 4 5 5

Copy number

slide-23
SLIDE 23

Case 1 – proof of concept 2

  • Copy numbers from duplex assay – gene 2 (performed in

triplicate)

  • deviation from expected integer copies for samples 3 and 4

S1 S2 S3 S4 S5 S6 S7 expected CN: 1 1 4 4 3 0 1

Copy number

slide-24
SLIDE 24

Case 1 – getting integer copy numbers with ddPCR

  • Copy numbers from duplex assay – gene 2 (XbaI restriction

digest)

  • Restriction digest is required to properly count linked loci (here:

tandem repeats)

S1 S2 S3 S4 S5 S6 S7 expected CN: 1 1 4 4 3 0 1

Copy number

Restriction digest

slide-25
SLIDE 25

Case 1 - ddPCR versus qPCR

  • ddPCR has higher

accuracy than qPCR

  • 3.1 x lower standard

deviation on log2 copy numbers

  • 2.3 x smaller fold changes

between max and min copy number

  • Less reactions required

for ddPCR than for qPCR

  • ddCPR requires no

external standard or calibrator sample with known copy number

0.00# 1.00# 2.00# 3.00# 4.00# 0.00# 1.00# 2.00# 3.00# 4.00# 5.00# ddPCR% qPCR%

qPCR ddPCR

slide-26
SLIDE 26

Case 1 – ddPCR based genetic characterization of cell banks

  • Copy number
  • 24 samples – WCB
  • Duplex assay – gene 1
  • Expected CN: 5
  • Deviation from expected

CN

  • Average: 0.11
  • Standard deviation: 0.078

Copy number

01_WCB 02_WCB 03_WCB 04_WCB 05_WCB 06_WCB 07_WCB 08_WCB 09_WCB 10_WCB 11_WCB 12_WCB 13_WCB 14_WCB 15_WCB 16_WCB 17_WCB 18_WCB 19_WCB 20_WCB 21_WCB 22_WCB 23_WCB 24_WCB

0.05 0.1 0.15 0.2 0.25 0.3 01_WCB 02_WCB 03_WCB 04_WCB 05_WCB 06_WCB 07_WCB 08_WCB 09_WCB 10_WCB 1_WCB 12_WCB 13_WCB 14_WCB 15_WCB 16_WCB 17_WCB 18_WCB 19_WCB 20_WCB 21_WCB 22_WCB 23_WCB 24_WCB

01_WCB 02_WCB 03_WCB 04_WCB 05_WCB 06_WCB 07_WCB 08_WCB 09_WCB 10_WCB 11_WCB 12_WCB 13_WCB 14_WCB 15_WCB 16_WCB 17_WCB 18_WCB 19_WCB 20_WCB 21_WCB 22_WCB 23_WCB 24_WCB

Deviation

slide-27
SLIDE 27

Case 1 – ddPCR based genetic characterization of cell banks

  • ddPCR is very well suited for transgene copy number

determination

  • Genetic characterization of cell banks for therapeutic

protein production

  • Transgene copy number analysis in genetically modified

(GM) crop research

  • Transgenic animal models
  • Remark: qPCR is the standard approach in biopharmaceutical

industry – will take some time to adopt ddPCR

slide-28
SLIDE 28

Case 2 – clinical genetics application

  • Detection of chromosomal aneuploidies
  • Proof of concept on post-natal samples
  • Future: non-invasive prenatal testing (NIPT)
  • Challenge to achieve accuracy and precision required to

quantify fetal copy numbers in prenatal samples based on low level fetal cfDNA in maternal blood (median amount of 10%)

slide-29
SLIDE 29

Case 2 – assay design and validation

  • Design of assays for a number of loci on chromosomes for which

copy number variations are most often found

  • Chromosome 21 (e.g. trisomy 21 or Down syndrome)
  • Chromosome 13 (e.g. trisomy 13 or Patau syndrome)
  • Chromosome 18 (e.g. trisomy 18 or Edwards syndrome)
  • Chromosome X & Y (e.g. Turner syndrome)
  • Empirical validation using qPCR
  • Standard curve (dilution series) à efficiency QC
  • Gel electrophoresis à specificity QC
slide-30
SLIDE 30

Case 2 – assay design and validation

  • Design of assays for a number of loci on chromosomes for which

copy number variations are most often found

  • Chromosome 21 (e.g. trisomy 21 or Down syndrome)
  • Chromosome 13 (e.g. trisomy 13 or Patau syndrome)
  • Chromosome 18 (e.g. trisomy 18 or Edwards syndrome)
  • Chromosome X & Y (e.g. Turner syndrome)
  • ddPCR
  • Chromosome specific assays (hydrolysis probe - FAM)
  • Reference assay (RPP30 – VIC)
  • Gradient PCR à standard protocol is suitable
  • gDNA dilution series
  • CNV duplex – 3 replicates
slide-31
SLIDE 31

Case 2 – copy numbers of control samples

Control ¡1 Control ¡2 Control ¡3 Control ¡4 female male female male

A-13q B-13q A-18p A-18q B-18q A-21q B-21q A-Xp A-Xq B-Xq A-Yp B-Yp 2.5 2 1.5 1 0.5 A-13q B-13q A-18p A-18q B-18q A-21q B-21q A-Xp A-Xq B-Xq A-Yp B-Yp 2.5 2 1.5 1 0.5 A-13q B-13q A-18p A-18q B-18q A-21q B-21q A-Xp A-Xq B-Xq A-Yp B-Yp 2.5 2 1.5 1 0.5 2.5 2 1.5 1 0.5 A-13q B-13q A-18p A-18q B-18q A-21q B-21q A-Xp A-Xq B-Xq A-Yp B-Yp

slide-32
SLIDE 32

Case 2 – copy numbers of cases

Case ¡5 Case ¡9 Case ¡18 female Turner trisomy 21 male trisomy 18 male

A-13q B-13q A-18p A-18q B-18q A-21q B-21q A-Xp A-Xq B-Xq A-Yp B-Yp C-21q 2.5 2 1.5 1 0.5 3.5 3 A-13q B-13q A-18p A-18q B-18q A-21q B-21q A-Xp A-Xq B-Xq A-Yp B-Yp C-21q 2.5 2 1.5 1 0.5 3.5 3 2.5 2 1.5 1 0.5 3.5 3 A-13q B-13q A-18p A-18q B-18q A-21q B-21q A-Xp A-Xq B-Xq A-Yp B-Yp C-21q

slide-33
SLIDE 33

Case 2 – proof of concept on post-natal samples

  • ddPCR is great for copy number analysis in majority of samples
  • Non-integer copy numbers may be observed in difficult samples
  • Accuracy and precision need improvements to allow for NIPT
  • ultrashort amplicons
  • improved cell-free DNA isolation method (300-1000 alleles

from 2 ml of plasma)

  • multigene normalization (also for gene expression!)
slide-34
SLIDE 34

Case 2 – optimization experiment design

  • Standard CNV protocol – duplex normalization
  • Triplicate ddPCR reactions
  • 14 duplex reactions
  • Each reaction contains one locus of interest (FAM) to be

normalized with reference locus (VIC)

  • Normalization against reference locus copy number in the

same reaction

slide-35
SLIDE 35

Case 2 – optimization experiment design

  • Improved CNV protocol – multigene normalization
  • Triplicate ddPCR reactions
  • 7 duplex reactions
  • Each reaction contains a FAM labeled assay and a HEX

labeled assay (à HEX as alternative to VIC (Zen / Iowa Black double quencher probes from IDT)

  • No a priori selection of reference gene locus
  • Normalization against all other autosomal chromosomes with

normal diploid copy number

slide-36
SLIDE 36

geNorm - multigene normalization

  • geNorm – cited more than 8000 times

Vandesompele et al., Genome Biology, 2002

slide-37
SLIDE 37

Case 2 – multigene normalization

  • Average deviation from integer copy numbers between different

normalization strategies deviation from integer CN

multigene normalization RPP30 normalization

Case 5 Case 6 Case 9 Case 16 Case 19 Case 20 Control 4 Control 3 Control 2 Control 1 0.000 0.010 0.020 0.030 0.040 0.050 0.060 0.070 0.080

multigene normalization RPP30 normalization Average 0.015 0.037 SD 0.008 0.025

slide-38
SLIDE 38

Case 2 – optimization experiment design

  • Results show that normalization using other autosomes improves

accuracy of copy numbers

  • Normalization based on absolute autosomal counts reduces

running cost by 50%

slide-39
SLIDE 39

Advanced digital PCR data-analysis

  • Vynck et al., submitted
  • GLMM framework (R and Shiny web app)
  • handles replicate wells
  • multiple reference gene normalization
  • automatic selection and application of stable reference

genes

slide-40
SLIDE 40

20 samples, 3 replicates each, ~ 14,000 droplets, negative fraction 80-90%, 95% CI

Results from oncogene detection in cell-free DNA from plasma

0" 0.5" 1" 1.5" 2" 2.5" 3" 3.5" 4" 4.5" 5"

2.0 3.0 1.0

slide-41
SLIDE 41
  • in 8/10, there was a perfect agreement on oncogene

amplification status

  • in 2/10, there is no agreement
  • fresh frozen is only marginally elevated (tumor

heterogeneity)

  • tumor DNA

2.068 (95% CI 2.017-2.121) > elevated

  • cfDNA

2.009 (95% CI 1.933-2.089) > normal

Comparison of plasma cfDNA and fresh frozen tumor DNA

slide-42
SLIDE 42

More narrow CI with proper statistical processing of replicates

0.25% 0.5% 1% 2% 4% 8% 16% 32% 64% 128% 256% 512% 1024% 0.25% 0.5% 1% 2% 4% 8% 16% 32% 64% 128% 256% 512% 1024%

meta-analysis GLMM

slide-43
SLIDE 43

More narrow CI with GLMM statistical processing of replicates

1" 2" 4" 1" 2" 4"

3:4 copies

  • ncogene:reference gene

(tumor) cfDNA without evidence of

  • ncogene amplification

cfDNA with signs of

  • ncogene amplification

(p<0.05) meta-analysis GLMM

slide-44
SLIDE 44
  • Jacobs et al., BMC Bioinformatics, 2014

Partition misclassification has largest impact on accuracy and precision

slide-45
SLIDE 45

Interactive tool to inspect sources of variance on absolute quantification

http://users.ugent.be/~bkjacobs/dPCR_VarComp/index.html

slide-46
SLIDE 46
  • Stochastic clustering approach that matches the intuition
  • Using the raw data from the QX100
  • Multistep approach
  • cluster center location (expectation maximization)
  • remove the rotation
  • univariate projection on each channel
  • robustly fit a normal null distribution on the negative peak
  • calculate the posterior probability to be negative with respect

to the channel for each droplet

  • combine both channels

Development of a framework for

  • bjective partition classification
slide-47
SLIDE 47

Find cluster centers and remove rotation

slide-48
SLIDE 48

Fit the null distribution and calculate posterior probability of the negatives

no rain rain

  • red = fitted distribution of the negatives
  • black = entire distribution
  • probability negative droplet = red/black in the projected point of the droplet
slide-49
SLIDE 49

Combine channels and label clusters based on max probability

slide-50
SLIDE 50

Gene copy number quantification

  • n digested high quality DNA
slide-51
SLIDE 51

Inhibition due to cDNA carryover

slide-52
SLIDE 52

Oncogene amplification in cfDNA

slide-53
SLIDE 53

Single channel data for low concentration target

slide-54
SLIDE 54

Single channel data for low concentration target

slide-55
SLIDE 55
  • better dealing with outlier droplets with lower than

negative amplitude (deviating droplet volumes?)

  • use combined estimated distribution of no template

reactions instead of theoretical normal distribution

Work in progress

slide-56
SLIDE 56

General conclusions (1)

  • ddPCR is a great tool for copy number analysis
  • no need for reference sample with known copy number
  • better accuracy and precision compared to qPCR
  • Points of attention
  • restriction digest is required to quantify linked loci (e.g.

tandem repeats)

  • Remaining challenges
  • non-integer copy numbers for difficult samples
  • further improve accuracy and precision to meet NIPT

requirements (for instance smaller amplicon size)

slide-57
SLIDE 57

General conclusions (2)

  • Power analysis is important (and easy)
  • interactive tool
  • Mathematical framework for combining replicates, selecting

reference genes, and multigene normalization

  • latent variable, complementary log-log link, GLMM
  • Vynck et al., submitted
  • Statistical framework for automated (objective) droplet

classification

  • Jacobs et al., work in progress
slide-58
SLIDE 58

Tips & tricks

slide-59
SLIDE 59

Template input (1)

  • ~1 copy per droplet (CPD) (highest precision is at 1.59)
  • range of 1-100 000 copies / 20 µl ddPCR reaction
  • 0.00005 - 5 CPD
  • 0.15
  • 0.1
  • 0.05

0.05 0.1 0.15

95% confidence interval fraction fraction positive droplets 0.11 0.22 0.36 0.51 0.69 0.92 1.20 1.61 2.30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 copies per droplet

1 well (20,000 droplets) 3 wells merged

slide-60
SLIDE 60

Template input (2)

  • maximum 25 v/v% unpurified digested gDNA or undiluted cDNA

to prevent inhibition (test using your own reagents)

  • DNA digest is required for gene copy number analysis, especially

for linked loci (not required for FFPE and cell-free DNA)

  • integrity of DNA/RNA is as important for dPCR as for qPCR
  • Vermeulen et al., Nucleic Acids Research, 2011
slide-61
SLIDE 61

ddPCR assay design guidelines

  • in house primerXL design pipeline
  • primer3 based
  • avoid SNPs (Lefever et al., Clinical Chemistry, 2013)
  • avoid secondary structures (UNAFold)
  • assess specificity (BiSearch / Bowtie)
  • target: FAM-IBFQ, reference HEX-IBFQ
  • amplicon length <70 nt if possible
  • primer Tm: 61-63 °C
  • probe Tm: 64-68 °C (65 opt)
  • probe length: 14-25 nt (18 opt)
  • HaeIII-compatible amplicons
slide-62
SLIDE 62

Separation of + and - droplets depend

  • n amplicon & probe length
  • amplicons >100

bp, positive intensities drop

  • rise in

negatives as probe length increases (> 25 nt)

slide-63
SLIDE 63

Gradient PCR allows selection of

  • ptimal annealing temperature
  • gradient from 55-65 °C
  • ptimal Ta, specificity check
slide-64
SLIDE 64

Duplex test validation

  • same quantification result as in singleplex
  • rthogonal droplet clusters in 2D plot
  • rthogonality of duplex assay can be improved by
  • Tm matching between target and reference assay
  • Droplet PCR Supermix (#186-3023) (adding more resources)