Insights from the first RT-qPCR based human transcriptome profiling - - PowerPoint PPT Presentation

insights from the first rt qpcr based human transcriptome
SMART_READER_LITE
LIVE PREVIEW

Insights from the first RT-qPCR based human transcriptome profiling - - PowerPoint PPT Presentation

Slide 1 of 38 Insights from the first RT-qPCR based human transcriptome profiling based on wet lab validated assays Jan Hellemans, PhD CEO Biogazelle qPCR & NGS 2013 Freising, Germany March 19, 2013 Biogazelle Slide 2 of 38 Slide 2


slide-1
SLIDE 1

Slide 1 of 38

Insights from the first RT-qPCR based human transcriptome profiling based on wet lab validated assays

Jan Hellemans, PhD CEO Biogazelle qPCR & NGS 2013 – Freising, Germany March 19, 2013

slide-2
SLIDE 2

Slide 2 of 38 Slide 2 of 38

Biogazelle

slide-3
SLIDE 3

Slide 3 of 38 Slide 3 of 38

The biogazelle team and collaborators

Biogazelle

  • Barbara D’haene
  • Pieter Mestdagh
  • Gaëlle Van Severen
  • Nele Nijs
  • Anthony Van Driessche
  • Manuel Luypaert
  • Shana Robbrecht
  • Ariane Deganck
  • Jo Vandesompele

Ghent University

  • Steve Lefever

VIB nucleomics core Bio-Rad

slide-4
SLIDE 4

Slide 4 of 38 Slide 4 of 38

qPCR: reference technology for nucleic acid quantification

sensitivity and specificity wide dynamic range speed relative low cost conceptual and practical simplicity

easy to perform ≠ easy to do it right

many steps involved all need to be right

Introduction

slide-5
SLIDE 5

Slide 5 of 38 Slide 5 of 38

R

relative quantification quality control statistical analysis

C

Prepare – cycle – report

P

experiment design samples assays

prepare cycle report

slide-6
SLIDE 6

Slide 6 of 38 Slide 6 of 38

R

relative quantification quality control statistical analysis

C

Prepare – cycle – report

P

experiment design samples assays

prepare cycle report

slide-7
SLIDE 7

Slide 7 of 38 Slide 7 of 38

Assays & MIQE

design

amplicon length primer positions (exonic or intron-spanning) transcript coverage

in-silico

specificity prediction (retropseudogenes and other homologues) secondary structure analysis

wet lab

specificity assessment (gel, melt, sequence) Cq of NTC (for SYBR assays) amplification efficiency determination (slope, E, SE(E), r²)

slide-8
SLIDE 8

Slide 8 of 38 Slide 8 of 38

Dealing with MIQE

DIY experts in qPCR

spend a lot of effort in doing it right

DIY novel to qPCR

adhering to the MIQE guidelines is a challenge

users of commercial assays

if they sell it, it must be good

slide-9
SLIDE 9

Slide 9 of 38 Slide 9 of 38

Dealing with MIQE

DIY experts in qPCR

spend a lot of effort in doing it right

à save time

DIY novel to qPCR

adhering to the MIQE guidelines is a challenge

à focus on biological question rather than technical qPCR challenges

users of commercial assays

if they sell it, it must be good

à have proof that it is good

slide-10
SLIDE 10

Slide 10 of 38 Slide 10 of 38

The perfect assay

Properties of the perfect assay

specific for the gene of interest (no off-target amplification) detection of all transcript variants detection not affected by polymorphisms (no allelic bias or drop out) amplification efficiency ~100% no gDNA co-amplification no primer dimer formation

slide-11
SLIDE 11

Slide 11 of 38 Slide 11 of 38

The perfect assay

Some genes cannot have a perfect assay

no unique sequences (homology with other genes – pseudogenes) not a single part of the gene occurs in all transcripts regions are excluded because of repeats, secondary structures, SNPs, homology, ...

Make the best possible compromise and report any potential issues Design à in-silico quality control à lab validation

slide-12
SLIDE 12

Slide 12 of 38 Slide 12 of 38

Assay designs

primerXL (UGent)

database of genomic information tools for target region selection

slide-13
SLIDE 13

Slide 13 of 38 Slide 13 of 38

Gene sequence fragmentation for target region selection

1 gene, 3 transcripts, 6 fragments (coverage frequency 1 to 3) gene ¡ transcript ¡1 ¡ transcript ¡2 ¡ transcript ¡3 ¡ 2 ¡ 3 ¡ 2 ¡ 3 ¡ 2 ¡ 1 ¡

slide-14
SLIDE 14

Slide 14 of 38 Slide 14 of 38

Assay designs

primerXL (UGent)

database of genomic information tools for target region selection primer3 based primer design analysis of secondary structures and SNPs in primer binding regions specificity prediction (BiSearch) relaxation cascade

slide-15
SLIDE 15

Slide 15 of 38

BiSearch specificity prediction

BiSearch loose

1222222222222222

  • nly the gene of interest

BiSearch strict

1233333333333

slide-16
SLIDE 16

Slide 16 of 38

BiSearch specificity prediction

BiSearch loose

1222222222222222

  • nly the gene of interest (FFAR2)

BiSearch strict

1233333333333

reads ¡ seq ¡ gene_list ¡

  • fficial_symbol ¡ loca8on ¡

2843 ¡ CATGGCAGTCACCATCTTCTGCTACTGGCGTTTTGTGTGGATCATGCTCTCCCAGCCCCTTGTGGGGGCCCAGAGG

CGGCGCCGAGCCGTGGGGCTGGCTGTGGTGACGCTGCTCAATTTCCTGGTGTGCTTCGGACCTTACAGATCGGAA

ENSG00000126262 ¡ FFAR2 ¡ 19:35940617-­‑35942667 ¡ 1897 ¡ GTAAGGTCCGAAGCACACCAGGAAATTGAGCAGCGTCACCACAGCCAGCCCCACGGCTCGGCGCCGCCTCTGGGCC

CCCACAAGGGGCTGGGAGAGCATGATCCACACAAAACGCCAGTAGCAGAAGATGGTGACTGCCATGAGATCGGAA

ENSG00000126262 ¡ FFAR2 ¡ 19:35940617-­‑35942667 ¡ 1535 ¡ GTAAGGTCCGAAGCACACCGAGAGCTGGGAGCAGGAGCTACACAGTCTGCTGGCCTCACTGCACACCCTGCTGGGG

GCCCTGTACGAGGGAGCAGAGACTGCTCCTGTGCAGAATGAAGGCCCTGGGGTGGAGATGCTGCTGTCCTCAGAA

ENSG00000141456 ¡ AC091153.1 ¡ 17:4574680-­‑4607632 ¡ 1097 ¡ CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATTCTGCACAGGAGCAGTCTCTGC

TCCCTCGTACAGGGCCCCCAGCAGGGTGTGCAGTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGG

ENSG00000141456 ¡ AC091153.1 ¡ 17:4574680-­‑4607632 ¡ 1091 ¡ CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATTCTGCACAGGAGCAGTCTCTGC

TCCCTCGTACAGGGCCCCCAGCAGGGTGTGCAGTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGGT

ENSG00000141456 ¡ AC091153.1 ¡ 17:4574680-­‑4607632 ¡

slide-17
SLIDE 17

Slide 17 of 38 Slide 17 of 38

Gene homology prevents perfect designs

1532 / 2043 (75%) of genes without perfect design have homologous genes that differ less than 12.5% (2 variations per 16 bases)

0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%

1 32 63 94 125 156 187 218 249 280 311 342 373 404 435 466 497 528 559 590 621 652 683 714 745 776 807 838 869 900 931 962 993 1024 1055 1086 1117 1148 1179 1210 1241 1272 1303 1334 1365 1396 1427 1458 1489 1520 1551 1582 1613 1644 1675 1706 1737 1768 1799 1830 1861 1892 1923 1954 1985 2016

distances (clustalW) between all genes without perfect design

slide-18
SLIDE 18

Slide 18 of 38 Slide 18 of 38

Wet lab validation

PCR composition

total volume: 5 ul instrument: CFX-384 (with automation)

  • mastermix: SsoAdvanced SYBR

primer conc: 250 nM each

PCR program

default cycling protocol for SsoAdvanced SYBR (Ta=60°C)

Samples

  • cDNA: 25 ng (total RNA equivalents – Agilent Universal human reference RNA)
  • gDNA: 2.5 ng (Roche)

NTC: water + carrier (5 ng/µl yeast transfer RNA) synthetic template (pooled 60-mers in concentration range: 2E7 – 2E1 copies)

slide-19
SLIDE 19

Slide 19 of 38 Slide 19 of 38

Some numbers

lab validation of 50 133 assays (human and mouse) 829 056 reactions 2 159 PCR plates (384-well) equivalent to 8 636 PCR plates (96 well)

172m

slide-20
SLIDE 20

Slide 20 of 38 Slide 20 of 38 Vermeulen et al., Nucleic Acids Research, 2009

55-mer standard desalted 3’ blocked to prevent elgongation 5 points dilution series: 150 000 molecules > 15 molecules

New approach: easier + cheaper + as good

60-mer first (5’) and last (3’) 30 nucleotides of amplicon sequence standard desalted no 3’ blocking 7 points dilution series: 20 000 000 > 20 molecules

Two generations of external oligonucleotide standards

stuffer FP 30 nt 3’ 30 nt 5’ RCRP

slide-21
SLIDE 21

Slide 21 of 38 Slide 21 of 38

Synthetic templates are equivalent to natural templates

ds template ss oligo r²<0.99 1 1 median E 2.00 2.01 average E 2.00 2.01 count E <> [1.90-2.10] 1 3 paired t-test p-value 0.14

20 200 2000 20000 200000 2000000 20000000 10 15 20 25 30 35

comparison between short ss synthetic template and full length ds template

> 300 assays

slide-22
SLIDE 22

Slide 22 of 38 Slide 22 of 38

Efficiency evaluation

amplification efficiency

6 orders of magnitude 20 – 20M copies linear over entire range LOD (LOQ) ≤ 20 molecules E in 90-110% range

slide-23
SLIDE 23

Slide 23 of 38 Slide 23 of 38

Efficiency distribution (n = 50 133)

89%

slide-24
SLIDE 24

Slide 24 of 38 Slide 24 of 38

Efficiency distribution (n = 50 133)

89% redesign redesign

slide-25
SLIDE 25

Slide 25 of 38 Slide 25 of 38

NGS as preferred method for specificity assessment

amplicon sizing ( + melt analysis for SYBR assays)

limited sensitivity for detecting low level non-specific coamplification failure to observe non-specific amplification of sequences with similar size and/or Tm

e.g. expressed pseudogenes or homologous genes

Next level of specificity assessment

in-silico specificity predictions by BiSearch massively parallel sequencing of pooled PCR products average coverage > 1000-fold à lab specificity > 99.9% 50 – 200 times more sensitive than size analysis and Sanger sequencing

slide-26
SLIDE 26

Slide 26 of 38 Slide 26 of 38

Most assays are 100% on-target

slide-27
SLIDE 27

Slide 27 of 38 Slide 27 of 38

2/3 of non-specific assays may go unnoticed without NGS

0% 25% 50% 75% 100% % on-target

assays with off-target reads

0% 10% 20% 30% 40% 50% 60% 0 < x < 0.1 0.1 < x < 0.2 0.2 < x < 0.3 0.3 < x < 0.4 0.4 < x < 0.5 0.5 < x < 0.6 0.6 < x < 0.7 0.7 < x < 0.8 0.8 < x < 0.9 0.9 < x < 1

slide-28
SLIDE 28

Slide 28 of 38 Slide 28 of 38

MIQE compliant PrimePCR assay validation data sheet

slide-29
SLIDE 29

Slide 29 of 38 Slide 29 of 38

MAQC

Micro-array quality control study

A = universal RNA B = brain RNA C = ¾ A + ¼ B D = ¼ A + ¾ B ~ 100,000 PCRs reproducibility titration response accuracy dynamic range

1 2 3 4

A C D B

slide-30
SLIDE 30

Slide 30 of 38 Slide 30 of 38

Reproducibility (n = 3 678)

Expression A = expression B à expression C = expression D

slide-31
SLIDE 31

Slide 31 of 38 Slide 31 of 38

Titration response (n = 18 437)

Expression A > Expression B à A > C > D > B

slide-32
SLIDE 32

Slide 32 of 38 Slide 32 of 38

Accuracy (n = 785)

No expression in A or B à C/D = 3 or 1/3 à dCq = 1.58

slide-33
SLIDE 33

Slide 33 of 38 Slide 33 of 38

dynamic range

500 1000 1500 2000 2500 3000 3500 4000 gene count copies per cell human mouse

> 10 000 000 fold

slide-34
SLIDE 34

Slide 34 of 38 Slide 34 of 38

qPCR detects more brain genes than micro-arrays (n = 12 853)

150 1 972 7 334

array qPCR

slide-35
SLIDE 35

Slide 35 of 38 Slide 35 of 38

SEQC

Comparison of our qPCR data to SEQC results

correlation for 10 784 common genes

slide-36
SLIDE 36

Slide 36 of 38

qPCR

Sum of detected copies of measured protein coding genes: ~500 million

NGS

high throughput gene expression profiling: 125 indexed samples (two concurrent HiSeq flow cells at about 25 million reads per sample) high resolution transcriptome analysis: more than 50 samples 100 million reads per sample

qPCR vs sequencing depth

slide-37
SLIDE 37

Slide 37 of 38 Slide 37 of 38

Human vs mouse

500 1000 1500 2000 2500 3000 3500

  • 15
  • 14
  • 13
  • 12
  • 11
  • 10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

gene count dCq (human-mouse)

slide-38
SLIDE 38

Slide 38 of 38 Slide 38 of 38

Conclusions

Assay design and in-silico validation

qPCR assays for protein coding genes in human and mouse Transcript coverage SNPs and secondary structures Specificity prediction

Lab validation

E in 90-110% range Stringent specificity analysis by NGS

qPCR based transcriptome profiling

High sensitivity and dynamic range MAQC screening allows for cross-platform comparison