Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays - PowerPoint PPT Presentation

Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21, 2014

Acknowledgements Biogazelle team & collaborators • • Biogazelle Ghent University • Steve Lefever • SEQC consortium • Christopher Mason • David Kreil • Leming Shi • Bio-Rad

Introduction • qPCR: reference technology for nucleic acid quanti fi cation • sensitivity and speci fi city • wide dynamic range • speed • relatively low cost • conceptual and practical simplicity • easy to perform ≠ easy to do it right • many steps involved • all need to be right

Assays & MIQE • design • amplicon length • primer positions (exonic or intron-spanning) • transcript coverage • in silico veri fi cation • speci fi city prediction (retropseudogenes and other homologues) • secondary structure analysis • empirical (wet lab) validation • speci fi city assessment (gel, melt, amplicon sequencing) • Cq of NTC (for SYBR assays) • ampli fi cation e ffi ciency determination (slope, E, SE(E), r ² )

The perfect assay properties • speci fi c for the gene of interest (no o ff -target ampli fi cation) • detection of all transcript variants • detection not a ff ected by polymorphisms (no allelic bias or drop out) • ampli fi cation e ffi ciency ~100% • no gDNA co-ampli fi cation • no primer dimer formation

The perfect assay

The perfect assay ... or the best possible • For some genes, there is no perfect assay • no unique sequence (homology with other genes – pseudogenes) • no common sequence among all transcripts • regions are excluded because of repeats, secondary structures, SNPs, homology, ... • Make the best possible compromise and report potential issues • Design à in silico quality control à lab validation

Assay design using primerXL • database of genomic information (transcripts, SNPs, ...) • tools for target region selection (maximize transcript coverage) • primer3 design engine • analysis of secondary structures and SNPs in primer annealing regions • speci fi city prediction (BiSearch) • relaxation cascade (from perfect to best possible)

BiSearch speci fi city prediction • • BiSearch loose BiSearch strict • • 1222222222222222 1233333333333

BiSearch speci fi city prediction • • BiSearch loose BiSearch strict • • 1222222222222222 1233333333333 • only the gene of interest (FFAR2) reads ¡ seq ¡ gene_list ¡ o ffi cial_symbol ¡ location ¡ 2843 ¡ CATGGCAGTCACCATCTTCTGCTACTGGCGTTTTGTGTGGATCATGCTCTCCCAGCCC ENSG00000126262 ¡ FFAR2 ¡ 19:35940617-359 CTTGTGGGGGCCCAGAGGCGGCGCCGAGCCGTGGGGCTGGCTGTGGTGACGC 42667 ¡ TGCTCAATTTCCTGGTGTGCTTCGGACCTTACAGATCGGAA 1897 ¡ GTAAGGTCCGAAGCACACCAGGAAATTGAGCAGCGTCACCACAGCCAGCCCC ENSG00000126262 ¡ FFAR2 ¡ 19:35940617-359 ACGGCTCGGCGCCGCCTCTGGGCCCCCACAAGGGGCTGGGAGAGCATGATCC 42667 ¡ ACACAAAACGCCAGTAGCAGAAGATGGTGACTGCCATGAGATCGGAA 1535 ¡ GTAAGGTCCGAAGCACACCGAGAGCTGGGAGCAGGAGCTACACAGTCTGCTGG ENSG00000141456 ¡ AC091153.1 ¡ 17:4574680-4607 CCTCACTGCACACCCTGCTGGGGGCCCTGTACGAGGGAGCAGAGACTGCTCCT 632 ¡ GTGCAGAATGAAGGCCCTGGGGTGGAGATGCTGCTGTCCTCAGAA 1097 ¡ CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATT ENSG00000141456 ¡ AC091153.1 ¡ 17:4574680-4607 CTGCACAGGAGCAGTCTCTGCTCCCTCGTACAGGGCCCCCAGCAGGGTGTGCA 632 ¡ GTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGG 1091 ¡ CATGGCAGTCACCATCTTCTGAGGACAGCAGCATCTCCACCCCAGGGCCTTCATT ENSG00000141456 ¡ AC091153.1 ¡ 17:4574680-4607 CTGCACAGGAGCAGTCTCTGCTCCCTCGTACAGGGCCCCCAGCAGGGTGTGCA 632 ¡ GTGAGGCCAGCAGACTGTGTAGCTCCTGCTCCCAGCTCTCGGT

Wet lab validation setup • PCR composition • total volume: 5 µl • instrument: CFX384 (with automation) • mastermix: SsoAdvanced SYBR • primer conc: 250 nM each • PCR program • default cycling protocol for SsoAdvanced SYBR (Ta=60°C) • Samples • cDNA: 25 ng (total RNA equivalents – Agilent Universal human reference RNA = MAQC A) • gDNA: 2.5 ng (Roche) • NTC: water + carrier (5 ng/ μ l yeast transfer RNA) • synthetic template (pooled 60-mers in concentration range: 20 M – 20 copies)

Wet lab validation some numbers 305 m • lab validation of 103 053 assays (human, mouse and rat coding genes) • 1 456 142 reactions • 3 822 PCR plates (384-well) • equivalent to 15 288 PCR plates (96-well)

Ampli fi cation e ffi ciency synthetic templates • initial publication: Vermeulen et al., Nucleic Acids Research, 2009 • Biogazelle approach (easy & cost e ff ective) • 60-mer 30 nt 5’ 30 nt 3’ • no modi fi cations, standard desalted • 7 points dilution series: 20 000 000 > 20 molecules • equivalent to full length double stranded template ds template ss oligo r ² <0.99 1 1 median E 2.00 2.01 average E 2.00 2.01 count E <> [1.90-2.10] 1 3 paired t-test p-value 0.14 • limitation: behavior of fi rst cycles amplifying from cDNA are not evaluated

Ampli fi cation e ffi ciency distribution (n = 50 133) 89%

Ampli fi cation e ffi ciency distribution (n = 50 133) redesign 89% redesign

Speci fi city NGS for increased sensitivity • amplicon sizing ( + melt analysis for SYBR assays) • limited sensitivity for detecting low level non-speci fi c coampli fi cation • failure to observe non-speci fi c ampli fi cation of sequences with similar size and/or Tm e.g. expressed pseudogenes or homologous genes • Next level of speci fi city assessment • in silico speci fi city predictions by BiSearch • massively parallel sequencing of pooled PCR products • average coverage > 1000-fold à lab speci fi city > 99.9% • 50 – 200 times more sensitive than size analysis and Sanger sequencing

Speci fi city most assays are 100% on-target

Speci fi city 2/3 of non-speci fi c assays may go unnoticed without NGS 100% 0.9 < x < 1 0.8 < x < 0.9 0.7 < x < 0.8 75% % on-target 0.6 < x < 0.7 0.5 < x < 0.6 50% 0.4 < x < 0.5 0.3 < x < 0.4 0.2 < x < 0.3 25% 0.1 < x < 0.2 0 < x < 0.1 0% 0% 20% 40% 60%

Speci fi city the power of in silico veri fi cation perfect 60 293 86% acceptable 5 866 8% (<10% non-speci fi c) predicted non-speci fi city 1 204 2% (no speci fi c design found) failing speci fi city QC criteria 2 467 4%

MIQE compliant PrimePCR assay validation data sheet

Dynamic range gene count 1000 1500 2000 2500 3000 3500 4000 4500 5000 500 0 16 777.216 8 388.608 4 194.304 2 097.152 1 048.576 524.288 262.144 131.072 human 65.536 > 10 000 000 fold 32.768 copies per cell 16.384 mouse 8.192 4.096 2.048 rat 1.024 0.512 0.256 0.128 0.064 0.032 0.016 0.008 0.004 0.002 0.001

SEQC • multisite, cross-platform analysis of RNAseq • FDA sponsored and guided MAQC-III • Nature Biotechnology, Sept 2014 Focus on RNA sequencing quality control (SEQC) 2 Biogazelle co-authors • MAQC samples reference RNA with built in controls – known truths • > 100 billion reads • compared against qPCR (PrimePCR)

RNAseq vs PrimePCR Di ff erential expression 454 ILMN PGM PRO 0.83 0.89 0.86 0.89 13,190 genes 16,264 genes 14,981 genes 16,242 genes

qPCR (PrimePCR) vs RNAseq (Illumina) r ² = 75% for genes detected by both platforms

qPCR (PrimePCR) vs RNAseq (Illumina)

Saturation analysis ABRF-NGS dataset GENCODE12 PrimePCR preparation ¡ sample ¡ libraries ¡ reads ¡ mapping ¡ mapping ¡ MAQC A ¡ 22 ¡ 5 304 M ¡ 1 955 M (37%) ¡ 1 692 M (32%) ¡ ribo- depleted ¡ MAQC B ¡ 17 ¡ 3 370 M ¡ 1 447 M (43%) ¡ 1 193 M (35%) ¡ MAQC A ¡ 4 ¡ 427 M ¡ 291 M (68%) ¡ 278 M (65%) ¡ poly-A– enriched ¡ MAQC B ¡ 4 ¡ 446 M ¡ 323 M (72%) ¡ 297 M (67%) ¡

Saturation analysis ribo-depletion RNAseq - % of GENCODE12 100% 90% 80% 70% 60% 50% MAQC A - detection 40% MAQC B - detection 30% 20% 10% 0% 4 096 2 048 1 024 512 256 128 64 32 16 8 000 4 000 2 000 1 000 500 250 125 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000

Saturation analysis ribo-depletion RNAseq - % of GENCODE12 100% 90% 80% 70% 60% MAQC A - detection 50% MAQC B - detection MAQC A - quanti fi cation 40% MAQC B - quanti fi cation 30% 20% 10% 0% 4 096 2 048 1 024 512 256 128 64 32 16 8 000 4 000 2 000 1 000 500 250 125 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000

Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays - PowerPoint PPT Presentation

Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21, 2014 Acknowledgements Biogazelle team & collaborators Biogazelle Ghent University Steve Lefever SEQC

RNA-seq basics: From reads to differential expression COMBINE RNA-seq Workshop RNA sequencing

RNA World Hypothesis and RNA folding By Lixin Dai October 16, 2002 Outline: RNA World

Sequencing technology and assembly Sanger sequencing Sanger sequencing with radioactivity

Prediction of RNA-RNA Interaction slides by Mathias M ohl and Rolf Backofen ohl M.M c

Genomics Sequencing tech Sequencing tech: next generation What do we get from sequencing? How

RNA sequencing with the MinION at Genoscope Jean-Marc Aury jmaury@genoscope.cns.fr @J_M_Aury

Single-cell RNA-sequencing Ximena Ibarra-Soria CRUK Cambridge Institute RNA-Sequence Analysis

DNA AND RNA ATI TEAS SCIENCE DNA & RNA Questions related to DNA and RNA cover topics

Prediction of RNA-RNA-Interaction 20 1 15 1 5 10 20 5 10 20 15 10 1 15 5 1 20 10

PROTEIN SYNTHESIS RNA (ribonucleic acid) 3 types RNA DIFFERENCES 1. messenger RNA (mRNA) DNA

PROTEIN SYNTHESIS RNA (ribonucleic acid) 3 types RNA DIFFERENCES 1. messenger RNA (mRNA)

Introduction to RNA-Seq Mary Piper Bioinformatics Consultant and Trainer DataCamp RNA-Seq

Introduction to Single Cell RNA Sequencing Sarah Boswell Director of the Single Cell Core,

Introduction to Single Cell RNA Sequencing Sarah Boswell Director of the Single Cell Core,

CARNAC-LR: clustering genes expressed variants from long read RNA sequencing Camille Marchet ,

RNA-Sequencing analysis Markus Kreuz 25. 04. 2012 Institut fr Medizinische Informatik,

Diastaticus An Expos of Everyones Favorite Explosive Yeast Matt Linske Manager & Lead

Introduction to RNA-Seq David Wood Winter School in Mathematics and Computational Biology July

Detecting Epistatic Interactions Contributing to a Quantitative Trait: The Restricted Partition

iDASH Healthcare Privacy Protection Challenge Fei Yu feiy@stat.cmu.edu Carnegie Mellon

Health aspects of indoor air pollution in schools: Specific actions aimed at reducing the health

COMPANY PRESENTATION JULY 2018 Bob Bechard Executive Vice-President Corporate Development

Expert forecast on em erging chem ical risks related to OSH Chem ical substances at w ork:

Referrals Dr Suzanne Kelleher Overview- present waiting list 2 years Process up to 2017

Sambuz

Useful Links

Newsletter

Mail Us

Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays - PowerPoint PPT Presentation

Comparison of RNA sequencing with 19,319 lab validated RT-qPCR assays Jan Hellemans, PhD London, UK October 20-21, 2014 Acknowledgements Biogazelle team & collaborators Biogazelle Ghent University Steve Lefever SEQC

RNA-seq basics: From reads to differential expression COMBINE RNA-seq Workshop RNA sequencing

RNA World Hypothesis and RNA folding By Lixin Dai October 16, 2002 Outline: RNA World

Sequencing technology and assembly Sanger sequencing Sanger sequencing with radioactivity

Prediction of RNA-RNA Interaction slides by Mathias M ohl and Rolf Backofen ohl M.M c

Genomics Sequencing tech Sequencing tech: next generation What do we get from sequencing? How

RNA sequencing with the MinION at Genoscope Jean-Marc Aury jmaury@genoscope.cns.fr @J_M_Aury

Single-cell RNA-sequencing Ximena Ibarra-Soria CRUK Cambridge Institute RNA-Sequence Analysis

DNA AND RNA ATI TEAS SCIENCE DNA &amp; RNA Questions related to DNA and RNA cover topics

Prediction of RNA-RNA-Interaction 20 1 15 1 5 10 20 5 10 20 15 10 1 15 5 1 20 10

PROTEIN SYNTHESIS RNA (ribonucleic acid) 3 types RNA DIFFERENCES 1. messenger RNA (mRNA) DNA

PROTEIN SYNTHESIS RNA (ribonucleic acid) 3 types RNA DIFFERENCES 1. messenger RNA (mRNA)

Introduction to RNA-Seq Mary Piper Bioinformatics Consultant and Trainer DataCamp RNA-Seq

Introduction to Single Cell RNA Sequencing Sarah Boswell Director of the Single Cell Core,

Introduction to Single Cell RNA Sequencing Sarah Boswell Director of the Single Cell Core,

CARNAC-LR: clustering genes expressed variants from long read RNA sequencing Camille Marchet ,

RNA-Sequencing analysis Markus Kreuz 25. 04. 2012 Institut fr Medizinische Informatik,

Diastaticus An Expos of Everyones Favorite Explosive Yeast Matt Linske Manager &amp; Lead

Introduction to RNA-Seq David Wood Winter School in Mathematics and Computational Biology July

Detecting Epistatic Interactions Contributing to a Quantitative Trait: The Restricted Partition

iDASH Healthcare Privacy Protection Challenge Fei Yu feiy@stat.cmu.edu Carnegie Mellon

Health aspects of indoor air pollution in schools: Specific actions aimed at reducing the health

COMPANY PRESENTATION JULY 2018 Bob Bechard Executive Vice-President Corporate Development

Expert forecast on em erging chem ical risks related to OSH Chem ical substances at w ork:

Referrals Dr Suzanne Kelleher Overview- present waiting list 2 years Process up to 2017

Sambuz

Useful Links

Newsletter

Mail Us

DNA AND RNA ATI TEAS SCIENCE DNA & RNA Questions related to DNA and RNA cover topics

Diastaticus An Expos of Everyones Favorite Explosive Yeast Matt Linske Manager & Lead