Frontiers of Metrology in Biology 26 th CGPM 2018 Marc Salit Joint - - PowerPoint PPT Presentation

frontiers of metrology in biology
SMART_READER_LITE
LIVE PREVIEW

Frontiers of Metrology in Biology 26 th CGPM 2018 Marc Salit Joint - - PowerPoint PPT Presentation

Frontiers of Metrology in Biology 26 th CGPM 2018 Marc Salit Joint Initiative for Metrology in Biology NIST, Stanford University and SLAC Frontiers of Metrology in Biology 26 th CGPM 2018 Marc Salit Joint Initiative for Metrology in Biology


slide-1
SLIDE 1

Frontiers of Metrology in Biology

26th CGPM 2018 Marc Salit Joint Initiative for Metrology in Biology NIST, Stanford University and SLAC

slide-2
SLIDE 2

Frontiers of Metrology in Biology

26th CGPM 2018 Marc Salit Joint Initiative for Metrology in Biology NIST, Stanford University and SLAC

slide-3
SLIDE 3
slide-4
SLIDE 4

There are 1013 cells in a human (human cells). There are 1014 microbial cells in a human. There are ~1010 carbon atoms in a human cell.

This is a 192 x 128 grid at 18 droplets per cm. It contains 7212 droplet transfers of 2.5 nl/droplet. Each droplet contained about 1000 cells. Cells were grown for ~24h.

slide-5
SLIDE 5

ACCELERATION VELOCITY AREA VOLUME ABSORBED DOSE DOSE EQUIVALENT PRESSURE, STRESS FORCE ENERGY, WORK, QUANTITY OF HEAT POTENTIAL, ELECTROMOTIVE FORCE CAPACITANCE ELECTRIC CHARGE ACTIVITY

(OF A RADIONUCLIDE)

CONDUCTANCE RESISTANCE INDUCTANCE MAGNETIC FLUX DENSITY MAGNETIC FLUX CELSIUS TEMPERATURE LUMINOUS FLUX ILLUMINANCE THERMODYNAMIC TEMPERATURE

K

LUMINOUS INTENSITY

cd

ELECTRIC CURRENT

A

AMOUNT OF SUBSTANCE

mol

TIME

s

LENGTH

m

MASS

kg

lm lx T Wb °C H

W

S V C

CATALYTIC ACTIVITY

kat F W Bq

FREQUENCY

Hz J Gy Sv Pa N

m3 m/s m2 m/s2

rad

PLANE ANGLE

sr

SOLID ANGLE POWER, HEAT FLOW RATE t/°C = T/K – 273.15 coulomb farad siemens

  • hm

degree Celsius lumen (C/V) (A·s) katal (mol/s) (1/W) (V/A) (K) (cd·sr) becquerel (1/s) (1/s) hertz radian steradian (m2/m2 = 1) (m/m = 1) gray sievert pascal newton joule watt volt henry tesla lux weber (J/s) (Wb/A) (V·s) (W/A) (Wb/m2) (N·m) (N/m2) (J/kg) (J/kg) (kg·m/s2) (lm/m2)

kelvin candela ampere mole second meter kilogram

SI DERIVED UNITS WITH SPECIAL NAMES AND SYMBOLS SI BASE UNITS

Derived units without special names Solid lines indicate multiplication, broken lines indicate division

https://physics.nist.gov/cuu/Units/SIdiagram.html

slide-6
SLIDE 6

The Subway (in Paris, The Metro) Diagram “Wait… there’s no stop for Biology?!?”

ACCELERATION VELOCITY AREA VOLUME ABSORBED DOSE DOSE EQUIVALENT PRESSURE, STRESS FORCE ENERGY, WORK, QUANTITY OF HEAT POTENTIAL, ELECTROMOTIVE FORCE CAPACITANCE ELECTRIC CHARGE ACTIVITY

(OF A RADIONUCLIDE)

CONDUCTANCE RESISTANCE INDUCTANCE MAGNETIC FLUX DENSITY MAGNETIC FLUX CELSIUS TEMPERATURE LUMINOUS FLUX ILLUMINANCE THERMODYNAMIC TEMPERATURE

K

LUMINOUS INTENSITY

cd

ELECTRIC CURRENT

A

AMOUNT OF SUBSTANCE

mol

TIME

s

LENGTH

m

MASS

kg

lm lx T Wb °C H

W

S V C

CATALYTIC ACTIVITY

kat F W Bq

FREQUENCY

Hz J Gy Sv Pa N

m3 m/s m2 m/s2

rad

PLANE ANGLE

sr

SOLID ANGLE POWER, HEAT FLOW RATE t/°C = T/K – 273.15 coulomb farad siemens

  • hm

degree Celsius lumen (C/V) (A·s) katal (mol/s) (1/W) (V/A) (K) (cd·sr) becquerel (1/s) (1/s) hertz radian steradian (m2/m2 = 1) (m/m = 1) gray sievert pascal newton joule watt volt henry tesla lux weber (J/s) (Wb/A) (V·s) (W/A) (Wb/m2) (N·m) (N/m2) (J/kg) (J/kg) (kg·m/s2) (lm/m2)

kelvin candela ampere mole second meter kilogram

SI DERIVED UNITS WITH SPECIAL NAMES AND SYMBOLS SI BASE UNITS

Derived units without special names Solid lines indicate multiplication, broken lines indicate division

slide-7
SLIDE 7

JIMB focusing on Operational Mastery of Living Matter

JIMB is focused on Operational Mastery of living matter at the cellular level.

  • Organizing principle:

“Measure, Model, Make”

  • Through Genomics and

Synthetic Biology

  • measure everything inside

the cell… Not focusing on metrology of biomaterials properties, medical diagnostics, biotherapeutics, regenerative medicine, diagnostic imaging…

slide-8
SLIDE 8

What’s different about metrology in biology?

Characterizing living matter requires measuring massively multiplexed measurands of heterogeneous systems with complex dynamics and interactions.

  • A living cell is a dance of interacting

chemical systems governed by biophysics.

  • The cell is the atom of biology.

DNA

Genome

RNA

Transcriptome

Protein

Proteome

slide-9
SLIDE 9

There was a revolution in measuring biology in 2006.

Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome

Jay Shendure,1*. Gregory J. Porreca,1*. Nikos B. Reppas,1 Xiaoxia Lin,1 John P. McCutcheon,2,3 Abraham M. Rosenbaum,1 Michael D. Wang,1 Kun Zhang,1 Robi D. Mitra,2 George M. Church1

We describe a DNA sequencing technology in which a commonly available, inexpensive epifluorescence microscope is converted to rapid nonelectrophoretic DNA sequencing automation. We apply this technology to resequence an evolved strain of Escherichia coli at less than one error per million consensus bases. A cell-free, mate-paired library provided single DNA molecules that were amplified in parallel to 1-micrometer beads by emulsion polymerase chain reaction. Millions of beads were immobilized in a polyacrylamide gel and subjected to automated cycles of sequencing by ligation and four-color imaging. Cost per base was roughly one-ninth as much as that of conventional sequencing. Our protocols were implemented with off-the-shelf instrumentation and reagents. The ubiquity and longevity of Sanger sequenc- ing (1) are remarkable. Analogous to semicon- ductors, measures of cost and production have followed exponential trends (2). High-throughput centers generate data at a speed of 20 raw bases per instrument-second and a cost of $1.00 per raw kilobase. Nonetheless, optimizations of elec- trophoretic methods may be reaching their lim-

  • its. Meeting the challenge of the $1000 human

genome requires a paradigm shift in our under- lying approach to the DNA polymer (3). Cyclic array methods, an attractive class

  • f alternative technologies, are Bmultiplex[ in

that they leverage a single reagent volume to enzymatically manipulate thousands to mil-

Shendure, J., Porreca, G. J., Reppas, N. B., Lin, X., McCutcheon, J. P., Rosenbaum, A. M., … Church, G. M. (2005). Accurate multiplex polony sequencing of an evolved bacterial genome. Science, 309(5741), 1728–1732. https://doi.org/10.1126/science.1117389

slide-10
SLIDE 10

Ge Genome Re Regulation Transcriptom

  • me Re

Reg. Pr Proteome Re Reg. Me Metabolome Or Organelle Ce Cell Ti Tissue Or Organism Sy System …

You can scan the landscape to frame a roadmap.

Lots of methods, reasonably characterized We’re pretty good at sequencing, but sampling presents challenges. Pretty good at RNA-Seq, reasonably characterized, sampling challenges. A couple of methods, still emerging Variety of methods, technically challenging Variety of methods, still emerging Enzyme activity measures, not ‘omics? It’s more granular than this – there’s work to do to roadmap our measurement capabilities.

slide-11
SLIDE 11

Community is reaching out for Standards.

  • Protocols
  • Data Representation
  • Data Exchange
  • Requirements

/Specifications

  • Calibration Materials
  • Validation/Benchmark

Materials

  • Validation/Benchmark

Data

from https://www.encodeproject.org based on an image from Darryl Leja (NHGRI), Ian Dunham (EBI), Michael Pazin (NHGRI)

slide-12
SLIDE 12

This 2012 Nature Comment triggered recognition of a “Re Reproducibility Crisis” in biomedical science…

slide-13
SLIDE 13

Taking a cue from Chemical Metrology… Reference Materials can work in Biology

  • Both RMs depicted

were created in consortium partnerships

  • Both are widely

adopted

  • Both address needs in

Genome-Scale Measurements Transcriptome Spike-ins ERCC Controls Human Genomes GIAB Genomes

Quantitative Q u a l i t a t i v e

slide-14
SLIDE 14

Use the ERCC Reference Material plasmid library to make controls

SRM 2374 Plasmid DNA Library in vitro transcription RNA transcripts Pooling Pools with known abundance ratios

slide-15
SLIDE 15

Design of Ambion ERCC Spike-In Ratio Mixtures

23 Controls per Subpool D e s i g n a b u n d a n c e s p a n s 220 r a n g e w i t h i n e a c h S u b p

  • l
slide-16
SLIDE 16

erccdashboard gives standard measures of technical performance

  • Technology-independent

ratio performance measures

  • Shows differences in

performance across

  • Experiments
  • Laboratories
  • Measurement processes

Munro, S. A. et al. Nat.

  • Commun. 5:5125

doi: 10.1038/ncomms6125 (2014).

slide-17
SLIDE 17

Evaluate Dynamic Range Performance Evaluate Ratio Performance – “MA Plot” Evaluate Diagnostic Performance – “ROC Curve” Establish Lower Limit of Detection for Differential Expression Detection – “LODR”

slide-18
SLIDE 18

Good Lab

slide-19
SLIDE 19

Bad Lab

slide-20
SLIDE 20

Genome in a Bottle Consortium is making and disseminating human genome reference materials.

Sample gDNA isolation Library Prep Sequencing Alignment/Mapping Variant Calling Confidence Estimates Downstream Analysis

  • create shared reference samples
  • validation materials to evaluate,

demonstrate, refine, optimize technologies

  • red light/yellow light…
  • developed benchmarking dashboard

with stakeholders @ GA4GH

  • meeting needs for technology

developers, regulators, clinical research teams

genome sequencing measurement process

slide-21
SLIDE 21

GIAB “Open Science” Virtuous Cycle

Users analyze GIAB Samples Benchmark

  • vs. GIAB

data Critical feedback to GIAB Integrate new methods New benchmark data

Method development,

  • ptimization, and

demonstration Part of assay validation GIAB/NIST expands to more difficult regions

Reference data

  • phased variant calls across 7

human genomes

  • ~ 4M small variants
  • ~20,000 larger “structural

variants”

All data available immediately without embargo

  • consistent with

transparency and metrology

slide-22
SLIDE 22

Evolving with Technologies: Single-molecule nanopore sequencer

slide-23
SLIDE 23

We’re Accumulating Nanopore Coverage

N50 and Coverage

  • ver time

5 longest mapped reads

slide-24
SLIDE 24

Important characteristics of benchmark calls

What does “reference standard” mean?

Accurate

  • high-confidence variants,

genotypes, haplotypes, and regions

  • compared to the benchmark, the

majority of differences (FPs/FNs) are errors in the method

Representative examples

  • different types of variants in

different genome contexts

Comprehensive characterization

  • many examples of different variant

types/genome contexts

  • eventually, diploid assembly

benchmarking

slide-25
SLIDE 25

Important characteristics of benchmark calls

What does “reference standard” mean?

Accurate

  • high-confidence variants,

genotypes, haplotypes, and regions

  • compared to the benchmark, the

majority of differences (FPs/FNs) are errors in the method

Representative examples

  • different types of variants in

different genome contexts

Comprehensive characterization

  • many examples of different variant

types/genome contexts

  • eventually, diploid assembly

benchmarking

E s t a b l i s h i n g s h a r e d r e f e r e n c e s , a n a l y t i c a l a p p r

  • a

c h e s , p e r f

  • r

m a n c e m e t r i c s , a n d b e s t p r a c t i c e s f

  • r

n

  • m

i n a l p r

  • p

e r t i e s i s e s s e n t i a l b i

  • m

e t r

  • l
  • g

y

slide-26
SLIDE 26

Single-cells – the atoms of biology!

Si Single-ce cell genomics is being widely adopted.

  • “Quantum” shift from

measuring bulk populations of heterogeneous cells

  • Innovation in ”Wet” lab
  • tissue disaggregation
  • including spatial location
  • single-cell processing
  • Innovation in ”Dry” lab
  • data management
  • meaningful analysis

Ananda L. Roy et al. Sci Adv 2018;4:eaat8573

slide-27
SLIDE 27

Building a human cell atlas with single-cell RNA-Seq

Figure 5. Retrospective samples from GTEx can be successfully profiled using single-nucleus RNA-

  • Seq. (A) Bulk gene expression profiles from all GTEx tissues. Hippocampus and frontal cortex sample clusters, from

which samples in (B) are obtained, are circled. (B) Single nucleus RNA-Seq (by DroNc-Seq) of hippocampus and frontal cortex samples from the GTEx collection. tSNE plots are colored by k-NN graph clustering and labeled post hoc by cell type. (C) Each cluster is supported by multiple individuals (from relevant tissue).

From Human Cell Atlas Whitepaper, accessed 11/14/2018

slide-28
SLIDE 28

What could we do with iPS Reference Material Sets?

PGP Individual Genome IPS Cell Line

Reference Material Set

Epigenome Transcriptome Translatome Proteome Differentiation Cell Type I Cell Type II Cell Type III Cell Type N Reference Material Set Predictive Systems Models, Iteratively Refined with Data

Reference Material Set

Epigenome Transcriptome Translatome Proteome Reference Material Set Reference Material Set

Develop reference sets from a single individual that represent a “body map” of functional ‘omes

  • use for model

development and validation

  • use as substrate for

technology development

  • benchmark sets for biology
slide-29
SLIDE 29

Essential genes of unknown function Unknown functions that are essential

“what is naturally alive I do not understand” — D. Endy “what I cannot create I do not understand” — R. Feynman

Organism Construction Coordination

(enabling operational mastery of living matter)

  • 1. Information
  • 2. Operation
  • 3. Measure & Model

Essential gene sets Abstracted functional modules APIs to (2) and (3) Cell-free & PURE Expression architectures, from gene to operon to genome Validated DNA via -omics Molecular ensembles via Cryo-EM Fluid physics ensemble dynamics

jimb.stanford.edu

slide-30
SLIDE 30

Frontiers of Metrology in Biology? What if…

  • NMIs establish biometrology
  • coupled to emerging needs
  • We figure out how to establish

metrics and comparability for results from complex algorithms

  • bioinformatics is part of the

measurement process

  • this isn’t new per se, but the

degree of complexity is significant

  • We develop more metrology of

“nominal properties”

  • is traceability a useful concept?
  • measurement uncertainty?
  • are there analogues to yield

compatability/comparability?

  • We consider metrology of

“Completeness” of Knowledgebases…

slide-31
SLIDE 31

The Jo Joint Ini nitiative for Metrology gy in n Biology gy was built to work in this space.

  • Collaborative home for

measurement science and standards for ‘omics and synthetic biology

  • NIST, Stanford University, and

private sector

  • operated by SLAC
  • Watch for series of workshops to

scope measurement science, measurement tool, and standards development

slide-32
SLIDE 32
slide-33
SLIDE 33

Tons of help from…

NIST Genome-Scale Measurements Group, MD & CA, and JIMB

Justin Zook Jenny McDaniel Lindsay Harris David Catoe Sarah Munro Scott Pine Noah Spies Sasha Levy Darach Miller Arend Sidow Drew Endy and The Genome in a Bottle Consortium and The External RNA Controls Consortium