Statistics in high- -content biology content biology Statistics in - - PowerPoint PPT Presentation

statistics in high content biology content biology
SMART_READER_LITE
LIVE PREVIEW

Statistics in high- -content biology content biology Statistics in - - PowerPoint PPT Presentation

Statistics in high- -content biology content biology Statistics in high Rebecca Walls Rebecca Walls Advanced Science & Technology Laboratory Advanced Science & Technology Laboratory Outline Outline Introduction and aim of


slide-1
SLIDE 1

Statistics in high Statistics in high-

  • content biology

content biology

Rebecca Walls Rebecca Walls

Advanced Science & Technology Laboratory Advanced Science & Technology Laboratory

slide-2
SLIDE 2

2 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Outline Outline

  • Introduction and aim of high-content biology
  • Predicting liver toxicity in vivo

in vivo

  • Distinguishing distinct modes of compound action
slide-3
SLIDE 3

3 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Current issues facing the pharmaceutical Current issues facing the pharmaceutical industry industry

  • All pharmaceutical companies face high attrition of

compounds through the discovery and development process

  • Two key issues that face project progression are
  • Safety and toxicity

Safety and toxicity

  • Efficacy in disease process

Efficacy in disease process

  • Need to know more about the mechanism of action

and toxicity of our compounds at an earlier stage in the discovery process

  • More information enables front-loading of risk, early

go/no-go decisions and improvements in toxicological attrition

slide-4
SLIDE 4

4 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

High High-

  • content biological assays

content biological assays

  • Attempt to use in vitro

in vitro cell models to mimic the complexity of an in vivo in vivo situation

  • Advanced imaging techniques used to generate

large, complex datasets describing the response of a population of cells to a compound

  • Aim is to build predictive models or ‘fingerprints’

from the multiparametric assay data for well- characterised compounds that elicit known responses

  • Fingerprints applied to new drugs to predict

biological mechanism of action and its toxicity

slide-5
SLIDE 5

5 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Cell culture Cell culture

Cells Media layer

  • Cells are extracted from some source

tissue e.g. rat hepatocytes, tumour derived cell-lines

  • Cells are plated into

multi-well plates, typically hundred or thousands of cells per well

  • Each well is like test tube

where we can test a single prototype drug

  • Cells grown in

the well can be labelled and imaged

slide-6
SLIDE 6

6 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

HCB cellular profiling HCB cellular profiling

Nucleus Nucleus DNA content Size Shape Cell division Fragmentation Micronuclei ER/ ER/Golgi Golgi Protein trafficking Secretion Mitochondria Mitochondria Viability Mass Activity Cellular distibution Pre-Apoptotic indicators Cytoskeleton Cytoskeleton Tubulin Actin Fibre content Length Mitotic arrest Apoptosis Apoptosis Membrane markers Blebbing Necrosis Cell Morphology Cell Morphology Count Area Form Roundness Length/Breadth Perimeter

General imaging indicators General imaging indicators

slide-7
SLIDE 7

7 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Statistical challenges Statistical challenges

  • Information captured for each feature is a dynamic

response to the compound over an 8-point dose- range

  • Datasets possess three-dimensional cube-like

structure

  • Traditional multivariate

approaches are difficult to apply to this type of data directly

FEATURES FEATURES COMPOUNDS COMPOUNDS DOSES DOSES

slide-8
SLIDE 8

8 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Case study 1: Predicting liver toxicity Case study 1: Predicting liver toxicity in in vivo vivo

  • Drug-induced liver toxicity is one of the most common causes of drug non-

approval

  • Early in vitro identification of compounds with hepatotoxic risk would allow their

de-selection early in the drug development process

In the animal In the lab

Cell Death (Necrosis) Fatty Liver (Steatosis) Phospholipidosis Cholestasis

slide-9
SLIDE 9

9 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Predicting Predicting steatosis steatosis -

  • data

data

  • Primary rat hepatotcytes treated with 60

compound set at a range of doses, consisting of known steatotics and non-steatotics

  • Bespoke algorithms designed to quantify

differences in localisation and morphology of lipid droplets in the cells

  • Generates 32 different continuous measurements

per cell

  • Averaged over cell population to give well-level

measurements for each compound and dose combination

  • Use partial least squares modelling (stepwise)

with the steatotic annotation as a binary response

Dose HT0053
  • 5000
5000 10000 15000 20000 0.5 1 5 10 50 100 500 1000 5000 10000 Dose HT1042
  • 5000
5000 10000 15000 20000 0.5 1 5 10 50 100 500 1000 5000 10000 Dose HT1102
  • 5000
5000 10000 15000 20000 0.5 1 5 10 50 100 500 1000 5000 10000
slide-10
SLIDE 10

10 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Polynomial model Polynomial model

  • Fit cubic polynomial to dose-response data for each feature
  • t-statistics for each term in cubic form a new set of variables
  • Only a small number of variables required to generate

greatest predictivity

  • After cross-validation, polynomial model is approximately

10% better than range model

  • 2
  • 1

1 2

  • 0.06 -0.04 -0.02 0.00 0.02 0.04

0.06 x y

Proportion of edge fat Proportion of edge fat – – non non-

  • steatotic

steatotic

  • 2
  • 1

1 2 0.00 0.05 0.10 x y

Proportion of edge fat Proportion of edge fat – – steatotic steatotic

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Specificity Sensitivity 50 variables 1 variable

slide-11
SLIDE 11

11 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Advantages of model Advantages of model

  • Based on predictive scores, compounds can be ranked in order of steatotic effect
  • Bootstrapping, incorporating random x-resampling, used to generate 95% confidence

intervals for the predicted score

  • High confidence, high steatotic effect compounds can be de-selected

1 4 7 11 15 19 23 27 31 35 39 43 47 51 55 59

  • 1.0
  • 0.5

0.0 0.5 1.0 1.5 Compounds Steatotic effect

slide-12
SLIDE 12

12 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Case study 2: Identifying distinct modes Case study 2: Identifying distinct modes

  • f compound action
  • f compound action
  • Morphology high content assay developed specifically to

examine microtubules and actin filaments as oncology targets –

  • Describes how drugs influence entire complex cellular phenotype

(i.e. multiple targets)

  • 102 compounds screened through the morphology assay
  • Primary aims are
  • Identify which compounds are active in the assay i.e. which

Identify which compounds are active in the assay i.e. which are ‘hits’? are ‘hits’?

  • Differentiate compound hits that have distinct morphological

Differentiate compound hits that have distinct morphological effects effects

  • Cluster hits together that have similar effects

Cluster hits together that have similar effects

  • 138 features for each compound, tested over 8 doses
  • 310 control wells
slide-13
SLIDE 13

13 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Principal components analysis Principal components analysis

  • PCA used in an attempt to reduce

dimension of dataset, yielding 6 principal components which explain close to 80%

  • f variation
  • Mahalanobis distance is powerful means
  • f determining how similar an unknown

sample is to a known one

  • Differs from Euclidean distance in that it

takes into account the covariance between variables

  • The Mahalanobis distance from a group
  • f values with mean μ=(μ1, μ2, …, μp)T

and covariance matrix Σ for multivariate vector x=(x1, x2, …, xp)T is defined as

) ( ) ( ) (

1

μ μ − Σ − =

− x

x x DM

slide-14
SLIDE 14

14 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Using the Mahalanobis distance Using the Mahalanobis distance

20 40 60 80 100 120 0.00 0.05 0.10 0.15

Squared Mahalanobis distances

Density

Non Non-

  • hits

hits Hits Hits

  • Working on the PCA scores on the 6 principal components,

the covariance matrix of the control cloud was calculated

  • For each compound at every dose, the squared

Mahalanobis distance to the centre of mass was calculated and compared to a chi-squared distribution with 6 degrees of freedom at some pre-chosen significance level, α.

  • An adjustment was made to control the false discovery rate
  • A compound with a significant result at at least

at least one of the doses along its range was deemed to be an ‘active hit’.

slide-15
SLIDE 15

15 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Distinguishing distinct phenotypes Distinguishing distinct phenotypes

Buffer Compound A Compound B Compound C Compound D Compound E Compound F F Compound G

  • Homogeneous nuclei

Homogeneous nuclei and cell shape and cell shape

  • Stabilised cell

Stabilised cell-

  • cell junctions

cell junctions – – results in ‘clumpy’ cells results in ‘clumpy’ cells

  • No single cells

No single cells

  • Aneuploidy

Aneuploidy – – big nuclei big nuclei

  • Increased cell size

Increased cell size

slide-16
SLIDE 16

16 Rebecca Walls, Non-Clinical Statistics Conference 2008, Leuven

Acknowledgements Acknowledgements

  • Discovery Statistics

Discovery Statistics

  • Chris

Chris Harbron Harbron

  • Advanced Science and Technology Laboratory

Advanced Science and Technology Laboratory

  • Ed

Ed Ainscow Ainscow

  • Neil

Neil Carragher Carragher

  • Andy

Andy Hargreaves Hargreaves

  • Mike Sullivan

Mike Sullivan

  • Helen Garside

Helen Garside

  • James Pilling

James Pilling

  • Lisa Rice

Lisa Rice

  • Tom

Tom Houslay Houslay

  • Peter

Peter Caie Caie

  • Alex

Alex Ingleston Ingleston-

  • Orme

Orme