Cross-organism prediction of drug hepatotoxicity by sparse group - - PowerPoint PPT Presentation

cross organism prediction of drug hepatotoxicity by
SMART_READER_LITE
LIVE PREVIEW

Cross-organism prediction of drug hepatotoxicity by sparse group - - PowerPoint PPT Presentation

Cross-organism prediction of drug hepatotoxicity by sparse group factor analysis Tommi Suvitaival Juuso A. Parkkinen Seppo Virtanen Samuel Kaski July 19-20, 2013 CAMDA Starting point High-dimensional gene-expression Sparse


slide-1
SLIDE 1

Cross-organism prediction of drug hepatotoxicity by sparse group factor analysis

Tommi Suvitaival Juuso A. Parkkinen Seppo Virtanen Samuel Kaski

July 19-20, 2013 – CAMDA

slide-2
SLIDE 2

Starting point

High-dimensional gene-expression data from 3 types of organisms Sparse pathological data

  • f

rat in vivo

View 1 Human in vitro 2 Rat in vitro 3 Rat in vivo Treatments

Treatments

Necrosis Increased mitosis Cellular infiltration Change, eosinophilic Microgranuloma Hypertrophy Single cell necrosis Swelling Vacuolization, cytoplasmic Deposit, glycogen DEAD Degeneration, granular, eosinophilic Edema Proliferation Change, basophilic Anisonucleosis Cellular infiltration, mononuclear cell Proliferation, Kupffer cell Nodule, hepatodiaphragmatic Degeneration, acidophilic, eosinophilic Atypia, nuclear Deposit, lipid Change, acidophilic Vacuolization, nuclear Degeneration, hydropic Hematopoiesis, extramedullary Mineralization Fibrosis Ground glass appearance

Not found Found Finding types

slide-3
SLIDE 3

Starting point

High-dimensional gene-expression data from 3 types of organisms Sparse pathological data

  • f

rat in vivo

View 1 Human in vitro 2 Rat in vitro 3 Rat in vivo Treatments

Treatments

Necrosis Increased mitosis Cellular infiltration Change, eosinophilic Microgranuloma Hypertrophy Single cell necrosis Swelling Vacuolization, cytoplasmic Deposit, glycogen DEAD Degeneration, granular, eosinophilic Edema Proliferation Change, basophilic Anisonucleosis Cellular infiltration, mononuclear cell Proliferation, Kupffer cell Nodule, hepatodiaphragmatic Degeneration, acidophilic, eosinophilic Atypia, nuclear Deposit, lipid Change, acidophilic Vacuolization, nuclear Degeneration, hydropic Hematopoiesis, extramedullary Mineralization Fibrosis Ground glass appearance

Not found Found Finding types

  • 1. Can we replace the animal

study with in vitro assay?

  • 2. Can we predict the liver

injury in humans using toxicogenomics data from animals?

Rat in vivo Rat in vitro Human in vitro Components Views 1

slide-4
SLIDE 4

Group factor analysis (GFA)

≈ × Observed data Latent variables Factor loadings

View 1 2 3 Treatments 1 2 3 Components Real numbers Zero a c t i v e i n A ) a l l v i e w s B ) a s u b s e t

  • f

v i e w s C ) a s i n g l e v i e w

GFA: ≈ ×

slide-5
SLIDE 5

Making generalizations across organisms

Rat in vivo Rat in vitro Human in vitro Components Views 1

Shared components

◮ associations between views ◮ cross-view prediction

slide-6
SLIDE 6

GFA with sparsity (1)

≈ × Observed data Latent variables Factor loadings

View 1 2 3 Treatments 1 2 3 Components Real numbers Zero a c t i v e i n A ) a l l v i e w s B ) a s u b s e t

  • f

v i e w s C ) a s i n g l e v i e w Treatments Components

GFA: GFA with sparsity: ≈ × ≈ × ≈ ×

slide-7
SLIDE 7

GFA with and without sparsity

≈ × ≈ × ≈ × ≈ ×

slide-8
SLIDE 8

GFA with sparsity (2)

≈ × Observed data Latent variables Factor loadings

View 1 2 3 Treatments 1 2 3 Components Real numbers Zero a c t i v e i n A ) a l l v i e w s B ) a s u b s e t

  • f

v i e w s C ) a s i n g l e v i e w Treatments Components

GFA: GFA with sparsity: [ X(1) X(2) X(3) ] ; ; Z [ W(1) W(2) W(3) ] ; ; ≈ × ≈ × ≈ × ≈ ×

slide-9
SLIDE 9

Sparsity – why

Sparsity in the model is encouraged due to

  • 1. High dimensionality of the

gene expression microarray data sets

  • 2. Strong sparsity of the

pathology data ⇒ Sparsity in terms of variables

  • 3. Treatments heterogeneous

by their effects ⇒ Sparsity in terms of samples

slide-10
SLIDE 10

Sparsity – how

  • 1. Sparsity in terms of

variables ⇒ Spike-and-slab prior∗ for factor loadings matrix W

  • 2. Sparsity in terms of

samples ⇒ Spike-and-slab prior for latent variables Z

Probability density Value

slide-11
SLIDE 11

GFA – model

≈ × Observed data Latent variables Factor loadings

View 1 2 3 Treatments 1 2 3 Components Real numbers Zero active in A) all views B) a subset of views C) a single view

[ X(1) X(2) X(3) ] ; ; Z [ W(1) W(2) W(3) ] ; ; ≈ ×

x(m)

∼ N

  • zi·W(m), τ−1

m

I

  • zi·

∼ N (0, I) w(m)

∼ N  0, 1 α(m)

k

I   α(m)

k

∼ Gamma

  • a(α), b(α)

τm ∼ Gamma

  • a(τ), b(τ)

a(τ) b(τ) a(α) b(α) τ x(m)

W(m) α(m) zi·

m = 1...M i = 1...N

i: samples, m: views

slide-12
SLIDE 12

GFA with sparsity – model

≈ × Observed data Latent variables Factor loadings

View 1 2 3 Treatments 1 2 3 Components Real numbers Zero a c t i v e i n A ) a l l v i e w s B ) a s u b s e t

  • f

v i e w s C ) a s i n g l e v i e w Treatments Components

GFA: GFA with sparsity: [ X(1) X(2) X(3) ] ; ; Z [ W(1) W(2) W(3) ] ; ; ≈ × ≈ × ≈ × ≈ ×

GFA GFA with sparsity x(m)

∼ N

  • zi·W(m), τ−1

m

I

  • x(m)

∼ N

  • zi·W(m),
  • Λ(m)−1

zi· ∼ N (0, I) zik ∼ H(z)

k N

  • 0,

1 α(z) ik

  • +
  • 1 − H(z)

k

  • δ0

w(m)

∼ N

  • 0,

1 α(m) k

I

  • W(m)

dk

∼ H(m)

dk N

  • 0,

1 α(m) dk

  • +
  • 1 − H(m)

dk

  • δ0
slide-13
SLIDE 13

Data representation – gene expression

◮ Treatments that occur in all 3 types of organism:

◮ 119 compounds ◮ dosage levels middle & high ◮ time points 8/9 h & 24 h

◮ Average differential expression over the replicates of each

treatment

⇒ Treatment = sample for the model ⇒ Matching treatments between the 3 transcriptomic views Xhuman

in vitro, Xrat in vitro and Xrat in vivo

View 1 Human in vitro 2 Rat in vitro 3 Rat in vivo Treatments

slide-14
SLIDE 14

Data representation – histopathology of the liver

Grade-weighted count

  • f

each pathological finding type over the replicates of a treatment ⇒ Pathology view Yrat

in vivo with matching

treatments to the 3 transcriptomic views

Treatments

Necrosis Increased mitosis Cellular infiltration Change, eosinophilic Microgranuloma Hypertrophy Single cell necrosis Swelling Vacuolization, cytoplasmic Deposit, glycogen DEAD Degeneration, granular, eosinophilic Edema Proliferation Change, basophilic Anisonucleosis Cellular infiltration, mononuclear cell Proliferation, Kupffer cell Nodule, hepatodiaphragmatic Degeneration, acidophilic, eosinophilic Atypia, nuclear Deposit, lipid Change, acidophilic Vacuolization, nuclear Degeneration, hydropic Hematopoiesis, extramedullary Mineralization Fibrosis Ground glass appearance

Not found Found Finding types

slide-15
SLIDE 15

Results

Our tasks:

  • 1. Predict liver damage of rats in vivo based on cell-level

transcriptomic responses in the 3 types of model organisms

  • 2. Test how well the transcriptomic cell-level responses

generalize to known effects of the compounds on humans

slide-16
SLIDE 16

Analysis: model organisms’ generalizability to organ level

Training: Learn associa- tions between the views

◮ 3 transcriptomic

views Xhuman

in vitro,

Xrat

in vitro and Xrat in vivo ◮ Pathology view

Yrat

in vivo

Testing: Predict the patho- logical findings Yrat

in vivo ◮ Given one of the

transcriptomic views

slide-17
SLIDE 17

Analysis: model organisms’ generalizability to organ level

Training: Learn associa- tions between the views

◮ 3 transcriptomic

views Xhuman

in vitro,

Xrat

in vitro and Xrat in vivo ◮ Pathology view

Yrat

in vivo

Testing: Predict the patho- logical findings Yrat

in vivo ◮ Given one of the

transcriptomic views

  • 0.0

0.2 0.4 0.6 0.8 1.0 − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − Hypertrophy Swelling Nodule, hepatodiaphragmatic Microgranuloma Necrosis Single cell necrosis Cellular infiltration Vacuolization, cytoplasmic Proportion of test samples predicted more accurately than by other views

Relative performance of the gene expression views at predicting pathological findings

  • Human in vitro

Rat in vitro Rat in vivo

slide-18
SLIDE 18

Sparsity in the target view

◮ WTW reveals the

similarity of component activities between the variables

◮ Thanks to sparsity,

projections to many variables are 0

◮ The model

automatically decides which variables to explain by

  • A. coherent

components

  • B. noise parameter

Change, basophilic Swelling Microgranuloma Vacuolization, nuclear Degeneration, acidophilic, eosinophilic Cellular infiltration Change, eosinophilic Vacuolization, cytoplasmic Nodule, hepatodiaphragmatic Degeneration, granular, eosinophilic Hypertrophy Necrosis DEAD Atypia, nuclear Deposit, lipid Change, acidophilic Proliferation, Kupffer cell Mineralization Fibrosis Ground glass appearance Deposit, glycogen Degeneration, hydropic Cellular infiltration, mononuclear cell Anisonucleosis Hematopoiesis, extramedullary Edema Increased mitosis Proliferation Single cell necrosis Single cell necrosis Proliferation Increased mitosis Edema Hematopoiesis, extramedullary Anisonucleosis Cellular infiltration, mononuclear cell Degeneration, hydropic Deposit, glycogen Ground glass appearance Fibrosis Mineralization Proliferation, Kupffer cell Change, acidophilic Deposit, lipid Atypia, nuclear DEAD Necrosis Hypertrophy Degeneration, granular, eosinophilic Nodule, hepatodiaphragmatic Vacuolization, cytoplasmic Change, eosinophilic Cellular infiltration Degeneration, acidophilic, eosinophilic Vacuolization, nuclear Microgranuloma Swelling Change, basophilic

−5 5

slide-19
SLIDE 19

Prediction: drug hepatotoxicity based on gene expression

◮ Given Xrat in vivo,

predict Yrat

in vivo ◮ Same prediction task

using ℓ1-regularized linear regression

2 4 6

Performance of rat in vivo gene expression view at predicting pathological findings

Root mean squared error − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − GFA L1 Hypertrophy Swelling Nodule, hepatodiaphragmatic Microgranuloma Necrosis Single cell necrosis Cellular infiltration Vacuolization, cytoplasmic

slide-20
SLIDE 20

Translation over model organisms to humans

◮ How do the transcriptional changes in model organisms

generalize system-level effects in humans?

◮ Can the model learn structure relevant to the properties of the

compounds in an unsupervised way?

slide-21
SLIDE 21

Translation over model organisms to humans (1)

We quantify the success of translation by the retrieval of similar compounds

◮ Ground-truth:

  • A. Anatomical Therapeutic Chemical (ATC) Classification

System’s labels (level 4)

  • B. Drug-induced liver injury (DILI) labels

◮ Model: GFA with sparsity for the transcriptomic views of the

model organisms, Xhuman

in vitro, Xrat in vitro and Xrat in vivo ◮ Measure: Average precision in the retrieval of similar

compounds in the latent space

slide-22
SLIDE 22

Translation over model organisms to humans (2)

We quantify the success of translation by the retrieval of similar compounds ◮ Ground-truth:

  • A. Anatomical Therapeutic Chemical (ATC) Classification System’s labels

(level 4)

  • B. Drug-induced liver injury (DILI) labels

◮ Model: GFA with sparsity for the transcriptomic views of the model organisms, Xhuman

in vitro, Xrat in vitro and Xrat in vivo

◮ Measure: Average precision in the retrieval of similar compounds in the latent space

  • ATC

DILI 0.00 0.05 0.10 0.15 0.20 0.20 0.25 0.30 0.35 0.40 0.45 5 10 15 20 5 10 15 20

Number of nearest samples considered for average precision Mean average precision

Method

  • GFA

Random

Size of the neighborhood for retrieval

slide-23
SLIDE 23

Conclusions

◮ GFA reveals associations between the views ◮ Associations indicate what generalizes between the views ◮ Sparsity helps in this decision ◮ Latent representation allows us to explore structure in the

data in an unsupervised way

Observed data Latent variables Factor loadings

View 1 2 3 1 2 3 Real numbers Zero active in A) all views B) a subset of views C) a single view Treatments Components

≈ × ≈ ×

slide-24
SLIDE 24

Discussion

We can

◮ analyse the similarity of model organisms ◮ learn what generalizes from the model organisms to humans

Observed data Latent variables Factor loadings

View 1 2 3 1 2 3 Real numbers Zero active in A) all views B) a subset of views C) a single view Treatments Components

≈ × ≈ ×

Funding:

The Academy of Finland ◮ Finnish Centre of Excellence in Computational Inference Research COIN, 251170 ◮ Computational Modeling of the Biological Effects of Chemicals, 140057

Finnish Doctoral Programme in Computational Sciences FICS

Helsinki Doctoral Programme in Computer Science