[PPT] - VenomSeq A platform for drug discovery from animal venoms using di PowerPoint Presentation

SLIDE 1

VenomSeq

A platform for drug discovery from animal venoms using differential gene expression

Joseph D. Romano, Hai Li, Ronald Realubit, Charles Karan, and Nicholas Tatonetti  Columbia University   

presented 03/14/2018  AMIA 2018 Informatics Summit

SLIDE 2

Learning objectives

1. Understand the connection between animal venoms and novel

drug discovery

2. Conceptualize the VenomSeq methodology and how it is used

to generate differential expression profiles of human cell lines

n exposure to dilute animal venoms
3. Understand approaches to comparing novel differential gene

expression profiles to expression profiles from public sources, and how to perform data quality assessment on the analysis results

SLIDE 3

Toxinology

SLIDE 4

VenomSeq

+

IMR-32 cells 25 Venoms PLATE-Seq VIPER DEseq2 Expression profiles

LINCS data

LINCS profiles

?

SLIDE 5

VenomSeq in context

VenomSeq

+

IMR-32 cells 25 Venoms PLATE-Seq VIPER DEseq2 Expression profiles

LINCS data

LINCS profiles

? VenomKB

Tox-Prot
NCBI
PubMed
GO

VenomKB

SLIDE 6

PLATE-Seq

Technology developed in Califano and Sims labs—reduces cost of

RNA-Seq by approximately tenfold

Main idea: Barcode individual samples, pool, sequence together

(from Bush 2017)

SLIDE 7

VIPER: Virtual inference of protein activity

PLATE-Seq operates at relatively low depth (0.5-2 M reads/sample)
VIPER (with the aREA-3T algorithm) uses network inference to

recover accuracy and improve statistical power

(from Alvarez 2016)

Expression signatures (yellow) VIPER (cyan)

SLIDE 8

Obtaining venoms and cells

Snakes Spiders Scorpions Other arthropods Fish / molluscs Other reptiles Fish

SLIDE 9

Determining venom dosages

SLIDE 10

Determining venom dosages

Create 9 serial dilutions of venoms

(1000µg/ml to ~3.9µg/ml)

Perform viability assay (5x replicates)
Fit Hill slope equation to observed

means: 

Final dosage concentration: IC20

y = Bottom + (Top − Bottom) 1 + 10(log EC50−x) − 1

SLIDE 11

Sample preparation

1. Grow IMR-32 cells to 80% confluence
2. Seed 96-well plates; incubate 2 days
3. Reconstitute lyophilized venoms in ddH2O at 10X IC20
4. Add dissolved venoms to wells to 1X final concentration

SLIDE 12

Recap: Experimental conditions

Venoms

25 species

Cell line

IMR-32 (Human neuroblastoma)

Dosage

IC20 for each venom

Time points

6/24/36 hours post-treatment

Replicates

3 per time point per venom

Controls

12 water controls, 9 untreated

Solvent

Water

SLIDE 13

Analyzing VenomSeq data

VenomSeq

+

IMR-32 cells 25 Venoms PLATE-Seq VIPER DEseq2 Expression profiles

LINCS data

LINCS profiles

?

How do we do this?

SLIDE 14

Data-driven discovery from gene expression data

Public compendia contain millions of

expression profiles/signatures

Instead of posing a single hypothesis, can

we let the data suggest new hypotheses?

Auxiliary areas of focus:
Cross-platform data normalization
Multiple hypothesis testing correction
Dataset annotation techniques
Subsequent preclinical validation

(Sirota 2011) (Chen 2016)

SLIDE 15

Quality Control

Results

SLIDE 16

VenomSeq   expression profiles/signatures

Gene

N. nivea 6h P. fasciata 6h
R. marina 6h P. volitans 6h

…

CHTOP

148 96 133 60 …

SNAPIN

72 17 17 13 …

ILF2

806 447 496 417 …

NPR1

0 …

INTS3

12 12 10 12 …

SLC27A3

18 14 12 1 …

LOC343052

0 …

GATAD2B

295 117 446 220 …

DENND4B

80 28 37 38 …

CRTC2

43 17 35 18 …

SLC39A1

328 241 373 236 …

…

… … … … Gene baseMean log2FC lfcSE t-statistic p-value p-adj S100A10 30.32 2.04 0.52 3.90 9.66E-05 0.034 S100A11 199.65 1.94 0.48 4.03 5.57E-05 0.027 PFDN2 1508.04 0.57 0.14 4.14 3.40E-05 0.024 SELL 14.77 2.54 0.54 4.73 2.23E-06 0.004 KIAA 2.88 2.98 0.73 4.06 4.82E-05 0.027 SYT2 9.11 2.59 0.60 4.32 1.59E-05 0.016 … … … … … … …

Raw count matrix (all samples) Significant differentially expressed genes (Naja nivea) Results

SLIDE 17

Manual review of Apis mellifera  expression signature

HUGO symbol Gene ID Log2FC Description ZNF609 23060

1.260

Involved in myoblast proliferation UBE2G2 7327

0.720

Conjugates ubiquitin to mark proteins for degradation in the endoplasmic reticulum PSMA4 5685

0.528

Encodes a core subunit of the proteasome complex CKAP2 26586 0.655 Stabilizes microtubules and regulates p53- mediated cell division

Results

SLIDE 18

Matching VenomSeq to   CMap data

Approach: Using Connectivity Map data,

find known perturbagens that produce the most similar differential expression changes

Technique: Use cosine distance to compare

signatures consisting of the top significant

ver- and under-expressed genes

(e.g., (Duan 2016))

Venom Most similar perturbagen Naja nivea cyclosporine Montivipera xanthina vorinostat Crotalus scutulatus narciclasine Atractaspis sp. benzoxiquine Argiope lobata linifanib Leiurus quinquestriatus hycanthone Bombina variegata salermide Conus imperialis prednisolone acetate Octopus macropus tetraethylthiuram disulfide

Results

SLIDE 19

Same thing, with   VIPER-inferred protein activity

Venom Most similar perturbagen  (raw expression) Most similar perturbagen  (VIPER-inferred activity) Pharmacologic activity/ Mechanism of action Naja nivea Cyclosporine Ki 8751 EGFR inhibitor Montivipera xanthina Vorinostat Geldanamycin Hsp90 inhibitor Crotalus scutulatus Narciclasine Apitolisib MTOR inhibitor Atractaspis sp. Benzoxiquine NCGC00182361-01 Cytoprotection in Huntington model Argiope lobata Linifanib AT7867 AKT inhibitor Leiurus quinquestriatus Hycanthone Homoharringtonine Protein translation inhibitor Bombina variegata Salermide Pevonedistat NAE inhibitor Conus imperialis Prednisolone acetate Niridazole Antiparasitic Octopus macropus Tetraethylthiuram disulfide PLX-4720 BRAF inhibitor

Results

SLIDE 20

Same thing, with   VIPER-inferred protein activity

Venom Most similar perturbagen  (raw expression) Most similar perturbagen  (VIPER-inferred activity) Pharmacologic activity/ Mechanism of action Naja nivea Cyclosporine Ki 8751 EGFR inhibitor Montivipera xanthina Vorinostat Geldanamycin Hsp90 inhibitor Crotalus scutulatus Narciclasine Apitolisib MTOR inhibitor Atractaspis sp. Benzoxiquine NCGC00182361-01 Cytoprotection in Huntington model Argiope lobata Linifanib AT7867 AKT inhibitor Leiurus quinquestriatus Hycanthone Homoharringtonine Protein translation inhibitor Bombina variegata Salermide Pevonedistat NAE inhibitor Conus imperialis Prednisolone acetate Niridazole Antiparasitic Octopus macropus Tetraethylthiuram disulfide PLX-4720 BRAF inhibitor

(red text: anticancer activity)

Results

SLIDE 21

The problem with venom expression profiles

Venoms often consist of hundreds of enzymes. How do we

isolate the important parts of the signal?

We still only have 25 venom expression profiles! Far too few

for standalone ML/DL.

Deep Learning approach:
Use a combination of stacked autoencoders and variational

autoencoders to learn clinically actionable features

Transfer learning: train models on known drug/disease

expression signatures and predict on VenomSeq signatures

http://venomkb.org/S6266294

SLIDE 22

Other challenges

Eventually, we will have to run VenomSeq on every human cell line
IMR-32 isn’t well represented in public compendia
Only a limited number of cell lines have regulon networks for

VIPER

It will be important to predict toxic/harmful effects of venom

components on the human body

Regularization, modeling assumptions, and evaluation for ML/DL

comparisons will be crucial for the success of these analyses

SLIDE 23

Future work

Carry out VenomSeq analysis via data-driven

comparisons to public expression profiles

Apply traditional methods as well as deep

learning methods that are robust to noise

Perform in vivo preclinical validation
Construct structured representation of VenomSeq

data and findings

Merge VenomSeq data and findings into VenomKB
Using ontological reasoning, generate novel

hypotheses from VenomSeq results

VenomSeq encoded profiles Drug/ anti-disease encoded profiles Encoded dimensions

Hypothesis generation Drug/disease associations Select promising association Preclinical validation

SLIDE 24

Future work

Carry out VenomSeq analysis via data-driven

comparisons to public expression profiles

Apply traditional methods as well as deep

learning methods that are robust to noise

Perform in vivo preclinical validation
Construct structured representation of VenomSeq

data and findings

Merge VenomSeq data and findings into VenomKB
Using ontological reasoning, generate novel

hypotheses from VenomSeq results

Example from api.clue.io

SLIDE 25

Future work

Carry out VenomSeq analysis via data-driven

comparisons to public expression profiles

Apply traditional methods as well as deep

learning methods that are robust to noise

Perform in vivo preclinical validation
Construct structured representation of VenomSeq

data and findings

Merge VenomSeq data and findings into VenomKB
Using ontological reasoning, generate novel

hypotheses from VenomSeq results

VenomSeq

+

IMR-32 cells 25 Venoms PLATE-Seq VIPER DEseq2 Expression profiles

LINCS data

LINCS profiles

? VenomKB

Tox-Prot
NCBI
PubMed
GO

VenomKB

SLIDE 26

Future work

Carry out VenomSeq analysis via data-driven

comparisons to public expression profiles

Apply traditional methods as well as deep

learning methods that are robust to noise

Perform in vivo preclinical validation
Construct structured representation of VenomSeq

data and findings

Merge VenomSeq data and findings into VenomKB
Using ontological reasoning, generate novel

hypotheses from VenomSeq results

SLIDE 27

Bibliography

Romano, Joseph D., and Nicholas P. Tatonetti. "VenomKB, a new knowledge base for facilitating the validation of

putative venom therapies." Scientific data 2 (2015): 150065.

Romano, Joseph D., and Nicholas P. Tatonetti. "Using a Novel Ontology to Inform the Discovery of Therapeutic

Peptides from Animal Venoms." AMIA Summits on Translational Science Proceedings 2016 (2016): 209.

Bush, Erin C., Forest Ray, Mariano J. Alvarez, Ronald Realubit, Hai Li, Charles Karan, Andrea Califano, and Peter
A. Sims. "PLATE-Seq for genome-wide regulatory network analysis of high-throughput screens." Nature

communications 8, no. 1 (2017): 105.

Chen, B., and A. J. Butte. "Leveraging big data to transform target selection and drug discovery." Clinical

Pharmacology & Therapeutics 99, no. 3 (2016): 285-297.

Sirota, Marina, Joel T. Dudley, Jeewon Kim, Annie P. Chiang, Alex A. Morgan, Alejandro Sweet-Cordero, Julien

Sage, and Atul J. Butte. "Discovery and preclinical validation of drug indications using compendia of public gene expression data." Science translational medicine 3, no. 96 (2011): 96ra77-96ra77.

Duan, Qiaonan, St Patrick Reid, Neil R. Clark, Zichen Wang, Nicolas F. Fernandez, Andrew D. Rouillard, Ben

Readhead et al. "L1000CDS 2: LINCS L1000 characteristic direction signatures search engine." NPJ systems biology and applications 2 (2016): 16015.

Subramanian, Aravind, Rajiv Narayan, Steven M. Corsello, David D. Peck, Ted E. Natoli, Xiaodong Lu, Joshua

Gould et al. "A next generation connectivity map: L1000 platform and the first 1,000,000 profiles." Cell 171, no. 6 (2017): 1437-1452.

SLIDE 28

Acknowledgements

Dissertation Advisors
Chunhua Weng, PhD, FACMI
Adler Perotte, MD, MA
Nicholas Tatonetti, PhD*
Tatonetti Lab
Rami Vanguri, PhD
Theresa Kolek, PhD
Yun Hao, MA
Phyllis Thangaraj
Alexandre Yahi
Fernanda Polubriaginof, MD
Nick Giangreco
Jenna Kefeli
Jing Ai
Katie LaRow
Deidre Gregory
Collaborators
Charles Karan, PhD*
Ronald Realubit*
Hai Li, PhD*
Peter Sims, PhD
Victor Nwankwo, MD

SLIDE 29

VenomSeq A platform for drug discovery from animal venoms using di - - PowerPoint PPT Presentation

VenomSeq

A platform for drug discovery from animal venoms using differential gene expression

Learning objectives

Toxinology

VenomSeq

VenomSeq in context

PLATE-Seq

VIPER: Virtual inference of protein activity

Obtaining venoms and cells

Determining venom dosages

Determining venom dosages

Sample preparation

Recap: Experimental conditions

Analyzing VenomSeq data

Data-driven discovery from gene expression data

Quality Control

VenomSeq   expression profiles/signatures

Manual review of Apis mellifera  expression signature

Matching VenomSeq to   CMap data

Same thing, with   VIPER-inferred protein activity

Same thing, with   VIPER-inferred protein activity

The problem with venom expression profiles

Other challenges

Future work

Future work

Future work

Future work

Bibliography

Acknowledgements

Thank you!

For any questions or comments: jdr2160@columbia.edu

VenomSeq

A platform for drug discovery from animal venoms using differential gene expression

Learning objectives

Toxinology

VenomSeq

VenomSeq in context

PLATE-Seq

VIPER: Virtual inference of protein activity

Obtaining venoms and cells

Determining venom dosages

Determining venom dosages

Sample preparation

Recap: Experimental conditions

Analyzing VenomSeq data

Data-driven discovery from gene expression data

Quality Control

VenomSeq expression profiles/signatures

Manual review of Apis mellifera expression signature

Matching VenomSeq to CMap data

Same thing, with VIPER-inferred protein activity

Same thing, with VIPER-inferred protein activity

The problem with venom expression profiles

Other challenges

Future work

Future work

Future work

Future work

Bibliography

Acknowledgements

Thank you!

For any questions or comments: jdr2160@columbia.edu

VenomSeq   expression profiles/signatures

Manual review of Apis mellifera  expression signature

Matching VenomSeq to   CMap data

Same thing, with   VIPER-inferred protein activity

Same thing, with   VIPER-inferred protein activity