VenomSeq A platform for drug discovery from animal venoms using di - - PowerPoint PPT Presentation

venomseq
SMART_READER_LITE
LIVE PREVIEW

VenomSeq A platform for drug discovery from animal venoms using di - - PowerPoint PPT Presentation

VenomSeq A platform for drug discovery from animal venoms using di ff erential gene expression Joseph D. Romano , Hai Li, Ronald Realubit, Charles Karan, and Nicholas Tatonetti Columbia University presented 03/14/2018 AMIA 2018


slide-1
SLIDE 1

VenomSeq

A platform for drug discovery from animal venoms using differential gene expression

Joseph D. Romano, Hai Li, Ronald Realubit, Charles Karan, and Nicholas Tatonetti
 Columbia University
 


presented 03/14/2018
 AMIA 2018 Informatics Summit

slide-2
SLIDE 2

Learning objectives

  • 1. Understand the connection between animal venoms and novel

drug discovery

  • 2. Conceptualize the VenomSeq methodology and how it is used

to generate differential expression profiles of human cell lines

  • n exposure to dilute animal venoms
  • 3. Understand approaches to comparing novel differential gene

expression profiles to expression profiles from public sources, and how to perform data quality assessment on the analysis results

slide-3
SLIDE 3

Toxinology

slide-4
SLIDE 4

VenomSeq

VenomSeq

+

IMR-32 cells 25 Venoms PLATE-Seq VIPER DEseq2 Expression profiles

LINCS data

LINCS profiles

?

slide-5
SLIDE 5

VenomSeq in context

VenomSeq

+

IMR-32 cells 25 Venoms PLATE-Seq VIPER DEseq2 Expression profiles

LINCS data

LINCS profiles

? VenomKB

  • Tox-Prot
  • NCBI
  • PubMed
  • GO

VenomKB

slide-6
SLIDE 6

PLATE-Seq

  • Technology developed in Califano and Sims labs—reduces cost of

RNA-Seq by approximately tenfold

  • Main idea: Barcode individual samples, pool, sequence together

(from Bush 2017)

slide-7
SLIDE 7

VIPER: Virtual inference of protein activity

  • PLATE-Seq operates at relatively low depth (0.5-2 M reads/sample)
  • VIPER (with the aREA-3T algorithm) uses network inference to

recover accuracy and improve statistical power

(from Alvarez 2016)

Expression signatures (yellow) VIPER (cyan)

slide-8
SLIDE 8

Obtaining venoms and cells

Snakes Spiders Scorpions Other arthropods Fish / molluscs Other reptiles Fish

slide-9
SLIDE 9

Determining venom dosages

slide-10
SLIDE 10

Determining venom dosages

  • Create 9 serial dilutions of venoms


(1000µg/ml to ~3.9µg/ml)

  • Perform viability assay (5x replicates)
  • Fit Hill slope equation to observed

means:


  • Final dosage concentration: IC20

y = Bottom + (Top − Bottom) 1 + 10(log EC50−x) − 1

slide-11
SLIDE 11

Sample preparation

  • 1. Grow IMR-32 cells to 80% confluence
  • 2. Seed 96-well plates; incubate 2 days
  • 3. Reconstitute lyophilized venoms in ddH2O at 10X IC20
  • 4. Add dissolved venoms to wells to 1X final concentration
slide-12
SLIDE 12

Recap: Experimental conditions

Venoms

25 species

Cell line

IMR-32 (Human neuroblastoma)

Dosage

IC20 for each venom

Time points

6/24/36 hours post-treatment

Replicates

3 per time point per venom

Controls

12 water controls, 9 untreated

Solvent

Water

slide-13
SLIDE 13

Analyzing VenomSeq data

VenomSeq

+

IMR-32 cells 25 Venoms PLATE-Seq VIPER DEseq2 Expression profiles

LINCS data

LINCS profiles

?

How do we do this?

slide-14
SLIDE 14

Data-driven discovery from gene expression data

  • Public compendia contain millions of

expression profiles/signatures

  • Instead of posing a single hypothesis, can

we let the data suggest new hypotheses?

  • Auxiliary areas of focus:
  • Cross-platform data normalization
  • Multiple hypothesis testing correction
  • Dataset annotation techniques
  • Subsequent preclinical validation

(Sirota 2011) (Chen 2016)

slide-15
SLIDE 15

Quality Control

Results

slide-16
SLIDE 16

VenomSeq 
 expression profiles/signatures

Gene

  • N. nivea 6h P. fasciata 6h
  • R. marina 6h P. volitans 6h

CHTOP

148 96 133 60 …

SNAPIN

72 17 17 13 …

ILF2

806 447 496 417 …

NPR1

0 …

INTS3

12 12 10 12 …

SLC27A3

18 14 12 1 …

LOC343052

0 …

GATAD2B

295 117 446 220 …

DENND4B

80 28 37 38 …

CRTC2

43 17 35 18 …

SLC39A1

328 241 373 236 …

… … … … Gene baseMean log2FC lfcSE t-statistic p-value p-adj S100A10 30.32 2.04 0.52 3.90 9.66E-05 0.034 S100A11 199.65 1.94 0.48 4.03 5.57E-05 0.027 PFDN2 1508.04 0.57 0.14 4.14 3.40E-05 0.024 SELL 14.77 2.54 0.54 4.73 2.23E-06 0.004 KIAA 2.88 2.98 0.73 4.06 4.82E-05 0.027 SYT2 9.11 2.59 0.60 4.32 1.59E-05 0.016 … … … … … … …

Raw count matrix (all samples) Significant differentially expressed genes (Naja nivea) Results

slide-17
SLIDE 17

Manual review of Apis mellifera
 expression signature

HUGO symbol Gene ID Log2FC Description ZNF609 23060

  • 1.260

Involved in myoblast proliferation UBE2G2 7327

  • 0.720

Conjugates ubiquitin to mark proteins for degradation in the endoplasmic reticulum PSMA4 5685

  • 0.528

Encodes a core subunit of the proteasome complex CKAP2 26586 0.655 Stabilizes microtubules and regulates p53- mediated cell division

Results

slide-18
SLIDE 18

Matching VenomSeq to 
 CMap data

  • Approach: Using Connectivity Map data,

find known perturbagens that produce the most similar differential expression changes

  • Technique: Use cosine distance to compare

signatures consisting of the top significant

  • ver- and under-expressed genes 


(e.g., (Duan 2016))

Venom Most similar perturbagen Naja nivea cyclosporine Montivipera xanthina vorinostat Crotalus scutulatus narciclasine Atractaspis sp. benzoxiquine Argiope lobata linifanib Leiurus quinquestriatus hycanthone Bombina variegata salermide Conus imperialis prednisolone acetate Octopus macropus tetraethylthiuram disulfide

Results

slide-19
SLIDE 19

Same thing, with 
 VIPER-inferred protein activity

Venom Most similar perturbagen
 (raw expression) Most similar perturbagen
 (VIPER-inferred activity) Pharmacologic activity/ Mechanism of action Naja nivea Cyclosporine Ki 8751 EGFR inhibitor Montivipera xanthina Vorinostat Geldanamycin Hsp90 inhibitor Crotalus scutulatus Narciclasine Apitolisib MTOR inhibitor Atractaspis sp. Benzoxiquine NCGC00182361-01 Cytoprotection in Huntington model Argiope lobata Linifanib AT7867 AKT inhibitor Leiurus quinquestriatus Hycanthone Homoharringtonine Protein translation inhibitor Bombina variegata Salermide Pevonedistat NAE inhibitor Conus imperialis Prednisolone acetate Niridazole Antiparasitic Octopus macropus Tetraethylthiuram disulfide PLX-4720 BRAF inhibitor

Results

slide-20
SLIDE 20

Same thing, with 
 VIPER-inferred protein activity

Venom Most similar perturbagen
 (raw expression) Most similar perturbagen
 (VIPER-inferred activity) Pharmacologic activity/ Mechanism of action Naja nivea Cyclosporine Ki 8751 EGFR inhibitor Montivipera xanthina Vorinostat Geldanamycin Hsp90 inhibitor Crotalus scutulatus Narciclasine Apitolisib MTOR inhibitor Atractaspis sp. Benzoxiquine NCGC00182361-01 Cytoprotection in Huntington model Argiope lobata Linifanib AT7867 AKT inhibitor Leiurus quinquestriatus Hycanthone Homoharringtonine Protein translation inhibitor Bombina variegata Salermide Pevonedistat NAE inhibitor Conus imperialis Prednisolone acetate Niridazole Antiparasitic Octopus macropus Tetraethylthiuram disulfide PLX-4720 BRAF inhibitor

(red text: anticancer activity)

Results

slide-21
SLIDE 21

The problem with venom expression profiles

  • Venoms often consist of hundreds of enzymes. How do we

isolate the important parts of the signal?

  • We still only have 25 venom expression profiles! Far too few

for standalone ML/DL.

  • Deep Learning approach:
  • Use a combination of stacked autoencoders and variational

autoencoders to learn clinically actionable features

  • Transfer learning: train models on known drug/disease

expression signatures and predict on VenomSeq signatures

http://venomkb.org/S6266294

slide-22
SLIDE 22

Other challenges

  • Eventually, we will have to run VenomSeq on every human cell line
  • IMR-32 isn’t well represented in public compendia
  • Only a limited number of cell lines have regulon networks for

VIPER

  • It will be important to predict toxic/harmful effects of venom

components on the human body

  • Regularization, modeling assumptions, and evaluation for ML/DL

comparisons will be crucial for the success of these analyses

slide-23
SLIDE 23

Future work

  • Carry out VenomSeq analysis via data-driven

comparisons to public expression profiles

  • Apply traditional methods as well as deep

learning methods that are robust to noise

  • Perform in vivo preclinical validation
  • Construct structured representation of VenomSeq

data and findings

  • Merge VenomSeq data and findings into VenomKB
  • Using ontological reasoning, generate novel

hypotheses from VenomSeq results

VenomSeq encoded profiles Drug/ anti-disease encoded profiles Encoded dimensions

Hypothesis generation Drug/disease associations Select promising association Preclinical validation

slide-24
SLIDE 24

Future work

  • Carry out VenomSeq analysis via data-driven

comparisons to public expression profiles

  • Apply traditional methods as well as deep

learning methods that are robust to noise

  • Perform in vivo preclinical validation
  • Construct structured representation of VenomSeq

data and findings

  • Merge VenomSeq data and findings into VenomKB
  • Using ontological reasoning, generate novel

hypotheses from VenomSeq results

Example from api.clue.io

slide-25
SLIDE 25

Future work

  • Carry out VenomSeq analysis via data-driven

comparisons to public expression profiles

  • Apply traditional methods as well as deep

learning methods that are robust to noise

  • Perform in vivo preclinical validation
  • Construct structured representation of VenomSeq

data and findings

  • Merge VenomSeq data and findings into VenomKB
  • Using ontological reasoning, generate novel

hypotheses from VenomSeq results

VenomSeq

+

IMR-32 cells 25 Venoms PLATE-Seq VIPER DEseq2 Expression profiles

LINCS data

LINCS profiles

? VenomKB

  • Tox-Prot
  • NCBI
  • PubMed
  • GO
VenomKB
slide-26
SLIDE 26

Future work

  • Carry out VenomSeq analysis via data-driven

comparisons to public expression profiles

  • Apply traditional methods as well as deep

learning methods that are robust to noise

  • Perform in vivo preclinical validation
  • Construct structured representation of VenomSeq

data and findings

  • Merge VenomSeq data and findings into VenomKB
  • Using ontological reasoning, generate novel

hypotheses from VenomSeq results

slide-27
SLIDE 27

Bibliography

  • Romano, Joseph D., and Nicholas P. Tatonetti. "VenomKB, a new knowledge base for facilitating the validation of

putative venom therapies." Scientific data 2 (2015): 150065.

  • Romano, Joseph D., and Nicholas P. Tatonetti. "Using a Novel Ontology to Inform the Discovery of Therapeutic

Peptides from Animal Venoms." AMIA Summits on Translational Science Proceedings 2016 (2016): 209.

  • Bush, Erin C., Forest Ray, Mariano J. Alvarez, Ronald Realubit, Hai Li, Charles Karan, Andrea Califano, and Peter
  • A. Sims. "PLATE-Seq for genome-wide regulatory network analysis of high-throughput screens." Nature

communications 8, no. 1 (2017): 105.

  • Chen, B., and A. J. Butte. "Leveraging big data to transform target selection and drug discovery." Clinical

Pharmacology & Therapeutics 99, no. 3 (2016): 285-297.

  • Sirota, Marina, Joel T. Dudley, Jeewon Kim, Annie P. Chiang, Alex A. Morgan, Alejandro Sweet-Cordero, Julien

Sage, and Atul J. Butte. "Discovery and preclinical validation of drug indications using compendia of public gene expression data." Science translational medicine 3, no. 96 (2011): 96ra77-96ra77.

  • Duan, Qiaonan, St Patrick Reid, Neil R. Clark, Zichen Wang, Nicolas F. Fernandez, Andrew D. Rouillard, Ben

Readhead et al. "L1000CDS 2: LINCS L1000 characteristic direction signatures search engine." NPJ systems biology and applications 2 (2016): 16015.

  • Subramanian, Aravind, Rajiv Narayan, Steven M. Corsello, David D. Peck, Ted E. Natoli, Xiaodong Lu, Joshua

Gould et al. "A next generation connectivity map: L1000 platform and the first 1,000,000 profiles." Cell 171, no. 6 (2017): 1437-1452.

slide-28
SLIDE 28

Acknowledgements

  • Dissertation Advisors
  • Chunhua Weng, PhD, FACMI
  • Adler Perotte, MD, MA
  • Nicholas Tatonetti, PhD*
  • Tatonetti Lab
  • Rami Vanguri, PhD
  • Theresa Kolek, PhD
  • Yun Hao, MA
  • Phyllis Thangaraj
  • Alexandre Yahi
  • Fernanda Polubriaginof, MD
  • Nick Giangreco
  • Jenna Kefeli
  • Jing Ai
  • Katie LaRow
  • Deidre Gregory
  • Collaborators
  • Charles Karan, PhD*
  • Ronald Realubit*
  • Hai Li, PhD*
  • Peter Sims, PhD
  • Victor Nwankwo, MD
slide-29
SLIDE 29

Thank you!

For any questions or comments: jdr2160@columbia.edu