for Drug Repurposing with Probabilistic Similarity Logic SHOBEIR - - PowerPoint PPT Presentation

for drug repurposing
SMART_READER_LITE
LIVE PREVIEW

for Drug Repurposing with Probabilistic Similarity Logic SHOBEIR - - PowerPoint PPT Presentation

Drug-Target Interaction Prediction for Drug Repurposing with Probabilistic Similarity Logic SHOBEIR FAKHRAEI* LOUIQA RASCHID LISE GETOOR University of Maryland, College Park, MD, USA Outline Drug Repurposing Drug-Target Interaction


slide-1
SLIDE 1

Drug-Target Interaction Prediction for Drug Repurposing with Probabilistic Similarity Logic

SHOBEIR FAKHRAEI* LOUIQA RASCHID LISE GETOOR

University of Maryland, College Park, MD, USA

slide-2
SLIDE 2

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Outline

  • Drug Repurposing
  • Drug-Target Interaction Network
  • Probabilistic Similarity Logic (PSL)
  • Drug-Target Interaction Prediction with PSL
  • Experimental Results
slide-3
SLIDE 3

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

New Drug Development

Illustration Credit: XVIVO Scientific Animation

  • Time Consuming: New drugs take a decade to reach market.
  • Costly: Development cost reaches 2 billion US dollars.
slide-4
SLIDE 4

Valley of death: Most novel drug candidates never get approved!

slide-5
SLIDE 5

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Drugs

  • Drugs:

Organic small molecules that bind to bio-molecular targets to activate/inhibit their functions

  • Drug often affect multiple targets.
  • Poly-pharmacology is an area of

growing interest

slide-6
SLIDE 6

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Drug Repurposing

  • Drug Affecting Multiple Targets:
  • Adverse side-effects
  • Unexpected therapeutic effect
  • Drug Repurposing/Repositioning:

Finding new uses for approved drugs.

  • No need for tests required for a

new therapeutic compound (Already approved)

slide-7
SLIDE 7

Sildenafil was originally developed for pulmonary arterial hypertension

slide-8
SLIDE 8

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Need for Systematic Search

  • Most new treatment are discovered by chance

during clinical trials.

  • There is a need for a better systematic approach.
  • Experimental identification
  • f drug-target associations is

labor intensive and costly

  • A better solution?
slide-9
SLIDE 9

Using computational predictions to focus biological search

slide-10
SLIDE 10

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Drug-Target Interaction Network

… … …

Interaction Drug Target

slide-11
SLIDE 11

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Drug-Target Interaction Network + Similarities

… … …

Drug-Drug Similarity Target- Target Similarity

slide-12
SLIDE 12

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Multiple Similarities

. . .

Chemical- based Sequence- based Ligand- based PPI- network- based Side- effect- based Gene Ontology- based

slide-13
SLIDE 13

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

D-T Interaction Network + Multiple Similarities

… … …

?

slide-14
SLIDE 14

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Drug-Target Interaction Prediction

  • Data:
  • Drug-target interaction network
  • Set of drug-drug similarities
  • Set of target-target similarities
  • Task:
  • Link Prediction (New drug-target interactions)
slide-15
SLIDE 15

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Challenges

  • Data is not originally flat:
  • Classifiers need a set of features and

instances.

  • Instances: all interactions in the

network (pairwise)

  • r only interaction of one drug or

target.

  • Features: Feature engineering

Features Instances Labels

Not independent

  • Not Independent and Identically

Distributed (IID): Interactions depend on each other (a drug tends to interact with similar targets)

  • Multi-relational:
  • Drug-Target Interactions
  • Different Drug-Drug Similarities
  • Different Target-Target Similarities
slide-16
SLIDE 16

Probabilistic Similarity Logic

slide-17
SLIDE 17

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Probabilistic Similarity Logic (PSL)

  • Declarative language based on logic to express

collective probabilistic inference problems.

  • Logical foundation
  • Probabilistic foundation
  • Weight Learning
slide-18
SLIDE 18

Logic Foundation

slide-19
SLIDE 19

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

General Rules

  • Can use predicate to define relations between variables.

e.g. Interacts(D, T)

  • Grounding: Instantiation of predicates with data.

e.g. Interacts(acetaminophen, cox2)

  • Groundings have a soft-truth values between [0, 1]

P A, B ∧ Q B, C → R A, C

Predicates Variables

e.g., 𝐽𝑜𝑢𝑓𝑠𝑏𝑑𝑢𝑡 𝐸, 𝑈2 ∧ 𝑇𝑗𝑛𝑗𝑚𝑏𝑠𝑈𝑏𝑠𝑕𝑓𝑢 𝑈

1, 𝑈 2 → 𝐽𝑜𝑢𝑓𝑠𝑏𝑑𝑢𝑡 𝐸, 𝑈 1

slide-20
SLIDE 20

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Lukasiewicz t-norm and co-norm

  • 𝑄

∧ 𝑅 = 𝑛𝑏𝑦 0, 𝑄 + 𝑅 − 1

  • 𝑄

∨ 𝑅 = 𝑛𝑗𝑜 1, 𝑄 + 𝑅

  • ¬𝑄 = 1 − 𝑄

𝑄 ∧ 𝑅 P Q P Q 𝑄 ∨ 𝑅

P A, B ∧ Q B, C → R A, C

slide-21
SLIDE 21

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Satisfaction

  • Interpretation (I) : an assignment of soft-truth

values to a set of groundings.

P A, B ∧ Q B, C → R A, C

0.7 0.8 max 0, 0.7 + 0.8 − 1 = 0.5 ≥ 0.5

  • Rule satisfaction: rbody → rhead is satisfied

when I rbody ≤ I rhead

slide-22
SLIDE 22

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Distance to Satisfaction

𝑒𝑠 𝐽 = 𝑛𝑏𝑦 𝐽 𝑠

𝑐𝑝𝑒𝑧

− 𝐽 𝑠

ℎ𝑓𝑏𝑒 , 0

P A, B ∧ Q B, C → R A, C P A, B ∧ Q B, C → R A, C

0.7 0.8 max 0, 0.7 + 0.8 − 1 = 0.5 0.7 0.7 0.8 0.2 𝑒𝑠 𝐽 = 0.0 𝑒𝑠 𝐽 = 0.3

slide-23
SLIDE 23

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Rule Weights

w : P A, B ∧ Q B, C → R A, C

  • Rule can have weights which corresponds to

importance of the rule.

  • Can come from domain knowledge
  • Can be learned from data
slide-24
SLIDE 24

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Review

  • PSL program + Dataset  Set of ground rules
  • Some groundings (predicates) have known truth values

and some have unknown truth values.

  • Every Interpretation of unknown groundings (predicates)

 different weighted distances to satisfaction

  • How to decide which Interpretation is best?
slide-25
SLIDE 25

Probabilistic Foundation

slide-26
SLIDE 26

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Probabilistic Model

Probability density over interpretation I Normalization constant Set of ground rules Distance exponent in {1, 2} Rule’s weight

Rule’s distance to satisfaction:

𝒆𝒔 𝑱 = 𝒏𝒃𝒚 𝑱 𝒔𝒄𝒑𝒆𝒛 − 𝑱 𝒔𝒊𝒇𝒃𝒆 , 𝟏

𝑔 𝐽 = 1 𝑎 𝑓𝑦𝑞 −

𝑠∈𝑆

𝑥𝑠 𝑒𝑠 𝐽

𝑞

slide-27
SLIDE 27

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

  • Given a set of observed groundings infer the values of

unknown groundings

  • e.g., Given a set of drug-target interactions + a set of D-D

and T-T similarities infer the value of other interactions.

Inferring Most Probable Explanations

  • Convex optimization: perform

inference using the alternating direction method of multipliers (ADMM) [Bach et al., NIPS 2012]

  • Fast, scalable, and straightforward
  • Optimize sub-problems (ground

rules) independently.

slide-28
SLIDE 28

Weight Learning

slide-29
SLIDE 29

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Weight Learning

  • Learn the weights from training data
  • Various methods:
  • Approximate maximum likelihood [Broecheler et al., UAI 10]
  • Maximum pseudo-likelihood
  • Large-margin estimation

w : P A, B ∧ Q B, C → R A, C

slide-30
SLIDE 30

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

PSL Summary

  • Design probabilistic models using declarative

language

  • Syntax based on first-order logic
  • Inference of most-probable explanation is fast

convex optimization (ADMM)

  • Learning algorithms for training rule weights

from labeled data.

slide-31
SLIDE 31

Drug-Target Interaction Prediction with PSL

slide-32
SLIDE 32

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Predicates

  • Interacts D, T
  • SimilarTargetβ T

1, T2

  • e.g. β can be Sequence-based, PPI-network-

based, Gene Ontology-based.

  • SimilarDrugα D1, D2
  • e.g. α can be Chemical-based, Ligand-based,

Expression-based, Side-effect-based, Annotation- based.

slide-33
SLIDE 33

Drug-Target Interaction Prediction Rules

slide-34
SLIDE 34

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Triad-based rules (Targets)

2 1

?

𝐽𝑜𝑢𝑓𝑠𝑏𝑑𝑢𝑡 𝐸, 𝑈

2 ∧ 𝑇𝑗𝑛𝑗𝑚𝑏𝑠𝑈𝑏𝑠𝑕𝑓𝑢𝛾 𝑈 1, 𝑈 2 → 𝐽𝑜𝑢𝑓𝑠𝑏𝑑𝑢𝑡 𝐸, 𝑈 1

  • Drugs tend to interact with similar targets

(friend of friend is a friend)

slide-35
SLIDE 35

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Triad-based rules (Drugs)

𝑇𝑗𝑛𝑗𝑚𝑏𝑠𝐸𝑠𝑣𝑕𝛽 𝐸1, 𝐸2 ∧ 𝐽𝑜𝑢𝑓𝑠𝑏𝑑𝑢𝑡 𝐸2, 𝑈 → 𝐽𝑜𝑢𝑓𝑠𝑏𝑑𝑢𝑡 𝐸1, 𝑈

  • Targets tend to interact with similar drugs

(friend of friend is a friend)

1 2

?

slide-36
SLIDE 36

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Tetrad-based Rules (Similar Edges)

1 2 2 1

?

  • Similar edges are likely to form in a graph

𝑇𝑗𝑛𝑗𝑚𝑏𝑠𝐸𝑠𝑣𝑕𝛽 𝐸1, 𝐸2 ∧ 𝑇𝑗𝑛𝑗𝑚𝑏𝑠𝑈𝑏𝑠𝑕𝑓𝑢𝛾 𝑈

1, 𝑈 2 ∧ 𝐽𝑜𝑢𝑓𝑠𝑏𝑑𝑢𝑡 𝐸2, 𝑈 2

→ 𝐽𝑜𝑢𝑓𝑠𝑏𝑑𝑢𝑡 𝐸1, 𝑈

1

slide-37
SLIDE 37

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Negative Prior

  • Negative prior indicates “Interacts” predicate is

most likely false

  • i.e., most drugs and targets do not interact

¬ 𝐽𝑜𝑢𝑓𝑠𝑏𝑑𝑢𝑡 𝐸, 𝑈

X X X X X X

slide-38
SLIDE 38

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Size of the problem

  • Total ground triad-based rules can be:

𝑃 𝐸 × 𝑈 × 𝐸 × 𝛽 + 𝑈 × 𝛾

All possible interactions Triads based on drug similarities for an interaction Triads based on target similarities for an interaction Number

  • f drugs

Number

  • f targets

Number

  • f drug

similarities Number

  • f target

similarities

  • e.g., in our experiments it was 180M
  • For tetrad-based rules the situation is even worst!
slide-39
SLIDE 39

Blocking

slide-40
SLIDE 40

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Blocking

  • Limit some of the rules from being grounded
  • Ignore some of the less significant similarities between

drugs and between targets.

slide-41
SLIDE 41

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Same Threshold for All Similarities

  • Fixed threshold either ignores most of the values in one

similarity or includes most of the values from the other ?

slide-42
SLIDE 42

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

A Threshold for Each Similarity

  • Same problem for individual drug or target!

?

slide-43
SLIDE 43

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

K-Nearest Neighbors-based

  • Preserve the k-highest values in each similarity for each

drug and each target and set the others to zero.

… … … …

k-most similar k-most similar

slide-44
SLIDE 44

PSL Advantages

slide-45
SLIDE 45

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

PSL Advantages

  • Collective Inference (No IID

assumption): Results in global information propagation through the network.

?

  • Class Imbalance: PSL can handle huge class-

imbalance problems in link prediction problems.

  • PSL captures the original structure
  • Inference based on

Interpretable rules

slide-46
SLIDE 46

Experimental Evaluation

slide-47
SLIDE 47

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Dataset

  • 315 Drugs
  • 250 Targets
  • Interaction: [Knox et al. 2011]
  • 1,306 observed interactions
  • 78,750 possible interactions
  • Similarities: [Perlman et al. 2011]
  • 3 target-target similarities
  • 5 drug-drug similarities
slide-48
SLIDE 48

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Drug-Drug Similarities [Perlman et al. 2011]

  • Chemical-based:
  • Jaccard similarity of the SMILES fingerprints
  • Ligand-based:
  • Jaccard similarity between protein receptor families extracted via

matched ligands with drugs SMILES

  • Expression-based:
  • Spearman correlation of gene expression responses to drugs using

Connectivity Map.

  • Side-effect-based:
  • Jaccard similarity between drugs side-effects from SIDER
  • Annotation-based:
  • Semantic Similarity of Drugs based on the World Health Organization ATC

classification system

slide-49
SLIDE 49

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Target-Target Similarities [Perlman et al. 2011]

  • Sequence-based:
  • Smith-Waterman sequence alignment scores
  • Protein-protein interaction network-based:
  • The distance in the protein-protein interactions network using all-pairs

shortest path.

  • Gene ontology-based:
  • Semantic similarity between Gene ontology annotations
slide-50
SLIDE 50

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Triad Rules

Rule AUROC Drug-Drug Similarity

Annotation-based 0.685 ± 0.026 Chemical-based 0.714 ± 0.030 Ligand-based 0.751 ± 0.030 Expression-based 0.584 ± 0.025 Side-effect-based 0.614 ± 0.030

Target-Target Similarity

PPI-network-based 0.816 ± 0.026 GO-based 0.608 ± 0.029 Sequence-based 0.842 ± 0.019

All rules (similarities)

0.931 ± 0.018

slide-51
SLIDE 51

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Triad Rules: Comparison with reported results

Method AUROC Condition PSL 0.931 ± 0.018 Without Sampling (10 Fold C.V.) Perlman et al. 2011 0.935 With Sampling (Reported Results) Yamanishi et al. 2008 0.884 Bleakley et al. 2009 0.814

slide-52
SLIDE 52

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Triad Rules: Blocking and Weight Learning

Condition AUROC K=5 K=15 K=30 All weights fixed

0.926 ± 0.016

0.929 ± 0.020 0.923 ± 0.021

Condition Time to Complete (10-folds) K=5 K=15 K=30 All weights fixed

12 mins

3 h 9 h

+ Weight learning

0.930 ± 0.016

0.931 ± 0.018 0.924 ± 0.21

+ Weight learning

1 h

10 h 28 h

slide-53
SLIDE 53

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Triad Rules: Precision of Top 100 Predictions

0.1 0.2 0.3 0.4 0.5 0.6 10 20 30 40 50 60 70 80 90 100 Precision Top N Predictions with weight learning without weight learning

slide-54
SLIDE 54

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Triad and Tetrad based rules

Method AUROC with k=5 Triad-based Rules

0.930 ± 0.016

Tetrad-based Rules

0.796 ± 0.025

Triad-based & Tetrad-based

0.913 ± 0.017

slide-55
SLIDE 55

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Conclusion

  • Identified challenges of network-based drug-target interaction

prediction.

  • Described PSL framework to address them:
  • Captures original network structure
  • Is a declarative language to implement different rules
  • Performs collective inference (No IID assumption)
  • Weight learning based on training data
  • Matched performance of the state-of-the-art with simple triad-

based rules.

  • The proposed method can easily be applied to other tasks with

similar structures.

slide-56
SLIDE 56

Thank you

Drug-Target Interaction Prediction for Drug Repurposing with Probabilistic Similarity Logic

Shobeir Fakhraei*, Louiqa Raschid, Lise Getoor

University of Maryland, College Park, MD, USA

http://psl.cs.umd.edu

slide-57
SLIDE 57

BioKDD 2013 | Chicago | Drug-Target Interaction Prediction …

Refrences

  • L. Perlman, A. Gottlieb, N. Atias, E. Ruppin, and R. Sharan. “Combining Drug and Gene

Similarity Measures for Drug-Target Elucidation.” Journal of Computational Biology, Feb. 2011

  • Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V, Djoumbou Y,

Eisner R, Guo AC, Wishart DS. “DrugBank 3.0: a comprehensive resource for 'omics' research

  • n drugs. ” Nucleic Acids Res. Jan 2011
  • Y. Yamanishi, M. Araki, A. Gutteridge, W. Honda, and M. Kanehisa. “Prediction of drug target

interaction networks from the integration of chemical and genomic spaces.” Bioinformatics, Jul 2008.

  • K. Bleakley and Y. Yamanishi. “Supervised prediction of drug target interactions using

bipartite local models.” Bioinformatics, Sep. 2009

  • Stephen H. Bach, Bert Huang, Ben London, and Lise Getoor, Hinge-loss, “Markov Random

Fields: Convex Inference for Structured Prediction”, Uncertainty in Artificial Intelligence (UAI) 2013

  • Stephen H. Bach, Matthias Broecheler, Lise Getoor, and Dianne P. O’Leary, “Scaling MPE

Inference for Constrained Continuous Markov Random Fields with Consensus Optimization”, Advances in Neural Information Processing Systems (NIPS) 2012