Molecular Structure Information Masaki Asada, Makoto Miwa, Yutaka - PowerPoint PPT Presentation

Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information Masaki Asada, Makoto Miwa, Yutaka Sasaki Toyota Technological Institute, Japan 1

Introduction • Our target problem is the extraction of drug-drug interactions (DDIs) from biomedical texts Mechanism Grepafloxacin inhibits the metabolism of Theophylline 2

Introduction • Our target problem is the extraction of drug-drug interactions (DDIs) from biomedical texts • We investigate the use of external drug database (DrugBank) information in extracting DDIs from texts • We especially focus on molecular structure information Mechanism Grepafloxacin inhibits the metabolism of Theophylline DrugBank database 3

Method Overview • We obtain the representations of textual drug pairs using convolutional neural networks (CNNs) and molecular drug pairs using graph convolutional networks (GCNs) • We concatenate text-based and molecule-based vectors Text Grepafloxacin inhibits the CNN metabolism of Theophylline DDI types concat Molecular structure Grepafloxacin concat GCN DrugBank Theophylline Database 4

Method DDI extraction from texts using molecular structures • Text-based DDI representation • Molecular structure-based DDI representation word + position embeddings Grepafloxacin Textual vector inhibits the Text Corpus CNN metabolism of DDI types Theophylline concat concat GCN Grepafloxacin DrugBank Database Theophylline Molecular vector 5

Method DDI extraction from texts using molecular structures • Text-based DDI representation • Molecular structure-based DDI representation word + position embeddings Grepafloxacin Textual vector inhibits the Text Corpus CNN metabolism of DDI types Theophylline concat concat GCN Grepafloxacin DrugBank Database Theophylline Molecular vector 6

Method: Text-based DDI Representation word + position embeddings Grepafloxacin Textual vector inhibits the Text Corpus CNN metabolism of Theophylline • Our model for representing textual DDIs is based on the CNN model by Zeng et al. (2014) • We use word and position embeddings as the input to the convolution layer • We convert the output of the convolution layer into a fixed-size textual vector 7

Method DDI extraction from texts using molecular structures • Text-based DDI representation • Molecular structure-based DDI representation word vector Input sentence Grepafloxacin Textual vector inhibits the Text Corpus CNN metabolism Predict of DDI Theophylline concat concat GCN Grepafloxacin DrugBank Database Theophylline Molecular vector 8

Method: Molecular Structure-based DDI Representation • We represent drug pairs in molecular graph structures using GCNs • We pre-train GCNs using interacting (positive) pairs mentioned in the DrugBank and not mentioned ( pseudo negative ) pairs in the DrugBank Molecular vector Grepafloxacin prediction interact GCN not mentioned Theophylline 9

Method: Molecular Structure-based DDI Representation Graph Convolutional Network (GCN) [Li et al. 2016] We use GCNs to convert a drug molecule graph into a fixed size 𝑈 vector by aggregating node vectors 𝒊 𝑤 graph structure molecular vector GCN 𝑢 : node vector 𝒊 𝑤 𝑂 𝑤 : neighbors of 𝑤 Node 𝑥 𝒉 Node 𝑤 GRU : gated Recurrent Unit Edge 𝑓 𝑤𝑥 𝑗, 𝑘 : linear layer 𝑢+1 = σ 𝑥∈𝑂(𝑤) 𝑩 𝑓 𝑤𝑥 𝒊 𝑥 𝑢 𝒏 𝑤 ⊙ : element-wise product 𝑢+1 = GRU([𝒊 𝑤 𝑢 ; 𝒏 𝑤 𝑢+1 ]) [… ; … ] : concatenation 𝒊 𝑤 𝑩 : learned weight 10

Method: Molecular Structure-based DDI Representation Graph Convolutional Network (GCN) [Li et al. 2016] We use GCNs to convert a drug molecule graph into a fixed size 𝑈 vector by aggregating node vectors 𝒊 𝑤 graph structure molecular vector GCN 𝑢 : node vector 𝒊 𝑤 𝑂 𝑤 : neighbors of 𝑤 Node 𝑥 𝒉 Node 𝑤 GRU : gated Recurrent Unit Edge 𝑓 𝑤𝑥 𝑗, 𝑘 : linear layer 𝑢+1 = σ 𝑥∈𝑂(𝑤) 𝑩 𝑓 𝑤𝑥 𝒊 𝑥 𝑢 𝒏 𝑤 ⊙ : element-wise product 𝑢+1 = GRU([𝒊 𝑤 𝑢 ; 𝒏 𝑤 𝑢+1 ]) [… ; … ] : concatenation 𝒊 𝑤 0 ) ⊙ (𝑘 𝒊 𝑤 𝑈 ; 𝒊 𝑤 𝑈 ; 𝒊 𝑤 0 𝑩 : learned weight 𝒉 = σ 𝑤 𝜏 𝑗( 𝒊 𝑤 ) 11

Method: DDI Extraction from Texts Using Molecular Structures word + position embeddings textual Grepafloxacin vector inhibits the CNN metabolism of Theophylline 12

Method: DDI Extraction from Texts Using Molecular Structures • Link mentions in text corpus to drug database entries by relaxed string matching word + position embeddings textual Grepafloxacin vector inhibits Relaxed the CNN string metabolism matching of Theophylline Grepafloxacin DrugBank Theophylline 13

Method: DDI Extraction from Texts Using Molecular Structures • Link mentions in text corpus to drug database entries by relaxed string matching • Obtain molecular vectors via GCNs with fixed parameters word + position embeddings textual Grepafloxacin vector inhibits Relaxed the CNN string metabolism matching of Theophylline Grepafloxacin GCN DrugBank Theophylline molecular 14 vector

Method: DDI Extraction from Texts Using Molecular Structures • Link mentions in text corpus to drug database entries by relaxed string matching • Obtain molecular vectors via GCNs with fixed parameters • Predict DDIs from concatenated textual and molecular vectors word + position embeddings textual Grepafloxacin vector inhibits Relaxed the CNN string metabolism matching DDI types of concat Theophylline Grepafloxacin concat GCN DrugBank Theophylline molecular 15 vector

Task Settings SemEval2013 shared task 9.2 The data set is composed of documents annotated with drug mentions and their 4 types of interactions ( Mechanism , Effect , Advice and Interaction ) or no interaction Statistics of the DDI SemEval2013 shared task 16

Data for Pre-training GCNs • We extracted 255,229 interacting (positive) pairs from DrugBank and generated the same number of pseudo negative pairs by randomly pairing DrugBank drugs • We deleted drug pairs mentioned in the test set of the text corpus 17

Molecular Structure Features • To obtain the graph of a drug molecule, we took as input the SMILES string encoding of the molecule from DrugBank and then converted it into the 2D graph structure using RDKit • For the initial atom (node) vectors, we used randomly embedded vectors for atoms, i.e., C , O , N , … • We also used 4 bond (edge) types: single , double , triple , and aromatic 18

Differences of Labels in Text and Database Tasks • Interacting drug pairs in database may not appear as positive instances in the text task • Text task define 4 detailed types, while database task has one positive type. Mechanism Grepafloxacin inhibits the metabolism of Theophylline No relation While the effect of Grepafloxacin on the metabolism of C.P.A substrates is not evaluated, in vitro data suggested similar effects of Grepafloxacin in Theophylline metabolism No relation 19

Training Settings • Mini-batch training using the Adam optimizer with L2 regularization • Word embeddings trained by the word2vec tool on the 2014 MEDLINE/PubMed baseline distribution – Skip-gram – Vocabulary size: 215k 20

Training Settings Hyper-parameters Hyper-parameters for text-based model Hyper-parameters for molecule-based model 21

Evaluation on Relaxed String Matching • How much of drug mentions in texts are linked to DrugBank entries by relaxed string matching? – We lowercased the mentions and the names in the entries and chose the entries with the most overlaps – As a result, 92.15% and 93.09% of drug mentions in train and test SemEval2013 data set matched the DrugBank entries 22

Evaluation on DDI Extraction from Texts (SemEval2013 Shared Task) • We observe the increase of micro F-score by using molecular structures Text-Only 2.39 pp Text + Molecular Structure Zheng et al. 2017 Lim et al. 2018 68 69 70 71 72 73 micro F-score (%) 23

Analysis Can molecular structures alone represent DDIs in texts ? Grepafloxacin Textual vector inhibits the CNN metabolism of Theophylline interact concat not interact GCN Grepafloxacin DrugBank Database Theophylline Molecular vector - Low F-score (23.90%) - This might be because the drug pairs that interact can appear in the textual context that does not describe their interactions 24

Conclusions • We proposed a novel neural method for DDI extraction using both textual and molecular information • The molecular information has improved DDI extraction performance • As future work, we will investigate the use of other information in DrugBank 25

Molecular Structure Information Masaki Asada, Makoto Miwa, Yutaka - PowerPoint PPT Presentation

Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information Masaki Asada, Makoto Miwa, Yutaka Sasaki Toyota Technological Institute, Japan 1 Introduction Our target problem is the extraction of drug-drug

4. Molecular dynamics Understanding Molecular Simulation Molecular Simulations Molecular

Molecular vibrations Ask Hjorth Larsen Center for Atomic-scale Materials Design 2008 Molecular

Basics of Molecular biology Molecular biology is the study of biology at molecular level.

3. Monte Carlo Simulations Understanding Molecular Simulation Molecular Simulations Molecular

Molecular Simulation Introduction Understanding Molecular Simulation Introduction Why to use

Reaction dynamics of small bio- -molecular ions with molecular ions with Reaction dynamics of

MOLECULAR DYNAMICS STUDY OF LIPOSOMES WITH A NEW COARSE-GRAINED MOLECULAR MODEL Wataru SHINODA

Molecular Spectroscopy: Molecular Spectroscopy How are some molecular parameters

MOLECULAR ENERGY LEVELS DR IMRANA ASHRAF OUTLINE q MOLECULE q MOLECULAR ORBITAL THEORY q

Molecular Motors Roop Mallik What is a Molecular Motor ? Why should you care about Molecular

2. Thermodynamics Introduction Understanding Molecular Simulation Molecular Simulations

Molecular Modeling of Proteins O. Michielin, SIB/LICR Molecular Modeling of Proteins Lecture

2. Thermodynamics Introduction Understanding Molecular Simulation Molecular Simulations

Molecular Spectroscopy 3 Christian Hill Joint ICTP-IAEA School on Atomic and Molecular

Information problems in Information problems in molecular biology and molecular biology and

Application of molecular techniques in Virology Suzan D Pas medical molecular microbiologist 1

2017 CONTENT 1. About Chile 2. About UAI 3. About EcoParque Pealoln UAI 4. Guiding

Metabolic Green Urban Centers of Tomorrow Food, Water and Energy (FWE): Methods:

O-Key Outline The Project 1. Design & Construction of the O-Key System 2. Regulation of

The Age-Related Metabolic Program October 16, 2019 Forward Looking Statements The matters

Quick review of patients are .. pharmacokinetic-based The concentration-effect (i.e.,

Metabolism, Stem Cells, and Their Effects on Acute Myeloid Leukemia By: Colin Dominick, Cecilie

Hypertemporal and Hyperspectral Remote Sensing Applications for Regional Water Quality

Manchester Institute of Biotechnology Discovery through innovation @RoyGoodacre www.biospec.net

Molecular Structure Information Masaki Asada, Makoto Miwa, Yutaka - PowerPoint PPT Presentation

Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information Masaki Asada, Makoto Miwa, Yutaka Sasaki Toyota Technological Institute, Japan 1 Introduction Our target problem is the extraction of drug-drug

4. Molecular dynamics Understanding Molecular Simulation Molecular Simulations Molecular

Molecular vibrations Ask Hjorth Larsen Center for Atomic-scale Materials Design 2008 Molecular

Basics of Molecular biology Molecular biology is the study of biology at molecular level.

3. Monte Carlo Simulations Understanding Molecular Simulation Molecular Simulations Molecular

Molecular Simulation Introduction Understanding Molecular Simulation Introduction Why to use

Reaction dynamics of small bio- -molecular ions with molecular ions with Reaction dynamics of

MOLECULAR DYNAMICS STUDY OF LIPOSOMES WITH A NEW COARSE-GRAINED MOLECULAR MODEL Wataru SHINODA

Molecular Spectroscopy: Molecular Spectroscopy How are some molecular parameters

MOLECULAR ENERGY LEVELS DR IMRANA ASHRAF OUTLINE q MOLECULE q MOLECULAR ORBITAL THEORY q

Molecular Motors Roop Mallik What is a Molecular Motor ? Why should you care about Molecular

2. Thermodynamics Introduction Understanding Molecular Simulation Molecular Simulations

Molecular Modeling of Proteins O. Michielin, SIB/LICR Molecular Modeling of Proteins Lecture

2. Thermodynamics Introduction Understanding Molecular Simulation Molecular Simulations

Molecular Spectroscopy 3 Christian Hill Joint ICTP-IAEA School on Atomic and Molecular

Information problems in Information problems in molecular biology and molecular biology and

Application of molecular techniques in Virology Suzan D Pas medical molecular microbiologist 1

2017 CONTENT 1. About Chile 2. About UAI 3. About EcoParque Pealoln UAI 4. Guiding

Metabolic Green Urban Centers of Tomorrow Food, Water and Energy (FWE): Methods:

O-Key Outline The Project 1. Design &amp; Construction of the O-Key System 2. Regulation of

The Age-Related Metabolic Program October 16, 2019 Forward Looking Statements The matters

Quick review of patients are .. pharmacokinetic-based The concentration-effect (i.e.,

Metabolism, Stem Cells, and Their Effects on Acute Myeloid Leukemia By: Colin Dominick, Cecilie

Hypertemporal and Hyperspectral Remote Sensing Applications for Regional Water Quality

Manchester Institute of Biotechnology Discovery through innovation @RoyGoodacre www.biospec.net

O-Key Outline The Project 1. Design & Construction of the O-Key System 2. Regulation of