Drug Discovery using Grid Technologies Yuichiro Inagaki - - PowerPoint PPT Presentation

drug discovery using grid technologies
SMART_READER_LITE
LIVE PREVIEW

Drug Discovery using Grid Technologies Yuichiro Inagaki - - PowerPoint PPT Presentation

Drug Discovery using Grid Technologies Yuichiro Inagaki Biotechnology division Fuji Research Institute co. Outline Needs for grid technologies in drug discovery g-Drug Discovery system Test calculation results Needs for grid


slide-1
SLIDE 1

Drug Discovery using Grid Technologies

Yuichiro Inagaki Biotechnology division Fuji Research Institute co.

slide-2
SLIDE 2

Outline

Needs for grid technologies in drug

discovery

g-Drug Discovery system Test calculation results

slide-3
SLIDE 3

Needs for grid technology in drug discovery

Increse in number of both drug candidate

compounds and target

107 molecules×103conformations screening throughout a family: Kinases, GPCRs,…

Various type of calculation

Druglikeness screening ADME/Tox screening Conformational search Pharmacophore screening Docking Molecular Orbital methods

More CPU Power Seemless connection

slide-4
SLIDE 4

“g-Drug Discovery”

Funded by Japan Science and

Technology Agency (JST)

Components

DB system Conflex-G Xsi-G REMD FMO

Data Grid

2D-Drug Libraries

Candidates for Docking

Exhaustive conformational analysis

Hitlist

Filtering / Data mining

Overlap scoring / Clustering Kohonen mapping Drug-likeness

Phamacophore Mapping / Docking calculations Phamacophore calculations

Scoring based on ∆G

DrugML Conflex-G, Xsi-G FMO(-MD) REMD Grid environment 3D-Drug Libraries

=

Data Grid

2D-Drug Libraries

Candidates for Docking

Exhaustive conformational analysis

Hitlist

Overlap scoring / Clustering Drug-

Phamacophore Mapping Docking calculations Phamacophore calculations

Scoring based on ∆G

DrugML Conflex-G, Xsi-G, FMO(-MD) REMD Grid environment 3D-Drug Libraries

=

AbinitMP-G

Data Grid

2D-Drug Libraries

Candidates for Docking

Exhaustive conformational analysis

Hitlist

Filtering / Data mining

Overlap scoring / Clustering Kohonen mapping Drug-likeness

Phamacophore Mapping / Docking calculations Phamacophore calculations

Scoring based on ∆G

DrugML Conflex-G, Xsi-G FMO(-MD) REMD Grid environment 3D-Drug Libraries

=

Data Grid

2D-Drug Libraries

Candidates for Docking

Exhaustive conformational analysis

Hitlist

Overlap scoring / Clustering Drug -

Phamacophore Mapping Docking calculations Phamacophore calculations

Scoring based on ∆G

DrugML Conflex-G, Xsi-G, FMO(-MD) REMD Grid environment 3D-Drug Libraries

=

AbinitMP-G

slide-5
SLIDE 5

DrugML a XML Schema for drug design

  • Use tags of CML as much as possible
  • Conformers
  • Complex
  • Descriptors

drugml universeList universe molecule molecule conformationList conformation atomArray bondArray

Any tag of cml:atomArray Any tag of cml:bondArray

descriptor2D Descriptor1D Descriptor3D DescriptorWHIM

Any tag of cml:molecule

slide-6
SLIDE 6

Structure of DB system

Xsi

XML-RPC Interface ID Operation

DrugML or CML

General DB Operations

Browser ID Server XML-RPC Interface

XML:DB

Xindice

Servlet

HTTP (XML-RPC) HTTP

Application Schema General Interface DB Connection Database

slide-7
SLIDE 7

Omni-RPC a Grid RPC system for Parallel Programming

  • Supports typical master-worker grid applications such as

docking simulation.

  • Users can use the same program for both clusters and grids.
  • Supports a local environment with "rsh", a grid environment

with Globus, and remote hosts with "ssh".

  • OmniRPC inherits its API from Ninf, the programmer can use

OpenMP for easy-to- use parallel programming because the API is designed to be thread-safe.

  • For a cluster over a private network, an agent process running

the server host functions as a proxy to relay communications between the client and the remote executables.

slide-8
SLIDE 8

Xsi 2.0

Combines Ligand Based Drug Design

and Structure Based Drug Design

Montecarlo, minimization, docking by

MMFF94s force field

2D & 3D descriptors Statistics,Clustering,Similarity Machine Learning by support vector

machine

slide-9
SLIDE 9

LigandAlignment

  • Optimizes similarity between

pharmacophore map and ligand

  • Pharamacophore map can be

defined by physico-chemical properties and voids

  • VDW,hydrophobicity,HD,HA,arom

aticity,electrostatic…

  • 0.6 sec/1 alignment (viracept)
slide-10
SLIDE 10

Map of binding site

HIV protease and inhibitor (DMP323)

slide-11
SLIDE 11

Alignment onto binding site

Binding Site Map Alignment of JG-365

slide-12
SLIDE 12

Pharmacophore screening by LigandAlignment

Hit rate (10% ranked DB) ~ 50 %

3 . 5 4 4 . 5 5 5 . 5 6 6 . 5 7 7 . 5

2 4 6 8 1

R a n k S c

  • r

e

r a n d

  • m

h i t

R a n k 1 J G
  • 3
6 5 R a n k 2 v i r a c e p t R a n k 3 L 7 3 5 5 2 4 R a n k 4 S B 2 3 3 8 6 R a n k 5 D M P 3 2 3

97 random compounds + 5 known HIV protease inhibitors

slide-13
SLIDE 13

Docking flow

H N N O C H3 C H3 H3C O H O H C H3 O N H S H H H N N O C H3 C H3 H3C O H O H C H3 O N H S H H H N N O C H3 C H3 H3C O H O H C H3 O N H S H H

MonteCarlo

Finding binding site

Calculate WHIMs Calculate WHIMs Calculate similarities Between ligand and pocket Sort ligands by similatits Aliginment by using WHIMs Docking Docking Docking Docking Master Workers

ligand Receptor

Viracept

slide-14
SLIDE 14

Calculation Environment

l

  • c

a t i

  • n

C P U n u m b e r

  • f

n

  • d

e sR T T ( m s ) f s l i n F u j i

  • R

I C ( T

  • k

y

  • ) D

u a l X e

  • n

2 . 4 G H z 5 . 1 9 D u a l X e

  • n

2 . 4 G H z 1 D u a l X e

  • n

3 . G H z 5 a l i c e T s u k u b a u n i v e r s i t y ( T s u k u b a ) D u a l A t h l

  • n

1 8 + 1 6 2 7 . 2 d e n n i s T s u k u b a u n i v e r s i t y ( T s u k u b a ) 2 7 . 2

Total : 1 master + 71 workers

slide-15
SLIDE 15

Speedup of calculations

0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 f s l i n ( 1 ) f s l i n ( 1 ) d e n n i s ( 3 ) a l i c e ( 3 2 ) f s l i n + d e n n i s ( 4 ) f s l i n + a l i c e ( 4 2 ) d e n n i s + a l i c e ( 6 2 ) a l l ( 7 2 ) Speedup

slide-16
SLIDE 16

Docking results

X-ray (yellow) Comp.(white) RMSD:1.77Å

slide-17
SLIDE 17

Summary

Hit rate more than 50%

can be achieved

Protein family screening LigandAlignment on grid necessary

slide-18
SLIDE 18

Acknowledgements

Hiroshi Chuman (Tokushima Univ.) Umpei Nagashima (AIST) Hitoshi Goto (Toyohashi Univ. of

Technology)

Mitsuhisa Sato (Tsukuba Univ.)

slide-19
SLIDE 19

Back up Slides

slide-20
SLIDE 20

LigandAlignment

(r) (c1) (c2) リガンドアライメントによるファーマコフォアマップの最適化の様子。 (r) 参照分子(ベンゼン)のMS(原子質量)のマップ (c1) 候補分子(トルエン)の最適化前のMS(原子質量)のマップ (c2) 候補分子(トルエン)の最適化後のMS(原子質量)のマップ トルエン分子のベンゼン環の配置がベンゼン分子のベンゼン環の配置に近くな るように最適化されている。図はマップの等数値面を描いたもの。格子点数は 32*32*32。

slide-21
SLIDE 21

References

[1] Mitsuhisa Sato, Taisuke Boku, Daisuke Takahashi, OmniRPC:a Grid RPC system for Parallel Programming in Cluster and Grid Environment 3rd International Symposium on Cluster Computing and the Grid (CCGrid2003), May 12 - 15, 2003, Tokyo, Japan. [2] Mitsuhisa Sato, Motonari Hirano, Yoshio Tanaka, Satoshi Sekiguchi, OmniRPC: A Grid RPC Facility for Cluster and Global Computing in OpenMP WOMPAT 2001, 130-136. [3] http://www.omni.hpcc.jp/OmniRPC/index.html.en [4] http://csb.stanford.edu/koehl/ProShape/ [5] Todeschin, R. and Gramatica, P., New 3D Molecular Descriptors: The WHIM theory and QSAR Applications, In 3D QSAR in Drug Design Volume 2, Eds., Kubinyi, H., Folkers, G and Martin, Y.C., 355-380, KLUWER/ESCOM, Dordrecht, 1998. [6]
  • R. Todeschini, M. Lasagni, and E. Marengo, “New Molecular Descriptors for 2D and 3D
  • Structures. Theory”, J. Chemometrics, 8, pp.263-272, (1994).
[7] http://www.fuji-ric.co.jp/st/xsi/index.html [8] http://www.xml-cml.org/ [9] http://xml.apache.org/ [10] http://www.w3.org/XML/Schema
  • Fig. 1. Platform for drug
discovery
  • 1. Introduction
A number of computer resources, such as CPUs and storages, can be connected over networks to construct a huge virtual computing environment using grid technologies. Our project “g-Drug Discovery” aims to develop a platform for drug design using grid technologies, on which various analysis and calculations are conducted, such as molecular mechanics method, replica exchange method, docking with proteins, molecular orbital method, and 3- dimensional quantitative structure activity relationship. In this poster we will present the following things: DrugML…The markup language for drug discovery Database system for drug discovery … Our database system which stores 3-dimensional structure of molecule Docking calculations using grid technology … Ligand-receptor docking simulation using “Xsi” and “OmniRPC”.
  • 2. DrugML (Drug Markup Language)
DrugML (Drug Markup Language) is the markup language for drug discovery whose specification has been decided upon newly by our
  • project. It is defined by XML Schema [10], so we can validate it’s file
strictly by using the existing XML parser (such as Xerces [9]). DrugML imports tags from CML (Chemical Markup Language) [8] as much as possible. Tag “universe” can represent the snapshot of two or more molecules. Tag “conformation” can represent the 3-dimensional structure of molecule and universe. Tag “descriptor*” can represent various descriptors of molecule, such as WHIM (Weighted Holistic Invariant Molecular) descriptors.
  • Fig. 2. Data structure of DrugML. “Any tag of cml:molecule” means that
any element under “molecule” of CML may exist in that place.
  • 3. DB system for drug discovery
  • Fig. 4. Framework of DB
System.
  • Fig. 3. Abstract structure of DB
system
  • Fig. 5. Implementation of DB system
Grid RPC system : OmniRPC is a grid RPC system which enables seamless parallel programming in cluster and grid environment ([1],[2],[3], Fig.6) . PC PC PC Cluster PC Cluster PC PC PC PC PC PC PC PC PC Cluster PC Cluster PC PC PC PC PC PC PC PC PC Cluster PC Cluster PC PC PC PC PC PC PC PC PC Cluster PC Cluster PC PC PC PC PC PC PC PC PC Cluster PC Cluster PC PC PC PC PC PC PC PC PC Cluster PC Cluster PC PC PC PC PC PC PC PC PC Cluster PC Cluster PC PC PC PC PC PC PC PC PC Cluster PC Cluster PC PC PC PC PC PC Client Client Grid Environment Grid Environment
  • 4. Docking calculations using grid technologies
We have performed docking calculations between viracept (Fig.7) and HIV protease , using Xsi and OmniRPC. It is only the conformation of HIV protease that we assumed, and neither the 3-dimensional structures of viracept nor the place of binding site with HIV protease assumed. Fig. 4 shows the flow chart of calculations. We generated 10000 initial conformations of viracept by MonteCarlo methods and found binding site of receptor by using ProShape [4]. Each conformations are aligned by using WHIM descriptors [5][6]. Our experiment’s environment is described in Table.1, and The result
  • f calculations is showed in Fig. 9 and Fig. 10.
H N N O C H3 C H3 H3C O H O H C H3 O N H S H H H N N O C H3 C H3 H3C O H O H C H3 O N H S H H H N N O C H3 C H3 H3C O H O H C H3 O N H S H H
  • Fig. 10. (Right figure) Comparison of viracept of
computation (white line) and X-ray (yellow line). RMSD
  • f two conformations is 1.77 Å. (Left figure) Complex of
HIV protease of X-ray (stick line) and the calculated viracept (line and ball). Table 1. Experiment’s environment. The master node is the top node of fslin. Suite for virtual screening : Xsi (ku-su-shi, [7]) is a suite for virtual screening based on Molecular Mechanics (MM) which has been developed by us. This is the following features. Exhaustive Search, MonteCarlo simulation, and Docking simulation based on Molecular Mechanics Various descriptor, similarity, clustering, superimpose Configurable by script

Drug discovery using grid technologies and Drug discovery using grid technologies and DrugML DrugML

Michiaki Hamada1, Yuichiro Inagaki1, Hitoshi Goto2, Umpei Nagashima3 , Shigenori Tanaka4, and Hiroshi Chuman5

1Fuji Research Institute Corporation, 2Toyohashi University of Technology, 3National Institute of Advanced Industrial Science and Technology, 4Toshiba Research and Development Center, and 5Tokushima University

CINF 47

  • Fig. 9. (Left figure) Speedup of calculations. This graph is
  • btained by 10 initial conformations (that is, each execute
240 docking simulations). The contents of the parenthesis
  • f the X-axis express the number of CPUs. The Speedup
was based on the time measured by one node of fslin. (Right figure) The time which calculations took. The actual number of docking calculations is three times of the number of initial conformations.
  • Fig. 8. Flow chart of calculations
  • Fig. 6. OmniRPC
We have developed database system in Fig.1, which adopted DrugML as the data structure to store. Fig.3 is the abstract structure of the system and
  • Fig. 4 is the framework which enables us to exchange the kind of
database easily. We assume the native XML database, but it has not had standard query language such as SQL and the way of connection such as ODBC (Open DataBase Connectivity ) of relational database (RDB). This framework is able to absorb those differences of each XML database. We have implemented this framework by C++ and native XML database Xindice [9] (Fig.5).
  • Fig. 7. Viracept,
which is a kind
  • f HIV
protease inhibitor 0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 f s l i n ( 1 ) f s l i n ( 1 ) d e n n i s ( 3 ) a l i c e ( 3 2 ) f s l i n + d e n n i s ( 4 ) f s l i n + a l i c e ( 4 2 ) d e n n i s + a l i c e ( 6 2 ) a l l ( 7 2 ) Speedup 20000 40000 60000 80000 100000 120000 140000 160000 180000 200000 2000 4000 6000 8000 10000 12000 number of initial conformations time (second) Xsi XML-RPC Interface ID Operation DrugML or CML General DB Operations Browser ID Server XML-RPC Interface XML:DB Xindice Servlet HTTP (XML-RPC) HTTP Application Schema General Interface DB Connection Database Application Schema General DB Interface DB Connection Database Application dependent layer Schema dependent layer Common method to DB DB Connection layer Actual Database MonteCarlo Finding binding site Calculate WHIMs Calculate WHIMs Calculate similarities Between ligand and pocket Sort ligands by similatits Aliginment by using WHIMs Docking Docking Docking Docking Master Workers ligand Receptor DBManager DataBaseImpl QueryGenerator UpdataQuery Generator SearchQuery Generator XUpdata Generator Xpath Generator XQuery Generator DataBase search update register HighLevel DBManager drugml universeList universe molecule molecule conformationList conformation atomArray bondArray Any tag of cml:atomArray Any tag of cml:bondArray descriptor2D Descriptor1D Descriptor3D DescriptorWHIM Any tag of cml:molecule l
  • c
a t i
  • n
C P U n u m b e r
  • f
n
  • d
e sR T T ( m s ) f s l i n F u j i
  • R
I C ( T
  • k
y
  • ) D
u a l X e
  • n
2 . 4 G H z 5 . 1 9 D u a l X e
  • n
2 . 4 G H z 1 D u a l X e
  • n
3 . G H z 5 a l i c e T s u k u b a u n i v e r s i t y ( T s u k u b a ) D u a l A t h l
  • n
1 8 + 1 6 2 7 . 2 d e n n i s T s u k u b a u n i v e r s i t y ( T s u k u b a ) 2 7 . 2 Data Grid 2D-Drug Libraries Candidates for Docking Exhaustive conformational analysis Hitlist Filtering / Data mining Overlap scoring / Clustering Kohonen mapping Drug-likeness Phamacophore Mapping / Docking calculations Phamacophore calculations Scoring based on ∆G DrugML Conflex-G, Xsi-G FMO(-MD) REMD Grid environment 3D-Drug Libraries = Data Grid 2D-Drug Libraries Candidates for Docking Exhaustive conformational analysis Hitlist Overlap scoring / Clustering Drug- Phamacophore Mapping Docking calculations Phamacophore calculations Scoring based on ∆G DrugML Conflex-G, Xsi-G, FMO(-MD) REMD Grid environment 3D-Drug Libraries = AbinitMP-G Data Grid 2D-Drug Libraries Candidates for Docking Exhaustive conformational analysis Hitlist Filtering / Data mining Overlap scoring / Clustering Kohonen mapping Drug-likeness Phamacophore Mapping / Docking calculations Phamacophore calculations Scoring based on ∆G DrugML Conflex-G, Xsi-G FMO(-MD) REMD Grid environment 3D-Drug Libraries = Data Grid 2D-Drug Libraries Candidates for Docking Exhaustive conformational analysis Hitlist Overlap scoring / Clustering Drug - Phamacophore Mapping Docking calculations Phamacophore calculations Scoring based on ∆G DrugML Conflex-G, Xsi-G, FMO(-MD) REMD Grid environment 3D-Drug Libraries = AbinitMP-G
slide-22
SLIDE 22

Flow of Screening

3D structure generation by Conflex-G Screening by pharmacophore (Xsi) Docking (Xsi-G)