EGI-InSPIRE Cheminformatics platform for drug discovery application - - PowerPoint PPT Presentation

egi inspire cheminformatics platform for drug discovery
SMART_READER_LITE
LIVE PREVIEW

EGI-InSPIRE Cheminformatics platform for drug discovery application - - PowerPoint PPT Presentation

EGI-InSPIRE Cheminformatics platform for drug discovery application Hsi-Kai, Wang Academic Sinica Grid Computing EGI User Forum, 13, April, 2011 1 www.egi.eu www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE RI-261323 Introduction to drug


slide-1
SLIDE 1

www.egi.eu EGI-InSPIRE RI-261323

EGI-InSPIRE

www.egi.eu EGI-InSPIRE RI-261323

Cheminformatics platform for drug discovery application

Hsi-Kai, Wang Academic Sinica Grid Computing EGI User Forum, 13, April, 2011

1

slide-2
SLIDE 2

www.egi.eu EGI-InSPIRE RI-261323

  • Introduction to drug discovery
  • Computing requirement of high

throughput virtual screening

  • Cheminfomatics case study
slide-3
SLIDE 3

www.egi.eu EGI-InSPIRE RI-261323 Computational chemistry /Molecular modeling useful across the pipeline, but very different techniques aim for success, but if not: fail early, fail cheap

Ref: Makus R. and Ralph W., Nature Rev. Drug Discov. (2003), 2, 123-131

Drug discovery development

slide-4
SLIDE 4

www.egi.eu EGI-InSPIRE RI-261323

Strategy in drug discovery

Ligand unknown Ligand known Receptor (3D structure) unknown Combichem HTS Virtual Screening Pharmacophore Similarity QSAR Receptor (3D structure) known Receptor-bases searching De novo design Structure-based drug design Receptor-ligand interaction Docking

4

slide-5
SLIDE 5

www.egi.eu EGI-InSPIRE RI-261323

  • What is grid
  • Many definitions exist in the literature
  • Foster and Kesselman, 1998. “A computational

grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational facilities.”

  • Grid can provide
  • Large scale and on-demand resources
  • Computing resources (computing grids)
  • Storage resources (data grids)

Drug discovery on Grid (1/2)

slide-6
SLIDE 6

www.egi.eu EGI-InSPIRE RI-261323

chemical compounds Receptor structures Molecular docking ….takes years Data challenge on Grid ….can be done in weeks In vitro screening

  • f best ~50

hits Hits sorting and refining

Problem

– Millions of compounds and drugs molecules are presently available for screening – But developing efficient assay in laboratory for such a work is time-consuming and very expensive

Solution

– Grids offer high-speed computing and huge-data managing capability – Possible variant targets can be studied quickly by present modelling applications. – This will help medicinal chemists to respond to major instant threats.

Drug discovery on Grid (2/2)

slide-7
SLIDE 7

www.egi.eu EGI-InSPIRE RI-261323

GVSS, GAP Virtual Screening Service

slide-8
SLIDE 8

www.egi.eu EGI-InSPIRE RI-261323

GAP Service Architecture

8

slide-9
SLIDE 9

www.egi.eu EGI-InSPIRE RI-261323

  • A lightweight framework for parallel scientific applications in

master worker model,

  • The framework takes care of all synchronization, communication,

and workflow management details on behalf of application User Application Interface

GRID environments DIANE, DIstributed ANalysis Environment

slide-10
SLIDE 10

www.egi.eu EGI-InSPIRE RI-261323

  • Each horizontal line segment =
  • ne task = one docking
  • Unhealthy workers are

removed from the worker list

  • Failed tasks are rescheduled to

healthy workers

the “bad” worker removed good load balance

The profile of a DIANE job

slide-11
SLIDE 11

www.egi.eu EGI-InSPIRE RI-261323

  • 280 DIANE worker agents were

submitted as LCG jobs

  • 200 jobs (~71%) were healthy

– ~16 % failures related to middleware errors – ~12 % failures related to application errors

DIANE utilizes ~ 95% of the healthy resources stable throughput

Efficiency and throughput of DIANE

slide-12
SLIDE 12

www.egi.eu EGI-InSPIRE RI-261323

GVSS application: dengue virus

Ref: Hsin-Yen C. et al, J Grid Computing (2010), 8, 529-541

slide-13
SLIDE 13

www.egi.eu EGI-InSPIRE RI-261323

Ref: Clark G.G. , "Dengue: An emerging arboviral disease“, 2006

Worldwide dengue distribution

Areas infested with Aedes aegypti Areas with Ae. aegypti and dengue epidemics

slide-14
SLIDE 14

www.egi.eu EGI-InSPIRE RI-261323

http://en.wikipedia.org/wiki/Aedes Kuhn, R.J.et al. Cell 108, 717−725; 2002

Dengue virus

slide-15
SLIDE 15

www.egi.eu EGI-InSPIRE RI-261323

Ref: PDB: 2vbc (2008) J.Virol. 82: 173

H51 D75 S135

Dengue NS3 protease

slide-16
SLIDE 16

www.egi.eu EGI-InSPIRE RI-261323

Dengue Fever Data Challenge / resources & 1st result Total number of completed docking jobs 300,000 Estimated needed computing power 4,167 CPU*days Duration of the experiment 60 days Cumulative computing results 42.5 GB Total Computing Recourses in EUAsia VO 268 Cores Number of used Computing Elements 6

slide-17
SLIDE 17

www.egi.eu EGI-InSPIRE RI-261323

  • Accumulating Computing Recourses in EUAsia VO: 268 cpu-

cores(100 – ASGC(TW), 2 – TH, 4 - VN, 18 – MIMOS(MY), 80 – UPM(MY), 64 - CESNET(CZ))

  • lcg-infosites --vo euasia ce
  • Registered VQS account:
  • 6 users (TW)
  • 17 user (PH, 15 in AdMU, 2 in ASTI)
  • 2 user (TH, 1 in NECTEC, 1 in HAII)
  • 1 user (MY, UPM)
  • 1 user (ID, ITB)
  • 2 user (VN, IAMI)
  • 1 user (FR, HealthGrid)

Joint Computing Resources & Users

slide-18
SLIDE 18

www.egi.eu EGI-InSPIRE RI-261323

Integration of SG & DG by EDGES

18

slide-19
SLIDE 19

www.egi.eu EGI-InSPIRE RI-261323

Scenario 1 – DG to SG via bridge

19

slide-20
SLIDE 20

www.egi.eu EGI-InSPIRE RI-261323

Scenario 2 – SG to DG via bridge

20

slide-21
SLIDE 21

www.egi.eu EGI-InSPIRE RI-261323

Scenario 3 –

SG/DG resources but not through EDGeS bridges

21

Job Manager Task Manager

slide-22
SLIDE 22

www.egi.eu EGI-InSPIRE RI-261323

Web UI Service Architecture

22

slide-23
SLIDE 23

www.egi.eu EGI-InSPIRE RI-261323

Prototype Web UI Screenshot

23

slide-24
SLIDE 24

www.egi.eu EGI-InSPIRE RI-261323

Simulation of drug discovery workflow

24

Ligand Protein Preparing ligand & protein Docking Scoring Generating conformation Analyzing & ranking data

slide-25
SLIDE 25

www.egi.eu EGI-InSPIRE RI-261323

Protein Database

25

Ref: PDB, http://www.rcsb.org/pdb/home/home.do PDBbind, http://sw16.im.med.umich.edu/databases/pdbbind/index.jsp

slide-26
SLIDE 26

www.egi.eu EGI-InSPIRE RI-261323

  • Genetic algorithm

– is a search heuristic that mimics the process of natural evolution. It generate solutions to optimization problems using techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover. – AutoDock, GOLD…

  • Molecular dynamics

– is used to find poses by force-fields. The generated conformations usually consists of a simulated annealing to locate the global optimum in a large search space. – AMBER, CHARMM…

  • Shape complementarities

– is a description of the molecules, including solvent-accessible surface area, geometric constraints, H-bond, hydrophobic/hydrophilic interaction between all atoms in the complex. – DOCK, FRED…

General class of docking algorithm

slide-27
SLIDE 27

www.egi.eu EGI-InSPIRE RI-261323

  • Force Field

– affinities are estimated by intermolecular van der Waals, electrostatic interaction et al. between all atoms of the two molecules in the complex. – AMBER…

  • Empirical

– count the number of interactions and assign a score based on the number of occurrences. Example H-bond, ionic, hydrophobic/hydrophilic interaction. – LUDI, X-Score…

  • Knowledge-base

– observe known protein/ligand structures, and favor interactions and geometries that are seen often. – DrugScore, PMF…

General class of scoring function

slide-28
SLIDE 28

www.egi.eu EGI-InSPIRE RI-261323 28

Ref: AutoDock, http://autodock.scripps.edu/ X-SCORE, http://sw16.im.med.umich.edu/software/xtool/

Tools of docking and scoring

slide-29
SLIDE 29

www.egi.eu EGI-InSPIRE RI-261323

Simulated Condition

  • Ligand and Protein
  • PDBBind database v2010 (3429 complexes)
  • Docking
  • software: AutoDock
  • computing time: 30 ~ 50 min per docking
  • ReScoring
  • software: X-Score
  • computing time: 1 ~ 2 min per scoring

29

slide-30
SLIDE 30

www.egi.eu EGI-InSPIRE RI-261323

Free energy in AutoDock, X-Score

slide-31
SLIDE 31

www.egi.eu EGI-InSPIRE RI-261323

Free energy R2 in ligand molecular weight

slide-32
SLIDE 32

www.egi.eu EGI-InSPIRE RI-261323

Free energy R2 in protein enzyme type

slide-33
SLIDE 33

www.egi.eu EGI-InSPIRE RI-261323

RMSD in AutoDock, X-Score

33

slide-34
SLIDE 34

www.egi.eu EGI-InSPIRE RI-261323

slide-35
SLIDE 35

www.egi.eu EGI-InSPIRE RI-261323

slide-36
SLIDE 36

www.egi.eu EGI-InSPIRE RI-261323

Future work

  • Finish implement Web-based Virtual

Screening Service with EDGeS infrastructure.

  • The 691 proteins x 691 ligands docking

tasks complete and data analysis.

  • Other proteins are classified by enzyme

code.

36

slide-37
SLIDE 37

www.egi.eu EGI-InSPIRE RI-261323

Thank you for your attention