high throughput virtual molecular docking
play

High-Throughput Virtual Molecular Docking: Hadoop Implementation of - PowerPoint PPT Presentation

High-Throughput Virtual Molecular Docking: Hadoop Implementation of AutoDock4 on a Private Cloud The Second International Emerging Computational Methods for the Life Sciences Workshop ACM International Symposium on High Performance Distributed


  1. High-Throughput Virtual Molecular Docking: Hadoop Implementation of AutoDock4 on a Private Cloud The Second International Emerging Computational Methods for the Life Sciences Workshop ACM International Symposium on High Performance Distributed Computing June 8, 2011, San Jose, CA Sally R. Ellingson Graduate Research Assistant Center for Molecular Biophysics, UT/ORNL Department of Genome Science and Technology, UT Scalable Computing and Leading Edge Innovative Technologies (IGERT) Dr. Jerome Baudry PhD Advisor Center for Molecular Biophysics, UT/ORNL Department of BCMB, UT

  2. Ultimate Goal: Reduce the time and cost of discovering novel drugs

  3. 1. Virtual Molecular Docking a) Novel Drug Discovery b) Virtual high-throughput screenings (VHTS) 2. Cloud Computing a) Advantages for VHTS b) Kandinsky c) Hadoop (MapReduce) 3. AutoDockCloud a) Current Implementation b) Future Implementations

  4. Virtual Molecular Docking Given a receptor (protein) and ligand (small molecule), predict 1. Bound conformations • Search algorithm to explore conformational space 2. Binding affinity • Force field to evaluate energetics

  5. Virtual Docking Engine http://autodock.scripps.edu/wiki/AutoDock4

  6. Novel Drug Discovery Human HDAC4 HA3 crystal structure ZINC03962325

  7. Virtual High-Throughput Screening (VHTS)

  8. VHTS with Autodock4

  9. Potential advantages of Cloud Computing for VHTS • Affordable access to compute resources (especially for small labs and classrooms). • Easy to use interface accessible through web for non-computer experts. Software maintained by experts. • Scalable resources for size of screening.

  10. Kandinsky Private Cloud Platform at ORNL Kandinsky, the Systems Biology Knowledgebase Computer, Sponsored by the Office of Biological and Environmental Research in the DOE Office of Science 68 nodes X 16 cores/node = 1088 cores 20 Gbps Infiniband Interconnect Designed to support Hadoop applications and gain an understanding of the MapReduce paradigm. • 57 nodes for MapReduce tasks • 1 tasktracker per node • 10 map and 6 reduce tasks per node (16 tasks per node) • 570 map tasks and 342 reduce tasks can run simultaneously on Kandinsky

  11. Hadoop • Scalable • Economical • Efficient • Reliable http://hadoop.apache.org/common/docs/current/api/overview-summary.html

  12. MapReduce programming paradigm used by Hadoop people.apache.org people.apache.org

  13. Current AutoDockCloud Implementation input=file names needed for each docking map(input) { copy input to local working directory; run AutoDock4 locally; copy result file to HDFS; } *pre-docking set-up and post-docking analysis is currently done manually *no reduce function is currently being used

  14. Current AutoDockCloud Implementation Er Agonist screening from DUD as benchmark 450 speed-up with 570 available map slots on Kandinsky, private cloud at ORNL

  15. Current AutoDockCloud Implementation Percent of known ligands found Percent of ranked database Docking enrichment plot for ER agonist using AutoDockCloud and DUD.

  16. Future AutoDockCloud Implementation input=ligand file from chemical compound database map(input) { create pdbqt (AutoDock input file) from input; run AutoDock4 locally; find best scoring ligand structure; save structure to HDFS; return <score, ligand>; } reduce(<score, ligand>) { sort; return ranked_database; } *pre-docking and post-docking will be automated and distributed *less total I/O requirements

  17. Future Plans • Incorporate additional docking engines – Autodock Vina • Less I/O • More efficient and accurate algorithm • No charge information needed • Deploy on Commercial Cloud (EC2) • Develop web interface

  18. 1. Virtual Molecular Docking a) Novel Drug Discovery b) Virtual high-throughput screenings (VHTS) 2. Cloud Computing a) Advantages for VHTS b) Kandinsky c) Hadoop (MapReduce) 3. AutoDockCloud a) Current Implementation b) Future Implementations

  19. Acknowledgements • Dr. Jerome Baudry (advisor) • Center for Molecular Biophysics, UT/ORNL • Genome Science and Technology, UT • Scalable Computing and Leading Edge Innovative Technologies (IGERT) • Avinash Kewalramani, ORNL • ECMLS and HPDC organizers and participants Questions/Comments

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend