deep learning for molecular docking
play

Deep Learning for Molecular Docking David Koes @david_koes GPU - PowerPoint PPT Presentation

Deep Learning for Molecular Docking David Koes @david_koes GPU Technology Conference San Jose, CA March 26, 2018 University of Pittsburgh Computational and Systems Biology THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS


  1. Deep Learning for Molecular Docking David Koes @david_koes GPU Technology Conference San Jose, CA March 26, 2018

  2. University of Pittsburgh Computational and Systems Biology THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS POST-APPROVAL BASIC DRUG PRE- FDA CLINICAL TRIALS RESEARCH & RESEARCH DISCOVERY CLINICAL REVIEW MONITORING PHASE I PHASE II PHASE III PHASE IV 1 FDA- APPROVED MEDICINE POTENTIAL NEW MEDICINES $2.6 BILLION NDA/BLA SUBMITTED IND SUBMITTED FDA APPROVAL NUMBER OF VOLUNTEERS TENS HUNDREDS THOUSANDS Source: Pharmaceutical Research and Manufacturers of America (http://phrma.org) 2

  3. University of Pittsburgh Computational and Systems Biology THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS POST-APPROVAL BASIC DRUG PRE- FDA CLINICAL TRIALS RESEARCH & RESEARCH DISCOVERY CLINICAL REVIEW MONITORING PHASE I PHASE II PHASE III PHASE IV 1 FDA- APPROVED MEDICINE POTENTIAL NEW MEDICINES $2.6 BILLION If you stop failing so often you massively reduce the cost of drug development. NDA/BLA SUBMITTED — Sir Andrew Witty IND SUBMITTED FDA APPROVAL CEO, GlaxoSmithKline NUMBER OF VOLUNTEERS TENS HUNDREDS THOUSANDS Source: Pharmaceutical Research and Manufacturers of America (http://phrma.org) 2

  4. University of Pittsburgh Computational and Systems Biology THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS POST-APPROVAL BASIC DRUG PRE- FDA CLINICAL TRIALS RESEARCH & RESEARCH DISCOVERY CLINICAL REVIEW MONITORING PHASE I PHASE II PHASE III PHASE IV 1 FDA- APPROVED MEDICINE POTENTIAL NEW MEDICINES $2.6 BILLION If you stop failing so often you massively reduce the cost of drug development. NDA/BLA SUBMITTED — Sir Andrew Witty IND SUBMITTED FDA APPROVAL CEO, GlaxoSmithKline NUMBER OF VOLUNTEERS TENS HUNDREDS THOUSANDS Source: Pharmaceutical Research and Manufacturers of America (http://phrma.org) 2

  5. University of Pittsburgh Computational and Systems Biology 1. Does the compound do what you want it to? 2. Does the compound not do what you don’t want it to? 3. Is what you want it to do the right thing? 3

  6. University of Pittsburgh Computational and Systems Biology Protein Structures sequence → structure → function 4

  7. University of Pittsburgh Computational and Systems Biology Protein Structures sequence → structure → function 4

  8. University of Pittsburgh Computational and Systems Biology Structure Based Drug Design Unlike ligand based approaches, ? generalizes to new targets Requires molecular target with known structure and binding site 5

  9. University of Pittsburgh Computational and Systems Biology Structure Based Drug Design Unlike ligand based approaches, generalizes to new targets Requires molecular target with known structure and binding site 5

  10. University of Pittsburgh Computational and Systems Biology Structure Based Drug Design Unlike ligand based approaches, generalizes to new targets Requires molecular target with known structure and binding site 5

  11. University of Pittsburgh Computational and Systems Biology Structure Based Drug Design Virtual Screening Lead Optimization Pose Prediction Binding Discrimination Affinity Prediction 6

  12. University of Pittsburgh Computational and Systems Biology Structure Based Drug Design Virtual Screening Lead Optimization Pose Prediction Binding Discrimination Affinity Prediction 6

  13. University of Pittsburgh Computational and Systems Biology Protein-Ligand Scoring AutoDock Vina d r 1 r 2 O. Trott, A. J. Olson, AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461 7

  14. University of Pittsburgh Computational and Systems Biology Can we do better? Accurate pose prediction, binding discrimination, and affinity prediction without sacrificing performance? 8

  15. University of Pittsburgh Computational and Systems Biology Can we do better? Accurate pose prediction, binding discrimination, and affinity prediction without sacrificing performance? Key Idea: Leverage “big data” • 231,655,275 bioactivities in PubChem • 125,526 structures in the PDB • 16,179 annotated complexes in PDBbind 8

  16. University of Pittsburgh Computational and Systems Biology Deep Learning Convolutional Neural Networks 9 https://devblogs.nvidia.com

  17. University of Pittsburgh Computational and Systems Biology Deep Learning Convolutional Neural Networks 9 https://devblogs.nvidia.com

  18. University of Pittsburgh Computational and Systems Biology CNNs for Protein-Ligand Scoring Pose Prediction CNN Binding Discrimination Affinity Prediction 10

  19. University of Pittsburgh Computational and Systems Biology Protein-Ligand Representation (R,G,B) pixel G G C C O O R R G G G G G R R R R C C C C C O O O O G G G G G GR R R R C C C C C CO O O O G G G G G G R R C C C C C C O O G G G G C C C C G G G G C C C C G G G G C C C C G G G G C C C C 11

  20. University of Pittsburgh Computational and Systems Biology Protein-Ligand Representation (R,G,B) pixel → (Carbon, Nitrogen, Oxygen,…) voxel C C O O C C C C C O O O O C C C C C CO O O O C C C C C C O O The only parameters for this C C C C representation are the choice of C C C C grid resolution , atom density , C C C C and atom types . C C C C 11

  21. University of Pittsburgh Computational and Systems Biology Training Data Pose Prediction Affinity Prediction • 8,688 low RMSD poses 4056 protein-ligand complexes • diverse targets • assign known affinity • wide range of affinities • regression problem • generate poses with AutoDock Vina • include minimized crystal pose 12

  22. University of Pittsburgh Computational and Systems Biology Data Augmentation ≠ 13

  23. University of Pittsburgh Computational and Systems Biology Data Augmentation ≠ 13

  24. University of Pittsburgh 48x48x48x35 2x2 Max Pooling 24x24x24x35 3x3x3 Convolution Rectified Linear Unit Model 24x24x24x32 2x2 Max Pooling 12x12x12x32 3x3x3 Convolution Rectified Linear Unit 12x12x12x64 2x2 Max Pooling 6x6x6x64 3x3x3 Convolution Rectified Linear Unit Computational and Systems Biology 6x6x6x128 Fully Connected Fully Connected Softmax+Logistic Loss Pseudo-Huber Loss Affinity Score Pose 14

  25. University of Pittsburgh Computational and Systems Biology Results Trained on PDBbind refined; tested on CSAR 15

  26. University of Pittsburgh Computational and Systems Biology Results Trained on PDBbind refined; tested on CSAR 15

  27. University of Pittsburgh Computational and Systems Biology Results Clustered Cross-Validation RMSE = 1.69 R = 0.57 AUC = 0.90 Trained on PDBbind refined; tested on CSAR 15

  28. University of Pittsburgh Computational and Systems Biology Visualization masking gradients layer-wise relevance 1UGX Score: 0.62 16

  29. University of Pittsburgh Computational and Systems Biology Visualizing Empty Space 17

  30. University of Pittsburgh Computational and Systems Biology Beyond Scoring 18

  31. University of Pittsburgh Computational and Systems Biology Beyond Scoring 18

  32. University of Pittsburgh Computational and Systems Biology Beyond Scoring 18

  33. University of Pittsburgh Computational and Systems Biology Beyond Scoring Deep Dreams https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html 18

  34. University of Pittsburgh Computational and Systems Biology Beyond Scoring 2Q89 Less Oxygen Here More Oxygen Here 19

  35. University of Pittsburgh Computational and Systems Biology Beyond Scoring 2Q89 Less Oxygen Here More Oxygen Here 19

  36. University of Pittsburgh Computational and Systems Biology Optimizing Low RMSD Poses better worse 21

  37. University of Pittsburgh Computational and Systems Biology Iterative Refinement better worse 22

  38. University of Pittsburgh Computational and Systems Biology Iterative Refinement better worse 22

  39. University of Pittsburgh Computational and Systems Biology Docking vina/smina/gnina Sampling Refinement Rescoring MCMC CNN MCMC Vina pose affinity MCMC MCMC best poses MCMC … CNN N (50) independent Monte Carlo chains Scored with grid-accelerated Vina Best identified pose retained 23

  40. University of Pittsburgh Computational and Systems Biology Full CNN Docking 24

  41. University of Pittsburgh Computational and Systems Biology GPU Performance Atom Gradients 500 CNN Backward CNN Forward 375 Molecular Grid Average Time (ms) 250 125 0 Xeon 4110 2.1GHz i9-7920X 2.9Ghz GTX 1070 Ti V100 25

  42. Prospective Evaluation: D3R

  43. University of Pittsburgh Computational and Systems Biology Grand Challenge 3 Spearman Correlation cnn_docked_affinity cnn_rescore_affinity cnn_docked_scoring cnn_rescore_scoring vina cat 0.0701 0.154 -0.0351 0.178 0.179 p38a -0.0784 -0.116 -0.329 -0.305 -0.0631 vegfr2 0.366 0.484 0.434 0.448 0.414 jak2 0.428 0.338 0.39 0.27 0.106 jak2_sub3 0.68 0.369 -0.372 0.159 -0.633 tie2 0.648 0.835 0.136 -0.078 0.561 abl1 0.634 0.745 0.005 0.182 0.713 27

  44. University of Pittsburgh Computational and Systems Biology Grand Challenge 3: The Good 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend