Deep Learning for Molecular Docking
David Koes
GPU Technology Conference San Jose, CA March 26, 2018
@david_koes
Deep Learning for Molecular Docking David Koes @david_koes GPU - - PowerPoint PPT Presentation
Deep Learning for Molecular Docking David Koes @david_koes GPU Technology Conference San Jose, CA March 26, 2018 University of Pittsburgh Computational and Systems Biology THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS
GPU Technology Conference San Jose, CA March 26, 2018
@david_koes
University of Pittsburgh Computational and Systems Biology 2
PHASE I PHASE II PHASE III PHASE IV
IND SUBMITTED NDA/BLA SUBMITTED FDA APPROVAL
TENS HUNDREDS THOUSANDS NUMBER OF VOLUNTEERS
THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS
POTENTIAL NEW MEDICINES
1 FDA-
APPROVED MEDICINE
BASIC RESEARCH DRUG DISCOVERY CLINICAL TRIALS FDA REVIEW
POST-APPROVAL RESEARCH & MONITORING
PRE- CLINICAL
Source: Pharmaceutical Research and Manufacturers of America (http://phrma.org)
$2.6
BILLION
University of Pittsburgh Computational and Systems Biology 2
PHASE I PHASE II PHASE III PHASE IV
IND SUBMITTED NDA/BLA SUBMITTED FDA APPROVAL
TENS HUNDREDS THOUSANDS NUMBER OF VOLUNTEERS
THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS
POTENTIAL NEW MEDICINES
1 FDA-
APPROVED MEDICINE
BASIC RESEARCH DRUG DISCOVERY CLINICAL TRIALS FDA REVIEW
POST-APPROVAL RESEARCH & MONITORING
PRE- CLINICAL
Source: Pharmaceutical Research and Manufacturers of America (http://phrma.org)
$2.6
BILLION
If you stop failing so often you massively reduce the cost of drug development.
— Sir Andrew Witty CEO, GlaxoSmithKline
University of Pittsburgh Computational and Systems Biology 2
PHASE I PHASE II PHASE III PHASE IV
IND SUBMITTED NDA/BLA SUBMITTED FDA APPROVAL
TENS HUNDREDS THOUSANDS NUMBER OF VOLUNTEERS
THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS
POTENTIAL NEW MEDICINES
1 FDA-
APPROVED MEDICINE
BASIC RESEARCH DRUG DISCOVERY CLINICAL TRIALS FDA REVIEW
POST-APPROVAL RESEARCH & MONITORING
PRE- CLINICAL
Source: Pharmaceutical Research and Manufacturers of America (http://phrma.org)
$2.6
BILLION
If you stop failing so often you massively reduce the cost of drug development.
— Sir Andrew Witty CEO, GlaxoSmithKline
University of Pittsburgh Computational and Systems Biology 3
University of Pittsburgh Computational and Systems Biology
4
sequence → structure → function
University of Pittsburgh Computational and Systems Biology
4
sequence → structure → function
University of Pittsburgh Computational and Systems Biology
5
Unlike ligand based approaches, generalizes to new targets Requires molecular target with known structure and binding site
University of Pittsburgh Computational and Systems Biology
5
Unlike ligand based approaches, generalizes to new targets Requires molecular target with known structure and binding site
University of Pittsburgh Computational and Systems Biology
5
Unlike ligand based approaches, generalizes to new targets Requires molecular target with known structure and binding site
University of Pittsburgh Computational and Systems Biology 6
Virtual Screening Lead Optimization Pose Prediction Binding Discrimination Affinity Prediction
University of Pittsburgh Computational and Systems Biology 6
Virtual Screening Lead Optimization Pose Prediction Binding Discrimination Affinity Prediction
University of Pittsburgh Computational and Systems Biology
7
r1 r2d
function, efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461
AutoDock Vina
University of Pittsburgh Computational and Systems Biology
Accurate pose prediction, binding discrimination, and affinity prediction without sacrificing performance?
8
University of Pittsburgh Computational and Systems Biology
Accurate pose prediction, binding discrimination, and affinity prediction without sacrificing performance? Key Idea: Leverage “big data”
8
University of Pittsburgh Computational and Systems Biology
9
https://devblogs.nvidia.com
Convolutional Neural Networks
University of Pittsburgh Computational and Systems Biology
9
https://devblogs.nvidia.com
Convolutional Neural Networks
University of Pittsburgh Computational and Systems Biology
10
Pose Prediction Binding Discrimination Affinity Prediction
University of Pittsburgh Computational and Systems Biology
C C O O C C C C C O O O O C C C C C CO O O O C C C C C C O O C C C C C C C C C C C C C C C C G G R R G G G G G R R R R G G G G G GR R R R G G G G G G R R G G G G G G G G G G G G G G G G
11
(R,G,B) pixel
University of Pittsburgh Computational and Systems Biology
C C O O C C C C C O O O O C C C C C CO O O O C C C C C C O O C C C C C C C C C C C C C C C C
11
(R,G,B) pixel → (Carbon, Nitrogen, Oxygen,…) voxel The only parameters for this representation are the choice of grid resolution, atom density, and atom types.
University of Pittsburgh Computational and Systems Biology
12
Pose Prediction 4056 protein-ligand complexes
Affinity Prediction
University of Pittsburgh Computational and Systems Biology
13
≠
University of Pittsburgh Computational and Systems Biology
13
≠
University of Pittsburgh Computational and Systems Biology
14
2x2 Max Pooling 2x2 Max Pooling 2x2 Max Pooling 3x3x3 Convolution
48x48x48x35 24x24x24x35 24x24x24x32 12x12x12x32 12x12x12x64 6x6x6x64 6x6x6x128
Fully Connected Fully Connected
Affinity Pose Score
Softmax+Logistic Loss Pseudo-Huber Loss Rectified Linear Unit
3x3x3 Convolution
Rectified Linear Unit
3x3x3 Convolution
Rectified Linear Unit
University of Pittsburgh Computational and Systems Biology
15
Trained on PDBbind refined; tested on CSAR
University of Pittsburgh Computational and Systems Biology
15
University of Pittsburgh Computational and Systems Biology
15
Clustered Cross-Validation
RMSE = 1.69 R = 0.57 AUC = 0.90
University of Pittsburgh Computational and Systems Biology
16
masking gradients layer-wise relevance
1UGX Score: 0.62
University of Pittsburgh Computational and Systems Biology
17
University of Pittsburgh Computational and Systems Biology
18
University of Pittsburgh Computational and Systems Biology
18
University of Pittsburgh Computational and Systems Biology
18
University of Pittsburgh Computational and Systems Biology
18
https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html
Deep Dreams
University of Pittsburgh Computational and Systems Biology
19
2Q89 More Oxygen Here Less Oxygen Here
University of Pittsburgh Computational and Systems Biology
19
2Q89 More Oxygen Here Less Oxygen Here
University of Pittsburgh Computational and Systems Biology 21
better worse
University of Pittsburgh Computational and Systems Biology 22
better worse
University of Pittsburgh Computational and Systems Biology 22
better worse
University of Pittsburgh Computational and Systems Biology
23
MCMC
Sampling Refinement
N (50) independent Monte Carlo chains Scored with grid-accelerated Vina Best identified pose retained
MCMC MCMC MCMC MCMC
… vina/smina/gnina Vina CNN Rescoring
CNN pose affinity
best poses
University of Pittsburgh Computational and Systems Biology
24
University of Pittsburgh Computational and Systems Biology
25
Average Time (ms) 125 250 375 500 Xeon 4110 2.1GHz i9-7920X 2.9Ghz GTX 1070 Ti V100
Molecular Grid CNN Forward CNN Backward Atom Gradients
University of Pittsburgh Computational and Systems Biology
27
cnn_docked_affinity cnn_rescore_affinity cnn_docked_scoring cnn_rescore_scoring vina cat 0.0701 0.154
0.178 0.179 p38a
vegfr2 0.366 0.484 0.434 0.448 0.414 jak2 0.428 0.338 0.39 0.27 0.106 jak2_sub3 0.68 0.369
0.159
tie2 0.648 0.835 0.136
0.561 abl1 0.634 0.745 0.005 0.182 0.713
Spearman Correlation
University of Pittsburgh Computational and Systems Biology
28
University of Pittsburgh Computational and Systems Biology
29
University of Pittsburgh Computational and Systems Biology
30
University of Pittsburgh Computational and Systems Biology
31
University of Pittsburgh Computational and Systems Biology
32
University of Pittsburgh Computational and Systems Biology
33
University of Pittsburgh Computational and Systems Biology
35
http://people.eecs.berkeley.edu/~pathak/context_encoder/
University of Pittsburgh Computational and Systems Biology
36
University of Pittsburgh Computational and Systems Biology
37
Matt Ragoza Josh Hochuli Jocelyn Sunseri
R01GM108340
Group Members Jocelyn Sunseri Jonathan King Paul Francoeur Matt Ragoza Josh Hochuli Lily Turner Pulkit Mittal Alec Helbling Gibran Biswas Sharanya Bandla Faiha Khan
Department of Computational and Systems Biology
Lily Turner
University of Pittsburgh Computational and Systems Biology 38
github.com/gnina http://bits.csb.pitt.edu
@david_koes
University of Pittsburgh Computational and Systems Biology 38
github.com/gnina http://bits.csb.pitt.edu
@david_koes