GPU-Accelerated Convolutional Neural Networks For Protein-Ligand Scoring
David Koes
GPU Technology Conference May 8, 2017
@david_koes
GPU-Accelerated Convolutional Neural Networks For Protein-Ligand - - PowerPoint PPT Presentation
GPU-Accelerated Convolutional Neural Networks For Protein-Ligand Scoring David Koes @david_koes GPU Technology Conference May 8, 2017 University of Pittsburgh Computational and Systems Biology THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT
GPU Technology Conference May 8, 2017
@david_koes
University of Pittsburgh Computational and Systems Biology 2
PHASE I PHASE II PHASE III PHASE IV
IND SUBMITTED NDA/BLA SUBMITTED FDA APPROVAL
TENS HUNDREDS THOUSANDS NUMBER OF VOLUNTEERS
THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS
POTENTIAL NEW MEDICINES
1 FDA-
APPROVED MEDICINE
BASIC RESEARCH DRUG DISCOVERY CLINICAL TRIALS FDA REVIEW
POST-APPROVAL RESEARCH & MONITORING
PRE- CLINICAL
Source: Pharmaceutical Research and Manufacturers of America (http://phrma.org)
$2.6
BILLION
University of Pittsburgh Computational and Systems Biology 2
PHASE I PHASE II PHASE III PHASE IV
IND SUBMITTED NDA/BLA SUBMITTED FDA APPROVAL
TENS HUNDREDS THOUSANDS NUMBER OF VOLUNTEERS
THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS
POTENTIAL NEW MEDICINES
1 FDA-
APPROVED MEDICINE
BASIC RESEARCH DRUG DISCOVERY CLINICAL TRIALS FDA REVIEW
POST-APPROVAL RESEARCH & MONITORING
PRE- CLINICAL
Source: Pharmaceutical Research and Manufacturers of America (http://phrma.org)
$2.6
BILLION
If you stop failing so often you massively reduce the cost of drug development.
— Sir Andrew Witty CEO, GlaxoSmithKline
University of Pittsburgh Computational and Systems Biology 2
PHASE I PHASE II PHASE III PHASE IV
IND SUBMITTED NDA/BLA SUBMITTED FDA APPROVAL
TENS HUNDREDS THOUSANDS NUMBER OF VOLUNTEERS
THE BIOPHARMACEUTICAL RESEARCH AND DEVELOPMENT PROCESS
POTENTIAL NEW MEDICINES
1 FDA-
APPROVED MEDICINE
BASIC RESEARCH DRUG DISCOVERY CLINICAL TRIALS FDA REVIEW
POST-APPROVAL RESEARCH & MONITORING
PRE- CLINICAL
Source: Pharmaceutical Research and Manufacturers of America (http://phrma.org)
$2.6
BILLION
If you stop failing so often you massively reduce the cost of drug development.
— Sir Andrew Witty CEO, GlaxoSmithKline
University of Pittsburgh Computational and Systems Biology 3
University of Pittsburgh Computational and Systems Biology
4
sequence → structure → function
University of Pittsburgh Computational and Systems Biology
4
sequence → structure → function
University of Pittsburgh Computational and Systems Biology
5
Unlike ligand based approaches, generalizes to new targets Requires molecular target with known structure and binding site
University of Pittsburgh Computational and Systems Biology
5
Unlike ligand based approaches, generalizes to new targets Requires molecular target with known structure and binding site
University of Pittsburgh Computational and Systems Biology
5
Unlike ligand based approaches, generalizes to new targets Requires molecular target with known structure and binding site
University of Pittsburgh Computational and Systems Biology 6
Virtual Screening Lead Optimization Pose Prediction Binding Discrimination Affinity Prediction
University of Pittsburgh Computational and Systems Biology 6
Virtual Screening Lead Optimization Pose Prediction Binding Discrimination Affinity Prediction
University of Pittsburgh Computational and Systems Biology
7
r1 r2d
function, efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461
AutoDock Vina
University of Pittsburgh Computational and Systems Biology
Accurate pose prediction, binding discrimination, and affinity prediction without sacrificing performance?
8
University of Pittsburgh Computational and Systems Biology
Accurate pose prediction, binding discrimination, and affinity prediction without sacrificing performance? Key Idea: Leverage “big data”
8
University of Pittsburgh Computational and Systems Biology
9
University of Pittsburgh Computational and Systems Biology
9
University of Pittsburgh Computational and Systems Biology
10
https://devblogs.nvidia.com
Convolutional Neural Networks
University of Pittsburgh Computational and Systems Biology
11
. . . . . .
Dog: 0.99 Cat: 0.02 Convolution Feature Maps Convolution Feature Maps Fully Connected Traditional NN
University of Pittsburgh Computational and Systems Biology
12
Pose Prediction Binding Discrimination Affinity Prediction
University of Pittsburgh Computational and Systems Biology
12
Pose Prediction Binding Discrimination Affinity Prediction
University of Pittsburgh Computational and Systems Biology
12
Pose Prediction Binding Discrimination Affinity Prediction
University of Pittsburgh Computational and Systems Biology
C C O O C C C C C O O O O C C C C C CO O O O C C C C C C O O C C C C C C C C C C C C C C C C G G R R G G G G G R R R R G G G G G GR R R R G G G G G G R R G G G G G G G G G G G G G G G G
13
(R,G,B) pixel
University of Pittsburgh Computational and Systems Biology
C C O O C C C C C O O O O C C C C C CO O O O C C C C C C O O C C C C C C C C C C C C C C C C
13
(R,G,B) pixel → (Carbon, Nitrogen, Oxygen,…) voxel The only parameters for this representation are the choice of grid resolution, atom density, and atom types.
University of Pittsburgh Computational and Systems Biology
14
Gaussian
University of Pittsburgh Computational and Systems Biology
15
Ligand AliphaticCarbonXSHydrophobe AliphaticCarbonXSNonHydrophobe AromaticCarbonXSHydrophobe AromaticCarbonXSNonHydrophobe Bromine Chlorine Fluorine Iodine Nitrogen NitrogenXSAcceptor NitrogenXSDonor NitrogenXSDonorAcceptor Oxygen OxygenXSAcceptor OxygenXSDonorAcceptor Phosphorus Sulfur SulfurAcceptor Receptor AliphaticCarbonXSHydrophobe AliphaticCarbonXSNonHydrophobe AromaticCarbonXSHydrophobe AromaticCarbonXSNonHydrophobe Calcium Iron Magnesium Nitrogen NitrogenXSAcceptor NitrogenXSDonor NitrogenXSDonorAcceptor OxygenXSAcceptor OxygenXSDonorAcceptor Phosphorus Sulfur Zinc
University of Pittsburgh Computational and Systems Biology
16
Pose Prediction 337 protein-ligand complexes
12,484 protein-ligand complexes
University of Pittsburgh Computational and Systems Biology
17
CSAR: >90% similar targets kept in same fold PDBbind: >80% similar targets kept in same fold
AUC
University of Pittsburgh Computational and Systems Biology
18
Parallelize over atoms to obtain a mask of atoms that overlap each grid region Use exclusive scan to obtain a list of atom indices from the mask Parallelize over grid points, using reduced atom list to avoid O(Natoms) check
Custom MolGridDataLayer
University of Pittsburgh Computational and Systems Biology
19
University of Pittsburgh Computational and Systems Biology
19
University of Pittsburgh Computational and Systems Biology
20
Atom Types
Atom Density Type
Radius Multiple Resolution Pooling Depth Width Fully Connected Layers max
University of Pittsburgh Computational and Systems Biology
21
University of Pittsburgh Computational and Systems Biology
21
unit1_pool unit1_conv1 32 x 24^3 loss unit2_pool unit2_conv1 64 x 12^3 label unit3_pool
2
unit3_conv1 128 x 6^3 data 48^3
University of Pittsburgh Computational and Systems Biology
22
University of Pittsburgh Computational and Systems Biology
23
University of Pittsburgh Computational and Systems Biology
23
inter-target ranking intra-target ranking
University of Pittsburgh Computational and Systems Biology
24
University of Pittsburgh Computational and Systems Biology
24
inter-target ranking intra-target ranking
University of Pittsburgh Computational and Systems Biology
25
University of Pittsburgh Computational and Systems Biology
26
3COY 2QMJ 3OZT Partially Aligned Poses
University of Pittsburgh Computational and Systems Biology
27
University of Pittsburgh Computational and Systems Biology
27
University of Pittsburgh Computational and Systems Biology
27
University of Pittsburgh Computational and Systems Biology
28
2Q89 More Oxygen Here Less Oxygen Here
University of Pittsburgh Computational and Systems Biology
28
2Q89 More Oxygen Here Less Oxygen Here
University of Pittsburgh Computational and Systems Biology 30
University of Pittsburgh Computational and Systems Biology 30
University of Pittsburgh Computational and Systems Biology
Iterative Training
Iterative Training
Virtual Screening Lead Optimization
University of Pittsburgh Computational and Systems Biology
Iterative Training
Iterative Training
Virtual Screening Lead Optimization
University of Pittsburgh Computational and Systems Biology
32
Matt Ragoza Josh Hochuli Elisa Idrobo Jocelyn Sunseri
R01GM108340
Group Members Jocelyn Sunseri Matt Ragoza Josh Hochuli Roosha Mandal Alec Helbling Lily Turner Aaron Zheng Sara Amato Lily Turner Aaron Zheng Gibran Biswas
Department of Computational and Systems Biology
University of Pittsburgh Computational and Systems Biology
33
Binding Determination Affinity Prediction Relevance Propagation
University of Pittsburgh Computational and Systems Biology
33
Binding Determination Affinity Prediction Relevance Propagation