Olexandr Isayev, Ph.D.
University of North Carolina at Chapel Hill
- lexandr@unc.edu
http://olexandrisayev.com
Deep Reinforcement Learning Olexandr Isayev, Ph.D. University of - - PowerPoint PPT Presentation
De Novo molecular design with Deep Reinforcement Learning Olexandr Isayev, Ph.D. University of North Carolina at Chapel Hill @olexandr olexandr@unc.edu http://olexandrisayev.com About me Ph.D. in Chemistry (computational) Minor in CS/ML
University of North Carolina at Chapel Hill
http://olexandrisayev.com
Ph.D. in Chemistry (computational) Minor in CS/ML Worked in Federal research lab on HPC & GPU computing to solve chemical problems Now I am faculty at the University of North Carolina, Chapel Hill We use ML & AI to solve challenging problems in chemistry http://olexandrisayev.com Twitter: @olexandr
internal rate of return (IRR) Source: Endpoints News https://endpts.com
7
Molecular representations Generative models Method of biasing generated compounds
Molecules Patent pending Predictive Deep Network Generative Deep Network Tm; LogP; pIC50; etc
General Approach Application to Molecular design
arXiv:1711.10907
~106 – 109 molecules VIRTUAL SCREENING CHEMICAL STRUCTURES CHEMICAL DESCRIPTORS PROPERTY/ ACTIVITY PREDICTIVE QSAR MODELS INACTIVES (confirmed inactives) QSAR MAGIC HITS (confirmed actives) CHEMICAL DATABASE
*Popova, Mariya, Olexandr Isayev, and Alexander Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).
Did the training converge ?
<START> c <START>c1ccc(O)cc1<END> c 1 1 c c c c ) + loss c ( ( F + loss O ) ) c c c c 1 1 <END>
Softmax loss
1.5M molecules from ChEMBL
c1ccc(O)cc1
FC(F)COc1ccc2c(Nc3ccc(Cl)c(Cl)c3)ncnc2c1
Generative model Predictive model
FC(F)COc1ccc2c(Nc3ccc(Cl)c(Cl)c3)ncnc2c1
Generative model Predictive model
Generative model Predictive model
Generative model Predictive model
Generative model Predictive model
Generative model Predictive model
Fc1ccc2c(Nc3ccc(F)c(F)c3)ncnc2c1
Generative model Predictive model
Generative model Predictive model
Fc1ccc2c(Nc3ccc(F)c(F)c3)ncnc2c1
Generative model Predictive model
Generative model Predictive model
Generative model Predictive model
Generative model Predictive model
Optimized Baseline Partition coefficient (logP)
JAK2 Inhibition (pIC50)
4 8 6 10
12 2
*
CAS 236-084-2 (buffer reagent) ZINC37859566 New molecule SIMILAR SCAFFOLDS NEW CHEMOTYPE
Train data distribution Maximized property distribution Minimized property distribution arXiv:1711.10907
Distribution of Tanimoto similarity to the nearest neighbor in training dataset for compounds predicted to be active for EGFR by consensus of QSAR models:
Similarity= 0.57 Similarity= 0.69 Similarity = 0.86
*Ertl, Peter, and Ansgar Schuffenhauer. "Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions." Journal of cheminformatics 1.1 (2009): 8.
*Keiser MJ, Roth BL, Armbruster BN, Ernsberger P, Irwin JJ, Shoichet BK. Relating protein pharmacology by ligand chemistry. Nat Biotech 25 (2), 197-206 (2007).
*Keiser MJ, Roth BL, Armbruster BN, Ernsberger P, Irwin JJ, Shoichet BK. Relating protein pharmacology by ligand chemistry. Nat Biotech 25 (2), 197-206 (2007).
ZINC19982368 pIC50 = 8.64 ZINC66347860 pIC50 = 3.31 pIC50 = 10.37 pIC50 = 0.63 ZINC2876515 pIC50 = 8.39
ZINC3549031 pIC50 = 3.76 ZINC469992 pIC50 = 8.23