Deep Reinforcement Learning Olexandr Isayev, Ph.D. University of - PowerPoint PPT Presentation

De Novo molecular design with Deep Reinforcement Learning Olexandr Isayev, Ph.D. University of North Carolina at Chapel Hill @olexandr olexandr@unc.edu http://olexandrisayev.com

About me Ph.D. in Chemistry (computational) Minor in CS/ML Worked in Federal research lab on HPC & GPU computing to solve chemical problems Now I am faculty at the University of North Carolina, Chapel Hill We use ML & AI to solve challenging problems in chemistry http://olexandrisayev.com Twitter: @olexandr olexandr@unc.edu

A public-private partnership that supports the discovery of new medicines through open access research www.thesgc.org

The Long and Winding Road to Drug Discovery Data Science approaches useful across the pipeline, but very different techniques aim for success, but if not: fail early, fail cheap

internal rate of return (IRR) Source: Endpoints News https://endpts.com

Drowning in Data …but starving for Knowledge

The growing appreciation of molecular modeling and informatics 7

“Behold the rise of the machines”

Summary of recent AI-based studies on chemical library design Molecular Generative models Method of biasing representations generated compounds • Autoencoders • None • Fingerprints • Generative • Latent space • SMILES adversarial optimization • Fine-tuning on small models (GANs) • Graphs • Recurrent neural subset of molecules networks (RNNs) with the desired • Convolutional property • Reinforcement neural networks (CNNs) Learning

De Novo molecular design with Deep Reinforcement Learning General Approach Application to Molecular design Tm; LogP; pIC50; etc Predictive Deep Network Molecules Generative Deep Network Patent pending arXiv:1711.10907

Drug discovery pipeline PREDICTIVE CHEMICAL CHEMICAL PROPERTY/ QSAR MODELS STRUCTURES DESCRIPTORS ACTIVITY QSAR MAGIC CHEMICAL DATABASE HITS VIRTUAL (confirmed SCREENING actives) ~10 6 – 10 9 molecules INACTIVES (confirmed inactives)

Design of the ReLeaSE* method Challenges: • Generate chemically feasible SMILES • Develop SMILES- based QSAR model • Employ Predictive ML model to bias library generation *Popova, Mariya, Olexandr Isayev, and Alexander Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Language of SMILEs

Generative model 1.5M molecules from <START>c1ccc(O)cc1<END> c 1 c c c c 1 <END> ChEMBL ) ( F ) c1ccc(O)cc1 c 1 c c c ( O ) c c 1 <START> NO + loss + loss Did the YES Softmax training loss converge ?

Reinforcement learning for chemical design Generative model FC(F)COc1ccc2c(Nc3ccc(Cl)c(Cl)c3)ncnc2c1 Predictive model M. Popova, O. Isayev, A. Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Reinforcement learning for chemical design Generative model INACTIVE! Predictive model M. Popova, O. Isayev, A. Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Reinforcement learning for chemical design Generative model Predictive model M. Popova, O. Isayev, A. Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Reinforcement learning for chemical design Generative model Fc1ccc2c(Nc3ccc(F)c(F)c3)ncnc2c1 Predictive model M. Popova, O. Isayev, A. Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Reinforcement learning for chemical design Generative model ACTIVE! Predictive model M. Popova, O. Isayev, A. Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Reinforcement learning for chemical design Generative model Predictive model M. Popova, O. Isayev, A. Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Technical details • Models were trained on Nvidia Titan X and Titan V GPUs • Training generative model on ChEMBL took ~ 25 days • Training predictive models took ~ 2 hours • Biasing generative model with reinforcement learning for one property ~ 1 day • Generative model produces 1000s compounds per minute

Results: Biasing target properties in the designed libraries Optimized Baseline * -2 0 2 4 6 8 10 12 Partition coefficient (logP) JAK2 Inhibition (pIC50) M. Popova, O. Isayev, A. Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Train data distribution JAK2 (Kinase) inhibition Maximized property distribution Minimized property distribution NEW CHEMOTYPE CAS 236-084-2 ZINC37859566 (buffer reagent) New molecule SIMILAR SCAFFOLDS arXiv:1711.10907

Results: analysis of similarity Distribution of Tanimoto similarity to the nearest neighbor in training dataset for compounds predicted to be active for EGFR by consensus of QSAR models: Similarity= 0.69 Similarity= 0.57 Similarity = 0.86 0.5 0.6 0.7 0.8 0.9 1.0 Tanimoto similarity

Results: Synthetic accessibility score* of the designed libraries *Ertl, Peter, and Ansgar Schuffenhauer. "Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions." Journal of cheminformatics 1.1 (2009): 8.

Target predictions for generated compounds using SEA* *Keiser MJ, Roth BL, Armbruster BN, Ernsberger P, Irwin JJ, Shoichet BK. Relating protein pharmacology by ligand chemistry. Nat Biotech 25 (2), 197-206 (2007).

Model visualization for JAK2 (projection using t-SNE) 10 ZINC469992 9 pIC50 = 8.23 8 ZINC19982368 ZINC2876515 pIC50 = 8.64 7 pIC50 = 8.39 6 5 4 pIC50 = 0.63 ZINC66347860 3 pIC50 = 3.31 2 ZINC3549031 1 pIC50 = 10.37 pIC50 = 3.76 M. Popova, O. Isayev, A. Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Examples of Stack-RNN cells with interpretable gate activations M. Popova, O. Isayev, A. Tropsha. "Deep reinforcement learning for de-novo drug design." arXiv preprint arXiv:1711.10907 (2017).

Summary • AI methods coupled with SMILES representation afford biased libraries generation • The system naturally embeds reinforcement to produce novel structure with the desired property • The system can be tuned to bias libraries towards specific property ranges • Next phase is experimental validation of hits by UNC SGC team

Deep Reinforcement Learning Olexandr Isayev, Ph.D. University of - PowerPoint PPT Presentation

De Novo molecular design with Deep Reinforcement Learning Olexandr Isayev, Ph.D. University of North Carolina at Chapel Hill @olexandr olexandr@unc.edu http://olexandrisayev.com About me Ph.D. in Chemistry (computational) Minor in CS/ML

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

Deep Reinforcement Learning [Human-Level Control through deep reinforcement learning, Nature

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Deep Reinforcement Learning Philipp Koehn 21 April 2020 Philipp Koehn Artificial Intelligence:

Deep Reinforcement Learning Philipp Koehn 18 April 2019 Philipp Koehn Artificial Intelligence:

Deep he(a)p, big feat arXiv:1707.06887 A Distributional Perspective on Reinforcement Learning

The Molecular Dynamics Method - 4.0 H-bond energy (kcal/mol) Fibronectin III_1, a mechanical

PRESENTATION OF THE FP6 EUROPEAN PROJECT BIOSHALE: EXPLOITATION OF BLACK SHALE ORES USING

Aging Water Infrastructure Research Program SAB Consultation Cross-Cutting, Integrative Research

16 REU Lunch Seminars 2012 Doug Holland and Victoria McMichael from the MBG Peter H. Raven Library.

Molecular impurities interacting with a many-body environment: dynamics in Helium nanodroplets G.

Outflow chemistry Mario Tafalla Observatorio Astronomico Nacional (IGN) Spain Know your

CALM Community Action for Lake Michigan by : Lauren Healy, Nicolette Bugher, Andre Batocabe, and

Total Synthesis of Cyanolide A in the Absence of Protecting Groups, Chiral Auxiliaries, or