neural networks designing new drugs
play

Neural Networks Designing New Drugs Mariya Popova, Olexandr Isayev, - PowerPoint PPT Presentation

Neural Networks Designing New Drugs Mariya Popova, Olexandr Isayev, Alexandr Tropsha Drug g Discovery Timeline ne 2 Conventio ional al Vir irtual tual Scre reening ing Pip ipeli line CHEMICAL AL CHEMIC MICAL PREDICTIVE PROPERTY/


  1. Neural Networks Designing New Drugs Mariya Popova, Olexandr Isayev, Alexandr Tropsha

  2. Drug g Discovery Timeline ne 2

  3. Conventio ional al Vir irtual tual Scre reening ing Pip ipeli line CHEMICAL AL CHEMIC MICAL PREDICTIVE PROPERTY/ STRUCTURES RES DESCRIPTO IPTORS MODELS ACTIVITY Chemical database VIRTUAL L SCREEN EENING Actives Inactives 3 ~10 6 – 10 9 molecules

  4. Why Do We Need Generat rativ ive Mode dels? ls? Biggest database of o molecules has ~10 9 compounds Estimates for the size of o chemical space – up to 10 60 Searching for new drug o candidates in existing databases – observation bias 4

  5. Generati rative Models ls Ove vervie view 5 Sanchez-Lengeling, Benjamin, and Al án Aspuru-Guzik. "Inverse molecular design using machine learning: Generative models for matter engineering." Science 361.6400 (2018): 360-365.

  6. Our Appr proach oach • Generative model for SMILES 𝐻 • Predictive model for the desired property 𝑄 • 𝐻 and 𝑄 combined with RL in one pipeline to bias the property of generated molecules. 6 Popova et. al. "Deep reinforcement learning for de novo drug design." Science advances 4.7 (2018): eaap7885.

  7. SMIL ILES ES-ba base sed d Generati rative Mode del • SMILES (simplif lified ied molec ecula lar-in input t line- entry y system) is a sequence of characters then encodes the molecular graph • One sequence = one molecule • Has alphabet Use language model for producing novel SMILES strings 𝑞 𝑡 𝑢 𝑡 1 … 𝑡 𝑢−1 ; 𝜄 = 𝑔(𝑡 1 … 𝑡 𝑢−1 |𝜄) 7

  8. Generati rative Model: l: train aining g mode • Trained on 1.5 million of drug-like compounds from ChEMBL in a supervised manner Softmax loss C С O <END> [0.90 0.05 … 0.01] [0.03 0.50 … 0.03] [0.7 0.15 … 0.02] [0.07 0.13 … 0.77] Stack Stack Stack GRU GRU GRU GRU [0.27 0.15 … 0.03] [0.63 0.14 … 0.23] [0.45 0.66 … 0.87] [0.33 0.13 … 0.01] <START> N C O 8

  9. Generati rative Model: l: in infe feren rence mode de Model takes its own predictions as next input character: 𝑞 𝑡 𝑢 𝑡 1 … 𝑡 𝑢−1 ; 𝜄 = 𝑔(𝑡 1 … 𝑡 𝑢−1 |𝜄) Growing SMILES sampling sampling sampling sampling Probabilities of the next character Embedding vectors 9

  10. RL fo form rmulati ulation on fo for SMIL ILES ES generat ratio ion • Action – generate symbol 𝑡 𝑞 𝑡 𝑢 𝑡 1 … 𝑡 𝑢−1 ; 𝜄 = 𝑔(𝑡 1 … 𝑡 𝑢−1 |𝜄) • Set of actions – SMILES alphabet 𝐵 • State – generated prefix 𝑡 1 𝑡 2 … 𝑡 𝑢−1 • Set of states – set of all possible strings in SMILES alphabet 𝐵 with lengths from 0 to T -- 𝔹 = {𝐵 𝑢 , 𝑢 = 0 … 𝑈} • Environment – set of states 𝔹 , set of actions 𝐵 and transition probabilities 𝑞 𝑡 𝑢 = 𝑏 𝑡 1 … 𝑡 𝑢−1 ; 𝜄 , 𝑏 ∈ 𝐵 • Reward function – 𝑆 𝑇 𝑢 • Objective – maximize the expected reward: 𝔽 𝑆 𝑇 𝑢 𝜄 = σ 𝑇∈𝔹 𝑞 𝑇 𝜄 𝑆 𝑇 → 𝑛𝑏𝑦 𝜄 10

  11. RL Pip ipeli line For Molec lecul ule Generati ration on • Generative model is a policy network • Predictive model is a simulator of the real-world • Reward is assigned based on the property prediction and researcher’s objective 11

  12. Re Resul ults: ts: opti timi mizi zing li lipophil ilicity ty • Lipophilicity is possibly the lost important physicochemical property of a potential drug • It plays a role in solubility, absorption, membrane penetration, etc • Log P is quantitative measure of lipophilicity, is the ratio of concentrations of a compound in a mixture of two immiscible phases at equilibrium Log P is a component of Lipinski’s Rule of 5 a rule of thumb to predict • drug-likeness • According to Lipinski’s rule must be in a range between 0 and 5 for drug-like molecules

  13. Predi dictiv ive Model l fo for r lo log P • SMILES-based RNN • Dataset of 14k compounds with logP measurements • 5 fold cross- validation • RMSE = 0.57 • 𝑆 2 = 0.90 13

  14. Re Resul ults: ts: opti timi mizi zing li lipophil ilicity ty Reward value 𝑆 𝑇 = ቊ 11, 𝑗𝑔 𝑚𝑝𝑕𝑄 𝑡 ∈ [0.5; 4.5] 0, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓 Log P value 14

  15. Re Resul ults: ts: opti timi mizi zing li lipophil ilicity ty Values of the reward function during training Reward value Training iteration 15

  16. Re Resul ults: ts: opti timi mizi zing li lipophil ilicity ty Distribution of unbiased and optimized log P values • Statistics are calculated from 10000 randomly generated SMILES • 100% of optimized SMILES were predicted to have log P within drug-like region Predicted log P values 16

  17. Lim imit itat atio ions Worked well for a relatively simple physical property What if a molecule with a high reward is a rear event? It could take very long until the model receives a high or non-zero reward 17

  18. Tricks ks • Flexible reward – First give high reward for worse molecules, then gradually increase threshold • Fine- tuning on a dataset of “good” molecules in a supervised manner – Fine-tune on generated molecules with high rewards – Fine-tune on experimental ground truth data – High exploitation, low exploration • Using experience replay for policy gradient optimization – Remember generated molecules with high rewards and replay on them – Replay on experimental ground truth data 18

  19. More re results: ts: EGFR R Epidermal growth factor receptor (EGFR) • Associated with cancer and inflammatory disease • Has ~10k experimental measurements for molecules 19

  20. More re results: ts: EGFR • Built a binary classification (active/inactive) predictive model for EGFR (F-1 score 0.9) • Took pretrained on ChEMBL generative network • Generated 10k random molecules and predicted probability of class “active” 20

  21. More re results: ts: EGFR Probability of class “active” 21

  22. More re results: ts: EGFR • Flexible reward: 𝑆 𝑇 = ቊ 10, 𝑗𝑔 𝑄 𝑇 > 𝑢ℎ𝑠𝑓𝑡ℎ𝑝𝑚𝑒 0, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓 • Initial threshold = 0.05 After every update we generate 10k compound • • If 15% of them predicted to have property > threshold, we increase threshold by 0.05 • Fine-tuning on generated molecules with high rewards • Experience replay on experimental measurements and on generated molecules with high rewards 22

  23. More re results: ts: EGFR Unbiased Maximized Probability of class “active” 23

  24. More re results: ts: EGFR Experimental validation: • Selected several commercially available and validated our results experimentally • Found 4 active compounds 24

  25. Futur ure work • Develop graph-based generative models: – SMILES-based models generate some amount of invalid molecules • Develop lead optimization methods: – Start from a given scaffold/structure – Impossible to do with SMILES • Develop models for predicting route for synthesis: – To be able to perform custom synthesis 25

  26. Code Lin inks RL for de novo drug design https://github.com/isayev/ReLeaSE 26

  27. Ac Acknow owledge dgements nts Univ iver ersit ity y of North Carolina olina at Chapel pel Hill: l: Olexandr Isayev Alexandr Tropsha 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend