kexin huang tianfan fu wenhao gao yue zhao marinka zitnik
play

Kexin Huang Tianfan Fu Wenhao Gao Yue Zhao Marinka Zitnik - PowerPoint PPT Presentation

Kexin Huang Tianfan Fu Wenhao Gao Yue Zhao Marinka Zitnik Harvard Georgia Tech MIT CMU Harvard kexinhuang@hsph.harvard.edu marinka@hms.harvard.edu tfu42@gatech.edu whgao@mit.edu zhaoy@cmu.edu Retrieving, curating, and processing


  1. Kexin Huang Tianfan Fu Wenhao Gao Yue Zhao Marinka Zitnik Harvard Georgia Tech MIT CMU Harvard kexinhuang@hsph.harvard.edu marinka@hms.harvard.edu tfu42@gatech.edu whgao@mit.edu zhaoy@cmu.edu

  2. Retrieving, curating, and processing ML-ready datasets is time- consuming and requires extensive domain expertise. Datasets are scattered around the bio repositories and there is no centralized repository for a variety of therapeutics tasks. Many tasks are under-explored in AI/ML community because of the lack of data access. 2 https://github.com/mims-harvard/TDC

  3. Machine Learning Datasets for Therapeutics Open-Source ML Datasets for Therapeutics: • Wide range of tasks: target discovery, activity screening, efficacy, safety, • manufacturing Wide range of products: small molecules, antibodies, vaccine, miRNA • Numerous Data Functions: • Extensive data functions and model evaluators • Data processing and splits, molecule generation oracles, and much more • 3 Lines of Code: • Minimum package dependency, lightweight loaders • 3 https://github.com/mims-harvard/TDC

  4. Our Vision for TDC Identify meaningful Design powerful therapeutics tasks ML models ML Domain scientists scientists Advancing algorithms for key therapeutics problems 4 https://github.com/mims-harvard/TDC

  5. Modular Structure of TDC TDC “Central Dogma” Single- Y instance Multi- Y instance Generation 5 https://github.com/mims-harvard/TDC

  6. Diverse Coverage of Tasks 6 https://github.com/mims-harvard/TDC

  7. GDA Tox DTI DrugRes Reaction HTS DrugSyn MolGen QM Peptide ADME MHC PairMolGen AntibodyAff Paratope RetroSyn MTI Epitope PPI Catalyst Develop Yields DDI 7 https://github.com/mims-harvard/TDC

  8. 3 Lines of Code The core TDC library uses minimum packages thus is installed hassle-free. Data loaders are simplified so that you can get access to ML- ready datasets within only 3 lines of code. 8 https://github.com/mims-harvard/TDC

  9. Highlight: Data sources 9 https://github.com/mims-harvard/TDC

  10. Highlight: Drug Response Prediction High Response DrugRes Low Response Drug Synergy Prediction High Response + DrugSyn Low Response 10 https://github.com/mims-harvard/TDC

  11. Highlight: 10 Biologics Datasets Paratope Develop Epitope Peptide MHC MTI AntibodyAff 11 https://github.com/mims-harvard/TDC

  12. Data Functions to Support your Research 12 https://github.com/mims-harvard/TDC

  13. Molecule Generation Oracles Molecule Generation Oracle Score GuacaMol Generated Molecules Optimize 3 Lines of Code MOSES Literature GuacaMol: Benchmarking Models for de Novo Molecular Design, J. Chem. Inf. Model., 2019 13 MOSES: A Benchmarking Platform for Molecular Generation Models, Frontiers in Pharmacology, 2020 https://github.com/mims-harvard/TDC

  14. You Are Invited to Join TDC! TDC is an Open-Source, Community Effort Contribute Tasks Datasets Data Functions HTS, Data Wrangling, Clinical Trials, ADME, Data Visualization, CRISPR, Drug Response, Realistic Splits, Phenotypic Drug Synergy, Molecule Screening, Reactions, Generation Protein Contact, Antibody affinity, Oracles, Crystal Structure ……. ……. ……. Fill in this form: rb.gy/ytbyfl 14 https://github.com/mims-harvard/TDC

  15. zitnikl klab.hms.harvard.edu/TDC /TDC 15 https://github.com/mims-harvard/TDC

  16. Website GitHub zitnikl klab.hms.harvard.edu/TDC /TDC git github.com om/mims mims-ha harva vard/TDC /TDC gr grou oups.io io/g/ /g/td tdc Kexin Huang Tianfan Fu Wenhao Gao Yue Zhao Marinka Zitnik @y @yzhao ao062 062 @marinka kazitnik @K @Kex exinHuan ang5 @Ti TianfanFu @W @Wen enhao aoGao ao1 Harvard Georgia Tech MIT CMU Harvard kexinhuang@hsph.harvard.edu tfu42@gatech.edu whgao@mit.edu zhaoy@cmu.edu marinka@hms.harvard.edu

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend