an improved matrix completion algorithm for categorical
play

An Improved Matrix Completion Algorithm For Categorical Variables: - PowerPoint PPT Presentation

An Improved Matrix Completion Algorithm For Categorical Variables: Application to Active Learning of Drug Responses Huangqingbo Sun, Robert F. Murphy Workshop on Real World Experiment Design and Active Learning at ICML 2020 Drug Discovery


  1. An Improved Matrix Completion Algorithm For Categorical Variables: Application to Active Learning of Drug Responses Huangqingbo Sun, Robert F. Murphy Workshop on Real World Experiment Design and Active Learning at ICML 2020

  2. Drug Discovery Funnel Failures in clinical trials (and even after FDA approval) typically due to side effects that were not tested for earlier on (e.g., Vioxx) Better to test early for both having desired effect and not having undesired effects – but too many combinations (10 4 targets x 10 7 compounds) Source: PhRMA

  3. Active Learning - Multiple Phenotypes • Solution is active learning of a predictive model of all compound effects on all targets • But there are also many possible effects that compounds could have on a given target – thus effects are categorical variables • Assume that there are some similarities in effects among compounds and targets • Predictive model: completion (imputation) of a very sparse (only a few observed entries) categorical matrix • For active learning, uncertainty sampling is adopted, with 3 query strategies.

  4. Experiment on Synthetic Data • How fast does Active Learning comparing to random selection? Performance was measured as the difference in the number of batches to achieve 100% (right) or 90% (left) accuracy between active and random selection.

  5. Experiment Using Microscope Images for Many Drugs and Targets 100% 80% Accuracy 60% Naik et al. Active Model 40% Our Active Model - Hybrid Query Our Active Model - Least Score Learn the effect of 92 drugs on 94 GFP- 20% Our Active Model - Entropy tagged proteins without doing experiments Our Random Model for all drugs and proteins with the help of Naik et al. Random Model 0% Active Learning. 0 20 40 60 80 100 1 21 41 61 81 101 Round Image source: Naik et al.

  6. Conclusions • Improved clustering- based, “lazy learning” matrix completion algorithm for categorical matrices. • Results in improved active learning performance over previous methods.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend