designing an ml minded product and a product minded ml
play

Designing an ML - Minded Product and a Product-Minded ML System - PowerPoint PPT Presentation

Designing an ML - Minded Product and a Product-Minded ML System ACM Webinar January 23, 2019 Grace Huang Personalized Homefeed Personalized Homefeed Personalization: Scoring and ranking Picking the best of the best among candidates A


  1. Designing an ML - Minded Product and a Product-Minded ML System ACM Webinar January 23, 2019 Grace Huang

  2. Personalized Homefeed

  3. Personalized Homefeed Personalization: Scoring and ranking Picking the best of the best among candidates

  4. A ranking model • Supervised learning with labels: ◦ 1 = some positive engagements ◦ 0 = no engagement or negative actions • Learns to predict a positive engagement (ranking) score • Pins are then sorted by engagement score = f(pin, user, …) 


  5. Prediction Data Feature Model (Store and collection engineering training serve)

  6. Components of A production ML system Data Training Serving Evaluation Launch Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

  7. We will focus on data, evaluation and shipping Data Training Serving Evaluation Launch Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

  8. Considerations for a data pipeline Data Training Serving Evaluation Shipping Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

  9. Engagement score = f(pin, user…) User profile Pin User’s past actions: engagement signals The perfect path 36 to cold brew Derived user profiles from past actions Ca ff einated Inc. Omar Seyal Cravings Derived pin information

  10. Considerations for a data pipeline Data Training Serving Evaluation Shipping Data (to On-line Predictions make experiments - Logging (and changes) Data predictions) - Aggregations (ETLs) pipeline - ETL management libraries Launch (training/ - Data validation Model (production test) - Monitoring and alerts for the pipeline model) Data (for O ffm ine training/ evaluations o ffm ine testing)

  11. Training data should be carefully managed Data Training Serving Evaluation Shipping Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model - Sampling scheme (production test) - Version control model) Data (for O ffm ine - Monitoring feature distribution changes training/ evaluations - Feature extraction and transformations o ffm ine - Feature value validation testing) - Shared feature store or individual pipelines

  12. Training and serving data discrepency(skew)? Data Training Serving Evaluation Shipping Data (to On-line Predictions make - Training data sampled differently from serving experiments Data predictions) data? pipeline - There is a lag to certain features being populated? Launch (training/ (e.g. takes a long time to compute) Model (production test) - Logging change? model) - ETL breaks? Data (for O ffm ine - Seasonality training/ evaluations - Market differences o ffm ine testing)

  13. How to evaluate a candidate model Data Training Serving Evaluation Shipping Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) - Your favorite offline performance model) measures Data (for O ffm ine - Human evaluation training/ evaluations - Custom tools (e.g. side by side, o ffm ine simulated debuggers for sanity testing) check, funnels..etc)

  14. How to evaluate a candidate model Data Training Serving Evaluation Shipping • - Goal metrics • - Leading indicators • - Debug metrics Data (to • - Guardrail metrics On-line Predictions make • - Custom tools experiments Data predictions) - Metrics vs. loss function pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

  15. Shipping criteria should include… Data Training Serving Evaluation Launch Data (to On-line Predictions - Metrics make experiments Data predictions) - Infrastructure cost pipeline - Maintenance overhead Launch (training/ (regularization!) Model (production test) - Product vision model) - Cannibilization Data (for O ffm ine - Speed vs. iteration training/ evaluations o ffm ine testing)

  16. Once shipped, continue to monitor Data Training Serving Evaluation Launch Data (to On-line Predictions make experiments - Continuous monitoring: Data predictions) - Goal metrics on dashboards pipeline - Alerts for data and prediction Launch (training/ Model distribution drifts (production test) - Runbook, tools and model) Data (for O ffm ine delegation for investigations training/ evaluations o ffm ine testing)

  17. Automation is key Data Training Serving Evaluation Launch Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

  18. Lessons learned #1 Beware of Data and System Bias #2 Testing & Monitoring …..(Do it!) #3 Good Infrastructure Speeds Up Iteration #4 Measurement and Understanding are Crucial #5 Build a Sustainable Ecosystem #6 Design a ML Minded Product , and a Product Minded ML System

  19. #1 Beware of Data and System Bias

  20. Engagement data complements pin information VS

  21. Engagement data is a double-sided sword! VS

  22. Remove bias and effects of the existing system as much as possible (so rich doesn’t get richer)

  23. #2 Testing & Monitoring …..(Do it!)

  24. Some important metric Not good!!! Weeks…….

  25. GBDT Migration to Neural Network Some important metric Not good!!! Weeks…….

  26. Offline data distribution != Online data distribution Offline data distribution != Online data distribution Data coverage drop or corruption -> Silent failures Data change Migration to Neural Network Some important metric Not good!!! Weeks…….

  27. #3 Good Infrastructure Speeds Up Iteration

  28. Can multiple engineers work on the system simultaneously? • Are there automated training/deploy pipelines? Can they ship multiple experiments at once? • Are there effective offline analysis tools to help reduce amount of live experiments needed?

  29. #4 Measurement and Understanding are Crucial

  30. Offline performance != Online performance • Final bar is running on live traffic wo Line Title Subtitle Baseline Guide > • Run experiments to learn itle or Subtitle Bullet Top Guide > wo Line Subtitle Bullet Top Guide >

  31. Invest in toolings and experiments to understand the blackbox • Ablation experiments • Are sub-populations of users disproportionally impacted • Analyses and tools to help us understand long term, ecosystem e ff ect ! 31

  32. It’s easy to get what you wish for, but not what you want……. (Goodharts Law)

  33. #5 Build a Sustainable Ecosystem

  34. Do we handle cold starts elegantly? Are we taking care of fresh, less impressed content? Lower 
 Ranking Score Higher 
 Ranking Score Fresher Older

  35. Do we handle cold starts elegantly? Are we taking care of content with missing features (or features whose generation is delayed)? Streaky, offensive content!

  36. Build a system with tight negative feedback, and make use of (explicit) negative signals as much as possible • Model / Objective Function - Change label / prediction target / model architecture so that negative events are tied to the objective function we optimize 
 • Features - Add more features that help in predicting negative events 
 But separate spam/racy filtering from negative signal incorporation in ML models

  37. #6 Design a ML Minded Product , and a Product Minded ML System

  38. Do you really need ML?

  39. For complex problems like diversity and freshness, ML components need to work in concert Beware of bottleneck!!

  40. Important to have a way to build policy and product vision into the ML system

  41. Independent surfaces for exploitation vs. exploration Exploration Exploitation

  42. Build a system for users tomorrow (or users you really care about) Global engagement Local engagement

  43. Thank you Confidential � 43

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend