Designing an ML - Minded Product and a Product-Minded ML System - PowerPoint PPT Presentation

Designing an ML - Minded Product and a Product-Minded ML System ACM Webinar January 23, 2019 Grace Huang

Personalized Homefeed

Personalized Homefeed Personalization: Scoring and ranking Picking the best of the best among candidates

A ranking model • Supervised learning with labels: ◦ 1 = some positive engagements ◦ 0 = no engagement or negative actions • Learns to predict a positive engagement (ranking) score • Pins are then sorted by engagement score = f(pin, user, …)  

Prediction Data Feature Model (Store and collection engineering training serve)

Components of A production ML system Data Training Serving Evaluation Launch Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

We will focus on data, evaluation and shipping Data Training Serving Evaluation Launch Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

Considerations for a data pipeline Data Training Serving Evaluation Shipping Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

Engagement score = f(pin, user…) User profile Pin User’s past actions: engagement signals The perfect path 36 to cold brew Derived user profiles from past actions Ca ff einated Inc. Omar Seyal Cravings Derived pin information

Considerations for a data pipeline Data Training Serving Evaluation Shipping Data (to On-line Predictions make experiments - Logging (and changes) Data predictions) - Aggregations (ETLs) pipeline - ETL management libraries Launch (training/ - Data validation Model (production test) - Monitoring and alerts for the pipeline model) Data (for O ffm ine training/ evaluations o ffm ine testing)

Training data should be carefully managed Data Training Serving Evaluation Shipping Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model - Sampling scheme (production test) - Version control model) Data (for O ffm ine - Monitoring feature distribution changes training/ evaluations - Feature extraction and transformations o ffm ine - Feature value validation testing) - Shared feature store or individual pipelines

Training and serving data discrepency(skew)? Data Training Serving Evaluation Shipping Data (to On-line Predictions make - Training data sampled differently from serving experiments Data predictions) data? pipeline - There is a lag to certain features being populated? Launch (training/ (e.g. takes a long time to compute) Model (production test) - Logging change? model) - ETL breaks? Data (for O ffm ine - Seasonality training/ evaluations - Market differences o ffm ine testing)

How to evaluate a candidate model Data Training Serving Evaluation Shipping Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) - Your favorite offline performance model) measures Data (for O ffm ine - Human evaluation training/ evaluations - Custom tools (e.g. side by side, o ffm ine simulated debuggers for sanity testing) check, funnels..etc)

How to evaluate a candidate model Data Training Serving Evaluation Shipping • - Goal metrics • - Leading indicators • - Debug metrics Data (to • - Guardrail metrics On-line Predictions make • - Custom tools experiments Data predictions) - Metrics vs. loss function pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

Shipping criteria should include… Data Training Serving Evaluation Launch Data (to On-line Predictions - Metrics make experiments Data predictions) - Infrastructure cost pipeline - Maintenance overhead Launch (training/ (regularization!) Model (production test) - Product vision model) - Cannibilization Data (for O ffm ine - Speed vs. iteration training/ evaluations o ffm ine testing)

Once shipped, continue to monitor Data Training Serving Evaluation Launch Data (to On-line Predictions make experiments - Continuous monitoring: Data predictions) - Goal metrics on dashboards pipeline - Alerts for data and prediction Launch (training/ Model distribution drifts (production test) - Runbook, tools and model) Data (for O ffm ine delegation for investigations training/ evaluations o ffm ine testing)

Automation is key Data Training Serving Evaluation Launch Data (to On-line Predictions make experiments Data predictions) pipeline Launch (training/ Model (production test) model) Data (for O ffm ine training/ evaluations o ffm ine testing)

Lessons learned #1 Beware of Data and System Bias #2 Testing & Monitoring …..(Do it!) #3 Good Infrastructure Speeds Up Iteration #4 Measurement and Understanding are Crucial #5 Build a Sustainable Ecosystem #6 Design a ML Minded Product , and a Product Minded ML System

#1 Beware of Data and System Bias

Engagement data complements pin information VS

Engagement data is a double-sided sword! VS

Remove bias and effects of the existing system as much as possible (so rich doesn’t get richer)

#2 Testing & Monitoring …..(Do it!)

Some important metric Not good!!! Weeks…….

GBDT Migration to Neural Network Some important metric Not good!!! Weeks…….

Offline data distribution != Online data distribution Offline data distribution != Online data distribution Data coverage drop or corruption -> Silent failures Data change Migration to Neural Network Some important metric Not good!!! Weeks…….

#3 Good Infrastructure Speeds Up Iteration

Can multiple engineers work on the system simultaneously? • Are there automated training/deploy pipelines? Can they ship multiple experiments at once? • Are there effective offline analysis tools to help reduce amount of live experiments needed?

#4 Measurement and Understanding are Crucial

Offline performance != Online performance • Final bar is running on live traffic wo Line Title Subtitle Baseline Guide > • Run experiments to learn itle or Subtitle Bullet Top Guide > wo Line Subtitle Bullet Top Guide >

Invest in toolings and experiments to understand the blackbox • Ablation experiments • Are sub-populations of users disproportionally impacted • Analyses and tools to help us understand long term, ecosystem e ff ect ! 31

It’s easy to get what you wish for, but not what you want……. (Goodharts Law)

#5 Build a Sustainable Ecosystem

Do we handle cold starts elegantly? Are we taking care of fresh, less impressed content? Lower   Ranking Score Higher   Ranking Score Fresher Older

Do we handle cold starts elegantly? Are we taking care of content with missing features (or features whose generation is delayed)? Streaky, offensive content!

Build a system with tight negative feedback, and make use of (explicit) negative signals as much as possible • Model / Objective Function - Change label / prediction target / model architecture so that negative events are tied to the objective function we optimize   • Features - Add more features that help in predicting negative events   But separate spam/racy filtering from negative signal incorporation in ML models

#6 Design a ML Minded Product , and a Product Minded ML System

Do you really need ML?

For complex problems like diversity and freshness, ML components need to work in concert Beware of bottleneck!!

Important to have a way to build policy and product vision into the ML system

Independent surfaces for exploitation vs. exploration Exploration Exploitation

Build a system for users tomorrow (or users you really care about) Global engagement Local engagement

Thank you Confidential � 43

Designing an ML - Minded Product and a Product-Minded ML System - PowerPoint PPT Presentation

Designing an ML - Minded Product and a Product-Minded ML System ACM Webinar January 23, 2019 Grace Huang Personalized Homefeed Personalized Homefeed Personalization: Scoring and ranking Picking the best of the best among candidates A

What is IB? IB Programmes aim to develop internationally minded people who are striving to

Kensington & Chelsea Social Council: Health and Wellbeing Voluntary Organisation Forum Like

WE ARE and technical-minded individuals. Driven by worked with multinational and regional a

Then make my joy complete - Philippians 2:2 :2 Moving Forward Be One-Minded!

NOVOMATIC ITALIA With its manufacturing and business-minded approach that has, for many years,

The Catboat Association Like-minded sailors, bonded by the pleasure of sailing a traditional

Membership Packages MARCH 2019 a great networking a friendly gathering of like-minded the

and principled contributors within local and global communities. Inquirer Open-Minded

7 th Grade! AT HIGHLAND WE STRIVE TO BE: HONEST PROBLEM SOLVERS SELF AWARE GROWTH MINDED

Open Minded Generation 20 2018 18-20 2020 20 Katerina Chousanli I am Katerina Chousanli , I am

An Equity-Minded Review and Discussion of Program Enrollments in the Washington CTC System

Ji Vl ek (Absent Minded) Prof. of Clinical pharmacy and Pharmaceutical care Charles

Student Success: What Is an Equity Minded Syllabus? Virtual Conference on Diversity, Equity,

Page 1, 18-Sep-07 Product Integration Product Integration Technical Product Product

Are Local Food Consumers Civic Minded or Seeking Assurances? Defining Policy Implications and the

About me Trevor Bryant Security minded DevOps nerd Knight of NIST Auditor,

U A (1) Collaboration U.S. Cosmic Visions Workshop, 23-25 March 2017, College Park, Maryland

Lock-free algorithms for Kotlin coroutines It is all about scalability Presented at SPTCC 2017

Untangling and Restructuring CTDB Martin Schwenke < martin@meltin.net > Samba Team IBM

Software Model Checking and Counter-example Guided Abstraction Refinement Claire Le Goues 1

Measurement Australian Direct Marketing Association Session Plan The challenge of

Junit-contracts: A Contract Testing Tool Claude N. Warren, Jr. CloudStack Collaboration

Background It is no broad agreement on DSR terminology, theory, methodology, evaluation

Reduces pause time Types of write barriers Snapshot-at-the-beginning Prevent loss of

Designing an ML - Minded Product and a Product-Minded ML System - PowerPoint PPT Presentation

Designing an ML - Minded Product and a Product-Minded ML System ACM Webinar January 23, 2019 Grace Huang Personalized Homefeed Personalized Homefeed Personalization: Scoring and ranking Picking the best of the best among candidates A

What is IB? IB Programmes aim to develop internationally minded people who are striving to

Kensington &amp; Chelsea Social Council: Health and Wellbeing Voluntary Organisation Forum Like

WE ARE and technical-minded individuals. Driven by worked with multinational and regional a

Then make my joy complete - Philippians 2:2 :2 Moving Forward Be One-Minded!

NOVOMATIC ITALIA With its manufacturing and business-minded approach that has, for many years,

The Catboat Association Like-minded sailors, bonded by the pleasure of sailing a traditional

Membership Packages MARCH 2019 a great networking a friendly gathering of like-minded the

and principled contributors within local and global communities. Inquirer Open-Minded

7 th Grade! AT HIGHLAND WE STRIVE TO BE: HONEST PROBLEM SOLVERS SELF AWARE GROWTH MINDED

Open Minded Generation 20 2018 18-20 2020 20 Katerina Chousanli I am Katerina Chousanli , I am

An Equity-Minded Review and Discussion of Program Enrollments in the Washington CTC System

Ji Vl ek (Absent Minded) Prof. of Clinical pharmacy and Pharmaceutical care Charles

Student Success: What Is an Equity Minded Syllabus? Virtual Conference on Diversity, Equity,

Page 1, 18-Sep-07 Product Integration Product Integration Technical Product Product

Are Local Food Consumers Civic Minded or Seeking Assurances? Defining Policy Implications and the

About me Trevor Bryant Security minded DevOps nerd Knight of NIST Auditor,

U A (1) Collaboration U.S. Cosmic Visions Workshop, 23-25 March 2017, College Park, Maryland

Lock-free algorithms for Kotlin coroutines It is all about scalability Presented at SPTCC 2017

Untangling and Restructuring CTDB Martin Schwenke &lt; martin@meltin.net &gt; Samba Team IBM

Software Model Checking and Counter-example Guided Abstraction Refinement Claire Le Goues 1

Measurement Australian Direct Marketing Association Session Plan The challenge of

Junit-contracts: A Contract Testing Tool Claude N. Warren, Jr. CloudStack Collaboration

Background It is no broad agreement on DSR terminology, theory, methodology, evaluation

Reduces pause time Types of write barriers Snapshot-at-the-beginning Prevent loss of

Kensington & Chelsea Social Council: Health and Wellbeing Voluntary Organisation Forum Like

Untangling and Restructuring CTDB Martin Schwenke < martin@meltin.net > Samba Team IBM