Painless machine learning in production H. Chase Stevens Principal - PowerPoint PPT Presentation

Painless machine learning in production H. Chase Stevens Principal Data Science Engineer, Boston, MA chase@chasestevens.com @hchasestevens Europython 2020

“Painless machine learning in production”

“Painless machine learning in production” “Painless machine learning in production”

“Painless machine learning in production” “Painless machine learning in production” “Painless machine learning in production ”

“Painless machine learning in production” “Painless machine learning in production” “Painless machine learning in production ” “ Painless machine learning in production”

Lessons from industry regarding pain reduction and data scientist empowerment in the H. Chase Stevens productionization of Principal Data Science Engineer, Boston, MA chase@chasestevens.com machine learning models @hchasestevens

Contents - Motivation - Developer experience - Our stack - Lessons learned

Motivation I. Ops is intrinsic to ML

Motivation I. Ops is intrinsic to ML II. MLOps is unsustainable

Motivation I. Ops is intrinsic to ML II. MLOps is unsustainable ∴

Motivation I. Ops is intrinsic to ML II. MLOps is unsustainable ∴ III. Data scientists want to do data science

Motivation I. Ops is intrinsic to ML II. MLOps is unsustainable ∴ III. Data scientists want to do data science ∴

I. Ops is intrinsic to ML Orchestration

I. Ops is intrinsic to ML Sanders, H., & Saxe, J. (2017). Garbage in, garbage out: how purportedly great ML models can be screwed up by bad data.

I. Ops is intrinsic to ML

II. MLOps is unsustainable (in 1970)

II. MLOps is unsustainable (in 1970) “I had to wait hours for my “You couldn't even delete a mistake” programs to turn around” “Only a select few programmers were allowed in the computer lab.” “One of our finals was to design, code, “I submitted my program to the punch, debug a solution - we got 4 days punch card crew, and got it back to do it which means finding typos, logic several days later with a rather errors, and design errors and eliminating strong note” them all with only 4 re-runs”

II. MLOps is unsustainable (in 2000) Code → QA → Release (?)

II. MLOps is unsustainable (today) “Here’s the model”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet” “Try this instead”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet” “Try this instead” “Wrong version of numpy”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet” “Try this instead” “Wrong version of numpy” “That should be corrected”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet” “Try this instead” “Wrong version of numpy” “That should be corrected” “This null value isn’t handled”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet” “Try this instead” “Wrong version of numpy” “That should be corrected” “This null value isn’t handled” “Try again?”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet” “Try this instead” “Wrong version of numpy” “That should be corrected” “This null value isn’t handled” “Try again?” “The graphs aren’t displaying”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet” “Try this instead” “Wrong version of numpy” “That should be corrected” “This null value isn’t handled” “Try again?” “The graphs aren’t displaying” “OK, delete that part”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet” “Try this instead” “Wrong version of numpy” “That should be corrected” “This null value isn’t handled” “Try again?” “The graphs aren’t displaying” “OK, delete that part” “This takes too long in prod”

II. MLOps is unsustainable (today) “Here’s the model” “This data isn’t available yet” “Try this instead” “Wrong version of numpy” “That should be corrected” “This null value isn’t handled” “Try again?” “The graphs aren’t displaying” “OK, delete that part” “This takes too long in prod” “... Ready to try version two?”

Developer experience $ cookiecutter git@github.com:teikametrics/sagemaker-framework.git github_username [my-github-username]: hchasestevens project_name [my-sagemaker-model]: europython-example-model project_slug [europython_example_model]: model_name [europython-example-model]: description [An ML model living on the SageMaker platform.]: An example model for Europython 2020. Select model_validation_metric: 1 - sklearn.metrics.mean_squared_error 2 - sklearn.metrics.r2_score 3 - sklearn.metrics.accuracy_score 4 - sklearn.metrics.log_loss 5 - sklearn.metrics.f1_score 6 - sagemaker_framework.utils.metrics.mean_absolute_percentage_error Choose from 1, 2, 3, 4, 5, 6 (1, 2, 3, 4, 5, 6) [1]: 1 Select promotion_criterion: 1 - sagemaker_framework.utils.promotion.maximize 2 - sagemaker_framework.utils.promotion.minimize 3 - sagemaker_framework.utils.promotion.maximize_with_tol 4 - sagemaker_framework.utils.promotion.minimize_with_tol 5 - sagemaker_framework.utils.promotion.manual 6 - sagemaker_framework.utils.promotion.always_promote Choose from 1, 2, 3, 4, 5, 6 (1, 2, 3, 4, 5, 6) [1]: 6 preprocessing_cpus [1]: preprocessing_memory_in_gb [4]: 8 test_proportion [0.2]: 0.1 training_cpus [1]: training_memory_in_gb [4]: training_volume_size_in_gb [2]: max_training_runtime_in_minutes [30]: 60 min_serving_instances [1]: max_serving_instances [10]: 1 serving_cpus [1]: serving_memory_in_gb [4]: 4

Painless machine learning in production H. Chase Stevens Principal - PowerPoint PPT Presentation

Painless machine learning in production H. Chase Stevens Principal Data Science Engineer, Boston, MA chase@chasestevens.com @hchasestevens Europython 2020 Painless machine learning in production Painless machine learning in

Painless Transition From SW- Painless Transition From SW- CMM Level 2 to CMMI Level 3 CMM Level

Cycle with PayScale Crew Agenda The Annual Pay Change Process 5 Steps to Painless Increases

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

A Dependency Parser for Tweets Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna

PHOTOVOLTAICS Direct Conversion of Sunlight to Electricity David T Britton, NanoSciences

Health Links Leadership Summit Wednesday September 28 th 2016 #HLSummit2016 AGENDA MY PERSONAL

Principles for Safeguarding Nuclear Waste at Reactors The following principles are based on the

Big Data is now a proper noun Randy Olinger Optum 2016 MSST 5/4/2016 Introduction - Optum

Economics in the Time of Trump. 2017 Climbing Wall Summit Loveland, CO Matthew C. Roberts, PhD

EXTENDS & SILENT CLASSES IN S ASS 3.2 its like ninjas in your code EXTENDS & SILENT

Human-Computer interaction Termin 5: User interface styles and technology MMI/SS06 1 MMI / SS06

Painless machine learning in production H. Chase Stevens Principal - PowerPoint PPT Presentation

Painless machine learning in production H. Chase Stevens Principal Data Science Engineer, Boston, MA chase@chasestevens.com @hchasestevens Europython 2020 Painless machine learning in production Painless machine learning in

Painless Transition From SW- Painless Transition From SW- CMM Level 2 to CMMI Level 3 CMM Level

Cycle with PayScale Crew Agenda The Annual Pay Change Process 5 Steps to Painless Increases

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

A Dependency Parser for Tweets Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna

PHOTOVOLTAICS Direct Conversion of Sunlight to Electricity David T Britton, NanoSciences

Health Links Leadership Summit Wednesday September 28 th 2016 #HLSummit2016 AGENDA MY PERSONAL

Principles for Safeguarding Nuclear Waste at Reactors The following principles are based on the

Big Data is now a proper noun Randy Olinger Optum 2016 MSST 5/4/2016 Introduction - Optum

Economics in the Time of Trump. 2017 Climbing Wall Summit Loveland, CO Matthew C. Roberts, PhD

EXTENDS &amp; SILENT CLASSES IN S ASS 3.2 its like ninjas in your code EXTENDS &amp; SILENT

Human-Computer interaction Termin 5: User interface styles and technology MMI/SS06 1 MMI / SS06

EXTENDS & SILENT CLASSES IN S ASS 3.2 its like ninjas in your code EXTENDS & SILENT