dsc 102 systems for scalable analytics
play

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 7: ML - PowerPoint PPT Presentation

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 7: ML Deployment Not included for Final Exam Slide Content ACKs: Alkis Polyzotis, Manasi Vartak 1 The Lifecycle of ML-based Analytics Feature Engineering Data acquisition Model


  1. DSC 102 
 Systems for Scalable Analytics Arun Kumar Topic 7: ML Deployment Not included for Final Exam Slide Content ACKs: Alkis Polyzotis, Manasi Vartak 1

  2. The Lifecycle of ML-based Analytics Feature Engineering Data acquisition Model Serving Training & Inference Data preparation Monitoring Model Selection 2

  3. Deployment Stage of Data Science ❖ Data science does not exist in a vacuum. It must interplay with the data-generating process and prediction application ❖ Deploy Stage: Integrate the trained prediction function(s) with production environment, e.g., offline inference in a data system, online inference on a Web platform / IoT / etc. ❖ Typically, data scientist must work with “DevOps” engineers or “MLOps” engineers to achieve this 3

  4. ML in Academia vs Production What you classes on statistics, ML, AI, etc. cover! ☺ https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf 4

  5. Deployment Stage of Data Science ❖ Deployment stage typically involves 5 main activities in sync with other stages: 1. Packaging and Orchestration 2. Prediction Serving 3. Data Validation 4. Prediction Monitoring 5. Versioning 5

  6. 1. Packaging and Orchestration ❖ Basic Goal: Bundle up software to deploy with its dependencies into a lightweight standalone executable software that can run almost seamlessly across different OSs and hardware environments ❖ Most common approach today: Containerization ❖ Not specific to ML deployment but highly general ❖ Older generation approach called “virtual machines” included OS too and were bulky and slow ❖ Docker and Kubernetes are most popular options today 6

  7. 1. Packaging and Orchestration 7 https://medium.com/edureka/kubernetes-vs-docker-45231abeeaf1

  8. 1. Packaging and Orchestration ❖ Often, one might need to deploy end-to-end pipelines with effectively independent contrainerized software modules ❖ Workflow orchestration tools help handle complex pipelines ❖ Can specify time constraints, operation constraints, etc. 8

  9. 1. Packaging and Orchestration ❖ Cloud providers are also starting to make it easier to package and deploy prediction software, e.g., Model Endpoint in AWS Sagemaker ❖ Data scientists must look out for organization’s tools and services 9

  10. 2. Prediction Serving ❖ Basic Goal: Make ML inference fast and potential co- optimize with serving environment/infra. ❖ Typically automated tools; so data scientists only needs to know what systems available and how to use them ❖ 3 main kinds of systems: ❖ Program optimization of prediction function to improve hardware utilization, e.g., ONNX Runtime or Apache TVM ❖ Batch optimization of many concurrent prediction requests to balance latency and throughput better to improve hardware utilization, e.g., AWS SageMaker ❖ New hardware optimized for inference, e.g., TPUs 10

  11. 3. Data Validation ❖ Basic Goal: Ensure the data fed into prediction function conforms to its expectations on, say, schema/syntax/shape, integrity constraints (e.g., value ranges or domains), etc. ❖ Needs to be in lock step with data sourcing stage: acquiring, re-organizing, cleaning, and feature extraction ❖ Industry is starting to build platforms to make this process more rigorous and reusable, e.g., TensorFlow Extended ❖ Data scientists must learn their organization’s data validation practices and tools/APIs ❖ Also covered in Alkis’s guest lecture; further reading: https:// mlsys.org/Conferences/2019/doc/2019/167.pdf 11

  12. 4. Prediction Monitoring ❖ Basic Goal: Ensure the prediction functions are working as intended by data scientist; “silent failures” can happen due to concept drifts , i.e., data distribution has deviated significantly from when prediction function was built! ❖ Example: Sudden world event changes Web user behavior drastically, e.g., WHO declares pandemic! ☺ ❖ Needs to be in lock step with model building stage ❖ Industry today uses ad hoc statistical approaches ❖ Data scientists must look out for organizations’ monitoring practices, since it affects the lifecycle loop frequency ❖ Also covered in Alkis’s guest lecture; further reading: https:// mlsys.org/Conferences/2019/doc/2019/167.pdf 12

  13. 5. Versioning ❖ Basic Goal: Just like regular code, prediction software must be versioned and tracked for teams to ensure consistency across time and employees, as well as for auditing sake, ability to “rollback” to a safer state, etc. ❖ But unlike regular code, prediction software has 3 more dependencies other than just code: datasets (train/val/test), configuration (e.g., hyper-parameters), and environment (hardware/software, since that can affect accuracy too) ❖ Research and industry are barely starting to figure this out ❖ Data scientists must look out versioning best practices/tools ❖ Covered in Manasi’s guest lecture; https://blog.verta.ai/blog/ how-to-move-fast-in-ai-without-breaking-things 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend