Clipper A Low-Latency Online Prediction Serving System Dan - PowerPoint PPT Presentation

Clipper A Low-Latency Online Prediction Serving System Dan Crankshaw crankshaw@cs.berkeley.edu http://clipper.ai https://github.com/ucbrise/clipper December 8, 2017

Serving Training Query Big Training Data Decision Model Application Prediction-Serving for interactive applications Timescale: ~10s of milliseconds

Prediction-Serving Challenges ??? VW Create Caffe 3 Large and growing ecosystem Support low-latency, high- of ML models and frameworks throughput serving workloads

Prediction-Serving Today Clipper aims to unify these approaches Query X Y New class of systems: Decision Prediction-Serving Systems Highly specialized systems for Offline scoring with existing specific problems frameworks and systems

Clipper Decouples Applications and Models Applications Predict RPC/REST Interface Clipper RPC RPC RPC RPC Model Container (MC) MC MC MC Caffe

Clipper RPC RPC RPC RPC MC MC MC Model Container (MC) Caffe Common Interface à Simplifies Deployment: Ø Evaluate models using original code & systems Ø Models run in separate processes as Docker containers Ø Resource isolation: Cutting edge ML frameworks can be buggy Ø Scale-out and deployment on Kubernetes

Clipper Architecture Applications Predict Clipper Caching Latency-Aware Batching RPC RPC RPC RPC Model Container (MC) MC MC MC Caffe

Status of the project https://github.com/ucbrise/clipper Ø First released in May 2017 with a focus on usability Ø Currently working towards 0.3 release and actively working with early users Focused on performance improvements and better monitoring and stability Ø Ø Supports native deployments on Kubernetes and a local Docker mode Ø Goal: Community-owned platform for model deployment and serving Post issues and questions on GitHub and subscribe to our mailing list clipper- Ø dev@googlegroups.com

Simplifying Model Deployment with Clipper

Getting Started with Clipper is Easy Docker images available on DockerHub Clipper admin is distributed as pip package: pip install clipper_admin Get up and running without cloning or compiling!

Clipper Connects Training and Serving Worker Node Executor Web Server Task Task Worker Node Executor Driver Program Cache Clipper Task Task SparkContext Worker Node MC Executor Database Task Task

Problem: Models don’t run in isolation Must extract model plus pre- and post-processing logic

Clipper provides a library of model deployers Ø Deployer automatically and intelligently saves all prediction code Ø Captures both framework-specific models and arbitrary serializable code Ø Replicates required subset of training environment and loads prediction code in a Clipper model container

Clipper provides a (growing) library of model deployers Ø Python Ø Combine framework specific models with external featurization, post-processing, business logic Ø Currently support Scikit-Learn, PySpark, TensorFlow Ø PyTorch, Caffe2, XGBoost coming soon Ø Scala and Java with Spark: Ø both MLLib and Pipelines APIs Ø Arbitrary R functions

Ongoing Research

Supporting Modular Multi-Model Pipelines Else If face Task- Slow but detected specific accurate Face model model detector If object Fast If confident detected Pre-trained model Object then return DNN detector Ensembles can Faster inference Faster development Model improve accuracy with prediction through model- specialization cascades reuse How to efficiently support serving arbitrary model pipelines?

Challenges of Serving Model Pipelines Ø Complex tradeoff space of latency, throughput, and monetary cost Ø Many serving workloads are interactive and highly latency-sensitive Ø Performance and cost depend on model, workload, and physical resources available Ø Model composition leads to combinatorial explosion in the size of the tradeoff space Ø Developers must make decisions about how to configure individual models while reasoning about end-to-end pipeline performance

Solution: Workload-Aware Optimizer Ø Exploit structure and properties of inference computation Immutable state Ø Query-level parallelism Ø Compute-intensive Ø Ø Pipeline definition Intermingle arbitrary application code and Clipper-hosted model evaluation for Ø maximum flexibility Ø Optimizer input Pipeline, sample workload, and performance or cost constraints Ø Ø Optimizer output Optimal pipeline configuration that meets constraints Ø Ø Deployed models use Clipper as physical execution engine for serving

Conclusion Ø Challenges of serving increasingly complex models trained in variety of frameworks while meeting strict performance demands Ø Clipper adopts a container-based architecture and employs prediction caching and latency-aware batching Ø Clipper’s model deployer library makes it easy to deploy both framework-specific models and arbitrary processing code Ø Ongoing efforts on a workload-aware optimizer to optimize the deployment of complex, multi-model pipelines http://clipper.ai

Clipper A Low-Latency Online Prediction Serving System Dan - PowerPoint PPT Presentation

Clipper A Low-Latency Online Prediction Serving System Dan Crankshaw crankshaw@cs.berkeley.edu http://clipper.ai https://github.com/ucbrise/clipper December 8, 2017 Serving Training Query Big Training Data Decision Model Application

Foyle Maritime e Festival & Clipper Race & Clipper Race e Presentation 26 6 May 2016

Midterm Recap Misuse of Crypto and Future Work Clipper chip A lesson in poorly

Clipper CIDs Relocations & Additions Project JPB Citizens Advisory Committee July 15, 2020

Clipper Breathing Life into Cultural Collections and Archives John Casey 1 , Trevor Collins 2

PHS Clipper Band Disney 2020 8/28/19 Presentation Agenda Marching Band Items Trip Dates

Why Did the Clipper Clip It? Bruce Calderbank, FRICS, CLS, CH, P. Eng. Chartered Hydrographic

CLIPPER Creating a Leadership for Maritime industries New opportunities in Europe

TVM Upgrade Project Citizens Advisory Committee July 15, 2020 Agenda Item 8 Scope of Work 1.

New inventions: telegraph, RRs, clipper ships, steam engines, building of factories.

CS 744: CLIPPER Shivaram Venkataraman Fall 2020 ADMINISTRIVIA Course Project Proposals - Due

CS 744: CLIPPER Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - Assignment 2 grading -

Clipper Cove WEEK STARTS COST 1 02-Jan 595.00 2 9 595.00 3 16 595.00 4 23

Beneficial Nematodes Dr. Abdul Al-Amidi Nemos Horticultural Ltd. pests Eastern Flower Thrip

READY, SET TRANSITION Preparing for Life After School You can do it! Kimberly Limato RCEB

TFAWS August 21-25, 2017 NASA Marshall Space Flight Center MSFC 2017 Huntsville, AL

Investor Presentation June 2020 Forward-Looking Statements All statements other than statements

Faster Region-based Hotspot Detection Ran Chen 1 , Wei Zhong 2 , Haoyu Yang 1 , Hao Geng 1 , Xuan

51 live sites in 37 languages

61A Extra Lecture 4 Thursday, February 19

Secondary School Information for Secondary Two (Normal Academic) Streaming Exercise 2019 SPORTS

Machine Learning Pipelines Marco Serafini COMPSCI 532 Lecture 21 Training vs. Inference

Prediction Serving what happens after learning? Joseph E. Gonzalez Asst. Professor, UC Berkeley

Diodes Waveform shaping Circuits Lecture notes: page 2-20 to 2-31 Sedra & Smith (6 th Ed):

HOW CRYPTO FAILS IN PRACTICE CMSC 414 APR 3 2018 POOR PROGRAMING CryptoLint tool to perform

Clipper A Low-Latency Online Prediction Serving System Dan - PowerPoint PPT Presentation

Clipper A Low-Latency Online Prediction Serving System Dan Crankshaw crankshaw@cs.berkeley.edu http://clipper.ai https://github.com/ucbrise/clipper December 8, 2017 Serving Training Query Big Training Data Decision Model Application

Foyle Maritime e Festival &amp; Clipper Race &amp; Clipper Race e Presentation 26 6 May 2016

Midterm Recap Misuse of Crypto and Future Work Clipper chip A lesson in poorly

Clipper CIDs Relocations &amp; Additions Project JPB Citizens Advisory Committee July 15, 2020

Clipper Breathing Life into Cultural Collections and Archives John Casey 1 , Trevor Collins 2

PHS Clipper Band Disney 2020 8/28/19 Presentation Agenda Marching Band Items Trip Dates

Why Did the Clipper Clip It? Bruce Calderbank, FRICS, CLS, CH, P. Eng. Chartered Hydrographic

CLIPPER Creating a Leadership for Maritime industries New opportunities in Europe

TVM Upgrade Project Citizens Advisory Committee July 15, 2020 Agenda Item 8 Scope of Work 1.

New inventions: telegraph, RRs, clipper ships, steam engines, building of factories.

CS 744: CLIPPER Shivaram Venkataraman Fall 2020 ADMINISTRIVIA Course Project Proposals - Due

CS 744: CLIPPER Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - Assignment 2 grading -

Clipper Cove WEEK STARTS COST 1 02-Jan 595.00 2 9 595.00 3 16 595.00 4 23

Beneficial Nematodes Dr. Abdul Al-Amidi Nemos Horticultural Ltd. pests Eastern Flower Thrip

READY, SET TRANSITION Preparing for Life After School You can do it! Kimberly Limato RCEB

TFAWS August 21-25, 2017 NASA Marshall Space Flight Center MSFC 2017 Huntsville, AL

Investor Presentation June 2020 Forward-Looking Statements All statements other than statements

Faster Region-based Hotspot Detection Ran Chen 1 , Wei Zhong 2 , Haoyu Yang 1 , Hao Geng 1 , Xuan

51 live sites in 37 languages

61A Extra Lecture 4 Thursday, February 19

Secondary School Information for Secondary Two (Normal Academic) Streaming Exercise 2019 SPORTS

Machine Learning Pipelines Marco Serafini COMPSCI 532 Lecture 21 Training vs. Inference

Prediction Serving what happens after learning? Joseph E. Gonzalez Asst. Professor, UC Berkeley

Diodes Waveform shaping Circuits Lecture notes: page 2-20 to 2-31 Sedra &amp; Smith (6 th Ed):

HOW CRYPTO FAILS IN PRACTICE CMSC 414 APR 3 2018 POOR PROGRAMING CryptoLint tool to perform

Foyle Maritime e Festival & Clipper Race & Clipper Race e Presentation 26 6 May 2016

Clipper CIDs Relocations & Additions Project JPB Citizens Advisory Committee July 15, 2020

Diodes Waveform shaping Circuits Lecture notes: page 2-20 to 2-31 Sedra & Smith (6 th Ed):