Putting Deep Learning Models in Production Sahil Dua @sahildua2305 - PowerPoint PPT Presentation

Putting Deep Learning Models in Production Sahil Dua @sahildua2305 @sahildua2305

Let’s imagine! @sahildua2305 @sahildua2305

But ... @sahildua2305 @sahildua2305

whoami ➔ Software Developer @ Booking.com ➔ Previously - Deep Learning Infrastructure ➔ Open Source Contributor (Git, Pandas, Kinto, go-github, etc.) ➔ Tech Speaker @sahildua2305 @sahildua2305

Agenda ➔ Deep Learning at Booking.com ➔ Life-cycle of a model ➔ Training Models ➔ Serving Predictions @sahildua2305 @sahildua2305

Deep Learning at Booking.com @sahildua2305 @sahildua2305

Scale highlights . 1,500,000 + 1.4 million + room nights active properties booked in 220+ countries every 24 hours @sahildua2305 @sahildua2305

Deep Learning ➔ Image understanding ➔ Translations ➔ Ads bidding ➔ ... @sahildua2305 @sahildua2305

Image Tagging @sahildua2305 @sahildua2305

Image Tagging Sea view: 6.38 Balcony/Terrace: 4.82 Photo of the whole room: 4.21 Bed: 3.47 Decorative details: 3.15 Seating area: 2.70 @sahildua2305 @sahildua2305

@sahildua2305 @sahildua2305

Image Tagging Using the image tag information in the right context Swimming pool, Breakfast Buffet, etc. @sahildua2305 @sahildua2305

Lifecycle of a model @sahildua2305 @sahildua2305

Lifecycle of a model Data Train Deploy Analysis @sahildua2305 @sahildua2305

Training a Model - on laptop @sahildua2305 @sahildua2305

Machine Learning workload ➔ Computationally intensive workload ➔ Often not highly parallelizable algorithms ➔ 10 to 100 GBs of data @sahildua2305 @sahildua2305

Why Kubernetes (k8s)? ➔ Isolation ➔ Elasticity ➔ Flexibility @sahildua2305 @sahildua2305

Why k8s – GPUs? ➔ In alpha since 1.3 ➔ Speed up 20X-50X resources: limits: alpha.kubernetes.io/nvidia-gpu: 1 @sahildua2305 @sahildua2305

Training with k8s ➔ Base images with ML frameworks ◆ TensorFlow, Torch, VowpalWabbit, etc. ➔ Training code is installed at start time ➔ Data access - Hadoop (or PVs) @sahildua2305 @sahildua2305

Startup Training pod Code .. start.sh train.py evaluate.py @sahildua2305 @sahildua2305

Startup Training pod Data .. start.sh PV train.py evaluate.py @sahildua2305 @sahildua2305

Streaming logs back Training pod Logs .. start.sh PV train.py evaluate.py @sahildua2305 @sahildua2305

Exports the model Training pod model .. start.sh PV train.py evaluate.py @sahildua2305 @sahildua2305

Serving predictions @sahildua2305 @sahildua2305

Serving Predictions Input Features Client Model Prediction @sahildua2305 @sahildua2305

Serving Predictions Input Features Client Model 1 Prediction Input Features Client Model X Prediction @sahildua2305 @sahildua2305

Serving Predictions ➔ Stateless app with common code ➔ Containerized ➔ No model in image ➔ REST API for predictions @sahildua2305 @sahildua2305

Serving Predictions Input App Features Client model Prediction @sahildua2305 @sahildua2305

Serving Predictions ➔ Get trained model from Hadoop ➔ Load model in memory ➔ Warm it up ➔ Expose HTTP API ➔ Respond to the probes @sahildua2305 @sahildua2305

Serving Predictions Input Features Client Prediction @sahildua2305 @sahildua2305

Serving Predictions Input Features Client Prediction Input Features Client Prediction @sahildua2305 @sahildua2305

Deploying a new model ➔ Create new Deployment ➔ Create new HTTP Route ➔ Wait for liveness/readiness probe @sahildua2305 @sahildua2305

Performance PredictionTime = RequestOverhead + N*ComputationTime N is the number of instances to predict on @sahildua2305 @sahildua2305

Optimizing for Latency ➔ Do not predict if you can precompute ➔ Reduce Request Overhead ➔ Predict for one instance ➔ Quantization (float 32 => fixed 8) ➔ TensorFlow specific: freeze network & optimize for inference @sahildua2305 @sahildua2305

Optimizing for Throughput ➔ Do not predict if you can precompute ➔ Batch requests ➔ Parallelize requests @sahildua2305 @sahildua2305

Summary ➔ Training models in pods ➔ Serving models ➔ Optimizing serving for latency/throughput @sahildua2305 @sahildua2305

Next steps ➔ Tooling to control hundred deployments ➔ Autoscale prediction service ➔ Hyper parameter tuning for training @sahildua2305 @sahildua2305

Want to get in touch? LinkedIn / Twitter / GitHub @sahildua2305 Website www.sahildua.com @sahildua2305 @sahildua2305

THANK YOU @sahildua2305 @sahildua2305 @sahildua2305

Putting Deep Learning Models in Production Sahil Dua @sahildua2305 - PowerPoint PPT Presentation

Putting Deep Learning Models in Production Sahil Dua @sahildua2305 @sahildua2305 Lets imagine! @sahildua2305 @sahildua2305 But ... @sahildua2305 @sahildua2305 whoami Software Developer @ Booking.com Previously - Deep Learning

Initial Submission Id IHV Publication Id IHV DUA Submission Id IHV DUA Publication Id Publish

New CMS DUA Expiration Policy Meeting Facilit ator: S haron Kavanagh, CMS Emmanuel S

DFIR FILE SYSTEM FORENSICS DEV DUA (@dev0x01) >> whois Dev.Dua # Star Wars fan, clearly.

Nick Kofman, Sahil Patel, Heather Jaconi, Alysha Bindra, Marissa McCorkel Meet Our Team Marissa

Putting a socially responsible price on carbon Putting a socially responsible price on carbon

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

and Features in Deep Learning Interpretation Sahil Singla Joint work with Eric Wallace, Shi

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

K a thle e n Co lso n CE O & Co -F o unde r April 2016 Po ve rty Gra dua tio n Pro g ra m

Welcome to Opening Dua Imam Yusuf Peer OFFICIAL OPENING Mr Graham Perret Member for

Making Changes to an Approved DUA Work performed under CMS Contract #HHSM-500-2013-00166C

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Uni.lu HPC School 2019 PS12a: Machine / Deep learning I Keras/Tensorflow CPU/GPU Uni.lu High

Deep Learning to Evaluate Secure RSA Implementations Mathieu Carbone, Vincent Conin, Marie-Angela

Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15

Interpretable & Transparent Deep Learning Fraunhofer HHI, Machine Learning Group Wojciech

Deep learning for HEP/NP at NERSC Jlab Machine Learning Workshop November 6 th 2018 Wahid Bhimji

DeepMPLS: Fast Analysis of MPLS Configurations Using Deep Learning Fabien Geyer 1,2 and Stefan

DEEP LEARNING FOR ACTIVITY RECOGNITION (A BRIEF AND INCOMPLETE SURVEY) GRAHAM TAYLOR VISION,

DISTINGUISHING BETWEEN TYPICALLY DEVELOPING ENGLISH LEARNERS AND THOSE WITH READING AND LEARNING

Putting Deep Learning Models in Production Sahil Dua @sahildua2305 - PowerPoint PPT Presentation

Putting Deep Learning Models in Production Sahil Dua @sahildua2305 @sahildua2305 Lets imagine! @sahildua2305 @sahildua2305 But ... @sahildua2305 @sahildua2305 whoami Software Developer @ Booking.com Previously - Deep Learning

Initial Submission Id IHV Publication Id IHV DUA Submission Id IHV DUA Publication Id Publish

New CMS DUA Expiration Policy Meeting Facilit ator: S haron Kavanagh, CMS Emmanuel S

DFIR FILE SYSTEM FORENSICS DEV DUA (@dev0x01) &gt;&gt; whois Dev.Dua # Star Wars fan, clearly.

Nick Kofman, Sahil Patel, Heather Jaconi, Alysha Bindra, Marissa McCorkel Meet Our Team Marissa

Putting a socially responsible price on carbon Putting a socially responsible price on carbon

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

and Features in Deep Learning Interpretation Sahil Singla Joint work with Eric Wallace, Shi

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

K a thle e n Co lso n CE O &amp; Co -F o unde r April 2016 Po ve rty Gra dua tio n Pro g ra m

Welcome to Opening Dua Imam Yusuf Peer OFFICIAL OPENING Mr Graham Perret Member for

Making Changes to an Approved DUA Work performed under CMS Contract #HHSM-500-2013-00166C

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Uni.lu HPC School 2019 PS12a: Machine / Deep learning I Keras/Tensorflow CPU/GPU Uni.lu High

Deep Learning to Evaluate Secure RSA Implementations Mathieu Carbone, Vincent Conin, Marie-Angela

Deep Learning in Computer Vision Caner Hazrba Deep Learning in Action 24. June 15

Interpretable &amp; Transparent Deep Learning Fraunhofer HHI, Machine Learning Group Wojciech

Deep learning for HEP/NP at NERSC Jlab Machine Learning Workshop November 6 th 2018 Wahid Bhimji

DeepMPLS: Fast Analysis of MPLS Configurations Using Deep Learning Fabien Geyer 1,2 and Stefan

DEEP LEARNING FOR ACTIVITY RECOGNITION (A BRIEF AND INCOMPLETE SURVEY) GRAHAM TAYLOR VISION,

DISTINGUISHING BETWEEN TYPICALLY DEVELOPING ENGLISH LEARNERS AND THOSE WITH READING AND LEARNING

DFIR FILE SYSTEM FORENSICS DEV DUA (@dev0x01) >> whois Dev.Dua # Star Wars fan, clearly.

K a thle e n Co lso n CE O & Co -F o unde r April 2016 Po ve rty Gra dua tio n Pro g ra m

Interpretable & Transparent Deep Learning Fraunhofer HHI, Machine Learning Group Wojciech