CS 744: PYWREN Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - - PowerPoint PPT Presentation

cs 744 pywren
SMART_READER_LITE
LIVE PREVIEW

CS 744: PYWREN Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - - PowerPoint PPT Presentation

CS 744: PYWREN Shivaram Venkataraman Fall 2019 ADMINISTRIVIA Happy Thanksgiving!? NEW HARDWARE MODELS Infiniband Networks Compute Accelerators Serverless Computing Non-Volatile Memory SERVERLESS COMPUTING MOTIVATION: USABILITY What


slide-1
SLIDE 1

CS 744: PYWREN

Shivaram Venkataraman Fall 2019

slide-2
SLIDE 2

ADMINISTRIVIA

Happy Thanksgiving!?

slide-3
SLIDE 3

NEW HARDWARE MODELS

slide-4
SLIDE 4

Serverless Computing Compute Accelerators Infiniband Networks Non-Volatile Memory

slide-5
SLIDE 5

SERVERLESS COMPUTING

slide-6
SLIDE 6

MOTIVATION: USABILITY

What instance type? What base image? How many to spin up? What price? Spot?

slide-7
SLIDE 7
slide-8
SLIDE 8

ABSTRACTION LEVEL ?

Application Compute Framework Hardware Logistic Regression Spark Amazon EC2 CloudLab Private Cluster … Application Compute Framework

slide-9
SLIDE 9

STATELESS DATA PROCESSING

slide-10
SLIDE 10

“Serverless” computing

300 900 seconds single-core 512 MB in /tmp 3GB RAM Python, Java, node.js

slide-11
SLIDE 11

PYWREN API

slide-12
SLIDE 12

PYWREN: how it works

your laptop the cloud

future = runner.map(fn, data) future.result()

slide-13
SLIDE 13

how it works

pull job from s3 download anaconda runtime python to run code pickle result stick in S3

your laptop the cloud

future = runner.map(fn, data)

Serialize func and data Put on S3 Invoke Lambda func data data data

future.result()

poll S3 unpickle and return result

slide-14
SLIDE 14

STATELESS FUNCTIONS: WHY NOW ?

What are the trade-offs ?

slide-15
SLIDE 15

MAP and REDUCE ?

Input Data Output Data

slide-16
SLIDE 16

PARAMETER SERVERS

Use lambdas to run “workers” Parameter server as a service ? Parameter Server get update

slide-17
SLIDE 17

WHEN Should we use SERVERLESS ?

Yes! Maybe not ?

slide-18
SLIDE 18

SUMMARY

Motivation: Usability of big data analytics Approach: Language-integrated cloud computing Features

  • Breakdown computation into stateless functions
  • Schedule on serverless containers
  • Use external storage for state management

Open question on scheduling, overheads

slide-19
SLIDE 19

DISCUSSION

https://forms.gle/Y9AFUpvVBA7LpKqh7

slide-20
SLIDE 20
slide-21
SLIDE 21
slide-22
SLIDE 22

Consider you are a cloud provider (e.g., AWS) implementing support for serverless. What could be some of the new challenges in scheduling these workloads? How would you go about addressing them?

slide-23
SLIDE 23
slide-24
SLIDE 24

OPEN QUESTIONS

  • Scalable scheduling: Low latency with large number of functions ?
  • Debugging: Correlate events across functions ?
  • Launch overheads: Fraction of time spent in setup (OpenLambda)
  • Resource limits: 15 minute AWS Lambda (Oct 2018)