CS 744: PYWREN Shivaram Venkataraman Fall 2020 ADMINISTRIVIA - - PowerPoint PPT Presentation

cs 744 pywren
SMART_READER_LITE
LIVE PREVIEW

CS 744: PYWREN Shivaram Venkataraman Fall 2020 ADMINISTRIVIA - - PowerPoint PPT Presentation

Hello ! CS 744: PYWREN Shivaram Venkataraman Fall 2020 ADMINISTRIVIA deadline Tonight Friday Project checkins due Nov 20 th submitting for In-class project presentations about ! talks requests regrade Dec 8 th and Dec 10 th 5


slide-1
SLIDE 1

CS 744: PYWREN

Shivaram Venkataraman Fall 2020

Hello!

slide-2
SLIDE 2

ADMINISTRIVIA

Project checkins due Nov 20th In-class project presentations Dec 8th and Dec 10th Project grade breakdown Intro: 5% Mid-semester checkin: 5% Presentation: 10% Final Report: 10%

→ Friday

Tonight

deadline

for

submitting

regrade

requests

!
5 min talks about your project Canvas soon !

for

Midterm I
slide-3
SLIDE 3

NEW HARDWARE MODELS

Implications Society
  • f
Big Date a analysis I computation Engines evolution

Big

Data Syctems shed storage

New hardware
slide-4
SLIDE 4

Serverless Computing Compute Accelerators Infiniband Networks Non-Volatile Memory

slide-5
SLIDE 5

SERVERLESS COMPUTING

1 No servers ? ?
slide-6
SLIDE 6

MOTIVATION: USABILITY

What instance type? What base image? How many to spin up? What price? Spot?

Data Scientist
  • Azure
, Google etc . E- Analysis
  • Makes

it difficult

to use the cloud
slide-7
SLIDE 7

O

slide-8
SLIDE 8

ABSTRACTION LEVEL ?

Application Compute Framework Hardware Logistic Regression Spark Amazon EC2 CloudLab Private Cluster … Application Compute Framework

Snowflake

÷÷j

..ae/totarinmisamneqn-,ouyIf.::.i-;:::e " → Avery
  • r
SOL query spark
  • n
a subset RDD
  • f
machines

wfm

strains

.

signing

?
  • server
VM
slide-9
SLIDE 9

STATELESS DATA PROCESSING

  • Intermediate

aerogel state

Compute state in spark IMR

f

resource .biz was
  • n
local disk Redis

I

local storage is ephemeral IAA so intermediate state S3 needs to be remote !
slide-10
SLIDE 10

“Serverless” computing

300 900 seconds single-core 512 MB in /tmp 3GB RAM Python, Java, node.js

Provided by cloud Provider
  • submit
a

function ( lambda)

Y÷mqFydoadµ§

to be executed
  • → Time
bound

r

I

→ storage tgpsowds

memory

cloud database

=

slide-11
SLIDE 11

PYWREN API

' foython test
  • pg
test . py

/

Language

Integrated

! !

martially

captures dependencies and ships them to the cloud

⇒ fat

use libraries

[cloudpickle

~ 2010]
  • like
map function similar to Pyspark ↳ block similar to get in Ray API
slide-12
SLIDE 12

PYWREN: how it works

your laptop the cloud

future = runner.map(fn, data) future.result()

Distributed key value : getput
  • #
Amazon

T.name

Invoke

In

get# → fetch fu & data

"

÷

  • ften
  • toll
. . containers )

¥

variable in <

fetch

  • your laptop ! #JUS
slide-13
SLIDE 13

how it works

pull job from s3 download anaconda runtime python to run code pickle result stick in S3

your laptop the cloud

future = runner.map(fn, data)

Serialize func and data Put on S3 Invoke Lambda func data data data

future.result()

poll S3 unpickle and return result

slide-14
SLIDE 14

STATELESS FUNCTIONS: WHY NOW ?

What are the trade-offs ?

Need more network 210 All the data is read
  • ver
network ! But network BW is

pretty

  • f
good !

comparable

to

local

SSD Bw!
Bottleneck could be Ss ?
slide-15
SLIDE 15

MAP and REDUCE ?

Input Data Output Data

Shuffle

phase

in MR is now Sort benchmark ↳ same as MapReduce paper

being

done

using

Co
  • key? ,hey2
  • Redi
  • Goi
  • soo) top
. .
  • =

=

  • red
.!

(

bucket keys
  • key
  • value

intoning

  • small

files

store
  • memory
not good for blob store like
slide-16
SLIDE 16

PARAMETER SERVERS

Use lambdas to run “workers” Parameter server as a service ? Parameter Server get update

compute ML model sparse

models

↳ Ad click prediction ) read

stored

input → # Redi
  • r
VMs etc .
  • How
do you

profile

  • r
measure

function requirements ?

Ran

function locally

, use

profiler ?

I

I

→ checkpoint

( before time limit) and resume [ Recent work

!]

Fault tolerance
slide-17
SLIDE 17

WHEN Should we use SERVERLESS ?

Yes! Maybe not ?

Use when we need

elasticity

not me semesters when you Use when

you

don't

need

need local state (actors)

fine

grained

Comm . across

Iterative

workloads)

might

need state

from

poor .

iteration

workers

not

all

lambdas

might

he active

at

the same time !
slide-18
SLIDE 18

SUMMARY

Motivation: Usability of big data analytics Approach: Language-integrated cloud computing Features

  • Breakdown computation into stateless functions
  • Schedule on serverless containers
  • Use external storage for state management

Open question on scheduling, overheads

slide-19
SLIDE 19

DISCUSSION

https://forms.gle/PAMDKmwHepmPWDrBA

slide-20
SLIDE 20 scale

ywjrkefpu.es?diforageindefedentY

Increasing

workers

by K

f

! ' Sx

improvement

  • D
  • Hard
to know

→ compute

is how to ← very short choose men

compared

to

I/O

pavilions

more

wards

reduces time to read/ write to Reds
slide-21
SLIDE 21

Consider you are a cloud provider (e.g., AWS) implementing support for serverless. What could be some of the new challenges in scheduling these workloads? How would you go about addressing them?

  • Mapping

lambda functions

machines How do we do this ?
  • Locality
? Does
  • ne
lambda

talk

to some Redi shard ? can we

infer it ?

  • when
to

schedule

a new container / when do we reuse ? * Need

"

to

find

  • pt

configuration

? use ML ?
  • Resource

requirements

are fixed ! 900 , I core upto 3GB '
slide-22
SLIDE 22

OPEN QUESTIONS

  • Scalable scheduling: Low latency with large number of functions ?
  • Debugging: Correlate events across functions ?
  • Launch overheads: Fraction of time spent in setup (OpenLambda)
  • Resource limits: 15 minute AWS Lambda (Oct 2018)

tu

slide-23
SLIDE 23 told Stark App side ↳ sued . side ]
  • btw"
"m be warm for 5 mins ⇒ if you ran
  • ne
within Swiss Azure
  • policy
paper

TB%YiaAuw#

÷÷÷i¥¥⇐

.
  • 1 :÷:
3h13