CS 744: CLIPPER
Shivaram Venkataraman Fall 2020
Good
morning !
CS 744: CLIPPER Shivaram Venkataraman Fall 2020 ADMINISTRIVIA - - PowerPoint PPT Presentation
morning ! Good CS 744: CLIPPER Shivaram Venkataraman Fall 2020 ADMINISTRIVIA Course Project Proposals - Due on Friday! - See Piazza for template - Submission instructions soon midterms dangle / the ML on Pinera at upto
CS 744: CLIPPER
Shivaram Venkataraman Fall 2020
Good
morning !
ADMINISTRIVIA
Course Project Proposals
Midterm details
section
/
dangle
midterms
→ at the
→
upto
ML
MACHINE LEARNING: INFERENCE
÷:
:
÷
GOALS
Fw!!
ggg percentile
99.9
"
percentile
→
how
latency
→
many
users "many
requests
that
need
to be
made
↳
ML
specific
II t \
handle
asmany mi
models /
frameworks
asqwitpj.AM?.wxe..m
possible
ARCHITECHTURE
Requests
HTTP
skinform
^
.
t
accuracy eager
'
eight L
It [
/
rabbi
fair
herd
←
D Deel
go
←
L
t
MODEL CONTAINERS
Run using Docker containers Can be replicated across machines
tint #dell
people
API
↳
Interface
is implementedper
framework
'ate TF shim
model
instantiate
TF
Mim
frameworks
are →
.pe?f
. 'Y ' !1¥
win rent
MODEL ABSTRACTION LAYER
Caching
datapoint
Predict
, * good
gpeoloinieddder
.¥kiEm;;
to
predict
movies for user- id - I→
spark -50
.Tt
Her
:
feedback
high
dir ith
¢
I
① predict . mm!?↳
and
Predictions
→ ModelBATCHING, QUEUING
Goals, Insight
for improved throughput
Approach
To
do
anRPC
sizethat
max .+
put while
within↳ fixed
lost
eating SLO
both
batches
we lead
hasmdddisswt
;÷%
:L
→ hardware
could vary
f-
each
model !
✓
parellism
Gpu
Cpu
ADAPTIVE BATCHING
1 2 3 4 5 2 4 6 8 10 Batch Size Time
AIMD: Additive Inc Multiplicative Dec Why ? Delayed: Wait until batch exists Why?
latency
)
write
late
SLO Increase
batch
in
carefully
inc .
batch
4
& ed-domrmaffiHE.7.su !
FL
f
s2
Collect
examples
Gang
upto
acertainties
then
dinette
Elias
→link) should
.?wait ?
↳Der
few
should Iad ?
MODEL SELECTION
→
Improve
Accuracy
ensembles
→
SINGLE MODEL SELECTION
Multi-Arm Bandit formulation
picking optimal action
Clipper
a n÷÷÷
atta "¥a-get
with each
weights
based
④ model I
Omodd2
MULTI MODELS
Ensemble
Robust Prediction
→
ensembles predict
movies
it 5¥
ft
Este
vApart linear
Combination tf
=L y.at/32i
t
t
Binary
classifier
Expo
, →update
d &
B
. CI0.25
CZ
O↳
combine
&
threshold
> as → catdog
STRAGGLER MITIGATION
Why do stragglers occur? Approach
wait
for
N
model
containers
to
totem
reply ,
some
them
might
be
slow ?
↳
morereplicas
locating ?
↳
Approx
result
based
1¥
. ..2repwhatever
has
finished
→ Better
approx
them
late !
→ MLspecific
SUMMARY
DISCUSSION
https://forms.gle/FCVhPURqz7HSbDtg6
Consider a scenario where you run a model serving service that hosts a number of different applications. The traffic for some applications is sporadic (e.g. only a few hours where they are used). What are some advantages / disadvantages of using Clipper for such a service?
Advantages
Disadvantages
→Rade might
becontented
→
Adaptive
batching
delayed
→ tune→
multiple
replicas
.net?fashim ?
elasticity
roti
frequent
greeted
→ Containerization
inlet
applications
T
pt⇒
. slow t ↳ de -Effie
bing.g.ms
.
homie :O
:L,
smug
)
different
things ?
?
O
D
O
O -
D
Ao
:
↳ µ
ensembles
Treasonable
accurate
tetany inflation
isvery
low