Inference, Deployment, and Compression
CS4787 Lecture 22 — Spring 2020
Inference, Deployment, and Compression CS4787 Lecture 22 Spring - - PowerPoint PPT Presentation
Inference, Deployment, and Compression CS4787 Lecture 22 Spring 2020 <latexit
CS4787 Lecture 22 — Spring 2020
n
i=1
all these metrics!
where some metrics get better and others get worst
What tools do we have in our toolbox?
since it’s easier to put a low-power CPU on a mobile device
return an answer for any individual example
activations around for later processing.
network may sometimes predict differently
your neural network.
the cost of some compute & can affect start-up latency.
signals until the accuracy decreases
first place would have been as good or better.
model to improve accuracy
Lossy Compression Retrain Weights
model final compressed model
smaller network to match its output
Neural Network.”
models into a single better prediction
inference time
inference on
place for our application.
from one frame to the next.
really general.
support for fast DNN inference, so this will become less necessary.
not interesting.