PareCO: Pareto-aware Channel Optimization for Slimmable Neural - - PowerPoint PPT Presentation

pareco pareto aware channel optimization for slimmable
SMART_READER_LITE
LIVE PREVIEW

PareCO: Pareto-aware Channel Optimization for Slimmable Neural - - PowerPoint PPT Presentation

PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks Ting-Wu (Rudy) Chin Ari S. Morcos Diana Marculescu Slimmable Neural Networks Error #FLOPs One set of weights, multiple networks on the trade-off front! Why Slimmable


slide-1
SLIDE 1

PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks

Ting-Wu (Rudy) Chin Ari S. Morcos Diana Marculescu

slide-2
SLIDE 2

Slimmable Neural Networks

#FLOPs Error

One set of weights, multiple networks on the trade-off front!

slide-3
SLIDE 3

Why Slimmable Neural Networks?

Reduce model maintenance cost Runtime optimization

slide-4
SLIDE 4

The Gap

slide-5
SLIDE 5

How can we optimize slimmable neural networks with flexible widths?

#FLOPs Error #FLOPs Error

Trade-off induced by a slimmable network

α*

α, θ

slide-6
SLIDE 6

The objective of our problem

min

θ

𝔽x,y𝔽λLCE(θ; x, y, α*) s.t. α* = arg min Tλ(α; θ, x, y)

Augmented Tchebyshev Scalarization

slide-7
SLIDE 7

ImageNet: Compared to conventional slimmable neural networks

MobileNetV2 MobileNetV3

slide-8
SLIDE 8

Takeaways

  • Optimizing the layer-wise channel counts for the sub-networks

in slimmable neural networks allows for better trade-off between prediction error and FLOPs

  • This work provides a principled formulation and a practical

algorithm for optimizing the layer-wise channel counts for slimmable neural networks