Same, Same But Different Recovering Neural Network Quantization - - PowerPoint PPT Presentation

same same but different
SMART_READER_LITE
LIVE PREVIEW

Same, Same But Different Recovering Neural Network Quantization - - PowerPoint PPT Presentation

Same, Same But Different Recovering Neural Network Quantization Error Through Weight Factorization Eldad Meller ICML 2019 Neural Network Quantization Quantization of Neural Networks is needed for efficient inference Quantization adds


slide-1
SLIDE 1

Same, Same But Different

Recovering Neural Network Quantization Error Through Weight Factorization

Eldad Meller ICML 2019

slide-2
SLIDE 2

Neural Network Quantization

  • Quantization of Neural Networks is

needed for efficient inference

  • Quantization adds noise to the

network and degrades its performance

slide-3
SLIDE 3

Quantization Dynamic Range

  • The most common quantization setting is layer-wise quantization

where all the channels in a layer are quantized using the same dynamic range

  • Equalizing the dynamic range of all the channels in a layer

by amplifying channels with small dynamic range will reduce

  • verall quantization noise
slide-4
SLIDE 4

A simple trick to amplify channels

  • For any homogeneous activation

functions

  • Any channel in the network can

be scaled by any positive scalar if the weights in the consecutive layer are properly inversely scaled

  • The network's output

remains unchanged

slide-5
SLIDE 5

Network Equalization

slide-6
SLIDE 6

Network Equalization

slide-7
SLIDE 7

Quantization Degradation on Imagenet[%]

slide-8
SLIDE 8

Quantization Degradation on Imagenet[%]

slide-9
SLIDE 9

Summary

  • Equalization is an easy to use post-training

quantization method to recover quantization noise in neural networks

  • Can be applied to any network
  • A novel approach to quantization by searching for the

best equivalent representation

  • The method can be combined with other quantization

methods - e.g. quantization-aware training and smart clipping