I ncremental N etwork Q uantization: Towards Lossless CNNs With - PowerPoint PPT Presentation

I ncremental N etwork Q uantization: Towards Lossless CNNs With Low-Precision Weights Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, Yurong Chen Presented by Zhuangwei Zhuang South China University of Technology June 6, 2017 1

Outline  Background  Motivation  Proposed Methods  Variable-length enconding  Incremental quantization strategy  Experimental Results  Conclusions 2

Background 3

Background Huge networks lead to heavy consumption on memory and computational resources.  ResNet-152 has model size of 230 MB , and needs about 11.3 billion FLOPs for a 224 × 224 image Difficult to implement deep CNNs on hardware with the limitation of computation and power. FPGA ARM 4

Motivation 5

Motivation  Network quantization low-precision +1, 0, -1 Floating-point Fixed-point (full-precision) 2 𝑜 1 , … , 2 𝑜 2 , 0 CNN quantization still an open question due to two critical issues: Non-negligible accuracy loss for CNN quantization methods  Increased number of training iterations for ensuring convergence  6

Proposed Methods 7

Proposed Methods 50% 75% 100% … Figure. Overview of INQ Pre-trained Weight Group-wise Retraining model partition quantization Figure. Quantization strategy of INQ 8

Variable-Length Encoding Suppose a pre-trained full precision CNN model can be represented by {W 𝑚 : 1 ≤ 𝑚 ≤ 𝑀} . 𝑚 : weight set of 𝑚 𝑢ℎ layer 𝑋 L: number of layers Goal of INQ: Convert 32 floating-point 𝑋 𝑚 to be low-precision ෢ W 𝑚 , each entry of ෢ W 𝑚 is chosen from 𝑚 = {±2 𝑜 1 , ⋯ , ±2 𝑜 2 , 0} , P where 𝑜 1 and 𝑜 2 are two integer numbers, and 𝑜 2 ≤ 𝑜 1 . 9

Variable-Length Encoding 𝑚 = {±2 𝑜 1 , ⋯ , ±2 𝑜 2 , 0} P 𝑚 is computed by:  ෢ 𝑋 W 𝑚 𝑗, 𝑘 = ൝𝛾sgn ෢ 𝑋 𝑚 𝑗, 𝑘 if 𝛽 + 𝛾 ≤ 𝑏𝑐𝑡 W 𝑚 𝑗, 𝑘 < 3𝛾/2 ෢ 0 otherwise, Where 𝛽 and 𝛾 are two adjacent elements in the sorted P 𝑚 , and 0 ≤ 𝛽 < 𝛾 . 10

Variable-Length Encoding 𝑚 = {±2 𝑜 1 , ⋯ , ±2 𝑜 2 , 0} P  Define bit-width 𝑐 1 bit to represent 0, and the remaining bits to represent ±2 𝑜  𝑜 1 and 𝑜 2 are computed by 𝑜 1 = floor(log 2 (4𝑡/3)) 𝑜 2 = 𝑜 1 + 1 − 2 𝑐−1 /2  𝑡 is calculated by 𝑡 = max(abs(W 𝑚 )) 11

Incremental Quantization Strategy Figure. Result illustrations  Quantization strategy:  Weight partition: divide weights in each layers into two disjoint groups  Group-wise quantization: quantize weights in first group  Retraining: retrain whole network and update weights in second group 12

Incremental Quantization Strategy For the 𝑚 𝑢ℎ , weight partition can be defined as 1 ∪ A 𝑚 2 = W 𝑚 𝑗, 𝑘 1 ∩ A 𝑚 2 = ∅ A 𝑚 , and A 𝑚 1 : first weight group that needs to be quantized A 𝑚 2 : second weight group that needs to be retrained A 𝑚  Define binary matrix T 𝑚 (1) T 𝑚 𝑗, 𝑘 = ቐ0, W 𝑚 (𝑗, 𝑘) ∈ A 𝑚 (2) 1, W 𝑚 (𝑗, 𝑘) ∈ A 𝑚  Update W 𝑚 𝜖𝐹 W 𝑚 𝑗, 𝑘 ← W 𝑚 𝑗, 𝑘 − γ T 𝑚 (𝑗, 𝑘) 𝜖 W 𝑚 𝑗, 𝑘 13

Incremental Quantization Strategy Algorithm. Pseudo Code of INQ 14

Experimental Results 15

Results on ImageNet Table. Converting full-precision models to 5-bit versions 16

Analysis of Weight Partition Strategies  Random partition: all weights have equal probability to fall into the two groups  Pruning-inspired partition: weights with larger absolute values have more probability to be quantized Table. Comparison of different weight partition strategies on ResNet-18 17

Trade-Off Between Bit-Width and Accuracy Table. Exploration on bit-width on ResNet-18 Table. Comparison of the proposed ternary model and the baselines on ResNet-18 18

Low-Bit Deep Compression Table. Comparison of INQ+DNS, and deep compression method on AlexNet. Conv: Convolutional layer, FC: Fully connected layer, P: Pruning, Q: Quantization, H: Huffman coding 19

Conclusions 20

Conclusions  Contributions  Present INQ to convert any pre-trained full-precision CNN model into a lossless low-precision version  The quantized models with 5/4/3/2 bits achieve comparable accuracy against their full-precision baselines  Future work  Extend incremental idea from low-precision weights to low- precision activations and low-precision gradients .  Implement the proposed low-precision models on hardware platforms 21

Q & A 22

References [1] Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. Incremental network quantization: Towards lossless cnns with low-precision weights. In ICLR , 2017. [2] Yiwen Guo, Anbang Yao, and Yurong Chen. Dynamic network surgery for efficient dnns. In NIPS , 2016. [3] Song Han, Jeff Pool, John Tran, and William J. Dally. Learning both weights and connections for efficient neural networks. In NIPS , 2015. [4] Song Han, Jeff Pool, John Tran, and William J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In ICLR , 2016. [5] Fengfu Li and Bin Liu. Ternary weight networks. arXiv preprint arXiv: 1605.04711v1 , 2016 [6] Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. arXiv preprint arXiv: 1603.05279v4 , 2016. 23

I ncremental N etwork Q uantization: Towards Lossless CNNs With - PowerPoint PPT Presentation

I ncremental N etwork Q uantization: Towards Lossless CNNs With Low-Precision Weights Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, Yurong Chen Presented by Zhuangwei Zhuang South China University of Technology June 6, 2017 1 Outline

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Lossless compression in lossy compression systems Almost every lossy compression system

Deep Learning for Geometry Processing 3D Representations View-Based and Volumetric CNNs 3D

Understanding Geometry of Encoder-Decoder CNNs (E-D CNNs) Jong Chul Ye & Woon Kyoung Sung

Introduction to CNNs and RNNs with PyTorch Introduction to CNNs and RNNs with PyTorch Presented

U SE OF D ECISION U NIT AND I NCREMENTAL S AMPLING M ETHODS TO I MPROVE S ITE I NVESTIGATIONS 2015

3C Netw 3C N etwork Consult ork Consultants nts 3C N 3C Netwo etwork Consult rk Consultants

Developing Major Gifts T HE E PISCOPAL N ETWORK FOR S TEWARDSHIP helping people live generously T

Narrative Budgets Telling Your Financial Story T HE E PISCOPAL N ETWORK FOR S TEWARDSHIP helping

Lecture 3 Lossless Source Coding I-Hsiang Wang Department of Electrical Engineering National

Lossless Congestion Control Motivation Control packet retransmissions, which is undesirable for

Applications of Random Coding and Algebraic Coding Theories to Universal Lossless Source Coding

Lecture 3 Lossless Source Coding I-Hsiang Wang Department of Electrical Engineering National

Geirhos et al. (2019) Introduction ImageNet classifjcation with CNNs Which image cues are

Texture attribute synthesis and transfer using feed-forward CNNs Thomas Irmer, Tobias Glasmachers,

Distributed Optimization of CNNs and RNNs GTC 2015 William Chan williamchan.ca

Opto-electronic Characterization of Perovskite Thin Films & Solar Cells Arman Mahboubi Soufiani

Digital strategies for time and energy measurement for ultra fast inorganic scintillators

Physics 2D Lecture Slides Lecture 3: Jan 7 2004 Vivek Sharma UCSD Physics Einsteins Special

Earth Materials ESS 212 5 CREDITS ESS 212: EARTH MATERIALS (5 credits) Instructor:

Verification of refinements in rule-based designs Nirav Dave, Myron King, Arvind (MIT) Michael

Relational Specification and Verification From Non-Interference to Regression-free Program

Enabling Secure Ad-hoc Group Collaboration over Bluetooth Scatternets Somil Asthana (

Federated Byzantine Quorum Systems lvaro Garca-Prez and Alexey Gotsman IMDEA Software

I ncremental N etwork Q uantization: Towards Lossless CNNs With - PowerPoint PPT Presentation

I ncremental N etwork Q uantization: Towards Lossless CNNs With Low-Precision Weights Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, Yurong Chen Presented by Zhuangwei Zhuang South China University of Technology June 6, 2017 1 Outline

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Lossless compression in lossy compression systems Almost every lossy compression system

Deep Learning for Geometry Processing 3D Representations View-Based and Volumetric CNNs 3D

Understanding Geometry of Encoder-Decoder CNNs (E-D CNNs) Jong Chul Ye &amp; Woon Kyoung Sung

Introduction to CNNs and RNNs with PyTorch Introduction to CNNs and RNNs with PyTorch Presented

U SE OF D ECISION U NIT AND I NCREMENTAL S AMPLING M ETHODS TO I MPROVE S ITE I NVESTIGATIONS 2015

3C Netw 3C N etwork Consult ork Consultants nts 3C N 3C Netwo etwork Consult rk Consultants

Developing Major Gifts T HE E PISCOPAL N ETWORK FOR S TEWARDSHIP helping people live generously T

Narrative Budgets Telling Your Financial Story T HE E PISCOPAL N ETWORK FOR S TEWARDSHIP helping

Lecture 3 Lossless Source Coding I-Hsiang Wang Department of Electrical Engineering National

Lossless Congestion Control Motivation Control packet retransmissions, which is undesirable for

Applications of Random Coding and Algebraic Coding Theories to Universal Lossless Source Coding

Lecture 3 Lossless Source Coding I-Hsiang Wang Department of Electrical Engineering National

Geirhos et al. (2019) Introduction ImageNet classifjcation with CNNs Which image cues are

Texture attribute synthesis and transfer using feed-forward CNNs Thomas Irmer, Tobias Glasmachers,

Distributed Optimization of CNNs and RNNs GTC 2015 William Chan williamchan.ca

Opto-electronic Characterization of Perovskite Thin Films &amp; Solar Cells Arman Mahboubi Soufiani

Digital strategies for time and energy measurement for ultra fast inorganic scintillators

Physics 2D Lecture Slides Lecture 3: Jan 7 2004 Vivek Sharma UCSD Physics Einsteins Special

Earth Materials ESS 212 5 CREDITS ESS 212: EARTH MATERIALS (5 credits) Instructor:

Verification of refinements in rule-based designs Nirav Dave, Myron King, Arvind (MIT) Michael

Relational Specification and Verification From Non-Interference to Regression-free Program

Enabling Secure Ad-hoc Group Collaboration over Bluetooth Scatternets Somil Asthana (

Federated Byzantine Quorum Systems lvaro Garca-Prez and Alexey Gotsman IMDEA Software

Understanding Geometry of Encoder-Decoder CNNs (E-D CNNs) Jong Chul Ye & Woon Kyoung Sung

Opto-electronic Characterization of Perovskite Thin Films & Solar Cells Arman Mahboubi Soufiani