cmsc5743 l06 binary ternary network
play

CMSC5743 L06: Binary/Ternary Network Bei Yu (Latest update: - PowerPoint PPT Presentation

CMSC5743 L06: Binary/Ternary Network Bei Yu (Latest update: November 2, 2020) Fall 2020 1 / 21 These slides contain/adapt materials developed by Ritchie Zhao et al. (2017). Accelerating binarized convolutional neural networks with


  1. CMSC5743 L06: Binary/Ternary Network Bei Yu (Latest update: November 2, 2020) Fall 2020 1 / 21

  2. These slides contain/adapt materials developed by ◮ Ritchie Zhao et al. (2017). “Accelerating binarized convolutional neural networks with software-programmable FPGAs”. In: Proc. FPGA , pp. 15–24 ◮ Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542 2 / 21

  3. Motivation Binary / Ternary Net: Motivation 6400 4800 Count 3200 => 1600 0 0 1 -1 0 1 − 0.05 0 0.05 Weight Value 3 / 21

  4. � Binarized Neural Networks (BNN) CNN Key Differences 1. Inputs are binarized ( − 1 or +1) 2.4 6.2 … 5.0 9.1 … ∗ 0.8 0.1 3.3 1.8 4.3 7.8 = 2. Weights are binarized ( − 1 or +1) 0.3 0.8 … … 3. Results are binarized after Weights batch normalization Input Map Output Map BNN Batch Normalization 4 23 = 1 23 − 5 : + < 1 −1 … 1 −3 … 1 −1 … 6 7 − 8 ∗ 1 −1 1 1 3 −7 1 −1 = → 1 −1 … … … = 23 = >+1 if 4 23 ≥ 0 Weights 1 23 Input Map −1 otherwise Output Map (Binary) (Binary) (Binary) (Integer) Binarization 6 4 / 21

  5. BNN CIFAR-10 Architecture [2] Feature map 32x32 dimensions 16x16 8x8 4x4 10 512 256 512 128 256 3 128 Number of feature maps 1024 1024 � 6 conv layers, 3 dense layers, 3 max pooling layers � All conv filters are 3x3 � First conv layer takes in floating-point input � 13.4 Mbits total model size (after hardware optimizations) [2] M. Courbariaux et al. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 . arXiv:1602.02830 , Feb 2016. 7 4 / 21

  6. Advantages of BNN 1. Floating point ops replaced with binary logic ops b 1 b 2 b 1 1 ⨯ ⨯ b 2 b 1 b 2 b 1 1 XO XOR b 2 +1 +1 +1 0 0 0 +1 −1 −1 0 1 1 −1 +1 −1 1 0 1 −1 −1 +1 1 1 0 – Encode {+1, − 1} as {0,1} à multiplies become XORs – Conv/dense layers do dot products à XOR and popcount – Operations can map to LUT fabric as opposed to DSPs 2. Binarized weights may reduce total model size – Fewer bits per weight may be offset by having more weights 8 4 / 21

  7. BNN vs CNN Parameter Efficiency Architecture Depth Param Bits Param Bits Error Rate (Float) (Fixed-Point) (%) ResNet [3] 164 51.9M 13.0M* 11.26 (CIFAR-10) BNN [2] 9 - 13.4M 11.40 * Assuming each float param can be quantized to 8-bit fixed-point � Comparison: – Conservative assumption: ResNet can use 8-bit weights – BNN is based on VGG (less advanced architecture) – BNN seems to hold promise! [2] M. Courbariaux et al. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 . arXiv:1602.02830 , Feb 2016. [3] K. He, X. Zhang, S. Ren, and J. Sun. Identity Mappings in Deep Residual Networks. ECCV 2016. 9 4 / 21

  8. Overview Minimize the Quantization Error Reduce the Gradient Error 5 / 21

  9. Overview Minimize the Quantization Error Reduce the Gradient Error 6 / 21

  10. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  11. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  12. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  13. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  14. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  15. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  16. Training Binary Weight Networks Naive S Solution: � ! ����� � ��.1��� 1�.� ���� ����� ������.���� � ! ������2� .�� 1����. ���.���� 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  17. AlexNet'TopX1'(%)'ILSVRC2012' 60' 56.7' 50' 40' 30' 20' 10' 0.2' 0' Full'Precision' '' ' Naïve' 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  18. R W W '.'.'.'' ''.'.'.''' R R Binarization B W B '.'.'.'' ''.'.'.''' B B 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  19. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  20. Binary Weight Network Train f for b binary w y weights: 1. Randomly initialize W 2. For iter = 1 to N R '.'.'.'' ''.'.'.''' R R 3. Load a random input image X W B = sign( W ) 4. α = k W k ` 1 5. n Forward pass with α, W B 6. Compute loss function C 7. @ W = Backward pass with α, W B @ C 8. Update W ( W = W − @ C @ W ) 9. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  21. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  22. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  23. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  24. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  25. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  26. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  27. AlexNet'TopX1'(%)'ILSVRC2012' 60' 56.8' 56.7' 50' 40' 30' 20' 10' 0.2' 0' '' ' Naïve' Full'Precision' Binary'Weight' 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  28. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  29. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  30. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  31. (1) Binarizing Weights = R B (2) Binarizing Input Redundant computation in overlapping areas = R B Inefficient = sign( X ) X (2) Binarizing Input = = � | X : , : , i | B Efficient c sign( X ) c" Average Filter (3) Convolution with XNOR-Bitcount ≈ R B R B sign( X ) 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  32. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  33. AlexNet'TopX1'(%)'ILSVRC2012' 60' 56.7' 56.8' 50' 40' 30.5' 30' 20' 10' 0.2' 0' '' ' 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  34. Network Structure in XNOR-Networks BNorm' Conv' AcIv' Pool' +1' sign(x) ! ' X1' A'typical'block'in'CNN' MaxXPooling' ✗ InformaIon'Loss' ✓ MulIple'Maximums' 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  35. Network Structure in XNOR-Networks BNorm' Conv' AcIv' Pool' ' ✗ InformaIon'Loss' ✓ MulIple'Maximums' 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  36. Network Structure in XNOR-Networks BNorm' BNorm' Conv' Pool' AcIv' AcIv' ' ✓ InformaIon'Loss' ✓ MulIple'Maximums' 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

  37. 1 1 Mohammad Rastegari et al. (2016). “XNOR-NET: Imagenet classification using binary convolutional neural networks”. In: Proc. ECCV , pp. 525–542. 6 / 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend