Chengyue Gong*1, Zixuan Jiang*2, Dilin Wang1, Yibo Lin2, Qiang Liu1, and David Z. Pan2
1CS Department, 2ECE Department
The University of Texas at Austin
Mixed Precision Neural Architecture Search for Energy Efficient Deep Learning
1
Mixed Precision Neural Architecture Search for Energy Efficient Deep - - PowerPoint PPT Presentation
Mixed Precision Neural Architecture Search for Energy Efficient Deep Learning Chengyue Gong* 1 , Zixuan Jiang * 2 , Dilin Wang 1 , Yibo Lin 2 , Qiang Liu 1 , and David Z. Pan 2 1 CS Department, 2 ECE Department The University of Texas at Austin
1CS Department, 2ECE Department
1
2
3
4
5
5 10 15 20 25 30 35 40 45 50 1 10 100 1000 A l e x N e t I n c e p t i
1 V G G
6 V G G
9 R e s N e t
8 R e s N e t
4 R e s N e t
R e s N e t
1 R e s N e t
5 2 R e s N e t
Error rate (%) Layers / Speed (ms)
Layers Speed (ms) Top-1 error Top-5 error
6
7
hardware-aware automated quantization with mixed precision,” CVPR, 2019.
8
5 10 15 20 25 30 35 10 20 30 40 50 60 70 80 90 100 fix8 fix6 fix4 haq mixed haq mixed
Energy (mJ) Accuracy
top 1 top 5 energy (mJ)
9
10
11
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C. MobileNetV2: Inverted residuals and linear
12
Neural architecture !, #, $ Quantization %&, %'
13
14
15
16
17
18
19
20
21
[1] H. Sharma, J. Park, N. Suda, L. Lai, B. Chau, V. Chandra, and H. Esmaeilzadeh, “Bit fusion: Bit-level dynamically composable architecture for accelerating deep neural network,” in Proc. ISCA, June 2018, pp. 764–775. [2] https://github.com/hsharma35/bitfusion
22
23
Ours-small Ours-base
24
12.7 1.7 12.9 32.1 11.6 1.44 8.91 21.2 10.1 2.12 16.3 40.2 9.94 2.06 10.9 24.7 Top-5 Error Model Size (MB) Energy (mJ) Latency (ms)
HAQ-small Ours-small HAQ-base Ours-base
25 21 21.2 21.4 21.6 21.8 22 22.2 22.4 22.6 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Error (%) Energy (mJ)
26
27
28
29
30
31
29.1 7.4 138 753 838 24.7 5.3 25.5 557 591 28.19 9.75 3.4 29 73.9 26.84 8.97 4.5 34.7 83.9 36.29 15.4 1.68 13.5 27.9 33.01 12.7 1.7 12.9 32.1 31.62 11.6 1.44 8.91 21.2 29.1 10.1 2.12 16.3 40.2 28.23 9.94 2.06 10.9 24.7 Top-1 Error Top-5 Error Model Size (MB) Energy (mJ) Latency (ms)
VGG-16 FXP 8 Resnet-50 FXP 8 MobileNetV2 FXP 8 FBNet-B FXP 8 FBNet-B FXP3 HAQ-small Mixed Ours-small Mixed HAQ-base Mixed Ours-base Mixed