Ligeng Zhu May 4th
Neural Architecture
1
Neural Architecture Ligeng Zhu May 4th 1 The Blooming of CNNs 2 - - PowerPoint PPT Presentation
Neural Architecture Ligeng Zhu May 4th 1 The Blooming of CNNs 2 Bypass Connection x ` +1 = F ` ( x ` ) + x ` = F ` ( x ` ) + F ` 1 ( x ` 1 ) + x ` 1 = F ` ( x ` ) + F ` 1 ( x ` 1 ) + ... + F 1 ( x 1 ) = y ` 1 + y
1
2
3
Direct gradient flow between any two layer, makes
4
Cifar-10 param error Res-32 0.46M 7.51 Res-44 0.66M 7.17 Res-56 0.85M 6.97 Res-110 1.7M 6.43 Res-1202 19.4M 7.93
5
# ResNet pre-activation def ResidualBlock(x): x1 = BN_ReLU_Conv(x) x2 = BN_ReLU_Conv(x1) return x + x2 for i in range(N): model.add(ResidualBlock) # DenseNet BC structure def DenseBlock(x): x1 = BN_ReLU_Conv(x) x2 = BN_ReLU_Conv(x1) return Concat([x, x2]) for i in range(N): model.add(DenseBlock)
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708).
6
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708).
7
Dense-40-12 1.0M Dense-100-12 7.0M Dense-100-24 27.2M Dense-200-12 OOM
8
x`+1 = F`(x`) + x` = F`(x`) + F`−1(x`−1) + x`−1 = F`(x`) + F`−1(x`−1) + ... + F1(x1) = y`−1 + y`−2 + ... + y1. x`+1 = F`(x`) ⊕ x` = F`(x`) ⊕ F`−1(x`−1) ⊕ x`−1 = F`(x`) ⊕ F`−1(x`−1) ⊕ ... ⊕ F1(x1) = y`−1 ⊕ y`−2 ⊕ ... ⊕ y1.
9
ResNet DenseNet Mixed Link Dual Path
10
11 Zhu, L., Deng, R., Maire, M., Deng, Z., Mori, G., & Tan, P. (2018). Sparsely aggregated convolutional
12 Zhu, L., Deng, R., Maire, M., Deng, Z., Mori, G., & Tan, P. (2018). Sparsely aggregated convolutional
13
(a) Dense Aggregation: Equivalent Exploded View of (a)
F0 × F1 × F2 × F3 × F4 × F5 × F6 × F7 × F8
(b) Sparse Aggregation (Our Proposed Topology)
F0 × F1 × F2 × F3 × F4 × F5 × F6 × F7 × F8
ResNet & DenseNet: each layer takes all previous outputs. SparseNet: each layer takes all outputs with exponential offset (e.g., i-1, i - 2, i - 4, i - 8 …)
Zhu, L., Deng, R., Maire, M., Deng, Z., Mori, G., & Tan, P. (2018). Sparsely aggregated convolutional
14 Zhu, L., Deng, R., Maire, M., Deng, Z., Mori, G., & Tan, P. (2018). Sparsely aggregated convolutional
15 Zhu, L., Deng, R., Maire, M., Deng, Z., Mori, G., & Tan, P. (2018). Sparsely aggregated convolutional
16
17
Manual Architecture Design
VGGNets Inception Models ResNets DenseNets ….
Automatic Architecture Search
Human Expertise Machine Learning
Reinforcement Learning Neuro-evolution Bayesian Optimization Monte Carlo Tree Search …
Computational Resources
18
19
Learning Transferable Architectures for Scalable Image Recognition
4 days * 24 hours * 500 GPUs = 48,000 GPU hours
20
21
22
Net2Wider Net2Deeper
23
24
25
26
27
ResNet Inception DenseNet MobileNet ShuffleNet
Previous Paradigm: One CNN for all datasets. Our Work: Customize CNN for each dataset.
28
Previous Paradigm: One CNN for all platforms.
ResNet Inception DenseNet MobileNet ShuffleNet
Our Work: Customize CNN for each platform.
Proxyless NAS
29
Current neural architecture search (NAS) is VERY EXPENSIVE.
Therefore, previous work have to utilize proxy tasks:
…….
*if directly search on ImageNet, like us
Proxy Task
Transfer Architecture Updates
Target Task & Hardware Learner
30
Proxies:
Limitations of Proxy
Proxy Task
Transfer Architecture Updates
Target Task & Hardware Learner
31
Goal: Directly learn architectures on the target task and hardware, while allowing all blocks to have different structures. We achieved by
Learner Target Task & Hardware
Architecture Update
Proxy Task
Transfer Architecture
Update
Target Task & Hardware Learner
32
poor weapon but smart students Less GPUs but: we have more efficient algorithm AI research institutes: Good weapon (GPU cluster) Many Engineers
High-end GPU cluster Many Engineers poor equipment, smart algorithm Google, Facebook, NVIDIA
33
34
Pruning redundant paths based on architecture parameters Simplify NAS to be a single training process of a over-parameterized network. No meta controller. Stand on the shoulder of giants. Build the cumbersome network with all candidate paths
35
Binarize the architecture parameters and allow only one path of activation to be active in memory at run-time. We propose gradient-based and RL methods to update the binarized parameters. Thereby, the memory footprint reduces from O(N) to O(1).
36
37
10% FLOPs difference 60% latency difference
38
Op Lat. 4ms 3ms 7ms … … 1ms
… …
Pβ
<latexit sha1_base64="bFGBoTXmqpV1xsabjmpSF38Jk=">AB8HicbVDLSgNBEOyNrxhfUY9eBoPgKewGQY8BLx4jmIckS5idJIhM7PLzKwQlnyFw+KePVzvPk3TpI9aGJBQ1HVTXdXlAhurO9/e4WNza3tneJuaW/4PCofHzSMnGqGTZLGLdiahBwRU2LbcCO4lGKiOB7WhyO/fbT6gNj9WDnSYSjpSfMgZtU56bPSzXoSWzvrlil/1FyDrJMhJBXI0+uWv3iBmqURlmaDGdAM/sWFGteVM4KzUSw0mlE3oCLuOKirRhNni4Bm5cMqADGPtSlmyUH9PZFQaM5WR65TUjs2qNxf/87qpHd6EGVdJalGx5aJhKoiNyfx7MuAamRVTRyjT3N1K2JhqyqzLqORCFZfXietWjXwq8H9VaVey+MowhmcwyUEcA1uIMGNIGBhGd4hTdPey/eu/exbC14+cwp/IH3+QPd/pBj</latexit><latexit sha1_base64="bFGBoTXmqpV1xsabjmpSF38Jk=">AB8HicbVDLSgNBEOyNrxhfUY9eBoPgKewGQY8BLx4jmIckS5idJIhM7PLzKwQlnyFw+KePVzvPk3TpI9aGJBQ1HVTXdXlAhurO9/e4WNza3tneJuaW/4PCofHzSMnGqGTZLGLdiahBwRU2LbcCO4lGKiOB7WhyO/fbT6gNj9WDnSYSjpSfMgZtU56bPSzXoSWzvrlil/1FyDrJMhJBXI0+uWv3iBmqURlmaDGdAM/sWFGteVM4KzUSw0mlE3oCLuOKirRhNni4Bm5cMqADGPtSlmyUH9PZFQaM5WR65TUjs2qNxf/87qpHd6EGVdJalGx5aJhKoiNyfx7MuAamRVTRyjT3N1K2JhqyqzLqORCFZfXietWjXwq8H9VaVey+MowhmcwyUEcA1uIMGNIGBhGd4hTdPey/eu/exbC14+cwp/IH3+QPd/pBj</latexit><latexit sha1_base64="bFGBoTXmqpV1xsabjmpSF38Jk=">AB8HicbVDLSgNBEOyNrxhfUY9eBoPgKewGQY8BLx4jmIckS5idJIhM7PLzKwQlnyFw+KePVzvPk3TpI9aGJBQ1HVTXdXlAhurO9/e4WNza3tneJuaW/4PCofHzSMnGqGTZLGLdiahBwRU2LbcCO4lGKiOB7WhyO/fbT6gNj9WDnSYSjpSfMgZtU56bPSzXoSWzvrlil/1FyDrJMhJBXI0+uWv3iBmqURlmaDGdAM/sWFGteVM4KzUSw0mlE3oCLuOKirRhNni4Bm5cMqADGPtSlmyUH9PZFQaM5WR65TUjs2qNxf/87qpHd6EGVdJalGx5aJhKoiNyfx7MuAamRVTRyjT3N1K2JhqyqzLqORCFZfXietWjXwq8H9VaVey+MowhmcwyUEcA1uIMGNIGBhGd4hTdPey/eu/exbC14+cwp/IH3+QPd/pBj</latexit><latexit sha1_base64="bFGBoTXmqpV1xsabjmpSF38Jk=">AB8HicbVDLSgNBEOyNrxhfUY9eBoPgKewGQY8BLx4jmIckS5idJIhM7PLzKwQlnyFw+KePVzvPk3TpI9aGJBQ1HVTXdXlAhurO9/e4WNza3tneJuaW/4PCofHzSMnGqGTZLGLdiahBwRU2LbcCO4lGKiOB7WhyO/fbT6gNj9WDnSYSjpSfMgZtU56bPSzXoSWzvrlil/1FyDrJMhJBXI0+uWv3iBmqURlmaDGdAM/sWFGteVM4KzUSw0mlE3oCLuOKirRhNni4Bm5cMqADGPtSlmyUH9PZFQaM5WR65TUjs2qNxf/87qpHd6EGVdJalGx5aJhKoiNyfx7MuAamRVTRyjT3N1K2JhqyqzLqORCFZfXietWjXwq8H9VaVey+MowhmcwyUEcA1uIMGNIGBhGd4hTdPey/eu/exbC14+cwp/IH3+QPd/pBj</latexit>Pα
<latexit sha1_base64="tcOz0ez3MuOKlAxBXf/w/Cixi04=">AB8XicbVBNS8NAEJ3Ur1q/qh69LBbBU0mKoMeCF48V7Ae2oUy2m3bpZhN2N0I/RdePCji1X/jzX/jts1BWx8MPN6bYWZekAiujet+O6WNza3tnfJuZW/4PCoenzS0XGqKGvTWMSqF6BmgkvWNtwI1ksUwygQrBtMb+d+94kpzWP5YLKE+RGOJQ85RWOlx9YwH6BIJjgbVmtu3V2ArBOvIDUo0BpWvwajmKYRk4YK1LrvuYnxc1SGU8FmlUGqWYJ0imPWt1RixLSfLy6ekQurjEgYK1vSkIX6eyLHSOsCmxnhGaiV725+J/XT014+dcJqlhki4XhakgJibz98mIK0aNyCxBqri9ldAJKqTGhlSxIXirL6+TqPuXv/qrWbBRxlOEMzuESPLiGJtxBC9pAQcIzvMKbo50X5935WLaWnGLmFP7A+fwBp0WQ1w=</latexit><latexit sha1_base64="tcOz0ez3MuOKlAxBXf/w/Cixi04=">AB8XicbVBNS8NAEJ3Ur1q/qh69LBbBU0mKoMeCF48V7Ae2oUy2m3bpZhN2N0I/RdePCji1X/jzX/jts1BWx8MPN6bYWZekAiujet+O6WNza3tnfJuZW/4PCoenzS0XGqKGvTWMSqF6BmgkvWNtwI1ksUwygQrBtMb+d+94kpzWP5YLKE+RGOJQ85RWOlx9YwH6BIJjgbVmtu3V2ArBOvIDUo0BpWvwajmKYRk4YK1LrvuYnxc1SGU8FmlUGqWYJ0imPWt1RixLSfLy6ekQurjEgYK1vSkIX6eyLHSOsCmxnhGaiV725+J/XT014+dcJqlhki4XhakgJibz98mIK0aNyCxBqri9ldAJKqTGhlSxIXirL6+TqPuXv/qrWbBRxlOEMzuESPLiGJtxBC9pAQcIzvMKbo50X5935WLaWnGLmFP7A+fwBp0WQ1w=</latexit><latexit sha1_base64="tcOz0ez3MuOKlAxBXf/w/Cixi04=">AB8XicbVBNS8NAEJ3Ur1q/qh69LBbBU0mKoMeCF48V7Ae2oUy2m3bpZhN2N0I/RdePCji1X/jzX/jts1BWx8MPN6bYWZekAiujet+O6WNza3tnfJuZW/4PCoenzS0XGqKGvTWMSqF6BmgkvWNtwI1ksUwygQrBtMb+d+94kpzWP5YLKE+RGOJQ85RWOlx9YwH6BIJjgbVmtu3V2ArBOvIDUo0BpWvwajmKYRk4YK1LrvuYnxc1SGU8FmlUGqWYJ0imPWt1RixLSfLy6ekQurjEgYK1vSkIX6eyLHSOsCmxnhGaiV725+J/XT014+dcJqlhki4XhakgJibz98mIK0aNyCxBqri9ldAJKqTGhlSxIXirL6+TqPuXv/qrWbBRxlOEMzuESPLiGJtxBC9pAQcIzvMKbo50X5935WLaWnGLmFP7A+fwBp0WQ1w=</latexit><latexit sha1_base64="tcOz0ez3MuOKlAxBXf/w/Cixi04=">AB8XicbVBNS8NAEJ3Ur1q/qh69LBbBU0mKoMeCF48V7Ae2oUy2m3bpZhN2N0I/RdePCji1X/jzX/jts1BWx8MPN6bYWZekAiujet+O6WNza3tnfJuZW/4PCoenzS0XGqKGvTWMSqF6BmgkvWNtwI1ksUwygQrBtMb+d+94kpzWP5YLKE+RGOJQ85RWOlx9YwH6BIJjgbVmtu3V2ArBOvIDUo0BpWvwajmKYRk4YK1LrvuYnxc1SGU8FmlUGqWYJ0imPWt1RixLSfLy6ekQurjEgYK1vSkIX6eyLHSOsCmxnhGaiV725+J/XT014+dcJqlhki4XhakgJibz98mIK0aNyCxBqri9ldAJKqTGhlSxIXirL6+TqPuXv/qrWbBRxlOEMzuESPLiGJtxBC9pAQcIzvMKbo50X5935WLaWnGLmFP7A+fwBp0WQ1w=</latexit>Pσ
<latexit sha1_base64="PwTcVNjRQpkKx+mHJHVbmIkms3U=">AB8XicbVDLSgNBEOyNrxhfUY9eBoPgKewGQY8BLx4jmAcmS+idTJIhM7PLzKwQlvyFw+KePVvPk3TpI9aGJBQ1HVTXdXlAhurO9/e4WNza3tneJuaW/4PCofHzSMnGqKWvSWMS6E6FhgivWtNwK1k0QxkJ1o4mt3O/cS04bF6sNOEhRJHig85Reukx0Y/6xk+kjrlyt+1V+ArJMgJxXI0eiXv3qDmKaSKUsFGtMN/MSGWrLqWCzUi81LE6wRHrOqpQMhNmi4tn5MIpAzKMtStlyUL9PZGhNGYqI9cp0Y7NqjcX/O6qR3ehBlXSWqZostFw1QG5P5+2TANaNWTB1Bqrm7ldAxaqTWhVRyIQSrL6+TVq0a+NXg/qpSr+VxFOEMzuESAriGOtxBA5pAQcEzvMKbZ7wX7937WLYWvHzmFP7A+/wBuC6Q4g=</latexit><latexit sha1_base64="PwTcVNjRQpkKx+mHJHVbmIkms3U=">AB8XicbVDLSgNBEOyNrxhfUY9eBoPgKewGQY8BLx4jmAcmS+idTJIhM7PLzKwQlvyFw+KePVvPk3TpI9aGJBQ1HVTXdXlAhurO9/e4WNza3tneJuaW/4PCofHzSMnGqKWvSWMS6E6FhgivWtNwK1k0QxkJ1o4mt3O/cS04bF6sNOEhRJHig85Reukx0Y/6xk+kjrlyt+1V+ArJMgJxXI0eiXv3qDmKaSKUsFGtMN/MSGWrLqWCzUi81LE6wRHrOqpQMhNmi4tn5MIpAzKMtStlyUL9PZGhNGYqI9cp0Y7NqjcX/O6qR3ehBlXSWqZostFw1QG5P5+2TANaNWTB1Bqrm7ldAxaqTWhVRyIQSrL6+TVq0a+NXg/qpSr+VxFOEMzuESAriGOtxBA5pAQcEzvMKbZ7wX7937WLYWvHzmFP7A+/wBuC6Q4g=</latexit><latexit sha1_base64="PwTcVNjRQpkKx+mHJHVbmIkms3U=">AB8XicbVDLSgNBEOyNrxhfUY9eBoPgKewGQY8BLx4jmAcmS+idTJIhM7PLzKwQlvyFw+KePVvPk3TpI9aGJBQ1HVTXdXlAhurO9/e4WNza3tneJuaW/4PCofHzSMnGqKWvSWMS6E6FhgivWtNwK1k0QxkJ1o4mt3O/cS04bF6sNOEhRJHig85Reukx0Y/6xk+kjrlyt+1V+ArJMgJxXI0eiXv3qDmKaSKUsFGtMN/MSGWrLqWCzUi81LE6wRHrOqpQMhNmi4tn5MIpAzKMtStlyUL9PZGhNGYqI9cp0Y7NqjcX/O6qR3ehBlXSWqZostFw1QG5P5+2TANaNWTB1Bqrm7ldAxaqTWhVRyIQSrL6+TVq0a+NXg/qpSr+VxFOEMzuESAriGOtxBA5pAQcEzvMKbZ7wX7937WLYWvHzmFP7A+/wBuC6Q4g=</latexit><latexit sha1_base64="PwTcVNjRQpkKx+mHJHVbmIkms3U=">AB8XicbVDLSgNBEOyNrxhfUY9eBoPgKewGQY8BLx4jmAcmS+idTJIhM7PLzKwQlvyFw+KePVvPk3TpI9aGJBQ1HVTXdXlAhurO9/e4WNza3tneJuaW/4PCofHzSMnGqKWvSWMS6E6FhgivWtNwK1k0QxkJ1o4mt3O/cS04bF6sNOEhRJHig85Reukx0Y/6xk+kjrlyt+1V+ArJMgJxXI0eiXv3qDmKaSKUsFGtMN/MSGWrLqWCzUi81LE6wRHrOqpQMhNmi4tn5MIpAzKMtStlyUL9PZGhNGYqI9cp0Y7NqjcX/O6qR3ehBlXSWqZostFw1QG5P5+2TANaNWTB1Bqrm7ldAxaqTWhVRyIQSrL6+TVq0a+NXg/qpSr+VxFOEMzuESAriGOtxBA5pAQcEzvMKbZ7wX7937WLYWvHzmFP7A+/wBuC6Q4g=</latexit>P✏
<latexit sha1_base64="kqA7Jl41ZQtj0KHGby5+uabk+5g=">AB83icbVDLSgNBEOz1GeMr6tHLYBA8hd0g6DHgxWME84DsEmYnWTI7MwyMyuEJb/hxYMiXv0Zb/6Nk2QPmljQUFR1090Vp4Ib6/vf3sbm1vbObmvH9weHRcOTltG5Vphi2mhNLdmBoUXGLciuwm2qkSywE0/u5n7nCbXhSj7aYpRQkeSDzmj1klhs5+HmBoulJz1K1W/5i9A1klQkCoUaPYrX+FAsSxBaZmgxvQCP7VRTrXlTOCsHGYGU8omdIQ9RyVN0ET54uYZuXTKgAyVdiUtWai/J3KaGDNYteZUDs2q95c/M/rZXZ4G+VcplFyZaLhpkgVpF5AGTANTIrpo5Qprm7lbAx1ZRZF1PZhRCsvrxO2vVa4NeCh+tqo17EUYJzuIArCOAGnAPTWgBgxSe4RXevMx78d69j2XrhlfMnMEfeJ8/a+2R3w=</latexit><latexit sha1_base64="kqA7Jl41ZQtj0KHGby5+uabk+5g=">AB83icbVDLSgNBEOz1GeMr6tHLYBA8hd0g6DHgxWME84DsEmYnWTI7MwyMyuEJb/hxYMiXv0Zb/6Nk2QPmljQUFR1090Vp4Ib6/vf3sbm1vbObmvH9weHRcOTltG5Vphi2mhNLdmBoUXGLciuwm2qkSywE0/u5n7nCbXhSj7aYpRQkeSDzmj1klhs5+HmBoulJz1K1W/5i9A1klQkCoUaPYrX+FAsSxBaZmgxvQCP7VRTrXlTOCsHGYGU8omdIQ9RyVN0ET54uYZuXTKgAyVdiUtWai/J3KaGDNYteZUDs2q95c/M/rZXZ4G+VcplFyZaLhpkgVpF5AGTANTIrpo5Qprm7lbAx1ZRZF1PZhRCsvrxO2vVa4NeCh+tqo17EUYJzuIArCOAGnAPTWgBgxSe4RXevMx78d69j2XrhlfMnMEfeJ8/a+2R3w=</latexit><latexit sha1_base64="kqA7Jl41ZQtj0KHGby5+uabk+5g=">AB83icbVDLSgNBEOz1GeMr6tHLYBA8hd0g6DHgxWME84DsEmYnWTI7MwyMyuEJb/hxYMiXv0Zb/6Nk2QPmljQUFR1090Vp4Ib6/vf3sbm1vbObmvH9weHRcOTltG5Vphi2mhNLdmBoUXGLciuwm2qkSywE0/u5n7nCbXhSj7aYpRQkeSDzmj1klhs5+HmBoulJz1K1W/5i9A1klQkCoUaPYrX+FAsSxBaZmgxvQCP7VRTrXlTOCsHGYGU8omdIQ9RyVN0ET54uYZuXTKgAyVdiUtWai/J3KaGDNYteZUDs2q95c/M/rZXZ4G+VcplFyZaLhpkgVpF5AGTANTIrpo5Qprm7lbAx1ZRZF1PZhRCsvrxO2vVa4NeCh+tqo17EUYJzuIArCOAGnAPTWgBgxSe4RXevMx78d69j2XrhlfMnMEfeJ8/a+2R3w=</latexit><latexit sha1_base64="kqA7Jl41ZQtj0KHGby5+uabk+5g=">AB83icbVDLSgNBEOz1GeMr6tHLYBA8hd0g6DHgxWME84DsEmYnWTI7MwyMyuEJb/hxYMiXv0Zb/6Nk2QPmljQUFR1090Vp4Ib6/vf3sbm1vbObmvH9weHRcOTltG5Vphi2mhNLdmBoUXGLciuwm2qkSywE0/u5n7nCbXhSj7aYpRQkeSDzmj1klhs5+HmBoulJz1K1W/5i9A1klQkCoUaPYrX+FAsSxBaZmgxvQCP7VRTrXlTOCsHGYGU8omdIQ9RyVN0ET54uYZuXTKgAyVdiUtWai/J3KaGDNYteZUDs2q95c/M/rZXZ4G+VcplFyZaLhpkgVpF5AGTANTIrpo5Qprm7lbAx1ZRZF1PZhRCsvrxO2vVa4NeCh+tqo17EUYJzuIArCOAGnAPTWgBgxSe4RXevMx78d69j2XrhlfMnMEfeJ8/a+2R3w=</latexit>Learnable Block i Latency Lookup Table (LUT) Estimated Latency Arch Info
Query the latency from the lookup table.
E[LATi] =Pα × F(conv 3x3)+ Pβ × F(conv 5x5)+ Pσ × F(identity)+ ...... Pζ × F(pool 3x3)
<latexit sha1_base64="VRgvkxmFph20LMXmsTdND8bOKw8=">AC3nicdVJNb9QwEHXCVwlfWzhysViBipBWCaUqF6QCAnHgsEjdtlIcBceZzVp17Mh2qi5RDlw4gBXfhc3fgh3vNlQaLuMZOnpzRvPm7GzSnBjw/Cn51+4eOnylbWrwbXrN27eGqzf3jOq1gwmTAmlDzJqQHAJE8utgINKAy0zAfvZ4ctFfv8ItOFK7tp5BUlJC8mnFHrqHTwi2RQcNlQwQsJeRuQktpZljWv2riDumzePt9NeZvgZzh4gMcpoaKaUwsL8Hg1xt/ZEzJI5JuHm+2D/EjTMhSnIGl+H/ireOtU2LDi3LFzTwH6Wab/9WOujip+7C6SaWU6B0FBGR+MmU6GIajsAt8HkQ9GKI+xungB8kVq0vngwlqTByFlU0aqi1nAtzWagMVZYe0gNhBSZ2RpOmep8X3HZPjqdLuSIs79t+KhpbGzMvMKRfGzdncglyVi2s7fZo0XFa1BcmWja1wFbhxVvjnGtgVswdoExz5xWzGdWUWfcjAreE6OzI58He41EUjqJ3T4Y7L/p1rKG76B7aQBHaRjvoDRqjCWJe7H30Pntf/Pf+J/+r/20p9b2+5g46Ff73w5Z4Ic=</latexit><latexit sha1_base64="VRgvkxmFph20LMXmsTdND8bOKw8=">AC3nicdVJNb9QwEHXCVwlfWzhysViBipBWCaUqF6QCAnHgsEjdtlIcBceZzVp17Mh2qi5RDlw4gBXfhc3fgh3vNlQaLuMZOnpzRvPm7GzSnBjw/Cn51+4eOnylbWrwbXrN27eGqzf3jOq1gwmTAmlDzJqQHAJE8utgINKAy0zAfvZ4ctFfv8ItOFK7tp5BUlJC8mnFHrqHTwi2RQcNlQwQsJeRuQktpZljWv2riDumzePt9NeZvgZzh4gMcpoaKaUwsL8Hg1xt/ZEzJI5JuHm+2D/EjTMhSnIGl+H/ireOtU2LDi3LFzTwH6Wab/9WOujip+7C6SaWU6B0FBGR+MmU6GIajsAt8HkQ9GKI+xungB8kVq0vngwlqTByFlU0aqi1nAtzWagMVZYe0gNhBSZ2RpOmep8X3HZPjqdLuSIs79t+KhpbGzMvMKRfGzdncglyVi2s7fZo0XFa1BcmWja1wFbhxVvjnGtgVswdoExz5xWzGdWUWfcjAreE6OzI58He41EUjqJ3T4Y7L/p1rKG76B7aQBHaRjvoDRqjCWJe7H30Pntf/Pf+J/+r/20p9b2+5g46Ff73w5Z4Ic=</latexit><latexit sha1_base64="VRgvkxmFph20LMXmsTdND8bOKw8=">AC3nicdVJNb9QwEHXCVwlfWzhysViBipBWCaUqF6QCAnHgsEjdtlIcBceZzVp17Mh2qi5RDlw4gBXfhc3fgh3vNlQaLuMZOnpzRvPm7GzSnBjw/Cn51+4eOnylbWrwbXrN27eGqzf3jOq1gwmTAmlDzJqQHAJE8utgINKAy0zAfvZ4ctFfv8ItOFK7tp5BUlJC8mnFHrqHTwi2RQcNlQwQsJeRuQktpZljWv2riDumzePt9NeZvgZzh4gMcpoaKaUwsL8Hg1xt/ZEzJI5JuHm+2D/EjTMhSnIGl+H/ireOtU2LDi3LFzTwH6Wab/9WOujip+7C6SaWU6B0FBGR+MmU6GIajsAt8HkQ9GKI+xungB8kVq0vngwlqTByFlU0aqi1nAtzWagMVZYe0gNhBSZ2RpOmep8X3HZPjqdLuSIs79t+KhpbGzMvMKRfGzdncglyVi2s7fZo0XFa1BcmWja1wFbhxVvjnGtgVswdoExz5xWzGdWUWfcjAreE6OzI58He41EUjqJ3T4Y7L/p1rKG76B7aQBHaRjvoDRqjCWJe7H30Pntf/Pf+J/+r/20p9b2+5g46Ff73w5Z4Ic=</latexit><latexit sha1_base64="VRgvkxmFph20LMXmsTdND8bOKw8=">AC3nicdVJNb9QwEHXCVwlfWzhysViBipBWCaUqF6QCAnHgsEjdtlIcBceZzVp17Mh2qi5RDlw4gBXfhc3fgh3vNlQaLuMZOnpzRvPm7GzSnBjw/Cn51+4eOnylbWrwbXrN27eGqzf3jOq1gwmTAmlDzJqQHAJE8utgINKAy0zAfvZ4ctFfv8ItOFK7tp5BUlJC8mnFHrqHTwi2RQcNlQwQsJeRuQktpZljWv2riDumzePt9NeZvgZzh4gMcpoaKaUwsL8Hg1xt/ZEzJI5JuHm+2D/EjTMhSnIGl+H/ireOtU2LDi3LFzTwH6Wab/9WOujip+7C6SaWU6B0FBGR+MmU6GIajsAt8HkQ9GKI+xungB8kVq0vngwlqTByFlU0aqi1nAtzWagMVZYe0gNhBSZ2RpOmep8X3HZPjqdLuSIs79t+KhpbGzMvMKRfGzdncglyVi2s7fZo0XFa1BcmWja1wFbhxVvjnGtgVswdoExz5xWzGdWUWfcjAreE6OzI58He41EUjqJ3T4Y7L/p1rKG76B7aQBHaRjvoDRqjCWJe7H30Pntf/Pf+J/+r/20p9b2+5g46Ff73w5Z4Ic=</latexit>Loss = LossCE + λ1||w||2
2 + λ2E[LAT]
<latexit sha1_base64="kXj15GWogApedlTjZ+ES1hwcZ7o=">ACRnicbVBNSxBEK3ZfJk1ias5tK4CEJgmVkEcwkYRcjBg4Krwu49PTUamPB901iUv/LpcPHvzJ3jxYAhe7RlXSDQF3Txe1aNevbhQ0pDvX3mtFy9fvX4z97Y9/+79h4XO4tKByUstcCByleujmBtUMsMBSVJ4VGjkazwMD7bqvuHP1AbmWf7NCkwTPlJsdScHJU1Al3cmO+svqP7NZ2xT6zkXLyhEcBm05/TqdR/9j2G57wnJqNVmNS2ce5PhulnE51arer4SPc+bZfhVXU6fo9vyn2HAQz0IVZ7Uady1GSizLFjITixgwDv6DQck1SKzao9JgwcUZP8GhgxlP0YS2MVWxFckbJxr9zJiDfu3wvLUmEkau8napXnaq8n/9Yljb+EVmZFSZiJh0XjUjHKWZ0pS6RGQWriABdaOq9MnHLNBbnk2y6E4OnJz8FBvxf4vWBvrbuxOYtjDj7BMqxCAOuwAd9hFwYg4Bdcwy389i68G+Pd/cw2vJmo/wT7XgHrYxsgA=</latexit><latexit sha1_base64="kXj15GWogApedlTjZ+ES1hwcZ7o=">ACRnicbVBNSxBEK3ZfJk1ias5tK4CEJgmVkEcwkYRcjBg4Krwu49PTUamPB901iUv/LpcPHvzJ3jxYAhe7RlXSDQF3Txe1aNevbhQ0pDvX3mtFy9fvX4z97Y9/+79h4XO4tKByUstcCByleujmBtUMsMBSVJ4VGjkazwMD7bqvuHP1AbmWf7NCkwTPlJsdScHJU1Al3cmO+svqP7NZ2xT6zkXLyhEcBm05/TqdR/9j2G57wnJqNVmNS2ce5PhulnE51arer4SPc+bZfhVXU6fo9vyn2HAQz0IVZ7Uady1GSizLFjITixgwDv6DQck1SKzao9JgwcUZP8GhgxlP0YS2MVWxFckbJxr9zJiDfu3wvLUmEkau8napXnaq8n/9Yljb+EVmZFSZiJh0XjUjHKWZ0pS6RGQWriABdaOq9MnHLNBbnk2y6E4OnJz8FBvxf4vWBvrbuxOYtjDj7BMqxCAOuwAd9hFwYg4Bdcwy389i68G+Pd/cw2vJmo/wT7XgHrYxsgA=</latexit><latexit sha1_base64="kXj15GWogApedlTjZ+ES1hwcZ7o=">ACRnicbVBNSxBEK3ZfJk1ias5tK4CEJgmVkEcwkYRcjBg4Krwu49PTUamPB901iUv/LpcPHvzJ3jxYAhe7RlXSDQF3Txe1aNevbhQ0pDvX3mtFy9fvX4z97Y9/+79h4XO4tKByUstcCByleujmBtUMsMBSVJ4VGjkazwMD7bqvuHP1AbmWf7NCkwTPlJsdScHJU1Al3cmO+svqP7NZ2xT6zkXLyhEcBm05/TqdR/9j2G57wnJqNVmNS2ce5PhulnE51arer4SPc+bZfhVXU6fo9vyn2HAQz0IVZ7Uady1GSizLFjITixgwDv6DQck1SKzao9JgwcUZP8GhgxlP0YS2MVWxFckbJxr9zJiDfu3wvLUmEkau8napXnaq8n/9Yljb+EVmZFSZiJh0XjUjHKWZ0pS6RGQWriABdaOq9MnHLNBbnk2y6E4OnJz8FBvxf4vWBvrbuxOYtjDj7BMqxCAOuwAd9hFwYg4Bdcwy389i68G+Pd/cw2vJmo/wT7XgHrYxsgA=</latexit><latexit sha1_base64="kXj15GWogApedlTjZ+ES1hwcZ7o=">ACRnicbVBNSxBEK3ZfJk1ias5tK4CEJgmVkEcwkYRcjBg4Krwu49PTUamPB901iUv/LpcPHvzJ3jxYAhe7RlXSDQF3Txe1aNevbhQ0pDvX3mtFy9fvX4z97Y9/+79h4XO4tKByUstcCByleujmBtUMsMBSVJ4VGjkazwMD7bqvuHP1AbmWf7NCkwTPlJsdScHJU1Al3cmO+svqP7NZ2xT6zkXLyhEcBm05/TqdR/9j2G57wnJqNVmNS2ce5PhulnE51arer4SPc+bZfhVXU6fo9vyn2HAQz0IVZ7Uady1GSizLFjITixgwDv6DQck1SKzao9JgwcUZP8GhgxlP0YS2MVWxFckbJxr9zJiDfu3wvLUmEkau8napXnaq8n/9Yljb+EVmZFSZiJh0XjUjHKWZ0pS6RGQWriABdaOq9MnHLNBbnk2y6E4OnJz8FBvxf4vWBvrbuxOYtjDj7BMqxCAOuwAd9hFwYg4Bdcwy389i68G+Pd/cw2vJmo/wT7XgHrYxsgA=</latexit>E[LAT] =
N
X
i
E[LATi]
<latexit sha1_base64="WcTGPvwF/T0BgrUCTKYk73PLuWE=">ACK3icdVDLSgMxFM34rPVdekmWARXZUYE3Qi1IrgQqdAXTMchk2ba0CQzJBmhDPM/bvwVF7rwgVv/w/QhaKsHAodziX3niBmVGnbfrPm5hcWl5ZzK/nVtfWNzcLWdkNFicSkjiMWyVaAFGFUkLqmpFWLAniASPNoH8+9Jt3RCoaiZoexMTjqCtoSDHSRvILlTZHuid5epG53/TqrJZ58BS2VcL9lGa36XUG/8n51PMLRbtkjwBniTMhRTB1S8tTsRTjgRGjOklOvYsfZSJDXFjGT5dqJIjHAfdYlrqECcKC8d3ZrBfaN0YBhJ84SGI/XnRIq4UgMemORwTXtDcW/PDfR4YmXUhEnmg8/ihMGNQRHBYHO1QSrNnAEIQlNbtC3EMSYW3qzZsSnOmTZ0njsOTYJefmqFiuTOrIgV2wBw6A45BGVyCKqgDO7BI3gBr9aD9Wy9Wx/j6Jw1mdkBv2B9fgFHoKjp</latexit><latexit sha1_base64="WcTGPvwF/T0BgrUCTKYk73PLuWE=">ACK3icdVDLSgMxFM34rPVdekmWARXZUYE3Qi1IrgQqdAXTMchk2ba0CQzJBmhDPM/bvwVF7rwgVv/w/QhaKsHAodziX3niBmVGnbfrPm5hcWl5ZzK/nVtfWNzcLWdkNFicSkjiMWyVaAFGFUkLqmpFWLAniASPNoH8+9Jt3RCoaiZoexMTjqCtoSDHSRvILlTZHuid5epG53/TqrJZ58BS2VcL9lGa36XUG/8n51PMLRbtkjwBniTMhRTB1S8tTsRTjgRGjOklOvYsfZSJDXFjGT5dqJIjHAfdYlrqECcKC8d3ZrBfaN0YBhJ84SGI/XnRIq4UgMemORwTXtDcW/PDfR4YmXUhEnmg8/ihMGNQRHBYHO1QSrNnAEIQlNbtC3EMSYW3qzZsSnOmTZ0njsOTYJefmqFiuTOrIgV2wBw6A45BGVyCKqgDO7BI3gBr9aD9Wy9Wx/j6Jw1mdkBv2B9fgFHoKjp</latexit><latexit sha1_base64="WcTGPvwF/T0BgrUCTKYk73PLuWE=">ACK3icdVDLSgMxFM34rPVdekmWARXZUYE3Qi1IrgQqdAXTMchk2ba0CQzJBmhDPM/bvwVF7rwgVv/w/QhaKsHAodziX3niBmVGnbfrPm5hcWl5ZzK/nVtfWNzcLWdkNFicSkjiMWyVaAFGFUkLqmpFWLAniASPNoH8+9Jt3RCoaiZoexMTjqCtoSDHSRvILlTZHuid5epG53/TqrJZ58BS2VcL9lGa36XUG/8n51PMLRbtkjwBniTMhRTB1S8tTsRTjgRGjOklOvYsfZSJDXFjGT5dqJIjHAfdYlrqECcKC8d3ZrBfaN0YBhJ84SGI/XnRIq4UgMemORwTXtDcW/PDfR4YmXUhEnmg8/ihMGNQRHBYHO1QSrNnAEIQlNbtC3EMSYW3qzZsSnOmTZ0njsOTYJefmqFiuTOrIgV2wBw6A45BGVyCKqgDO7BI3gBr9aD9Wy9Wx/j6Jw1mdkBv2B9fgFHoKjp</latexit><latexit sha1_base64="WcTGPvwF/T0BgrUCTKYk73PLuWE=">ACK3icdVDLSgMxFM34rPVdekmWARXZUYE3Qi1IrgQqdAXTMchk2ba0CQzJBmhDPM/bvwVF7rwgVv/w/QhaKsHAodziX3niBmVGnbfrPm5hcWl5ZzK/nVtfWNzcLWdkNFicSkjiMWyVaAFGFUkLqmpFWLAniASPNoH8+9Jt3RCoaiZoexMTjqCtoSDHSRvILlTZHuid5epG53/TqrJZ58BS2VcL9lGa36XUG/8n51PMLRbtkjwBniTMhRTB1S8tTsRTjgRGjOklOvYsfZSJDXFjGT5dqJIjHAfdYlrqECcKC8d3ZrBfaN0YBhJ84SGI/XnRIq4UgMemORwTXtDcW/PDfR4YmXUhEnmg8/ihMGNQRHBYHO1QSrNnAEIQlNbtC3EMSYW3qzZsSnOmTZ0njsOTYJefmqFiuTOrIgV2wBw6A45BGVyCKqgDO7BI3gBr9aD9Wy9Wx/j6Jw1mdkBv2B9fgFHoKjp</latexit>Gradient Based Reinforce Based
39
40
When targeting GPU platform, the accuracy is further improved to 75.1%. 3.1% higher than MobilenetV2.
41
the current industry standard.
42
43
Model Top-1 Latency Hardware Aware No Proxy No Repeat Search Cost Manually Designed MobilenetV1 70.6 113ms
72.0 75ms
(under mobile latency constraint ≤ 80ms) with 200× less search cost in GPU hours. “LL” indicates latency regularization loss.
NAS NASNet-A 74.0 183ms x x x 48000 AmoebaNet-A 74.4 190ms x x x 75600 MNasNet 74.0 76ms yes x x 40000 ProxylessNAS ProxylessNAS-G 71.8 83ms yes yes yes 200 ProxylessNAS-G + LL 74.2 79ms yes Yes yes 200 ProxylessNAS-R 74.6 78ms yes Yes yes 200 ProxylessNAS-R + MIXUP 75.1 78ms yes yes yes 200
44
(1) The history of finding efficient Mobile model (2) The history of finding efficient CPU model (3) The history of finding efficient GPU model
https://hanlab.mit.edu/files/proxylessNAS/visualization.mp4
45
MB1 3x3 MB3 5x5 MB3 7x7 MB6 7x7 MB3 5x5 MB6 5x5 MB3 3x3 MB3 5x5 MB6 7x7 MB6 7x7 MB6 7x7 MB6 5x5 MB6 7x7 Conv 3x3 Pooling FC MB3 3x3
40x112x112 24x112x112 3x224x224 32x56x56 56x28x28 56x28x28 112x14x14 112x14x14 128x14x14 128x14x14 128x14x14 256x7x7 256x7x7 256x7x7 256x7x7 432x7x7
Conv 3x3 MB1 3x3 MB3 5x5 MB3 3x3 MB3 7x7 MB3 3x3 MB3 5x5 MB3 5x5 MB6 7x7
32x112x112 32x112x112 3x224x224 32x56x56 40x56x56 40x28x28 40x28x28 40x28x28 40x28x28
MB3 5x5 MB3 5x5
80x14x14 80x14x14
MB6 5x5 MB3 5x5 MB3 5x5 MB3 5x5 MB6 7x7 MB3 7x7 MB6 7x7 Pooling FC
80x14x14 96x14x14 96x14x14 96x14x14 192x7x7 192x7x7 192x7x7 192x7x7 320x7x7
MB3 5x5
80x14x14
MB6 7x7 MB3 7x7
96x14x14
Conv 3x3 MB1 3x3 MB6 3x3 MB3 3x3 MB3 3x3 MB3 3x3 MB6 3x3 MB3 3x3 MB3 3x3
40x112x112 24x112x112 3x224x224 32x56x56 32x56x56 32x56x56 32x56x56 48x28x28 48x28x28
MB6 3x3 MB3 5x5
48x28x28 48x28x28
MB6 5x5 MB3 3x3 MB3 3x3 MB3 3x3 MB6 5x5 MB3 3x3 MB6 5x5 Pooling FC
88x14x14 104x14x14 104x14x14 104x14x14 216x7x7 216x7x7 216x7x7 216x7x7 360x7x7
MB3 3x3
88x14x14
MB3 5x5 MB3 5x5
104x14x14
(1) Efficient mobile architecture found by Proxy-less NAS. (2) Efficient CPU architecture found by Proxy-less NAS. (3) Efficient GPU architecture found by Proxy-less NAS.
46
AMC: AutoML for Model Compression
He et al [ECCV’18]
Proxyless Neural Architecture Search
Cai et al [ICLR’19]
HAQ: Hardware-aware Automated Quantization
Wang et al [CVPR’19], oral Machine learning expert Hardware expert Non expert Hardware-Centric AutoML
+
A u t
L
BitFusion (On the Edge)
PE
&<<
+
+ PE PE PE PE PE
PE
&<<
+
+ PE PE PE PE PE
Feedback Hardware Mapping
3 bit weight 5 bit activation 1 0 1 0 0 0 1 0 1 1 1 0 1 0 1 0 0 1 0 1 0 1 1 1 0 1 0 1 0 0 1 1 1 1 0 1 0 1 0 0 1 0 …… ……
Quantized Model
… Layer 3 3bit / 5bit Layer 4 6bit / 7bit Layer 5 4bit / 6bit Layer 6 5bit / 6bit
Hardware Accelerator Policy
BISMO (On the Edge)
PE
&<<
Cycle 0 (MSB) Cycle T (LSB)+
+ PE PE PE PE PE
47
# https://github.com/MIT-HAN-LAB/ProxylessNAS from proxyless_nas import * net = proxyless_cpu(pretrained=True) net = proxyless_gpu(pretrained=True) net = proxyless_mobile(pretrained=True)
48
49