Deep Neural Network Pruning for Efficient Edge Computing in IoT - - PowerPoint PPT Presentation

deep neural network pruning for efficient edge computing
SMART_READER_LITE
LIVE PREVIEW

Deep Neural Network Pruning for Efficient Edge Computing in IoT - - PowerPoint PPT Presentation

Deep Neural Network Pruning for Efficient Edge Computing in IoT Rih-Teng Wu 1 , Ankush Singla 2 , Mohammad R. Jahanshahi 3 , Elisa Bertino 4 1 Ph.D. Student, Lyles School of Civil Engineering, Purdue University 2 Ph.D. Student, Department of


slide-1
SLIDE 1

Deep Neural Network Pruning for Efficient Edge Computing in IoT

Rih-Teng Wu1, Ankush Singla2, Mohammad R. Jahanshahi3, Elisa Bertino4

1 Ph.D. Student, Lyles School of Civil Engineering, Purdue University 2 Ph.D. Student, Department of Computer Science, Purdue University 3 Assistant Professor, Lyles School of Civil Engineering, Purdue University 4 Professor, Department of Computer Science, Purdue University

March 20th, 2019

1

slide-2
SLIDE 2

Motivation – Internet of Things

2

Figure adopted from: https://www.axis.com/blog/secure-insights/internet-of-things-reshaping-security/ Source: https://tinyurl.com/yagpsakm

slide-3
SLIDE 3

Motivation – Current Inspection in SHM

3

slide-4
SLIDE 4

Motivation – Deep Neural Networks

Deep Convolutional Neural Network for SHM

Ø Specialized Architecture?

§

Needs a lot of data

Ø Transfer Learning?

§

Not efficient for edge computing

4

slide-5
SLIDE 5

Network Pruning – Inspiration from Biology

5

Figure adopted from Hong et al. (2013), ” Decreased Functional Brain Connectivity in Adolescents with Internet Addiction.”

slide-6
SLIDE 6

Existing Pruning Algorithms

Ø Magnitudes of filter weights Ø Magnitudes of activation values Ø Mutual information between activations and predictions Ø Regularization-based approaches Ø Taylor-expansion based approach

Molchanovet al. (2017), “Pruning Convolutional Neural Networks for Resource Efficient Inference”, arXiv:1611.06440v2.

6

slide-7
SLIDE 7

Network Pruning with Filter Importance Ranking

Original Network Evaluate the Importance of Neurons/Filters Remove the Least Important Neurons/Filters Fine-tuning More Pruning? Stop Pruning

Yes No Find the least important filters based on Taylor-expansion (Molchanov et al., 2017)

slide-8
SLIDE 8

Crack and Corrosion Datasets

8

Non-crack (training: 25313, testing: 4467 ) Crack (training: 25048, testing: 4420 ) Non-corrosion (training: 29,026, testing: 5,122 ) Corrosion (training: 28,083, testing: 4,956 )

slide-9
SLIDE 9

Computing Units

9

Server device Edge device

slide-10
SLIDE 10

Result – Transfer Learning without Pruning

Ø VGG16 (Simonyan and Zisserman, 2014)

*Inference time: the total time required to classify 3,720 image patches of size 224x224.

Simonyan and Zisserman (2014), “Very Deep Convolutional Networks for Large-Scale Image Recognition”, arXiv:1409.1556v6.

10

slide-11
SLIDE 11

Result – VGG16 with Pruning

Ø

Pruning is conducted on the server device.

Ø

Accuracy remains descent after pruning followed by fine-tuning.

11

Crack Corrosion

slide-12
SLIDE 12

Distribution of Pruned Convolution Kernels

Ø

Early layers are pruned less, indicating the importance of low-level features.

Ø

Similar numbers of pruned kernels in layers between the pooling layers are

  • bserved.

12

slide-13
SLIDE 13

Sensitivity Analysis – Number of Fine-tuning Epochs

Ø

The accuracy is not sensitive to the number of fine-tuning epochs used in each pruning iteration.

13

Crack Corrosion

slide-14
SLIDE 14

Sensitivity Analysis – Number of Fine-tuning Epochs

Ø

The accuracy is not sensitive to the number of fine-tuning epochs used in each pruning iteration.

14

Crack Corrosion

slide-15
SLIDE 15

Pruning Time Required on the Server

Ø

When using only 1 fine-tuning epoch, the total pruning time is reduced to 1.5(hr), which is approximately 4.6 times faster than using 10 fine-tuning epochs.

15

Crack Corrosion

slide-16
SLIDE 16

Result – ResNet18 (He et al., 2015) with Pruning

16

slide-17
SLIDE 17

Result – ResNet18 (He et al., 2015) with Pruning

Ø

Pruning is conducted on the server device.

Ø

Accuracy remains descent after pruning followed by fine-tuning.

Ø

Pruning is sensitive to the network configurations.

17

Crack Corrosion

slide-18
SLIDE 18

Inference Time Required for Pruned VGG16

*Inference time: the total time required to classify 3,720 image patches of size 224x224. Ø

Server (TITANX): 13.1 (s) is reduced to 4.0 (s) for crack data; 13.2 (s) is reduced to 3.7 (s) for corrosion data. Reduction factor: 3.5

Ø

Edge (TX2): 279.7 (s) is reduced to 31.6 (s) for crack data; 275.7 (s) is reduced to 30.6 (s) for corrosion data. Reduction factor: 9

18

Crack Corrosion

slide-19
SLIDE 19

Inference Time on Edge Device: VGG16 VS ResNet18

*Inference time: the total time required to classify 3,720 image patches of size 224x224.

Ø

Inference time

Ø VGG16: 279.7 (s) to 31.6 (s); reduction factor: 8.9 Ø ResNet18: 36.8 (s) to 8.9 (s); reduction factor: 4.1

Ø

Memory:

Ø VGG16: 525 (MB) to 125 (MB), 80% reduction Ø ResNet18: 44 (MB) to 2 (MB), 95% reduction

19

Crack Corrosion

slide-20
SLIDE 20

Five-fold Cross Validation Test on VGG16

Ø Mean accuracy of 5-fold cross validation test is conducted on server. Ø Network fine-tuning is necessary to enhance the accuracy.

20

Crack Corrosion

slide-21
SLIDE 21

Five-fold Cross Validation Test on VGG16 (Cont.)

Ø The variance in the accuracy after fine-tuning is very small. However,

when pruning 97% of the filters, the variance increases and the accuracy after fine-tuning drops.

Ø The pruning is stopped when the accuracy after fine-tuning drops more

than 3%.

21

Crack Corrosion

slide-22
SLIDE 22

Five-fold Cross Validation Test on VGG16 (Cont.)

Ø The variance in the accuracy after fine-tuning is very small. However,

when pruning 97% of the filters, the variance increases and the accuracy after fine-tuning drops.

Ø The pruning is stopped when the accuracy after fine-tuning drops more

than 3%.

22

Crack Corrosion

slide-23
SLIDE 23

Five-fold Cross Validation Test on VGG16 (Cont.)

Ø The variance in the accuracy after fine-tuning is very small. However,

when pruning 97% of the filters, the variance increases and the accuracy after fine-tuning drops.

Ø The pruning is stopped when the accuracy after fine-tuning drops more

than 3%.

23

Crack Corrosion

slide-24
SLIDE 24

Five-fold Cross Validation Test on VGG16 (Cont.)

Ø The variance in the accuracy after fine-tuning is very small. However,

when pruning 97% of the filters, the variance increases and the accuracy after fine-tuning drops.

Ø The pruning is stopped when the accuracy after fine-tuning drops more

than 3%.

24

Crack Corrosion

slide-25
SLIDE 25

Summary

Ø Network pruning combined with transfer learning can achieve efficient inference when there is limited training data and computing power. Ø By network pruning, the inference time on edge device is nine and four times faster than the original VGG16 and ResNet18. The network size is reduced by 80% and 95% for the VGG16 and ResNet18 networks, respectively. Ø Different network configurations exhibit different behaviors with respect to pruning. Ø Sensitive analysis shows that pruning can be achieved by using a smaller number of fine-tuning without losing detection performance. Ø The computation gain on the edge device is more prominent than the gain

  • n the server device.

25

slide-26
SLIDE 26

Thank you

26