26
Reduce Number of Ops and Weights
- Exploit Activation Statistics
- Network Pruning
- Compact Network Architectures
- Knowledge Distillation
Reduce Number of Ops and Weights Exploit Activation Statistics - - PowerPoint PPT Presentation
Reduce Number of Ops and Weights Exploit Activation Statistics Network Pruning Compact Network Architectures Knowledge Distillation 26 Sparsity in Fmaps Many zeros in output fmaps after ReLU ReLU 9 -1 -3 9 0 0 1 -5 5 1 0 5 -2 6 -1
26
27
28
2 12 4 53 2 22 0 Run Level Run Level Run Level Term
29
1 2 3 4 5 6 1 2 3 4 5
30
31
32
33
34
35
36
37
38
39
40
41
42
1.74x
43
Input DNN Configuration File Output DNN energy breakdown across layers
44
[Moons et al., VLSI 2016; Han et al., ICLR 2016]
45
46
~ a
a3
~ b PE0 PE1 PE2 PE3 B B B B B B B B B B B B B @ w0,0w0,1 0 w0,3 0 w1,2 0 0 w2,1 0 w2,3 0 w4,2w4,3 w5,0 0 0 w6,3 0 w7,1 0 1 C C C C C C C C C C C C C A = B B B B B B B B B B B B B @ b0 b1 −b2 b3 −b4 b5 b6 −b7 1 C C C C C C C C C C C C C A
ReLU
⇒ B B B B B B B B B B B B B @ b0 b1 b3 b5 b6 1 C C C C C C C C C C C C C A
47
Scatter network
48
49
50
5x5 filter Two 3x3 filters decompose Apply sequentially decompose 5x5 filter 5x1 filter 1x5 filter Apply sequentially
51
52
53
54
compress expand compress
55
56
57
58
Original Approx.
59
[Kim et al., ICLR 2016]
60