Reza Yazdani Marc Riera Jose-Maria Arnau Antonio González
The Dark Side of DNN Pruning
45th International Symposium on Computer Architecture, Los Angeles, US, June 2018
The Dark Side of DNN Pruning Reza Yazdani Marc Riera Jose-Maria - - PowerPoint PPT Presentation
45 th International Symposium on Computer Architecture, Los Angeles, US, June 2018 The Dark Side of DNN Pruning Reza Yazdani Marc Riera Jose-Maria Arnau Antonio Gonzlez DNN Pruning Efficient reduction of DNN size Higher
Reza Yazdani Marc Riera Jose-Maria Arnau Antonio González
45th International Symposium on Computer Architecture, Los Angeles, US, June 2018
✔ Higher performance ✔ Significant energy-saving ✔ Ultra-low power ✔ Lower area
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 2
– Speech network of acoustic modeling
0.2 0.4 0.6 0.8 1
Baseline Pruned Model
Output Class Probability
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 3
– Automatic Speech Recognition (ASR) – Machine Translation
20 40 60 80 100 120 140 10 20 30 40 50 60 70 80 90 100
Dnn Viterbi WER Normalized Decoding Time (%) Word-Error-Rate (%)
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 4
0% 20% 40% 60% 80% 100%
T
T
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 5
0.5 0.55 0.6 0.65 0.7
Average Confidence
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 6
– DNN: computes probabilities of different phonemes at each frame
Frame i
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 7
– DNN: computes probabilities of different phonemes at each frame
Frame i Hidden Layers
.
n .
. . . m .
DNN
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 7
– DNN: computes probabilities of different phonemes at each frame
Frame i Hidden Layers
.
n .
. . . m .
DNN
0.2 0.4 0.6 0.8 1
Output Class DNN Score
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 7
– DNN: computes probabilities of different phonemes at each frame – Viterbi search: explores WFST based on DNN scores
Frame 0 Frame 1 Frame 2 S1 S2 S1 S2 S2 S3 S4 ... ... ... ... ... ...
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 7
Frame 2
DNN Scores of Frame 2 The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 8
DNN Scores of Frame 2
Frame 2
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 8
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 9
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10
st0 st1
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10
st0 st1
10
st0 st1
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10
st0 st1
0.00015 0.31 0.0014 0.0002 0.00005
Likelihoods
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10
st0 st1 Likelihoods
0.00015 0.31 0.0014 0.0002 0.00005
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10
Hash Bottlenecks Collision handling
Overflows
Access delay
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10
– Known as Histogram Pruning
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 11
– Known as Histogram Pruning
– Sorting tokens at every frame – Expensive: O(m*log(m)) for m hypotheses
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 11
– Known as Histogram Pruning
– Sorting tokens at every frame – Expensive: O(m*log(m)) for m hypotheses
– Loosely keeping N-best using hash mechanism
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 11
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 12
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 12
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 13
– Replace when set is full – Finding hypothesis with max cost
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 14
– Replace when set is full – Finding hypothesis with max cost
– Store index of each set based on max-heap – Replace with the root of tree – Updating max-heap fits in one cycle
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 14
– Verilog implementation of logic parts – Synthesized by design compiler – Cacti: Cache and memory components – Micron: main memory
– Decoding time – Decoding power and energy consumption – Accelerator's area usage
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 15
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 16
– Baseline: Unfold's design – Beam: reduce beam without changing baseline – N-Best: our proposal
– Non-pruned version – Pruned version: 70%, 80% and 90% pruning
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 17
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 18
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 18
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 18
– Confidence loss: top-1's low likelihood
– 20% confidence loss, 33% slowdown
– Resilient to DNN pruning
– Less search activity while maintaining accuracy
– 9x energy-saving, 4.5x speedup, 2x area reduction
The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 19
Reza Yazdani Marc Riera Jose-Maria Arnau Antonio González
45th International Symposium on Computer Architecture, Los Angeles, US, June 2018