The Dark Side of DNN Pruning Reza Yazdani Marc Riera Jose-Maria - - PowerPoint PPT Presentation

the dark side of dnn pruning
SMART_READER_LITE
LIVE PREVIEW

The Dark Side of DNN Pruning Reza Yazdani Marc Riera Jose-Maria - - PowerPoint PPT Presentation

45 th International Symposium on Computer Architecture, Los Angeles, US, June 2018 The Dark Side of DNN Pruning Reza Yazdani Marc Riera Jose-Maria Arnau Antonio Gonzlez DNN Pruning Efficient reduction of DNN size Higher


slide-1
SLIDE 1

Reza Yazdani Marc Riera Jose-Maria Arnau Antonio González

The Dark Side of DNN Pruning

45th International Symposium on Computer Architecture, Los Angeles, US, June 2018

slide-2
SLIDE 2

DNN Pruning

  • Efficient reduction of DNN size

✔ Higher performance ✔ Significant energy-saving ✔ Ultra-low power ✔ Lower area

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 2

slide-3
SLIDE 3

Side-Effect of DNN Pruning

  • Lack of confidence in DNN classification

– Speech network of acoustic modeling

0.2 0.4 0.6 0.8 1

Baseline Pruned Model

Output Class Probability

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 3

slide-4
SLIDE 4

Confidence Issue

  • DNN dependent applications

– Automatic Speech Recognition (ASR) – Machine Translation

  • Example: ASR evaluation for pruned DNN

20 40 60 80 100 120 140 10 20 30 40 50 60 70 80 90 100

Dnn Viterbi WER Normalized Decoding Time (%) Word-Error-Rate (%)

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 4

slide-5
SLIDE 5

Outline

  • Motivation
  • DNN pruning & Confidence loss
  • ASR using pruned DNN
  • Accelerator's baseline
  • Efficient design with DNN pruning
  • Experimental results
  • Conclusions
slide-6
SLIDE 6

DNN Pruning: Accuracy

  • Maintaining top-5 accuracy

0% 20% 40% 60% 80% 100%

T

  • p 1

T

  • p 5

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 5

slide-7
SLIDE 7

Loss of Confidence

  • The more the pruning rate in DNNs, the lower the

classification probability

0.5 0.55 0.6 0.65 0.7

Average Confidence

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 6

slide-8
SLIDE 8

Outline

  • Motivation
  • DNN pruning & Confidence loss
  • ASR using pruned DNN
  • Accelerator's baseline
  • Efficient design with DNN pruning
  • Experimental results
  • Conclusions
slide-9
SLIDE 9

ASR

  • ASR systems include two phases

– DNN: computes probabilities of different phonemes at each frame

Frame i

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 7

slide-10
SLIDE 10

ASR

  • ASR systems include two phases

– DNN: computes probabilities of different phonemes at each frame

Frame i Hidden Layers

.

n .

. . . m .

DNN

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 7

slide-11
SLIDE 11

ASR

  • ASR systems include two phases

– DNN: computes probabilities of different phonemes at each frame

Frame i Hidden Layers

.

n .

. . . m .

DNN

0.2 0.4 0.6 0.8 1

Output Class DNN Score

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 7

slide-12
SLIDE 12

ASR

  • ASR systems include two phases

– DNN: computes probabilities of different phonemes at each frame – Viterbi search: explores WFST based on DNN scores

Frame 0 Frame 1 Frame 2 S1 S2 S1 S2 S2 S3 S4 ... ... ... ... ... ...

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 7

slide-13
SLIDE 13

ASR Evaluation

  • Viterbi search under pruned DNN model

Frame 2

DNN Scores of Frame 2 The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 8

slide-14
SLIDE 14

ASR Evaluation

  • Viterbi search under pruned DNN model

DNN Scores of Frame 2

Frame 2

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 8

slide-15
SLIDE 15

Viterbi Workload

  • Increase in Viterbi's search activity

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 9

slide-16
SLIDE 16

Outline

  • Motivation
  • DNN pruning & Confidence loss
  • ASR using pruned DNN
  • Accelerator's baseline
  • Efficient design with DNN pruning
  • Experimental results
  • Conclusions
slide-17
SLIDE 17

Hardware Baseline

  • UNFOLD: state-of-the-art Viterbi accelerator

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10

slide-18
SLIDE 18

Hardware Baseline

  • UNFOLD: state-of-the-art Viterbi accelerator

st0 st1

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10

slide-19
SLIDE 19

Hardware Baseline

  • UNFOLD: state-of-the-art Viterbi accelerator

st0 st1

10

slide-20
SLIDE 20

Hardware Baseline

  • UNFOLD: state-of-the-art Viterbi accelerator

st0 st1

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10

slide-21
SLIDE 21

Hardware Baseline

  • UNFOLD: state-of-the-art Viterbi accelerator

st0 st1

0.00015 0.31 0.0014 0.0002 0.00005

Likelihoods

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10

slide-22
SLIDE 22

Hardware Baseline

  • UNFOLD: state-of-the-art Viterbi accelerator

st0 st1 Likelihoods

0.00015 0.31 0.0014 0.0002 0.00005

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10

slide-23
SLIDE 23

Hardware Baseline

  • UNFOLD: state-of-the-art Viterbi accelerator

Hash Bottlenecks Collision handling

  • Backup buffer

Overflows

  • Overflow buffer

Access delay

  • Backup
  • Overflow

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 10

slide-24
SLIDE 24

Outline

  • Motivation
  • DNN pruning & Confidence loss
  • ASR using pruned DNN
  • Accelerator's baseline
  • Efficient design with DNN pruning
  • Experimental results
  • Conclusions
slide-25
SLIDE 25

Efficient Hash Design

  • Keeping the best N hypotheses at each frame

– Known as Histogram Pruning

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 11

slide-26
SLIDE 26

Efficient Hash Design

  • Keeping the best N hypotheses at each frame

– Known as Histogram Pruning

  • Implementation issue

– Sorting tokens at every frame – Expensive: O(m*log(m)) for m hypotheses

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 11

slide-27
SLIDE 27

Efficient Hash Design

  • Keeping the best N hypotheses at each frame

– Known as Histogram Pruning

  • Implementation issue

– Sorting tokens at every frame – Expensive: O(m*log(m)) for m hypotheses

  • Our scheme

– Loosely keeping N-best using hash mechanism

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 11

slide-28
SLIDE 28

Efficient Hash Design

  • Direct-mapped

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 12

slide-29
SLIDE 29

Efficient Hash Design

  • Direct-mapped
  • Way-Associative

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 12

slide-30
SLIDE 30

Efficient Hash Design

  • Our scheme efficiency

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 13

slide-31
SLIDE 31

Efficient Hash Design

  • Way-associative main challenge

– Replace when set is full – Finding hypothesis with max cost

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 14

slide-32
SLIDE 32

Efficient Hash Design

  • Way-associative main challenge

– Replace when set is full – Finding hypothesis with max cost

  • Our solution

– Store index of each set based on max-heap – Replace with the root of tree – Updating max-heap fits in one cycle

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 14

slide-33
SLIDE 33

Outline

  • Motivation
  • DNN pruning & Confidence loss
  • ASR using pruned DNN
  • Accelerator's baseline
  • Efficient design with DNN pruning
  • Experimental results
  • Conclusions
slide-34
SLIDE 34

Evaluation Methodology

  • Cycle-accurate simulation of DNN and Viterbi
  • Model accelerator's components in hardware

– Verilog implementation of logic parts – Synthesized by design compiler – Cacti: Cache and memory components – Micron: main memory

  • Combine simulation results with hardware models

– Decoding time – Decoding power and energy consumption – Accelerator's area usage

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 15

slide-35
SLIDE 35

Accelerator's Parameters

  • DNN and Viterbi accelerators

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 16

slide-36
SLIDE 36

Experiment Configs

  • Viterbi Search:

– Baseline: Unfold's design – Beam: reduce beam without changing baseline – N-Best: our proposal

  • DNN:

– Non-pruned version – Pruned version: 70%, 80% and 90% pruning

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 17

slide-37
SLIDE 37

Experimental Results

  • Decoding time

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 18

slide-38
SLIDE 38

Experimental Results

  • Decoding time
  • Energy consumption

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 18

slide-39
SLIDE 39

Experimental Results

  • Decoding time
  • Energy consumption
  • Area usage: 10.74 mm2 (2x reduction)

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 18

slide-40
SLIDE 40

Outline

  • Motivation
  • DNN pruning & Confidence loss
  • ASR using pruned DNN
  • Accelerator's baseline
  • Efficient design with DNN pruning
  • Experimental results
  • Conclusions
slide-41
SLIDE 41

Conclusions

  • Major side effect of DNN pruning

– Confidence loss: top-1's low likelihood

  • DNN pruning in ASR systems

– 20% confidence loss, 33% slowdown

  • Our solution: A novel Viterbi accelerator

– Resilient to DNN pruning

– Less search activity while maintaining accuracy

  • Compared to state-of-art ASR accelerated system

– 9x energy-saving, 4.5x speedup, 2x area reduction

The Dark Side of DNN Pruning, Session 9A, Wednesday June 6th, ISCA'18 19

slide-42
SLIDE 42

Reza Yazdani Marc Riera Jose-Maria Arnau Antonio González

The Dark Side of DNN Pruning

45th International Symposium on Computer Architecture, Los Angeles, US, June 2018