A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms
Yu (Emma) Wang, Gu-Yeon Wei, David Brooks Harvard University
3/3/2020 Contact: ywang03@g.harvard.edu
ParaDnn
github.com/Emma926/paradnn
Acknowledgement Frank Chen, Glenn Holloway, Dan Janni, Peter - - PowerPoint PPT Presentation
P ara D nn github.com/Emma926/paradnn A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms Yu (Emma) Wang, Gu-Yeon Wei, David Brooks Harvard University Contact: ywang03@g.harvard.edu 3/3/2020 Acknowledgement
3/3/2020 Contact: ywang03@g.harvard.edu
github.com/Emma926/paradnn
○
○
○
○
Area Benchmark Dataset Model Reference Implementation Vision Image classification ImageNet ResNet-50 TensorFlow Object detection COCO 2017 Mask R-CNN Pytorch Object detection COCO 2017 SSD-ResNet34 Pytorch Language/ Audio Translation WMT Eng-Germ Transformer TensorFlow Speech recognition WMT Eng-Germ GNMT PyTorch Commerce Recommendation MovieLens-20M NCF PyTorch Action Reinforcement Learning Go Mini-go TensorFlow
Area Benchmark Dataset Model Reference Implementation Vision Image classification ImageNet ResNet-50 TensorFlow Object detection COCO 2017 Mask R-CNN Pytorch Object detection COCO 2017 SSD-ResNet34 Pytorch Language/ Audio Translation WMT Eng-Germ Transformer TensorFlow Speech recognition WMT Eng-Germ GNMT PyTorch Commerce Recommendation MovieLens-20M NCF PyTorch Action Reinforcement Learning Go Mini-go TensorFlow
several arbitrary models
end-to-end models
models
models, i.e. MLPerf
convergence with real datasets
people care about
# of Nodes # of Nodes Input Output # of Layers # of Res/Bottleneck Blocks (filter size) Input Output FC Layer x 4 RNN or LSTM or GRU cell (size) Input Output # of Layers RNN or LSTM or GRU cell
13
David Brooks, Gu-Yeon Wei
14
David Brooks, Gu-Yeon Wei
15
David Brooks, Gu-Yeon Wei
16
David Brooks, Gu-Yeon Wei
17
David Brooks, Gu-Yeon Wei
18
David Brooks, Gu-Yeon Wei
19
David Brooks, Gu-Yeon Wei
20
David Brooks, Gu-Yeon Wei
21
David Brooks, Gu-Yeon Wei
22
23
24
25
26
27
28
Figure is from https://cloud.google.com/tpu/docs/system-architecture
29
30
31
32
33
34
35
37
38
FC W A FC Gradient Weighted Sum G
FC W A FC Gradient Weighted Sum G
A Conv Gradient Weighted Sum G
W
41
42
43
44
45
46
47
48
49
50