In Integrating NVIDIA Deep Learning Accelerator (NVDLA) wit ith RIS ISC-V SoC on Fir ireSim
Farzad Farshchi§, Qijing Huang¶, Heechul Yun§
§University of Kansas, ¶University of California, Berkeley
In Integrating NVIDIA Deep Learning Accelerator (NVDLA) wit ith - - PowerPoint PPT Presentation
In Integrating NVIDIA Deep Learning Accelerator (NVDLA) wit ith RIS ISC-V SoC on Fir ireSim Farzad Farshchi , Qijing Huang , Heechul Yun University of Kansas, University of California, Berkeley SiFive Internship Rocket Chip
Farzad Farshchi§, Qijing Huang¶, Heechul Yun§
§University of Kansas, ¶University of California, Berkeley
2
3
4
5
target
Figure credit: Donggyu Kim et al. “Strober: Fast and Accurate Sample-Based Energy Simulation for Arbitrary RTL”
6
in Chisel
7
the target. Added by FireSim.
constant latency
way, block sizes. No need to rebuild FPGA image
8
9
nv_large
matrix multiplication
function, pooling, etc.
Adopted from “The Nvidia Deep Learning Accelerator”, https://goo.gl/Znyba5 10
11
12
407x 5.5x
13
alternative to scratchpad
changing the LLC size
data reuse left
* Speedup is measured w.r.t design with no LLC
1.6x
14
2.5x
* Normalized to solo execution time i.e. running in isolation
15
16
17
18