SLIDE 5 Unified Computing Engine for Convolution—Supertile
a
cfm
EPE
DSP
× + Activation Input
MUX MUX
Weight Update
DSP
× + Activation Input Weight Update
EPE
Buf A Buf B Buf C Buf D Buf A Buf B Buf C Buf D
MUX MUX
Weight Cache Weight Cache Weight Cache Weight Cache
[1] E. Wu, X. Zhang, D. Berman, and I. Cho. “A high-throughput reconfigurable processing array for neural networks,” In Field Programmable Logic and Applications (FPL), 2017 27th International Conference on (pp. 1–4). IEEE.
Performance = Freq. * Dsp_num * Ops_per_dsp The supertile method runs the DSP at twice the clock rate of the surrounding logic[1]. Enhanced Processing Element (EPE)