Moving CNN Accelerator Computations Closer to Data Sumanth - PowerPoint PPT Presentation

Jan 21, 2024 •133 likes •257 views

1 Moving CNN Accelerator Computations Closer to Data Sumanth Gudaparthi Surya Narayanan Rajeev Balasubramonian Evolution of CNN Accelerators 2 DRISA DianNao, DaDianNao, etc. Higher-cost DRAM, Limited by memory Cant be used as a

1 Moving CNN Accelerator Computations Closer to Data Sumanth Gudaparthi Surya Narayanan Rajeev Balasubramonian
Evolution of CNN Accelerators 2 DRISA DianNao, DaDianNao, etc. Higher-cost DRAM, Limited by memory Can’t be used as a bandwidth host memory Moore’s Analog in-situ Law Accelerators Digital in-situ Digital Accelerators Accelerators Complex Analog Transistor scaling is circuits, Lack of coming to an end Flexibility ISAAC, PRIME etc
SRAM based In-Situ Computation Accelerator 3 DA vs SISCA AIA vs SISCA DIA vs SISCA Perform Use SRAM cells Modify the LLC. Computations In- to perform In- Trivial overhead Situ Situ on baseline Cache Computations operations SISCA: Proposed Accelerator DA: Digital Accelerators AIA: Analog In-situ Accelerators DIA: Digital In-situ Accelerators
Logic-In-Memory 4 WL WL 1 0 WLB BLB BL BLB BL WLB Jeloka et al., 2016
Logic-In-Memory 5 WL1 1 0 1 0 0 0 0 Pre-charge the bit-lines 1 1 WLB1 Activate the word-lines Cell1 Discharge of bit-line voltage WL2 through Cell1 Discharge of bit-line voltage 1 0 1 1 1 0 0 through both Cells WLB2 Cell2 Bit-line stays Pre-charged 1 -> 0 1 1 BL BLB Jeloka et al., 2016
Enabling In-Situ Multiplication in Caches 6 W 0-0 W 0-1 W 0-2 W 0-0 W 0-1 W 0-2 I 0-0 I 0-1 I 0-2 * W 0-1 W 0-2 W 0-0 W 0-2 I 0-2 I 0-2 W 0-2 I 0-2 W 0-0 W 0-1 W 0-1 W 0-0 I 0-0 I 0-1 I 0-2 W 0-2 I 0-1 W 0-0 I 0-1 W 0-1 I 0-1 W 0-1 W 0-2 I 0-0 W 0-0 I 0-0 I 0-0 C a-b : bit number-b in a th variable of C
SISCA Organization 7 Banks LC:1 SA1 SA2 RAT1 RAT2 H-Tree RAT3 RAT4 SA3 SA4 Shifter Feature Map Unused Sub-array C a-b : bit number-b in a th variable of C Kernel Entries Entries Entries
SISCA Dataflow 8 Sub Array 1 Input Feature Map (6x6) Sub Array 2 Kernel Maps 2x(3x3) Sub Array 3 Output Feature Maps 2x(4x4)
Energy Improvements 9 6.3x Energy Efficient
Performance Improvements 10 2.7x Higher Throughput
Conclusions and Future Work 11 • SISCA is an SRAM in-situ computation Engine for Convolution Neural Networks • Uses on-chip Last Level Cache (LLC) to perform computations • SISCA is 6.3x Energy efficient, and has 2.7x higher throughput than DaDianNao • Better dataflow and mapping mechanisms can further improve the Energy and Throughput. • Need to work on better scheduling mechanisms to distribute the general purpose workload, and CNN data across the Cache.
12 Questions?

Recommend

CLOSER 2019, May., 2-4, Heraklion, Greece 1 CLOSER 2019, May., 2-4, Heraklion, Greece 2 Cloud

CLOSER 2019, May., 2-4, Heraklion, Greece 1 CLOSER 2019, May., 2-4, Heraklion, Greece 2 Cloud layer Fog layer Sensor layer CLOSER 2019, May., 2-4, Heraklion, Greece 3 CLOSER 2019, May., 2-4, Heraklion, Greece 4 CLOSER 2019, May., 2-4,

345 views • 18 slides

CS7015 (Deep Learning) : Lecture 12 Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only

CS7015 (Deep Learning) : Lecture 12 Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only Look Once (YOLO) Mitesh M. Khapra Department of Computer Science and Engineering Indian Institute of Technology Madras 1/47 Mitesh M. Khapra

1.17k views • 47 slides

Object Detection using R-CNN Experiments CS381V: Visual Recognition, Spring 2016 William Xie

Object Detection using R-CNN Experiments CS381V: Visual Recognition, Spring 2016 William Xie Feb. 24, 2016 Fast R-CNN R-CNN: Girshick et al., CVPR 2013 Fast R-CNN: Girshick, ICCV 2015 Faster R-CNN: Ren et al., NIPS

4.36k views • 46 slides

Embarrassingly Parallel Computations 3.2 1 Embarrassingly Parallel Computations A computation

Parallel Techniques Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations Asynchronous Computations Strategies

378 views • 13 slides

DDR solution Sprites overview Moving right arrow Moving left arrow Moving down arrow Moving up

DDR solution Sprites overview Moving right arrow Moving left arrow Moving down arrow Moving up arrow Moving up arrow Cassy (contol loop) Great sprite OK sprite Space bar sprite

521 views • 11 slides

TS 83 DORMA DORMA TS 83 Easy-action Door Closer Easy-action door closer Data and features TS

Easy-action Door Closer TS 83 DORMA DORMA TS 83 Easy-action Door Closer Easy-action door closer Data and features TS 83 with thinking backcheck Variable Spring EN EN closing force strength 2 6 7 Easy to fix and even

178 views • 6 slides

1 3 5 CONVENTIONAL DC MODEL Accelerator Output Accelerator Opening FB-CA SERIES Accelerator

1 3 5 CONVENTIONAL DC MODEL Accelerator Output Accelerator Opening FB-CA SERIES Accelerator Output Accelerator Opening 7 SAFETY AND COMFORT 2 3 4 5 1 Wa 9 H 4 H 3 +s Q B 5 B H 6 H 1 b C H 2 +s s Wa F y X L 2 L 1

385 views • 6 slides

Structuring Computations Structuring Computations Contents Jacobs Types06, 18/4/06

FACULTY OF SCIENCE Bart Jacobs Structuring Computations Structuring Computations Contents Jacobs Types06, 18/4/06 p.1/52 Structuring Computations Contents I. Sneak preview VII. Hoare logic for JML II. Comonads VIII. Conclusions

1.79k views • 164 slides

Decay vertex ID using CNN for p K+ Aaron Higuera University of Houston CNN Tools on

Decay vertex ID using CNN for p K+ Aaron Higuera University of Houston CNN Tools on the Proton Decay Analysis e + Standard vertex reconstruction use p K+ pmtracks in order to reconstruct a + vertex K + Ideally we would

746 views • 15 slides

CNN Ba CNN Based ed Pi Pipeline peline for or Op Optical ical Fl Flow ow Tal Schuster,

CNN Ba CNN Based ed Pi Pipeline peline for or Op Optical ical Fl Flow ow Tal Schuster, June 2017 Based on: PatchBatch: a Batch Augmented Loss for Optical Flow, (Gadot, Wolf) CVPR 2016 Optical Flow Requires Multiple Strategies (but only

711 views • 53 slides

CENG5030 Part 2-1: Introduction to Convolutional Nueral Network Bei Yu (Latest update: March 4,

CENG5030 Part 2-1: Introduction to Convolutional Nueral Network Bei Yu (Latest update: March 4, 2019) Spring 2019 1 / 22 Overview CNN Architecture Overview CNN Energy Efficiency CNN on Embedded Platform 2 / 22 Overview CNN Architecture

474 views • 29 slides

Nue Energy Reconstruction with CNN Lars Hertel, Ilsoo Seong, Jianming Bian 2018/08/20 Intro.

Nue Energy Reconstruction with CNN Lars Hertel, Ilsoo Seong, Jianming Bian 2018/08/20 Intro. Convolutional Neural Network (CNN) has been mostly used for classification This CNN has been implemented in the dunetpc software to classify

224 views • 8 slides

CEBAF Accelerator Status Arne Freyberger Operations Department Accelerator Division Jefferson

CEBAF Accelerator Status Arne Freyberger Operations Department Accelerator Division Jefferson Lab Outline Accelerator Management Changes FY17 Accelerator Operations Fall 2016 Spring 2017 FY18 3+ Hall Operations

363 views • 19 slides

SLAC Accelerator Science and R&D R. Hettel Accelerator Research Division Head (acting)

SLAC Accelerator Science and R&D R. Hettel Accelerator Research Division Head (acting) August 29, 2014 SLAC Accelerator and Test Facilities SPEAR3 ASTA FACET FACET II ESTB LCLS I LCLS II SLC NLCTA PEP-II 2 SLAC Accelerator Test

815 views • 18 slides

Fermilab Accelerator R&D Program Vladimir Shiltsev, Accelerator Physics Center Institutional

Fermilab Accelerator R&D Program Vladimir Shiltsev, Accelerator Physics Center Institutional Review of the Fermi National Accelerator Laboratory 11 February 2015 Fermilab Accelerator Program: P5-Aligned Operational Support and Complex

837 views • 62 slides

Accelerator Modeling Through High Performance Computing Z. Li NERSC,LBNL Advanced Computations

Accelerator Modeling Through High Performance Computing Z. Li NERSC,LBNL Advanced Computations Department Stanford Linear Accelerator Center NCCS, ORNL Presented at Jefferson Lab, 9-24-2007 Work supported by U.S. DOE ASCR, BES & HEP

734 views • 55 slides

Program 1. Company presentation 2. Smart Burners - How it works haemers- technologies.com 4

Program 1. Company presentation 2. Smart Burners - How it works haemers- technologies.com 4 octobre 2018 2 Technology provider for Thermal Desorption of Contaminated Soil Oil & Gas Real Estate Remote Locations Soil Treatment

642 views • 35 slides

An in situ sediment sound speed An in situ sediment sound speed measurement platform:

An in situ sediment sound speed An in situ sediment sound speed measurement platform: measurement platform: Design, operation, and experimental results Design, operation, and experimental results Jie Yang, Dajun Tang, and Kevin L. Jie Yang,

463 views • 15 slides

WELLS RANCH SEC. 25: OBSERVATIONS FROM A UNDERGROUND IN-SITU LABORATORY Dave Koskella Bob

WELLS RANCH SEC. 25: OBSERVATIONS FROM A UNDERGROUND IN-SITU LABORATORY Dave Koskella Bob Parney David brock January 21, 2015 Slide 2 Forward-looking Statements and Other Matters This presentation contains certain forward - looking

853 views • 30 slides

ON TACC S S TAMPEDE -KNL Paul A. Navrtil, Ph.D. Manager Scalable Visualization

SDV IS AND I N -S ITU V ISUALIZATION ON TACC S S TAMPEDE -KNL Paul A. Navrtil, Ph.D. Manager Scalable Visualization Technologies pnav@tacc.utexas.edu 1 High-Fidelity Visualization Natively on Xeon and Xeon Phi 2 O UTLINE Stampede

786 views • 37 slides

Free In Situ Volume Compression Using NVENC Nick Leaf, Bob Miller, and Kwan-Liu Ma UC Davis

Free In Situ Volume Compression Using NVENC Nick Leaf, Bob Miller, and Kwan-Liu Ma UC Davis A supercomputer is a device for turning compute-bound problems into I/O-bound problems. -Ken Batcher Video Processing Unit (VPU) Dedicated ASIC

710 views • 21 slides

Fully Automated In-Situ Sample Preparation with New Generation Helios 5 DualBeam David Wall The

Fully Automated In-Situ Sample Preparation with New Generation Helios 5 DualBeam David Wall The world leader in serving science Electron microscopy innovation at Thermo Fisher More than 25 years of DualBeam innovations 2 Helios 5

878 views • 29 slides

Animal Genetic Resources for food and agriculture Gust avo Gandini Brussels 3 rd June 2015

Better integration of ex situ and in situ approaches towards conservation and sustainable use of GR at national and EU level: from complementarity to synergy. Animal Genetic Resources for food and agriculture Gust avo Gandini Brussels 3 rd June

558 views • 13 slides

Policies and access rules for national genebanks Norway ERFP ex situ WG, Madrid 22 May 2019 Nina

Meeting of the ERFP WG "Ex situ conservation" Policies and access rules for national genebanks Norway ERFP ex situ WG, Madrid 22 May 2019 Nina Sther, PhD Director Norwegian Genetic Resource Centre Gene banks in Norway Poultry -

400 views • 4 slides