CRESITT EVENT IA EMBARQUEE ET RECHERCHE AMONT CEA Presentation for - PowerPoint PPT Presentation

CRESITT EVENT IA EMBARQUEE ET RECHERCHE AMONT CEA Presentation for CRESITT | October 17th, 2019 Sandrine Varenne, David Briand CEA LIST sandrine.varenne@cea.fr | 1

IA EMBARQUÉE ET RECHERCHE AMONT 1 LES TRAVAUX DU CEA DRT ( DIRECTION DE LA RECHERCHE TECHNOLOGIQUE ) EN INTELLIGENCE ARTIFICIELLE 2 A PERÇU GÉNÉRAL DE NOS ACTIVITÉS EN IA EMBARQUÉE 3 ZOOM SUR NOS OUTILS N2D2 ET NOS ACCÉLÉRATEURS HARDWARE (PNEURO, DNEURO…) 4 CONCLUSION | 2

CEA TECH & Artificial Intelligence Text & Audio Semantics Video Other signals.. Images DATA Architecture Data analytics NVM Algorithms Architecture Software Hardware IC Conception Algorithm Certification & know-how know-how Adequation software Communication verification Tools… 3D Integration Smart … … Systems CEA CONFIDENTIEL | 3

CEA TECH & Artificial Intelligence To address the embedded Challenges Days, weeks on multi-GPU server until correct accuracy (topology, training set, parameters…) Labeled Machine learning DNN databases algorithm model Nvidia DGX-1 (8 Tesla P100) training prediction “A car” New data DNN trained model prediction Low-latency inference (TPU, FPGA, GPU, PNeuro…) CEA CONFIDENTIEL | 4

KNOW-HOW OF CEA IN DEEP LEARNING & EMBEDDED AI EXPERIENCES Possible FRAMEWORKS Code Generation link with Modules for CPU, Manycore CPU, GPU, FPGA, OFF THE SHELF Optimized C Cuda HLS TensorRT Dedicated HW ELEMENTS CuDNN C++ OpenMP OpenCL HW LIBRAIRIES DNEURO HW IP SPIKING+ NVM PNEURO SPIKING CEA CONFIDENTIEL | 5

N2D2 An European Platform to address Embedded Systems’ Challenges N2D2 has been totally developed by CEA Database Handling and Data Preprocessing Help  • Data conditioning • Semi automatic Data labelling Standalone Code generation for • COTS* Components (CPU, GPU, FPGA)  • Specific Hardware Targets (ST, Kalray, Renesas …) • NN Hardware Accelerators based on CEA IP >> Well adapted for embedded AI Decision help for the implementation phase  • Hardware Cost & Form Factor • Power Consumption • Latency  Spike Coding * COTS : Commercial Off-The-Shelf Components CEA CONFIDENTIEL | 6

Context / Motivations • Deep Neural Networks (DNN) are very successful in the vast majority of classification/recognition benchmarks … on high-end multi-250W GPU clusters 85 Top-1 ImageNet accuracy (%) 80 • Embedding low-power DNN remains challenging: 75 • Must adapt and simplify DNN topologies 70 • Reduce layers complexity (number of operations) 65 • Reduce precision (8 bit integer or less) 60 • Today’s general purpose CoTS are inefficient for DNNs 55 • Number of cores too low 50 • Computing cores too complex (floating point computation) • 45 Low MAC/cycle efficiency 10 100 1000 10000 100000 • Insufficient memory Complexity (MMACs)  Balancing speed/power and applicative performances is a major challenge  Need for a framework to automate DNN shrinking exploration and evaluation, performances projection and porting on embedded platforms CONFIDENTIEL

Deep learning for embedded computing N2D2 : DNN design framework • Unified modeling and NN exploration tool • Custom applications building & optimization (CNN, Faster- RCNN…) O PTIMIZED E MBEDDED • Hardware mapping & benchmarking (CPUs, GPUs, FPGAs, ASIPs) C ODE G ENERATION • N2D2 is available at https://github.com/CEA-LIST/N2D2/ ACCELERATION H ARDWARE Embedded Programmable processor PNeuro ASIC neural Dee • Clustered 8-bit SIMD architecture computing • Designed for DNN processing chains and image processing • Published at DATE 2018 FPGA H ARDWARE A CCELERATION Dataflow FPGA IP DNeuro • Optimized RTL DNN layer kernels • Automatic RTL generation through N2D2 • Dataflow computation, designed to use the DSP available on FPGA CONFIDENTIEL

Deep learning for embedded computing N2D2 : DNN design framework • Unified modeling and NN exploration tool • Custom applications building & optimization (CNN, Faster- RCNN…) O PTIMIZED E MBEDDED • Hardware mapping & benchmarking (CPUs, GPUs, FPGAs, ASIPs) C ODE G ENERATION • N2D2 is available at https://github.com/CEA-LIST/N2D2/ Motivations • Deep Neural Networks (DNN) are today extremely successful in the vast Embedded majority of classification/recognition benchmarks… on high -end multi-250W neural Dee GPU clusters computing • Embedding low-power DNN remains challenging: • Must adapt and simplify DNN topologies • Reduce layers complexity (number of operations) • Reduce precision (8 bit integer)  Balancing speed/power and applicative performances is a major challenge Need for a framework to automate DNN shrinking exploration and • evaluation, performances projection and porting on embedded platforms CONFIDENTIEL

N2D2: DNN Design Environment • A unique platform for the design and exploration of DNN applications SW DNN libraries CONSIDERED CRITERIA COTS • OpenCL, OpenMP, • Accuracy (approximate computing…) • Many-core CPUs • Memory need CuDNN, CUDA, (MPPA, P2012, ARM…) • Computational Complexity TensorRT • GPUs, FPGAs • PNeuro, ASMP Learning & HW ACCELERATORS HW DNN libraries Test PNeuro DNeuro, C/HLS databases Optimization Trained DNN Data Modeling Learning Test Code Generation Code Execution conditioning CONFIDENTIEL

N2D2: Data Augmentation, Conditioning and Analysis • N2D2 integrates data processing and analysis dataflow building • Genericity: process image and sound, 1D, 2D or 3D data • Associate a label for each data point, 1D or 2D labels • Support arbitrary label shapes (circular, rectangular, polygonal or pixel-wise defined) • Apply transformations to data, pixel-wise labels and geometrical labels • Basic operations: rescaling, flipping, normalization, affine, filtering, DFT… • Advanced operations: elastic distortion, random slices/labels extraction, morphological reconstructions… Test set Data channels Validation set Learn set Channel Channel Channel Extract Affine Extract STATS DATA- Affine Slice STATS Extract Affine DATA- Slice Rescale STATS Op=-STATS Data- Slice Rescale Op=-STATS BASE Extract DL Core / Rescale BASE Extract Op=-STATS .mean base .mean Extract Affine Spike coding .mean STATS Channel Affine STATS Channel Affine STATS Channel Op/=STATS Extract Op/=STATS Extract Op=/STATS .stdDev Extract .stdDev .stdDev (cumulative) mean min Nb. of data Annotation data max (geometric mean (cumulative) Nb. of data min and pixel-wise) Value Transformation module max Data analysis module Value CONFIDENTIEL

N2D2: Typical Outputs Layer-wise detailed memory Dataflow visualization Results visualization: and computing requirements - Pixel-wise segmentation - ROI bounding box extraction and classification N2D2 INI network description file ; Database Input=conv1 [database] Type=Conv Type=MNIST_IDX_Database KernelWidth=5 Validation=0.2 KernelHeight=5 NbChannels=12 ; Environment Stride=2 [env] ConfigSection=common.config SizeX=24 SizeY=24 ; Third layer (fully connected) [fc1] BatchSize=128 Input=conv2 [env.Transformation] Type=Fc Type=PadCropTransformation NbOutputs=100 Width=[env]SizeX ConfigSection=common.config Height=[env]SizeY ; Output layer (fully connected) [env.OnTheFlyTransformation] [fc2] Type=DistortionTransformation Input=fc1 ApplyTo=LearnOnly Type=Fc ElasticGaussianSize=21 NbOutputs=10 ElasticSigma=6.0 ConfigSection=common.config ElasticScaling=36.0 Scaling=10.0 ; Softmax layer [soft] Rotation=10.0 Input=fc2 ; First layer (convolutionnal) Type=Softmax [conv1] NbOutputs=10 Input=env WithLoss=1 Type=Conv ConfigSection=common.config KernelWidth=5 KernelHeight=5 ; Common solvers config [common.config] NbChannels=6 Stride=2 WeightsSolver.LearningRate=0.05 ConfigSection=common.config WeightsSolver.Decay=0.0005 Solvers.LearningRatePolicy=StepDecay Solvers.LearningRateStepSize= [sp] _EpochSize ; Second layer (convolutionnal) [conv2] Solvers.LearningRateDecay=0.993 Layer-wise weights and kernels Pixel-wise and object wise visualization, distribution and Layer-wise output visualization confusion matrix reporting data-range analysis and data-range analysis CONFIDENTIEL

N2D2: DNN Complexity Analysis High weights memory High in/out buffer memory High computation Absolute Relative metrics metrics CONFIDENTIEL

N2D2: Calibration for Integer Precision • Weights clamping and/or normalization • Layers output activation distribution quantization • Histogram analysis and optimal quantization threshold determination • Using Kullback – Leibler divergence  Goal: automatic and guaranteed best result without retraining CONFIDENTIEL

N2D2: Hardware Exports GPU generic C++/OpenCL HLS FPGA (Intel) HLS FPGA (Xilinx) GPU (NVidia) C++/OpenCL C/HLS C++/CUDA/CuDNN/ TensorRT Dataflow DNeuro ( ) Support SSD and N2D2  TensorRT configurable Faster-RCNN RTL on Drive PX2 RTL library A unified tool for multiple MPPA ( ) hardware targets C++/OpenCL KaNN API DSP-like PNeuro ( ) programmable CPU x86 / ARM / DSP RTL/ASM SIMD processor C/OpenMP C++/OpenCL R-Car ( ) CNN-IP C API NeuroSpike ( ) RTL ASMP ( ) C/OpenMP/CVA8 Generic spike SystemC Generic / not optimized for a specific product CONFIDENTIEL

CRESITT EVENT IA EMBARQUEE ET RECHERCHE AMONT CEA Presentation for - PowerPoint PPT Presentation

CRESITT EVENT IA EMBARQUEE ET RECHERCHE AMONT CEA Presentation for CRESITT | October 17th, 2019 Sandrine Varenne, David Briand CEA LIST sandrine.varenne@cea.fr | 1 IA EMBARQUE ET RECHERCHE AMONT 1 LES TRAVAUX DU CEA DRT ( DIRECTION DE LA

solutions vos besoins de services en lectronique CORDON NETWORKS PRESENTATION CRESITT

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

More Event Combinators CML provides two more event combinators: guard and withNack : val guard :

Ch 11. Event Cognition Seminar on Event Cognition Summary of Event Cognition Event

The The Algae Event The The Algae Event Algae Event Algae Event

RSO Event Planning 7 Steps to a Successful Event Why Plan an Event? Event planning is a great

MBAweb Panel 2019-12-23 1 MBA Recherche MBAweb Panel MBAweb Panel Presentation 2019-12-23

Nud : ( ) Nud : ( ) Relation : [ ] Nud : ( ) Relation : [ ] Modles de recherche possibles

Temporary Event Notices June 2019 What is a Temporary Event Notice? A temporary event

Event Report DIGITAL POSTER MAKING COMPETITION Event Information Event Type:

Event Study Dr. Qiwei Chen Event Study Analysis Definition: An event study attempts to

CBOP3203 What Is an Event? Events Objects that describe what happened Event Sources

Event calculus Problem Event Calculus I Constraint Logic Fritz Hamm Programming Event

Friends of the Arroyos A tributary of the Alameda Creek Alliance Arroyo Mocho * Arroyo Las

Entering An Event in Safety Event Manager Safety Event Manager with QPrecision About

Patient Centered Outcomes Research (PCOR) A Framework for Meaningful Assessment of Value C.

A Case Study for Estimating Cost-Effectiveness of Education Programs Conner Brannen & Meghan

CCPS Reopening of Schools Mervin B. Daugherty, Ed.D., Superintendent Presented to the School

School Planning and CEAs Facilities Planning and Student Enrollment June 18, 2019 1 Our Team

Digital Challenges: Semiconductors Driving Innovation Jean-Marc Chry Deputy CEO

Directions of Indian Power Sector Rohit Yadav Assistant Director, TPRM Division Central

were building our industry, our city and our region, one career at a time. CEA.abcbaltimore.org

Cambridge Eastern Access stakeholder workshop Jo Baker 1 and 2 July 2020 Welcome and

CRESITT EVENT IA EMBARQUEE ET RECHERCHE AMONT CEA Presentation for - PowerPoint PPT Presentation

CRESITT EVENT IA EMBARQUEE ET RECHERCHE AMONT CEA Presentation for CRESITT | October 17th, 2019 Sandrine Varenne, David Briand CEA LIST sandrine.varenne@cea.fr | 1 IA EMBARQUE ET RECHERCHE AMONT 1 LES TRAVAUX DU CEA DRT ( DIRECTION DE LA

solutions vos besoins de services en lectronique CORDON NETWORKS PRESENTATION CRESITT

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

More Event Combinators CML provides two more event combinators: guard and withNack : val guard :

Ch 11. Event Cognition Seminar on Event Cognition Summary of Event Cognition Event

The The Algae Event The The Algae Event Algae Event Algae Event

RSO Event Planning 7 Steps to a Successful Event Why Plan an Event? Event planning is a great

MBAweb Panel 2019-12-23 1 MBA Recherche MBAweb Panel MBAweb Panel Presentation 2019-12-23

Nud : ( ) Nud : ( ) Relation : [ ] Nud : ( ) Relation : [ ] Modles de recherche possibles

Temporary Event Notices June 2019 What is a Temporary Event Notice? A temporary event

Event Report DIGITAL POSTER MAKING COMPETITION Event Information Event Type:

Event Study Dr. Qiwei Chen Event Study Analysis Definition: An event study attempts to

CBOP3203 What Is an Event? Events Objects that describe what happened Event Sources

Event calculus Problem Event Calculus I Constraint Logic Fritz Hamm Programming Event

Friends of the Arroyos A tributary of the Alameda Creek Alliance Arroyo Mocho * Arroyo Las

Entering An Event in Safety Event Manager Safety Event Manager with QPrecision About

Patient Centered Outcomes Research (PCOR) A Framework for Meaningful Assessment of Value C.

A Case Study for Estimating Cost-Effectiveness of Education Programs Conner Brannen &amp; Meghan

CCPS Reopening of Schools Mervin B. Daugherty, Ed.D., Superintendent Presented to the School

School Planning and CEAs Facilities Planning and Student Enrollment June 18, 2019 1 Our Team

Digital Challenges: Semiconductors Driving Innovation Jean-Marc Chry Deputy CEO

Directions of Indian Power Sector Rohit Yadav Assistant Director, TPRM Division Central

were building our industry, our city and our region, one career at a time. CEA.abcbaltimore.org

Cambridge Eastern Access stakeholder workshop Jo Baker 1 and 2 July 2020 Welcome and

A Case Study for Estimating Cost-Effectiveness of Education Programs Conner Brannen & Meghan