Scaling-Up Deep Learning For Autonomous Vehicles JOSE M. ALVAREZ - PowerPoint PPT Presentation

Accuracy vs Efficiency (for Large datasets) 46

Accuracy vs Efficiency 2013 2015 2016 TRAINING TESTING 47

Accuracy vs Efficiency Efficient Training of DNN Goal: maximize training resources while obtaining deployment ‘friendly’ network. 48

Over-parameterization 49

Accuracy vs Efficiency Capacity Non-linearity Num. parameters Same receptive field 50

Accuracy vs Efficiency Validation Accuracy on a 3x3-based Convnet (orange) and the equivalent 5x5-based Convnet ( blue ) 51 https://blog.sicara.com/about-convolutional-layer-convolution-kernel-9a7325d34f7d

Accuracy vs Efficiency Capacity Non-linearity Num. parameters FLOPS ? Non-linearity Same receptive field n x n as [1 x n] and [n x 1] 52

Accuracy vs Efficiency Filter Decompositions for Real-time Semantic Segmentation [Alvarez and Petersson], DecomposeMe: Simplifying ConvNets for End-to-End Learning. Arxiv 2016 53 [Romera, Alvarez et al.] , Efficient ConvNet for Real-Time Semantic Segmentation. IEEE-IV 2017, T-ITS 2018

Accuracy vs Efficiency Filter Decompositions for Real-time Semantic Segmentation Cityscapes dataset (19 classes, 7 categories) Train Pixel Class IoU Category IoU mode accuracy Scratch 94.7 % 70.0 % 86.0 % Pre-trained 95.1 % 71.5 % 86.9 % Forward-Time: Cityscapes 19 classes TEGRA-TX1 TITAN-X Fwd 1024x5 512x256 1024x512 2048x1024 512x256 2048x1024 Pass 12 Time 85 ms 310 ms 1240 ms 8 ms 24 ms 89 ms FPS 11.8 3.2 0.8 125.0 41.7 11.2 54 [Romera, Alvarez et al.] , Efficient ConvNet for Real-Time Semantic Segmentation. IEEE-IV 2017, T-ITS 2018

Accuracy vs Efficiency 55 [Romera, Alvarez et al.] , Efficient ConvNet for Real-Time Semantic Segmentation. IEEE-IV 2017, T-ITS 2018

Accuracy vs Efficiency Common Approach Train a large model (trade-off accuracy / computational cost) Prune / TRAIN DEPLOY Optimize Promising model Optimize for Specific hardware For a specific application 58 Regularization at parameter level

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Train a large model (trade-off accuracy / computational cost) DEPLOY Joint Train / Pruning Optimize for Specific hardware 59

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Convolutional layer Removed 5x1x3x3 To be kept 60

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Common approach: [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 61 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Our Approach: [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 62 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Classification Results 63

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ImageNet dataset: 1.2 million training images and 50.000 for validation split in 1000 categories. Between 5000 and 30000 training images per class. No data augmentation (random flip) . [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 64 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ImageNet Train an over-parameterized architecture up to 768 neurons per layer ( Dec 8 -768 ) [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 65 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ImageNet [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 66 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ICDAR character recognition dataset [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 67 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ICDAR character recognition dataset Train an over-parameterized architecture up to 512 neurons per layer ( Dec 3 -512 ) [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 68 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ICDAR character recognition dataset [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 69 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Joint Training and Pruning Deep Networks Skip connection Dec7-1 Dec7-2 Dec8 Dec8-1 Dec1 Dec2 Dec3 Dec4 Dec5 Dec6 Dec7 Dec8-2 FC 100 0 Skip connection [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 70 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Skip connection Dec1 Dec2 Dec3 Dec4 Dec5 Dec6 Dec7 Dec7-1 Dec7-2 Dec8 Dec8-1 Dec8-2 FC 100 0 Skip connection Initial number 600 Learned number 500 Number of neurons 400 300 200 100 0 L1v L1h L2v L2h L3v L3h L4v L4h L5v L5h L6v L6h L7v L7hL7-1v L7-1h L7-2v L7-2hL8v L8hL8-1v L8-1h L8-2v L8-2h 71 Layer Name

Accuracy vs Efficiency Skip connection Dec1 Dec2 Dec3 Dec4 Dec5 Dec6 Dec7 Dec7-1 Dec7-2 Dec8 Dec8-1 Dec8-2 FC 100 0 Skip connection Initial number 600 Learned number 500 Number of neurons 400 300 (No drop in accuracy) 200 100 0 L1v L1h L2v L2h L3v L3h L4v L4h L5v L5h L6v L6h L7v L7hL7-1v L7-1h L7-2v L7-2hL8v L8hL8-1v L8-1h L8-2v L8-2h 72 Layer Name

KITTI Object Detection Results 73

Accuracy vs Efficiency Object Detection KITTI Prune / TRAIN Optimize Promising model For a specific application 74

Accuracy vs Efficiency Object Detection KITTI Prune / TRAIN Optimize Joint Train / Pruning 75

Accuracy vs Efficiency Compression-aware Training of DNN Convolutional layer Removed 5x1x3x3 To be kept [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 76 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Compression-aware Training of DNN Uncorrelated filters should maximize the use of each parameter / kernel Cross-correlation of Gabor Filters. 77

Accuracy vs Efficiency Compression-aware Training of DNN Weak-Points Significantly larger training time (prohibitive at large scale) . Usually drops in accuracy. Orthogonal filters are difficult to compress (post-processing). 78 [P Rodríguez, J Gonzàlez, G Cucurull, J. M. Gonfaus, X. Roca] Regularizing CNNs with Locally Constrained Decorrelations. ICLR 2017

Accuracy vs Efficiency Compression-aware Training of DNN Convolutional layer Removed 5x1x3x3 To be kept 79

Accuracy vs Efficiency Compression-aware Training of DNN 80 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Compression-aware Training of DNN Our Approach: 81 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Classification Results 82

Accuracy vs Efficiency Compression-aware Training of DNN Quantitative Results on ImageNet using ResNet50* 256-d 1x1, 64 relu 3x1, 64 relu 1x3, 64 relu 1x1, 256 83 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Training Efficient (side benefit) 84

Accuracy vs Efficiency Compression-aware Training of DNN Up to 70% train speed-up (similar accuracy) 87 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Compression-aware Training of DNN Is Over-parameterization needed? Observations: Additional training parameters are needed to initially help the optimizer. Small models are explicitly constrained, same training regime may not be fair. Other optimizers lead to slightly better results in optimizing compact networks from scratch. 88 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency Compression-aware Training of DNN Number of parameters decreases Number of layers increases Data Movements may be more significant than current savings. 89 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

Accuracy vs Efficiency (more on over-parameterization) 90

Accuracy vs Efficiency Capacity Non-linearity Num. parameters Num. layers Same receptive field 91

ExpandNets Exploiting Linear Redundancies 92

ExpandNets 3x3 conv, 64 Input 224x224 3x3 conv, 64 11x11 conv, 64 11x11 conv, 64 3x3 conv, 64 3x3 conv, 64 5x5 conv, 192 3x3 conv, 64 [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

ExpandNets [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

Classification Results

ExpandNets N N 384 Conv3 Conv4 Conv5 192 64 Conv2 Conv1 3 input ImageNet Baseline Expanded N =128 46.72% 49.66% N =256 54.08% 55.46% N =512 58.35% 58.75% 6 @ 3x3 128 @ 3x3 64 @ 3x3 128 @ 3x3 [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

ExpandNets MobileNetV2: The Next Generation of On-Device Computer Vision Networks Model Top-1 Top-5 MobileNetV2 70.78% 91.47% MobileNetV2- expanded 74.85% 92.15% [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

ExpandNets MobileNetV2: The Next Generation of On-Device Computer Vision Networks 3x3 conv, 64 3x3 conv, 64 3x3 conv, 64 Model Top-1 Top-5 MobileNetV2 70.78% 91.47% 3x3 conv, 64 MobileNetV2- expanded 74.85% 92.15% MobileNetV2- expanded-nonlinear 74.17% 91.61% 3x3 conv, 64 MobileNetV2- expanded (nonlinear Init) 75.46% 92.58% [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

ExpandNet beyond classification

Scaling-Up Deep Learning For Autonomous Vehicles JOSE M. ALVAREZ - PowerPoint PPT Presentation

Scaling-Up Deep Learning For Autonomous Vehicles JOSE M. ALVAREZ | | San Jose 2019 1 NVIDIA AI-Infra 2 AI-Infra Team One of our top Goals Industry grade Deep Learning to take AV Perception DNN into production, tested in multiple

Advanced planning for autonomous vehicles using reinforcement learning and deep inverse

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

An Open Approach to Autonomous Vehicles IEEE MICRO, 201 5 Outline 1 . Introduction 2. Vehicles

BUILDING AUTONOMOUS VEHICLES USING DRIVE PX 2 Shri Sundaram, May 8, 2017 Autonomous Vehicles

Oregon Task Force on Autonomous Vehicles Kick-Off Meeting April 18, 2018 4/18/2018 1 Welcome

DEEP LEARNING INFRASTRUCTURE FOR AUTONOMOUS VEHICLES Pradeep Gupta | Solutions Architecture,

Autonomous Vehicles Kevin Lacy, PE, State Traffic Engineer Nomenclature Connected Vehicles

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

The Autonomous Cars Ethics A study on the Decision-Making Mechanism (DMM) of Autonomous

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AUTONOMOUS DRONE NAVIGATION WITH DEEP LEARNING Nikolai Smolyanskiy, Alexey Kamenev, Jeffrey Smith

DEEP LEARNING IN THE FIELD OF AUTONOMOUS DRIVING AN OUTLINE OF THE DEPLOYMENT PROCESS FOR ADAS

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

BASWARE: SIMPLIFY OPERATIONS SPEND SMARTER NOAH Berlin June 13, 2019 GLOBAL COMPLEXITY: SONY

Philip Isell Lind af Hageby The President and CEO presentation 23 April 2020 Adapteo The

Vocational education and 21 st century skill: Promoting adaptability through curriculum, pedagogic

Climate Change and Health An Introduction to PPHs Vulnerability Assessment and Adaptation Plan

CalABLE CALIFORNIA ACHIEVING A BETTER LIFE EXPERIENCE ACT BOARD Implementing Californias ABLE

Alternatives evaluation by subsystem 039 056 035 206 101 103 104 105 201 202 203 204 205

Technical and Tactical Discussion October 2018 Sasha Rearick Skills of Successful WC skiers

Consolidated Framework for Implementation Research Mitja Kos Head, Chair of Social Pharmacy

Scaling-Up Deep Learning For Autonomous Vehicles JOSE M. ALVAREZ - PowerPoint PPT Presentation

Scaling-Up Deep Learning For Autonomous Vehicles JOSE M. ALVAREZ | | San Jose 2019 1 NVIDIA AI-Infra 2 AI-Infra Team One of our top Goals Industry grade Deep Learning to take AV Perception DNN into production, tested in multiple

Advanced planning for autonomous vehicles using reinforcement learning and deep inverse

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

UP UP AND OUT: SCALING SOFTWARE WITH AKKA Jonas Bonr CTO Typesafe @jboner Scaling software

An Open Approach to Autonomous Vehicles IEEE MICRO, 201 5 Outline 1 . Introduction 2. Vehicles

BUILDING AUTONOMOUS VEHICLES USING DRIVE PX 2 Shri Sundaram, May 8, 2017 Autonomous Vehicles

Oregon Task Force on Autonomous Vehicles Kick-Off Meeting April 18, 2018 4/18/2018 1 Welcome

DEEP LEARNING INFRASTRUCTURE FOR AUTONOMOUS VEHICLES Pradeep Gupta | Solutions Architecture,

Autonomous Vehicles Kevin Lacy, PE, State Traffic Engineer Nomenclature Connected Vehicles

Analysis of Scaling Algorithms for Matrix &amp; Operator Scaling Contents Scaling Algorithms

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

The Autonomous Cars Ethics A study on the Decision-Making Mechanism (DMM) of Autonomous

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AUTONOMOUS DRONE NAVIGATION WITH DEEP LEARNING Nikolai Smolyanskiy, Alexey Kamenev, Jeffrey Smith

DEEP LEARNING IN THE FIELD OF AUTONOMOUS DRIVING AN OUTLINE OF THE DEPLOYMENT PROCESS FOR ADAS

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

BASWARE: SIMPLIFY OPERATIONS SPEND SMARTER NOAH Berlin June 13, 2019 GLOBAL COMPLEXITY: SONY

Philip Isell Lind af Hageby The President and CEO presentation 23 April 2020 Adapteo The

Vocational education and 21 st century skill: Promoting adaptability through curriculum, pedagogic

Climate Change and Health An Introduction to PPHs Vulnerability Assessment and Adaptation Plan

CalABLE CALIFORNIA ACHIEVING A BETTER LIFE EXPERIENCE ACT BOARD Implementing Californias ABLE

Alternatives evaluation by subsystem 039 056 035 206 101 103 104 105 201 202 203 204 205

Technical and Tactical Discussion October 2018 Sasha Rearick Skills of Successful WC skiers

Consolidated Framework for Implementation Research Mitja Kos Head, Chair of Social Pharmacy

Analysis of Scaling Algorithms for Matrix & Operator Scaling Contents Scaling Algorithms