Parameter efficient training of deep convolutional neural networks - PowerPoint PPT Presentation

Mar 31, 2023 •631 likes •727 views

Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization Hesham Mostafa (Intel AI) Xin Wang (Intel AI, Cerebras Systems) Easy : post-training (sparse) compression Hard : direct training of sparse

Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization Hesham Mostafa (Intel AI) Xin Wang (Intel AI, Cerebras Systems)
Easy : post-training (sparse) compression Hard : direct training of sparse networks Compression
“Winning lottery tickets” (Frankle & Carbin 2018): post hoc identification of trainable sparse nets Compression
Dynamic sparse reparameterization (ours): training-time structural exploration
Direct training sparse nets to generalize as well as post-training compression : is this possible? - YES Directly trained sparse nets : are they “winning lottery tickets”? - NO
Dynamic sparse reparameterization prune grow 1 for each sparse parameter tensor W i do ( W i , k i ) ← prune_by_threshold ( W i , H ) ◃ k i is the number of pruned weights 2 l i ← number_of_nonzero_entries ( W i ) ◃ Number of surviving weights after pruning 3 4 end for 5 ( K, L ) ← ( � i k i , � i l i ) ◃ Total number of pruned and surviving weights 6 H ← adjust_pruning_threshold ( H, K, δ ) ◃ Adjust pruning threshold 7 for each sparse parameter tensor W i do W i ← grow_back ( W i , l i ◃ Grow l i L K ) L K zero-initialized weights at random in W i 8 9 end for
Closed gap between post-training compression and direct training of sparse nets WRN-28-2 on CIFAR10 Resnet-50 on Imagenet Global sparsity Sparsity (# Param) 0.8 (7.3M) 0.9 (5.1M) 0.0 (25.6M) 0.9 0.8 0.7 0.6 0.5 95 72.4 90.9 70.7 89.9 Thin dense [-2.5] [-1.5] [-4.2] [-2.5] 71.6 90.4 67.8 88.4 Test accuracy% 94 Static sparse [-3.3] [-2.0] [-7.1] [-4.0] 71.7 90.6 70.2 90.0 DeepR 93 (Bellec et al., 2017) [-3.2] [-1.8] [-4.7] [-2.4] 74.9 92.4 72.6 91.2 70.4 90.1 SET [0.0] [0.0] (Mocanu et al., 2018) [-2.3] [-1.2] [-4.5] [-2.3] Full dense 92 Dynamic sparse 73.3 92.4 71.6 90.5 Compressed sparse (Ours) [ -1.6 ] [ 0.0 ] [ -3.3 ] [ -1.9 ] Thin dense Static sparse DeepR SET Dynamic sparse 73.2 91.5 70.3 90.0 Compressed sparse (Zhu & Gupta, 2017) [-1.7] [-0.9] [-4.6] [-2.4] 161 306 451 596 741 Number of parameters (K)
Directly trained sparse nets are not “winning tickets”: exploration of structural degrees of freedom is crucial
Visit our poster :   Wednesday, Pacific Ballroom #248

Recommend

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks<br/><br/> 5/4/19, 4(03 PM Convolutional Neural Networks<br/><br/> 5/4/19, 4(03 PM Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use UMaine

412 views • 9 slides

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Transfer Learning with Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural Networks A breakthough Convolutional Neural Networks VGG-16 example Layers of Convolutional filters Bottleneck

625 views • 23 slides

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN) A.k.a. CNN or ConvNet Adit Deshpande, A Beginner's Guide To Understanding Convolutional Neural Networks. Digital Images Input array: an images

1.42k views • 72 slides

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural Neural CSCE 970 Lecture 4: Networks Networks Good for data with a grid-like topology Stephen Scott Convolutional Neural Networks Stephen Scott

355 views • 3 slides

Parameter Hub A Rack-Scale Parameter Server for Efficient Cloud-based Distributed Deep Neural

Parameter Hub A Rack-Scale Parameter Server for Efficient Cloud-based Distributed Deep Neural Network Training Liang Luo, Jacob Nelson, Luis Ceze, Amar Phanishayee and Arvind Krishnamurthy 1 - DNN training is computationally expensive - Needs

565 views • 44 slides

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly used for what sort of task? a) Recognizing images b) Transcribing speech c) Translating documents d) Playing games Neural network review

431 views • 13 slides

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing 2016 Prof. Luiz Velho Convolutional Neural Networks 1 Summary & References 08/11 ImageNet Classification with Deep Convolutional Neural Networks

648 views • 26 slides

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Deep Convolutional Neural Nets 1 / 25 Outline 1 Why Neural Networks? 2 Circuits 3 Neurons, Layers, and Networks 4 Correlation and Convolution 5

287 views • 25 slides

CS7015 (Deep Learning) : Lecture 13 Visualizing Convolutional Neural Networks, Guided

CS7015 (Deep Learning) : Lecture 13 Visualizing Convolutional Neural Networks, Guided Backpropagation, Deep Dream, Deep Art, Fooling Convolutional Neural Networks Mitesh M. Khapra Department of Computer Science and Engineering Indian Institute

1.07k views • 72 slides

Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier)

Lecture 8: Convolutional Neural Nets Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/ 1 Convolutional Neural Nets (ConvNets, CNNs) [4 parameters, applied 3

615 views • 24 slides

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks for Sentence Classification Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34 Convolutional Neural Networks for Sentence Classification Agenda Word Embeddings

389 views • 34 slides

ON TEGRA X1 ALAN WANG, NVIDIA Convolutional Neural Network optimization target Result

DIRECT CONVOLUTION FOR DEEP NEURAL NETWORK CLASSIFICATION ON TEGRA X1 ALAN WANG, NVIDIA Convolutional Neural Network optimization target Result Convolutional Fully Connected Input layer layer Convolutional Layer An example: A E

621 views • 18 slides

Outline Convolutional Neural Network Architectures for Matching Natural Language Sentences.

Outline Hu, NIPS14 Irsoy, NIPS14 Outline Convolutional Neural Network Architectures for Matching Natural Language Sentences. NIPS14 Convolutional Sentence Model Convolutional Matching Models Experiments Deep Recursive Neural

779 views • 35 slides

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing 2016 Prof. Luiz Velho Convolutional Neural Networks 1 Summary & References 08/11 ImageNet Classification with Deep Convolutional

444 views • 27 slides

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Band-limited Training and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast Training of Convolutional Networks through FFTs Fast Convolutional Nets With fbfft: A GPU Performance Evaluation Data: x

648 views • 46 slides

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture you should understand the following concepts convolutional neural networks (CNN) convolution and its advantage pooling and its advantage 2

750 views • 50 slides

Developing a Standards-Based Localiza6on Service Bus at Intel

Developing a Standards-Based Localiza6on Service Bus at Intel David Filip CNGL / ADAPT Loc Dufresne de Virel Intel Intro Intels in-house

436 views • 8 slides

Compiler Design Spring 2018 Thomas R. Gross Computer Science Department ETH Zurich, Switzerland

Compiler Design Spring 2018 Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 1 What I hope you learned in this class 1. Compiler design: Structure of a simple compiler Simple: 2-3K lines of Java code (maybe a bit more)

850 views • 33 slides

1st Graduate Students Conference Computer Science Department October 25th, 2019 PhD Candidate

Senior Software Security Engineer sad Sphynx Technology Solutions AG (CH,CY) Email: smyrlis@sphynx.ch 1st Graduate Students Conference Computer Science Department October 25th, 2019 PhD Candidate School of Mathematics, Computer Science and

923 views • 32 slides

CSCI 127 Introduction to Database Systems Integrity Constraints and Functional Dependencies

CSCI 127 Introduction to Database Systems Integrity Constraints and Functional Dependencies CSCI 127: Introduction to Database Systems Integrity Constraints Purpose: Prevent semantic inconsistencies in data total savings + checking

1.01k views • 64 slides

Introduction to Artificial Intelligence Kalev Kask ICS 271 Fall 2017

Introduction to Artificial Intelligence Kalev Kask ICS 271 Fall 2017 http://www.ics.uci.edu/~kkask/Fall-2017 CS271/ 271-fall 2017 Course requirements Assignments: There will be weekly homework assignments, a project, a final.

1.11k views • 52 slides

Intelligent Agents Chapter 2 Outline Agents and environments Rationality Task

Intelligent Agents Chapter 2 Outline Agents and environments Rationality Task environment: PEAS: Performance measure Environment Actuators Sensors Environment types Agent types Agents and Environments An

579 views • 47 slides

The Multi-Slot Framework: Teleporting Intelligent Agents Some insights into the identity problem

The Multi-Slot Framework: Teleporting Intelligent Agents Some insights into the identity problem Laurent Orseau AgroParisTech laurent.orseau@agroparistech.fr Thanks to Mark Ring and Stanislas Sochacki AGI 2014 Qubec The Papers

494 views • 26 slides

Virtual Environments and Game AI Dr Michael Papasimeon Guest Lecture Graphics and Interaction 9

Virtual Environments and Game AI Dr Michael Papasimeon Guest Lecture Graphics and Interaction 9 August 2016 Introduction Introduction So what is this lecture all about? In general... Where Artificial Intelligence meets Computer Graphics

717 views • 38 slides