Transparent parallelization of neural network training Cyprien Noel - PowerPoint PPT Presentation

Mar 02, 2024 •233 likes •562 views

Transparent parallelization of neural network training Cyprien Noel Flickr / Yahoo - GTC 2015 by overthemoon Outline Neural Nets at Flickr Training Fast Parallel Distributed Q&A Tagging Photos Class Probability

Transparent parallelization of neural network training Cyprien Noel Flickr / Yahoo - GTC 2015 by overthemoon
Outline ▪ Neural Nets at Flickr ▪ Training Fast ▪ Parallel ▪ Distributed ▪ Q&A
Tagging Photos Class Probability Flowers 0.98 Classifiers Outdoors 0.95 Classifiers Classifiers Cat 0.001 Grass 0.6 Any photo on Flickr is classified using computer vision
Auto Tags Feeding Search ▪ ▪ ▪ ▪ ▪ ▪
Tagging the Flickr corpus ▪ Classify millions of new photos per day ▪ Apply new models to billions of photos ▪ Train new models using Caffe
Training new models ▪ Manual experimentation ▪ Hyperparameter search ▪ Limitation is training time → Parallelize Caffe
Goals ▪ “Transparent” ▪ Code Isolation ▪ Existing Models ▪ Globally connected layers ▪ Existing Infrastructure
Outline ▪ Neural Nets at Flickr ▪ Training Fast ▪ Parallel ▪ Distributed ▪ Q&A
GoogLeNet, 2014
Ways to Parallelize ▪ Model ▪ Caffe team enabling this now ▪ Data ▪ Synchronous ▪ Asynchronous
Outline ▪ Neural Nets at Flickr ▪ Training Faster ▪ Parallel ▪ Distributed ▪ Q&A
First Approach: CPU ▪ Hogwild! (2011) ▪ Cores read and write from shared buffer ▪ No synchronization ▪ Data races surprisingly low
MNIST CPU
Hogwild ▪ Plateaus with core counts ▪ Some Potential ▪ On a grid ▪ With model parallelism
But we are at GTC
GPU Cluster ▪ A lot of time spent preparing experiments ▪ Code Deployment ▪ ▪ Data Handling ▪ On the fly datasets for “big data”
Outline ▪ Neural Nets at Flickr ▪ Training Fast ▪ Parallel ▪ Distributed ▪ Q&A
Second Approach: Lots of Boxes
Second Approach: Lots of Boxes ▪ Exchange gradients between nodes ▪ Parameter server setup ▪ Easy: move data fast
GPU memory - PCI - Ethernet
Second Approach: Lots of Boxes ▪ 230MB * 2 * N per batch ▪ TCP/UDP chokes ▪ Machines unreachable ▪ No InfiniBand or RoCE
Second Approach: Lots of Boxes ▪ Modify Caffe: chunk parameters
Packet_mmap Buffer App Kernel App Kernel
MNIST
ImageNet
NVIDIA ▪ Large Machines ▪ 4 or 8 GPUs ▪ Root PCI switches ▪ InfiniBand
Third Approach: CUDA P2P ▪ GPUs on single machine ▪ Data Feeding ▪ Caffe Pipeline ▪ Async Streams
State of Things ▪ Async ~8x but no momentum ▪ Sync ~2x ▪ Combining both, and model parallelism ▪ Working on auto tuning of params (batch, rate) ▪ Different ratios of compute vs. IO
Takeaway ▪ Check Caffe, including Flickr’s contributions ▪ CUDA + Docker = Love ▪ Small SOC servers might be interesting for ML
Thanks! Flickr vision team Flickr backend team Yahoo labs cypof@yahoo-inc.com

Recommend

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural IR tasks Neural IR architecture Feature Representations Neural IR query auto completion Neural IR query suggestion Neural IR document

1.48k views • 18 slides

Speed up evaluation by parallelization /////////// November 2018 Michael Weiss Bayer AG

POStER Speed up evaluation by parallelization /////////// November 2018 Michael Weiss Bayer AG POStER - Speed up evaluation by parallelization What is POStER? The Basics Why to do parallelization? How to parallelize SAS programs? What

353 views • 18 slides

Parallelization and Parallelization and Proling Proling Programming for Statistical

Parallelization and Parallelization and Proling Proling Programming for Statistical Programming for Statistical Science Science Shawn Santo Shawn Santo 1 / 33 1 / 33 Supplementary materials Full video lecture available in Zoom

689 views • 33 slides

Parallelization Parallelization Programming for Statistical Programming for Statistical Science

Parallelization Parallelization Programming for Statistical Programming for Statistical Science Science Shawn Santo Shawn Santo 1 / 31 1 / 31 Supplementary materials Full video lecture available in Zoom Cloud Recordings Additional

465 views • 31 slides

Code Parallelization Fabrice Schlegel Introduction Goal: Efficient parallelization and memory

3D Particle Methods Code Parallelization Fabrice Schlegel Introduction Goal: Efficient parallelization and memory optimization of a CFD code used for the direct numerical simulation (DNS) of turbulent combustion. Hardware: 1) Our lab

384 views • 20 slides

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks and Handwriting Recognition Steven Sloss Math 164 Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven Sloss Structure Training Neural Networks Math 164 Motivation Problem

890 views • 41 slides

Simulating Transparent Migration in Java Java doesnt provide transparent migration. non

Simulating Transparent Migration in Java Java doesnt provide transparent migration. non transparent programm as if mymethod(int x) mymethod() { { if ( x==1) Use code instrumentation go() go() if ( x==2) } } save local

322 views • 13 slides

Transparent Assessment Providing transparent goals and expectations for students Jonathon Adams

Transparent Assessment Providing transparent goals and expectations for students Jonathon Adams Shinshu University Thursday, 9 August 12 Outline Aims of the presentation An approach to transparent goals and expectations Example

576 views • 46 slides

Transparent Parallelization of Binary Code Benot Pradelle Alain Ketterlin Philippe Clauss

Transparent Parallelization of Binary Code Benot Pradelle Alain Ketterlin Philippe Clauss Universit de Strasbourg INRIA (CAMUS team, Centre Nancy Grand-Est) CNRS (LSIIT, UMR 7005) First International Workshop on Polyhedral Compilation

305 views • 18 slides

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Feed-forward Networks Network Training Error Backpropagation Deep Learning Feed-forward Networks Network Training Error Backpropagation Deep Learning Neural Networks Neural networks arise from attempts to model Neural Networks

380 views • 9 slides

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural Networks can represent complex decision boundaries decision boundaries Variable size. Any boolean function can be Variable size. Any boolean

358 views • 14 slides

ICASSP 2017 Tutorial on Methods for Interpreting and Understanding Deep Neural Networks G.

ICASSP 2017 Tutorial on Methods for Interpreting and Understanding Deep Neural Networks G. Montavon, W. Samek, K.-R. Mller Part 2: Making Deep Neural Networks Transparent 5 March 2017 Making Deep Neural Nets Transparent DNN transparency

448 views • 44 slides

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks and their Application to Go A. Bausch Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training neural networks Problems AlphaGo Anne-Marie Bausch The Game of Go Policy Network

280 views • 24 slides

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1 Advances and Challenges 2 Gongbo Tang Neural Machine Translation 2/52 Neural Machine Translation Figure Recurrent neural network based NMT

907 views • 73 slides

Neural Network II Neural Network II Week 8 1 Team Homework Assignment #10 Team Homework

Neural Network II Neural Network II Week 8 1 Team Homework Assignment #10 Team Homework Assignment #10 Read pp. 327 334. Read pp. 327 334. Do Example 6.9. Explore neural network tools and try to use a tool for solving Example

628 views • 47 slides

Transparent migration of virtual Transparent migration of virtual infrastructures in large

Transparent migration of virtual Transparent migration of virtual infrastructures in large datacenters for infrastructures in large datacenters for Cloud Computing Cloud Computing Workshop on Management of Cloud Systems ( Workshop on

489 views • 23 slides

H1 2018 Interim Results and Project Update September 2018 Disclaimer Certain statements within

AIM: HGM H1 2018 Interim Results and Project Update September 2018 Disclaimer Certain statements within this presentation constitute forward looking statements. Such forward looking statements involve risks and other factors which may cause

465 views • 21 slides

INVESTOR PRESENTATION 9M 2019 RESULTS AGENDA 9M 2019 Results Our Vision: Commercial Results

INVESTOR PRESENTATION 9M 2019 RESULTS AGENDA 9M 2019 Results Our Vision: Commercial Results To Be the No.1 Private Strategy & Business update Bank unique by Value of Appendix: Financials Service, Innovation and Sustainability

945 views • 53 slides

2015 MILANO, ITALY for WORLD COFFEE CONFERENCE 2015 THE VISION To host the World Coffee

MILAN, ITALY FOR WORLD COFFEE CONFERENCE 2015 MILANO, ITALY for WORLD COFFEE CONFERENCE 2015 THE VISION To host the World Coffee Conference during the Milan Universal Exhibition whose theme, Feeding the Planet, Energy for Life

627 views • 13 slides

A MeerKAT HI survey of Fornax Paolo Serra Greg Bryan Erwin de Blok Gyula Jozsa Tom Oosterloo

Netherlands Institute for Radio Astronomy A MeerKAT HI survey of Fornax Paolo Serra Greg Bryan Erwin de Blok Gyula Jozsa Tom Oosterloo Reynier Peletier Roberto Pizzo Scott Trager Jacqueline van Gorkom Marc Verheijen ASTRON is part of the

117 views • 9 slides

The MOF4AIR Project M etal O rganic F rameworks for carbon dioxide A dsorption processes in power

The MOF4AIR Project M etal O rganic F rameworks for carbon dioxide A dsorption processes in power production and energy I ntensive indust R ies This project has received funding from the European Unions Horizon 2020 research and innovation

344 views • 21 slides

L A M B D A M E A N S C L U S T E R I N G A U T O M A T I C P A R A M E T E R S E A R C H A N

L A M B D A M E A N S C L U S T E R I N G A U T O M A T I C P A R A M E T E R S E A R C H A N D D I S T R I B U T E D C O M P U T I N G I M P L E M E N T A T I O N M A R C U S C O M I T E R , M I R I A M C H A , H T K U N G , S U R A

444 views • 30 slides

Asynchronous K-Means Clustering of Multiple Data Sets Marek Fiser, Illia Ziamtsov, Ariful Azad,

Asynchronous K-Means Clustering of Multiple Data Sets Marek Fiser, Illia Ziamtsov, Ariful Azad, Bedrich Benes, Alex Pothen Motivation Clustering bottleneck in Flow Cytometry research 3,000 data sets 25,000 points in 7D per data set 19 separate

407 views • 19 slides

Addressing the Learning Needs of Gifted Students Through the Schoolwide Cluster Grouping Model

Addressing the Learning Needs of Gifted Students Through the Schoolwide Cluster Grouping Model Sioux City Schools Parent Night June 8, 2017 Dina Brulles, Ph.D. www.giftededucationconsultants.com I am co-author of: Differentiated Lessons

732 views • 50 slides