DEE DEEP LE P LEARNI ARNING NG MOD MODELS ELS Mathew Salvaris - PowerPoint PPT Presentation

DIS DISTRI TRIBUT BUTED ED TRA TRAINI INING NG OF OF DEE DEEP LE P LEARNI ARNING NG MOD MODELS ELS Mathew Salvaris @msalvaris Ilia Karmanov @ikdeepl Miguel Fierro @miguelgfierro

Rosetta Stone of Deep Learning more info: https://github.com/ilkarman/DeepLearningFrameworks Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

ImageNet Competition error (%) ImageNet top-5 error 15.3% 7.3% 6.7% 5.1% (human) 3.8% 3.8% 3.6% 3.1% 2.4% Inception- ResNext ResNet NASNet AmoebaNet AlexNet VGG Inception ResNet Instagram (2015) (2017) (2017) (2012) (2014) (2015) (2016) (2018) Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training mode: Data parallelism Worker 1 Worker 2 Job manager CNN model Subset 1 CNN model Subset 2 CNN model Dataset Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training mode: Model parallelism Worker 1 Worker 2 Job manager Submodel 1 Submodel 2 CNN model Dataset Dataset Dataset Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Data parallelism vs model parallelism Da Data ta parallelism arallelism Mod odel el parallelism arallelism ▪ Easier implementation ▪ Better scalability of large models ▪ Stronger fault tolerance ▪ Less memory on each GPU ▪ Higher cluster utilization Why no Wh y not bo t both th? ? Da Data ta para aralleli llelism sm fo for r CN CNN lay N layers ers an and model del par aralle allelism lism in in FC FC la laye yers rs so source: ce: Alex ex Krizhevs zhevsky ky. . 2014. On One weird rd trick ck fo for pa paralleli lelizing zing co convolutio lutional nal neura ural l netwo works. rks. https ps://a //arxiv.o rxiv.org/a rg/abs bs/14 /1404.5 04.5997 Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Training strategies: parameter averaging Worker 1 Worker 2 Subset 1 CNN model Subset 2 CNN model Average of weights for each worker Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Training strategies: distributed gradient based Worker 1 Worker 2 Subset 1 CNN model Subset 2 CNN model Synchronous Gradients of each worker Asynchronous Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Overview of distributed training Install software and containers Schedule jobs Provision clusters of VMs Share results Distribute data Scale resources Handling failures Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Azure Distributed Platforms ▪ Batch AI ▪ Batch Shipyard ▪ DL Workspace Horovod Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Batch Shipyard • Supports Docker and Singularity: run your Docker and Singularity containers within the same job, side-by-side or even concurrently • Move data easily between locally accessible storage systems, remote filesystems, Azure Blob or File Storage, and compute nodes • Supports local storage, Azure Blob or File Storage, and NFS. • Low priority nodes https://github.com/Azure/batch-shipyard Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Batch AI • Supports running on Docker container as well as the Data Science Virtual Machine • Supports local storage, Azure Blob or File Storage, and NFS. • Low priority nodes https://github.com/Azure/BatchAI Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

DL Workspace • Runs jobs inside Docker • Uses Kubernetes • Can be deployed anywhere not just Azure • Supports local storage and NFS https://github.com/Microsoft/DLWorkspace Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Training with Batch AI 1) Create scripts to run on Batch AI 1 and transfer them to file storage I 2) Write the data to storage A I 2 3) Create the docker containers for each DL framework and transfer them to a container registry 3 Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

1) Create a Batch AI Pool I A I 2 2) Each job will pull in the 2 appropriate container, script and 1 Batch AI Pool load data from chosen storage 3) Once the job is completed all the 2 results will be written to the fileshare 3 Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Batch AI Interface CLI Python SDK az batchai cluster create --name nc24r --image UbuntuLTS --vm-size Standard_NC24rs_v3 --min 8 --max 8 --afs-name $FILESHARE_NAME --afs-mount-path extfs --storage-account-name $STORAGE_ACCOUNT_NAME --storage-account-key $storage_account_key --nfs $NFS_NAME --nfs-mount-path nfs Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training with NFS ▪ Batch AI cluster configuration with Copy Data NFS share Batch AI Pool NFS I Share A I az batchai cluster create Mounted --name nc24r Fileshare --image UbuntuLTS --vm-size Standard_NC24rs_v3 --min 8 --max 8 --afs-name $FILESHARE_NAME --afs-mount-path extfs --storage-account-name $STORAGE_ACCOUNT_NAME --storage-account-key $storage_account_key --nfs $NFS_NAME --nfs-mount-path nfs Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training with blob storage ▪ Batch AI cluster configuration with Copy Data mounted blob Batch AI Pool Mounted I Blob A I az batchai cluster create Mounted --name nc24r Fileshare --image UbuntuLTS --vm-size Standard_NC24rs_v3 --min 8 --max 8 --afs-name $FILESHARE_NAME --afs-mount-path extfs --container-name $CONTAINER_NAME --container-mount-path extcn --storage-account-name $STORAGE_ACCOUNT_NAME --storage-account-key $storage_account_key Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training with local storage ▪ Batch AI cluster configuration with Copy Data copying the data to the nodes Batch AI Pool I A I az batchai cluster create --name nc24r Mounted --image UbuntuLTS Fileshare --vm-size Standard_NC24r --min 8 --max 8 --afs-name $FILESHARE_NAME --afs-mount-path extfs --container-name $CONTAINER_NAME --container-mount-path extcn --storage-account-name $STORAGE_ACCOUNT_NAME --storage-account-key $storage_account_key Node preparation configuration -c cluster.json Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training Results images/second Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training with Horovod Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training with PyTorch Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training with Chainer Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Distributed training with CNTK 1-bit SGD with MPI Blocked Momentum with MPI Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Demo Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Acknowledgements Hongzhi Li Alex Sutton Alex Yukhanov Attribution of some images: http://morguefile.com/ Mathew Salvaris (@msalvaris) – Ilia Karmanov (@ikdeepl) – Miguel Fierro (@miguelgfierro)

Thanks! Mathew Salvaris @msalvaris Ilia Karmanov @ikdeepl Miguel Fierro @miguelgfierro

DEE DEEP LE P LEARNI ARNING NG MOD MODELS ELS Mathew Salvaris - PowerPoint PPT Presentation

DIS DISTRI TRIBUT BUTED ED TRA TRAINI INING NG OF OF DEE DEEP LE P LEARNI ARNING NG MOD MODELS ELS Mathew Salvaris @msalvaris Ilia Karmanov @ikdeepl Miguel Fierro @miguelgfierro Rosetta Stone of

Modular Arithmetic Addition and Subtraction Modulus 9 0 mod 9 = 0 9 mod 9 = 0 1 mod 9 = 1 10

mod m y x x y 1 (mod m ) gcd( x , m ) = 1 3 mod 5 2 3 2 6 1 (mod 5) 10

English Learners (ELs) Who Are American Indian and/or Alaska Native (AI/AN) States With the

cse 311: foundations of computing Spring 2015 Lecture 13: Primes, GCDs, modular inverses

Quadratic Residues and Reciprocity Is 3 congruent to the square of some number modulo 7 ?

RfG 1 Year Implementation : Timeline National Parameter identification Mod 1 Structure /

C-Mod Schedule and Capabilities C-Mod Schedule and Capabilities Jim Irby MIT-PSFC Ideas Forum

BUI UILDI DING NG A ROB OBUST ON ONLINE LEARNI ARNING NG S STRUCTURE ON RE ON THE OL

Splines mod m Nealy Bowden Smith College July 24, 2014 Bowden Splines mod m Spline Basics 1

Mod-L Test: A Brief Report 1. Mod-L Sequence Mod-L sequence can produce full Stokes parameters

Correlation ECE for core turbulence measurements on Alcator C-Mod C-Mod Ideas Forum 2011 A.

Mod 1 Unit 1 Lesson 2 Lecture Slides.notebook September 09, 2015 0 81 1 Mod 1 Unit 1 Lesson

Ba Bayesi esian Deep Deep Le Lear arning ning Prof. Leal-Taix and Prof. Niessner 1 Go

Ba Bayesi esian Deep Deep Le Lear arning ning Prof. Leal-Taix and Prof. Niessner 1 Go

Low carbon opportunities Tuesday 21 November 2017 PHILIP MARGERISON MOD Outreach Presentation

WHY? rem(a,n) = rem(b,n) 66666663 example: 30 12 (mod 9) 788253 - since xxxxxxx0

Mason Public Schools Distance Learning Plan Recap Since March 13, 2020 In-person

Custom Writing Service - Special Prices Phd dissertation proposal presentation Research paper

Custom Writing Service - Special Prices Buy a dissertation defense presentation Research papers

TECHNIQUES FOR WRITING AND PRESENTATION OF THESIS: A COMPANION GUIDE TO POSTGRADUATE STUDENTS IN

DISTRIBUTED STREAMING TEXT EMBEDDING METHOD => DISTRIBUTED TRAINING WITH PYTORCH SNU 2018 - 2

Distance Education Affinity Group: Best Practices for Online Learning Embracing our Students

IS WRONG. Dave Birss Hello, Im Dave Birss prepared for Owen James - A Meeting of Minds

Presented By: Marianne Litzman , Assistant Superintendent for Curriculum & Instruction Susan

DEE DEEP LE P LEARNI ARNING NG MOD MODELS ELS Mathew Salvaris - PowerPoint PPT Presentation

DIS DISTRI TRIBUT BUTED ED TRA TRAINI INING NG OF OF DEE DEEP LE P LEARNI ARNING NG MOD MODELS ELS Mathew Salvaris @msalvaris Ilia Karmanov @ikdeepl Miguel Fierro @miguelgfierro Rosetta Stone of

Modular Arithmetic Addition and Subtraction Modulus 9 0 mod 9 = 0 9 mod 9 = 0 1 mod 9 = 1 10

mod m y x x y 1 (mod m ) gcd( x , m ) = 1 3 mod 5 2 3 2 6 1 (mod 5) 10

English Learners (ELs) Who Are American Indian and/or Alaska Native (AI/AN) States With the

cse 311: foundations of computing Spring 2015 Lecture 13: Primes, GCDs, modular inverses

Quadratic Residues and Reciprocity Is 3 congruent to the square of some number modulo 7 ?

RfG 1 Year Implementation : Timeline National Parameter identification Mod 1 Structure /

C-Mod Schedule and Capabilities C-Mod Schedule and Capabilities Jim Irby MIT-PSFC Ideas Forum

BUI UILDI DING NG A ROB OBUST ON ONLINE LEARNI ARNING NG S STRUCTURE ON RE ON THE OL

Splines mod m Nealy Bowden Smith College July 24, 2014 Bowden Splines mod m Spline Basics 1

Mod-L Test: A Brief Report 1. Mod-L Sequence Mod-L sequence can produce full Stokes parameters

Correlation ECE for core turbulence measurements on Alcator C-Mod C-Mod Ideas Forum 2011 A.

Mod 1 Unit 1 Lesson 2 Lecture Slides.notebook September 09, 2015 0 81 1 Mod 1 Unit 1 Lesson

Ba Bayesi esian Deep Deep Le Lear arning ning Prof. Leal-Taix and Prof. Niessner 1 Go

Ba Bayesi esian Deep Deep Le Lear arning ning Prof. Leal-Taix and Prof. Niessner 1 Go

Low carbon opportunities Tuesday 21 November 2017 PHILIP MARGERISON MOD Outreach Presentation

WHY? rem(a,n) = rem(b,n) 66666663 example: 30 12 (mod 9) 788253 - since xxxxxxx0

Mason Public Schools Distance Learning Plan Recap Since March 13, 2020 In-person

Custom Writing Service - Special Prices Phd dissertation proposal presentation Research paper

Custom Writing Service - Special Prices Buy a dissertation defense presentation Research papers

TECHNIQUES FOR WRITING AND PRESENTATION OF THESIS: A COMPANION GUIDE TO POSTGRADUATE STUDENTS IN

DISTRIBUTED STREAMING TEXT EMBEDDING METHOD =&gt; DISTRIBUTED TRAINING WITH PYTORCH SNU 2018 - 2

Distance Education Affinity Group: Best Practices for Online Learning Embracing our Students

IS WRONG. Dave Birss Hello, Im Dave Birss prepared for Owen James - A Meeting of Minds

Presented By: Marianne Litzman , Assistant Superintendent for Curriculum &amp; Instruction Susan

DISTRIBUTED STREAMING TEXT EMBEDDING METHOD => DISTRIBUTED TRAINING WITH PYTORCH SNU 2018 - 2

Presented By: Marianne Litzman , Assistant Superintendent for Curriculum & Instruction Susan