TRAINING NEURAL TRAINING NEURAL NETWORKS ON THE NETWORKS ON THE - - PowerPoint PPT Presentation

training neural training neural networks on the networks
SMART_READER_LITE
LIVE PREVIEW

TRAINING NEURAL TRAINING NEURAL NETWORKS ON THE NETWORKS ON THE - - PowerPoint PPT Presentation

TRAINING NEURAL TRAINING NEURAL NETWORKS ON THE NETWORKS ON THE EDGE EDGE Navjot Kukreja, Alena Shilova Also: Olivier Beaumont Jan Huckelheim Nicola Ferrier Paul Hovland Gerard Gorman BACKGROUND BACKGROUND Typical data ow pattern


slide-1
SLIDE 1

TRAINING NEURAL TRAINING NEURAL NETWORKS ON THE NETWORKS ON THE EDGE EDGE

Navjot Kukreja, Alena Shilova

slide-2
SLIDE 2

Also: Olivier Beaumont Jan Huckelheim Nicola Ferrier Paul Hovland Gerard Gorman

slide-3
SLIDE 3

BACKGROUND BACKGROUND

slide-4
SLIDE 4

Typical data ow pattern for adjoint problems

slide-5
SLIDE 5

Memory consumption during an adjoint problem

slide-6
SLIDE 6

Checkpointing (Revolve)

slide-7
SLIDE 7 Setup

Forward step Executing forward step Saved forward step Reverse step Executing reverse step Reverse step completed

slide-8
SLIDE 8

Where else do we see the same data-access pattern?

VGGNet

slide-9
SLIDE 9

ARRAY OF THINGS ARRAY OF THINGS

slide-10
SLIDE 10
slide-11
SLIDE 11

WAGGLE PAYLOAD COMPUTER WAGGLE PAYLOAD COMPUTER

ODROID XU4 based on the Samsung Exynos5422 CPU four A15 cores, four A7 cores Mali-T628 MP6 GPU that supports OpenCL, 2GB LPDDR3 RAM attached ash storage

slide-12
SLIDE 12

VIEWPOINT PROBLEM VIEWPOINT PROBLEM

slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15

STUDENT-TEACHER MODEL STUDENT-TEACHER MODEL

slide-16
SLIDE 16

CHALLENGES CHALLENGES

Network (not a challenge) Storage (not a challenge) Computation (not necessarily a challenge) Memory!

slide-17
SLIDE 17

MEMORY REQUIRED TO TRAIN RESNET MEMORY REQUIRED TO TRAIN RESNET

slide-18
SLIDE 18

Memory required (MB) for image size $224 \times 224$

slide-19
SLIDE 19

Memory required (MB) for batch size 1

slide-20
SLIDE 20

Memory required (GB) for batch size 8

slide-21
SLIDE 21

CHECKPOINTING CHECKPOINTING

slide-22
SLIDE 22

PyTorch fast-evolving Python package widely applied in deep learning uses Tensors as a basic class Tensors are similar to NumPy arrays which also allow to work with them on GPU dynamically denes the computational graph of the model designed to be memory efcient: there is checkpointing strategy

slide-23
SLIDE 23

Checkpoint sequential: number of segments = 2

slide-24
SLIDE 24

Checkpoint sequential: number of segments = 2

slide-25
SLIDE 25

Checkpoint sequential: number of segments = 2

slide-26
SLIDE 26

Checkpoint sequential: number of segments = 2

slide-27
SLIDE 27

Checkpoint sequential: number of segments = 2 $$ \mbox{Memory} = s - 1 + \bigl(l - \left\loor l/s \right\roor (s -1) \bigr). $$

slide-28
SLIDE 28

Revolve: dynamic programming $$ \small{\mbox{Opt}[\ell,1] = \frac{\ell (\ell +1)}{2} u_f + (\ell+1 ) u_b}$$ $$\small{\mbox{Opt}[1, c] = u_f +2 u_b}$$ $$\small{\mbox{Opt}[\ell, c] = \min_{1 \leq i \leq \ell-1} ( i u_f +\mbox{Opt}[\ell - i, c -1] + \mbox{Opt}[i-1, c]) }$$

slide-29
SLIDE 29

Comparison of Checkpoint sequential and Revolve Batch Size: $1$, Image Size: $224 \times 224$

slide-30
SLIDE 30
slide-31
SLIDE 31

Batch Size: $8$, Image Size: $224 \times 224$

slide-32
SLIDE 32

Batch Size: $1$, Image Size: $500 \times 500$

slide-33
SLIDE 33

Batch Size: $8$, Image Size: $500 \times 500$

slide-34
SLIDE 34

PRACTICAL IMPLEMENTATION AND PRACTICAL IMPLEMENTATION AND CONCLUDING REMARKS CONCLUDING REMARKS

slide-35
SLIDE 35

THANK YOU THANK YOU