IMPLEMENTATION OF A PARALLEL BATCH TRAINING ALGORITHM FOR DEEP - - PowerPoint PPT Presentation

implementation of a parallel batch training
SMART_READER_LITE
LIVE PREVIEW

IMPLEMENTATION OF A PARALLEL BATCH TRAINING ALGORITHM FOR DEEP - - PowerPoint PPT Presentation

IMPLEMENTATION OF A PARALLEL BATCH TRAINING ALGORITHM FOR DEEP NEURAL NETWORK YUPING LIN IFLYTEK LABORATORY FOR NEURAL COMPUTING FOR MACHINE LEARNING DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE YORK UNIVERSITY, TORONTO NOVEMBER


slide-1
SLIDE 1

IMPLEMENTATION OF A PARALLEL BATCH TRAINING ALGORITHM FOR DEEP NEURAL NETWORK

YUPING LIN IFLYTEK LABORATORY FOR NEURAL COMPUTING FOR MACHINE LEARNING DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE YORK UNIVERSITY, TORONTO NOVEMBER 10, 2015

1

slide-2
SLIDE 2

OUTLINE

Review

Neural Network Representation

Sequential Trainer

Concurrent Trainer

Dividing tasks

Collecting statistics

Using monitor

Using thread pool

Testing

Future Work

2

slide-3
SLIDE 3

REVIEW -- NEURAL NETWORK TRAINING

Forward phase

𝑎

𝑘 (𝑚+1) = 𝐺 𝑗 𝑥𝑗𝑘 ∙ 𝑎𝑗 𝑚 + 𝑐 𝑘

Where 𝐺 𝑦 is the nonlinear activation function

input

  • utput

3

slide-4
SLIDE 4

REVIEW -- NEURAL NETWORK TRAINING

Error back propagation

𝜀𝑙

(𝑝𝑣𝑢) = 𝑎𝑙 (𝑝𝑣𝑢) − 𝑈 𝑙 (𝑝𝑣𝑢)

𝜀𝑗

(𝑚) = 𝐺′ 𝑎𝑗 𝑚

∙ 𝑘 𝑥𝑗𝑘 ∙ 𝜀

𝑘 𝑚+1

Where 𝐺′ 𝑦 is the derivative of the activation function

𝑈 is the desired output vector

Weight updating

∆𝑥𝑗𝑘 = 𝑎𝑗

(𝑚) ∙ 𝜀 𝑘 (𝑚+1)

𝑥𝑗𝑘 = 𝑥𝑗𝑘 − 𝛿 ∙ ∆𝑥𝑗𝑘

Where 𝛿 is the learning rate

input

  • utput

4

slide-5
SLIDE 5

REVIEW -- SEQUENTIAL TRAINING VS. CONCURRENT TRAINING

Sequential training Concurrent training

5

trainer

Weight update

VS.

Master trainer Worker trainer

Weight update Send train data Collect statistics

Worker trainer Worker trainer

slide-6
SLIDE 6

BROAD VIEW OF THE IMPLEMENTATION

There are 3 major components in our implementation:

The neural network representation: package of classes that form a neural network.

Sequential trainers: classes that implement the sequential training algorithm.

Concurrent trainers: classes that implement the concurrent training algorithm.

6

slide-7
SLIDE 7

NEURAL NETWORK REPRESENTATION

7

slide-8
SLIDE 8

NEURAL NETWORK REPRESENTATION

Main class that represents a multi-layer perceptron.

Has attributes representing the components

  • f a neural network.

Weights

Biases

Activation Functions

8

slide-9
SLIDE 9

NEURAL NETWORK REPRESENTATION

Represents a weight matrix

Support forward/backward multiplications

9

slide-10
SLIDE 10

THE SEQUENTIAL TRAINER

10

slide-11
SLIDE 11

THE SEQUENTIAL TRAINER

Divide training into 3 layers:

train() for the whole training process

trainEpoch() for the training of each epoch

trainBatch() for the training of each mini-batch

11

slide-12
SLIDE 12

THE CONCURRENT TRAINER

12

slide-13
SLIDE 13

THE CONCURRENT TRAINER

Similar to the sequential trainer.

Keep references to the monitor object and a thread pool.

Concurrency occur within the trainBatch() method.

13

slide-14
SLIDE 14

THE CONCURRENT TRAINER -- DIVIDING TASKS

Implements the java.lang.Runnable interface

Represents a training task for worker thread.

Keep references to the monitor object and the global shared variables for update statistics.

14

slide-15
SLIDE 15

THE CONCURRENT TRAINER -- COLLECTING STATISTICS

The update statistics are accumulated locally within each task.

Then update in the shared global variables concurrently upon finish of the task.

Need synchronization: synchronized blocks, compare and set etc.

15

slide-16
SLIDE 16

THE CONCURRENT TRAINER -- USING MONITOR

16

slide-17
SLIDE 17

THE CONCURRENT TRAINER -- USING THREAD POOL

Repeatedly creating and destroying threads can waste a lot of resource and time.

Can pre-define a fixed size thread pool to avoid this problem.

17

slide-18
SLIDE 18

TESTING

The concurrent training algorithm only parallelizes the computations over data samples within each mini-batch.

The computed update statistics should be the same for both the sequential and concurrent algorithms.

Define the concurrent implementation as correct if the model trained by the concurrent trainer is equivalent to the same model trained by the sequential trainer.

Two models are considers equivalent if the differences between all their weights and biases are within some small error 𝜁.

18

slide-19
SLIDE 19

TESTING

Have ran 100 comparison tests and all

  • f them are considered

equal.

19

slide-20
SLIDE 20

FUTURE WORK

Run the sequential and the concurrent algorithm on a multicore machine to see how much training time can be reduced by using the concurrent algorithm.

20