New Modification of Restricted Boltzmann Machine that Considers the - - PowerPoint PPT Presentation

new modification of restricted boltzmann machine that
SMART_READER_LITE
LIVE PREVIEW

New Modification of Restricted Boltzmann Machine that Considers the - - PowerPoint PPT Presentation

New Modification of Restricted Boltzmann Machine that Considers the Stochasticity of Real Neural Network Guillermo Barrios Morales Ruoyu Zhao Restricted Boltzmann Machines (RBM) Probability of Each state of RBM has an energy: Probability


slide-1
SLIDE 1

New Modification of Restricted Boltzmann Machine that Considers the Stochasticity of Real Neural Network

Guillermo Barrios Morales Ruoyu Zhao

slide-2
SLIDE 2

Restricted Boltzmann Machines (RBM)

Each state of RBM has an energy: Probability of every state Probability of every visible patterns The goal of the training process is to make the states we want the machine to learn, be those with the largest probability. It can be proved that RBM evolves to states that are local minima of the energy

slide-3
SLIDE 3

1-step Contrastive Divergence (CD-1) Algorithm

Use Gibbs sampling to update each neuron: update the neurons with a probability Update the parameters after 1 step of Gibbs sampling Partial Derivative Equation

  • f Probability of each visible

state

slide-4
SLIDE 4

100 training samples of RMNIST/10

Reduced MNIST/10 (RMNIST/10)

  • MNIST is a dataset of

handwritten digit images with 60000 training samples and 10000 testing samples.

  • RMNIST/10 is a dataset that

takes 100 random training samples (10 samples for each digit) and the whole 10000 testing samples from MNIST.

slide-5
SLIDE 5
  • State-of-the-art learning machine can have

a classification accuracy of more than 99% if they are taught the whole training patterns.

TEST CORRECT RATE (%) Reference 99.65 Ciresan et al. IJCAI 2011 99.73 Ciresan et al. ICDAR 2011 99.77 Ciresan et al. CVPR 2012

Children do not learn 60000 samples of digits between 0 and 9, but they can still recognize many different versions of the same number with a high accuracy rate.

Why using RMNIST/10

slide-6
SLIDE 6

On the biological basis of the GBM

➢ What we know … ○ Synaptic plasticity play a key role in memory and learning processes. Long-term potentiation (LTP) and long-term depression (LTD) are mechanisms which strengthen or weaken the synapses, increasing

  • r decreasing the probability of releasing neurotransmitters.

○ Synaptic transmission can be modeled as an stochastic process (W.Maas, A.M. Zador, 1999) ○ A positive bias lowers the firing threshold for an action potential, whereas a negative bias raises it, therefore increasing or decreasing indirectly the probability of releasing neurotransmitters. ➢ What we infer … ○ LTP (LTD) can be realized in our model as an increase (decrease) in the bias of those neurons that have stronger (weaker) connections. ○ The value of the bias for a particular neuron i follows a Gaussian distribution with mean given by the average synapse strength with all the neurons that are connected to i.

slide-7
SLIDE 7

n=

GBM: Gaussian Boltzmann Machine

n pn(n)

slide-8
SLIDE 8

28x28 28x28

Training and testing process of the GBM

Trag

GBM Wij

Training Set (3000 digits)

28x28 px 10 digits

Testing Set (10000 digits)

10 digits 28x28 px

Did it recognize the number?

1x794 1x794

slide-9
SLIDE 9

RBM (Weights) GBM (Weights) RBM: 76.7 %

Classification Task

GBM: 78.4 %

slide-10
SLIDE 10

Reconstruction Task (Pattern Completion)

40 % Covered GBM: 26.0 % GBM: 61.6 % 60 % Covered RBM: 31.4 % RBM: 1.2 %

slide-11
SLIDE 11

Reconstruction Task (Noisy input)

25 % Noise GBM: 26.0 % GBM: 51.5 % 50 % Noise RBM: 70.0 % RBM: 42.7 %

slide-12
SLIDE 12

Influence of Machine Parameters on Classification Task with RMNIST/10

Exploring how stochasticity of GBM influences its performance

slide-13
SLIDE 13

Influence of training samples randomness (considered) RBM GBM

Mean: 0.5974 Var: 4.8836e-04 Mean: 0.6461 Var:4.9067e-05 Mean: 0.6344 Var: 6.5270e-06 Mean: 0.5919 Var: 6.25e-06

Variance with each testing process (ignored)

Considerations about Repeatability

slide-14
SLIDE 14

➢ The number of epochs was set to 500. ➢ The number of hidden units was set to (2/3) of the visible ones. ➢ We assess the variation in performance by changing … ○ The learning rate for both, RBM and GBM. ○ The variance of the gaussian in GBM after setting the learning rate at its

  • ptimal value.

➢ For each value of the parameters, we repeat 5 times the training process with different sets of 100 random training vectors, and then test the resulting machine.

Setting of parameters

slide-15
SLIDE 15

Influence of Learning Rate in RBM

slide-16
SLIDE 16

Influence of Learning Rate in GBM

  • Evidently, if we expand the

learning time, the optimal learning rate will be smaller.

  • These results explain why

when we train both machines for a longer time with the same learning rate, GBM shows a better performance.

slide-17
SLIDE 17

Influence of Gaussian Variance in GBM

  • If we calculate the bias

deterministically from the weight matrix, the performance is low.

  • By adding certain

stochasticity in the updating process of the biases, we reach a higher accuracy.

slide-18
SLIDE 18

Conclusions

➢ RBM and GBM show similar performance in classification tasks. On the other hand, while RBM reconstruct better noisy inputs, GBM shows a greater accuracy in pattern-completion tasks. GBMs “guess” better than they analyze. ➢ For small training sets, GBM shows less accuracy in classifying tasks, but presents less variance as we increase the learning rate. GBMs can achieve stability with faster learning processes. ➢ We expect therefore GBMs to outperform RBMs in tasks involving very big training sets, that need higher learning rates to be computationally feasible, or when a faster learning is needed.

One p te …

➢ Compare the performance of both machines using complete MNIST dataset. ➢ Study the response of GBM to random pruning of connections.