new modification of restricted boltzmann machine that
play

New Modification of Restricted Boltzmann Machine that Considers the - PowerPoint PPT Presentation

New Modification of Restricted Boltzmann Machine that Considers the Stochasticity of Real Neural Network Guillermo Barrios Morales Ruoyu Zhao Restricted Boltzmann Machines (RBM) Probability of Each state of RBM has an energy: Probability


  1. New Modification of Restricted Boltzmann Machine that Considers the Stochasticity of Real Neural Network Guillermo Barrios Morales Ruoyu Zhao

  2. Restricted Boltzmann Machines (RBM) Probability of Each state of RBM has an energy: Probability of every state every visible patterns It can be proved that RBM The goal of the training process is to make evolves to states that are the states we want the machine to learn, be local minima of the energy those with the largest probability.

  3. 1-step Contrastive Divergence (CD-1) Algorithm Partial Derivative Equation of Probability of each visible state Use Gibbs sampling to update each neuron: update the neurons with a probability Update the parameters after 1 step of Gibbs sampling

  4. Reduced MNIST/10 (RMNIST/10) ● MNIST is a dataset of handwritten digit images with 60000 training samples and 10000 testing samples. ● RMNIST/10 is a dataset that takes 100 random training samples ( 10 samples for each digit ) and the whole 10000 testing samples from MNIST. 100 training samples of RMNIST/10

  5. Why using RMNIST/10 TEST CORRECT RATE Reference ● State-of-the-art learning machine can have (%) a classification accuracy of more than 99% 99.65 Ciresan et al. IJCAI 2011 if they are taught the whole training 99.73 Ciresan et al. ICDAR 2011 patterns. 99.77 Ciresan et al. CVPR 2012 Children do not learn 60000 samples of digits between 0 and 9, but they can still recognize many different versions of the same number with a high accuracy rate.

  6. On the biological basis of the GBM What we know … ➢ ○ Synaptic plasticity play a key role in memory and learning processes. Long-term potentiation (LTP) and long-term depression (LTD) are mechanisms which strengthen or weaken the synapses, increasing or decreasing the probability of releasing neurotransmitters. ○ Synaptic transmission can be modeled as an stochastic process (W.Maas, A.M. Zador, 1999) ○ A positive bias lowers the firing threshold for an action potential, whereas a negative bias raises it, therefore increasing or decreasing indirectly the probability of releasing neurotransmitters. What we infer … ➢ ○ LTP (LTD) can be realized in our model as an increase (decrease) in the bias of those neurons that have stronger (weaker) connections. ○ The value of the bias for a particular neuron i follows a Gaussian distribution with mean given by the average synapse strength with all the neurons that are connected to i.

  7. GBM: Gaussian Boltzmann Machine p n ( � n ) � n � n =

  8. Training and testing process of the GBM Tra����g 28x28 1x794 28x28 px 10 digits Training Set (3000 digits) GBM W ij 1x794 28x28 28x28 px 10 digits Testing Set (10000 digits) Did it recognize the number?

  9. Classification Task RBM: 76.7 % RBM (Weights) GBM (Weights) GBM: 78.4 %

  10. Reconstruction Task (Pattern Completion) 40 % Covered RBM: 31.4 % GBM: 61.6 % 60 % Covered GBM: 26.0 % RBM: 1.2 %

  11. Reconstruction Task (Noisy input) RBM: 70.0 % GBM: 51.5 % 25 % Noise GBM: 26.0 % 50 % Noise RBM: 42.7 %

  12. Influence of Machine Parameters on Classification Task with RMNIST/10 Exploring how stochasticity of GBM influences its performance

  13. Considerations about Repeatability RBM GBM Influence of training samples randomness (considered) Mean: 0.6461 Var:4.9067e-05 Mean: 0.5974 Var: 4.8836e-04 Variance with each testing process (ignored) Mean: 0.6344 Var: 6.5270e-06 Mean: 0.5919 Var: 6.25e-06

  14. Setting of parameters ➢ The number of epochs was set to 500. The number of hidden units was set to (2/3) of the visible ones. ➢ ➢ We assess the variation in performance by changing … ○ The learning rate for both, RBM and GBM. ○ The variance of the gaussian in GBM after setting the learning rate at its optimal value. ➢ For each value of the parameters, we repeat 5 times the training process with different sets of 100 random training vectors, and then test the resulting machine.

  15. Influence of Learning Rate in RBM

  16. Influence of Learning Rate in GBM ● Evidently, if we expand the learning time, the optimal learning rate will be smaller. ● These results explain why when we train both machines for a longer time with the same learning rate, GBM shows a better performance.

  17. Influence of Gaussian Variance in GBM ● If we calculate the bias deterministically from the weight matrix, the performance is low. ● By adding certain stochasticity in the updating process of the biases, we reach a higher accuracy.

  18. Conclusions RBM and GBM show similar performance in classification tasks. On the other hand, while RBM ➢ reconstruct better noisy inputs, GBM shows a greater accuracy in pattern-completion tasks. GBMs “guess” better than they analyze. For small training sets, GBM shows less accuracy in classifying tasks, but presents less variance as we ➢ increase the learning rate. GBMs can achieve stability with faster learning processes. We expect therefore GBMs to outperform RBMs in tasks involving very big training sets, that need ➢ higher learning rates to be computationally feasible, or when a faster learning is needed. One ���p ���t�e� … ➢ Compare the performance of both machines using complete MNIST dataset. ➢ Study the response of GBM to random pruning of connections.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend