New Modification of Restricted Boltzmann Machine that Considers the - PowerPoint PPT Presentation

New Modification of Restricted Boltzmann Machine that Considers the Stochasticity of Real Neural Network Guillermo Barrios Morales Ruoyu Zhao

Restricted Boltzmann Machines (RBM) Probability of Each state of RBM has an energy: Probability of every state every visible patterns It can be proved that RBM The goal of the training process is to make evolves to states that are the states we want the machine to learn, be local minima of the energy those with the largest probability.

1-step Contrastive Divergence (CD-1) Algorithm Partial Derivative Equation of Probability of each visible state Use Gibbs sampling to update each neuron: update the neurons with a probability Update the parameters after 1 step of Gibbs sampling

Reduced MNIST/10 (RMNIST/10) ● MNIST is a dataset of handwritten digit images with 60000 training samples and 10000 testing samples. ● RMNIST/10 is a dataset that takes 100 random training samples ( 10 samples for each digit ) and the whole 10000 testing samples from MNIST. 100 training samples of RMNIST/10

Why using RMNIST/10 TEST CORRECT RATE Reference ● State-of-the-art learning machine can have (%) a classification accuracy of more than 99% 99.65 Ciresan et al. IJCAI 2011 if they are taught the whole training 99.73 Ciresan et al. ICDAR 2011 patterns. 99.77 Ciresan et al. CVPR 2012 Children do not learn 60000 samples of digits between 0 and 9, but they can still recognize many different versions of the same number with a high accuracy rate.

On the biological basis of the GBM What we know … ➢ ○ Synaptic plasticity play a key role in memory and learning processes. Long-term potentiation (LTP) and long-term depression (LTD) are mechanisms which strengthen or weaken the synapses, increasing or decreasing the probability of releasing neurotransmitters. ○ Synaptic transmission can be modeled as an stochastic process (W.Maas, A.M. Zador, 1999) ○ A positive bias lowers the firing threshold for an action potential, whereas a negative bias raises it, therefore increasing or decreasing indirectly the probability of releasing neurotransmitters. What we infer … ➢ ○ LTP (LTD) can be realized in our model as an increase (decrease) in the bias of those neurons that have stronger (weaker) connections. ○ The value of the bias for a particular neuron i follows a Gaussian distribution with mean given by the average synapse strength with all the neurons that are connected to i.

GBM: Gaussian Boltzmann Machine p n ( � n ) � n � n =

Training and testing process of the GBM Tra��g 28x28 1x794 28x28 px 10 digits Training Set (3000 digits) GBM W ij 1x794 28x28 28x28 px 10 digits Testing Set (10000 digits) Did it recognize the number?

Classification Task RBM: 76.7 % RBM (Weights) GBM (Weights) GBM: 78.4 %

Reconstruction Task (Pattern Completion) 40 % Covered RBM: 31.4 % GBM: 61.6 % 60 % Covered GBM: 26.0 % RBM: 1.2 %

Reconstruction Task (Noisy input) RBM: 70.0 % GBM: 51.5 % 25 % Noise GBM: 26.0 % 50 % Noise RBM: 42.7 %

Influence of Machine Parameters on Classification Task with RMNIST/10 Exploring how stochasticity of GBM influences its performance

Considerations about Repeatability RBM GBM Influence of training samples randomness (considered) Mean: 0.6461 Var:4.9067e-05 Mean: 0.5974 Var: 4.8836e-04 Variance with each testing process (ignored) Mean: 0.6344 Var: 6.5270e-06 Mean: 0.5919 Var: 6.25e-06

Setting of parameters ➢ The number of epochs was set to 500. The number of hidden units was set to (2/3) of the visible ones. ➢ ➢ We assess the variation in performance by changing … ○ The learning rate for both, RBM and GBM. ○ The variance of the gaussian in GBM after setting the learning rate at its optimal value. ➢ For each value of the parameters, we repeat 5 times the training process with different sets of 100 random training vectors, and then test the resulting machine.

Influence of Learning Rate in RBM

Influence of Learning Rate in GBM ● Evidently, if we expand the learning time, the optimal learning rate will be smaller. ● These results explain why when we train both machines for a longer time with the same learning rate, GBM shows a better performance.

Influence of Gaussian Variance in GBM ● If we calculate the bias deterministically from the weight matrix, the performance is low. ● By adding certain stochasticity in the updating process of the biases, we reach a higher accuracy.

Conclusions RBM and GBM show similar performance in classification tasks. On the other hand, while RBM ➢ reconstruct better noisy inputs, GBM shows a greater accuracy in pattern-completion tasks. GBMs “guess” better than they analyze. For small training sets, GBM shows less accuracy in classifying tasks, but presents less variance as we ➢ increase the learning rate. GBMs can achieve stability with faster learning processes. We expect therefore GBMs to outperform RBMs in tasks involving very big training sets, that need ➢ higher learning rates to be computationally feasible, or when a faster learning is needed. One ��p ��t�e� … ➢ Compare the performance of both machines using complete MNIST dataset. ➢ Study the response of GBM to random pruning of connections.

New Modification of Restricted Boltzmann Machine that Considers the - PowerPoint PPT Presentation

New Modification of Restricted Boltzmann Machine that Considers the Stochasticity of Real Neural Network Guillermo Barrios Morales Ruoyu Zhao Restricted Boltzmann Machines (RBM) Probability of Each state of RBM has an energy: Probability

with Applications to Change-point Detection and Restricted Boltzmann Machine Restricted Boltzmann

Biologically-Inspired Sparse Restricted Boltzmann Machines Pablo Tostado Michael Wiest Alice

Transport properties - Boltzmann equation goal: calculation of conductivity Boltzmann transport

Alcohol Harm Reduction Unit Insp Colin Dobson RESTRICTED RESTRICTED Historical position 2

South Wales Police Ray Forsey Head of Fleet The Real Benefits of Telematics Restricted

CSC321 Lecture 19: Boltzmann Machines Roger Grosse Roger Grosse CSC321 Lecture 19: Boltzmann

Boltzmann Sampling and Random Generation of Combinatorial Structures Philippe Flajolet Based on

Einstein on Boltzmann principle Giovanni Jona-Lasinio Galileo Galilei Institute, May 27, 2014

Non Isotropic Cauchy Theory for the Boltzmann Nordheim Equations Equation for Bosons. Bose

Fourier Law and Non-Isothermal Boundary in the Boltzmann Theory Joint work with Raffaele

Conditional Restricted Boltzmann Machine for Item Recommendation Zixiang Chen a, b, c , Wanqi Ma

On the Fine-Tuning Parameters in Deep Boltzmann Machines Using Quaternions Jo ao Paulo Papa

Mark Falcon Head of Regulatory Policy and Strategy PayExpo2015, 9-10 June 2015 1 PSR Restricted

NDR PRESENTATION Restricted NDR PRESENTATION Restricted CONTENT Getting registered

SMU Classification: Restricted SMU Classification: Restricted Challenges of investing in Asia

RESTRICTED BOLTZMANN MACHINES DANIEL KOHLSDORF LAST LECTURE: DEEP AUTO ENCODERS Directed Model

Alessandro Acq isti and Ralph Gross Alessandro Acquisti and Ralph Gross Heinz College/CyLab C

COMPLETE STATISTICAL THEORY OF LEARNING LEARNING USING STATISTICAL INVARIANTS Vladimir Vapnik

CS/COE 1520 pitt.edu/~ach54/cs1520 Responsive Web Design Viewing a webpage in a small window 2

Software for TDA ACM-BCB Workshop on TDA October 2, 2016 by Svetlana Lockwood Topological Data

Context-Free Grammars 19 March 2019 OSU CSE 1 BL Compiler Structure Code Tokenizer Parser

Number Theory Divisibility and Primes Definition. If a and b are integers and there is some

Quick Exercise What kind of sound does this method make? public Sound makeSound( int seconds ) {

Library of Congress Classification: Module 8.5 1 Library of Congress Classification: Module 8.5