Neural Networks Hopfield Nets and Boltzmann Machines Fall 2017 1

Recap: Hopfield network & $ = Θ + % "$ & " + ( $ "#$ Θ , = -+1 /0 , > 0 −1 /0 , ≤ 0 • At each time each neuron receives a “field” ∑ "#$ % "$ & " + ( $ • If the sign of the field matches its own sign, it does not respond • If the sign of the field opposes its own sign, it “flips” to match the sign of the field 2

Recap: Energy of a Hopfield Network * % = Θ $ ) '% * ' + + % ';% Θ - = .+1 12 - > 0 −1 12 - ≤ 0 ! = − $ ) %' * % * ' − $ + % * % %,'(% % The system will evolve until the energy hits a local minimum • In vector form • – Bias term may be viewed as an extra input pegged to 1.0 ! = − 1 2 7 8 97 − : 8 7 3

Recap: Hopfield net computation 1. Initialize network with initial pattern ) % 0 = + % , 0 ≤ . ≤ / − 1 2. Iterate until convergence ) % 1 + 1 = Θ $ ( &% ) & , 0 ≤ . ≤ / − 1 &4% Very simple • Updates can be done sequentially, or all at once • Convergence • ! = − $ $ ( &% ) & ) % % &'% does not change significantly any more 4

Recap: Evolution ! = − 1 2 & ' (& PE state • The network will evolve until it arrives at a local minimum in the energy contour 5

Recap: Content-addressable memory PE state • Each of the minima is a “stored” pattern – If the network is initialized close to a stored pattern, it will inevitably evolve to the pattern • This is a content addressable memory – Recall memory content from partial or corrupt values • Also called associative memory 6

Examples: Content addressable memory • http://staff.itee.uq.edu.au/janetw/cmc/chapters/Hopfield/ 7

Examples: Content addressable memory Noisy pattern completion: Initialize the entire network and let the entire network evolve • http://staff.itee.uq.edu.au/janetw/cmc/chapters/Hopfield/ 8

Examples: Content addressable memory Pattern completion: Fix the “seen” bits and only let the “unseen” bits evolve • http://staff.itee.uq.edu.au/janetw/cmc/chapters/Hopfield/ 9

Training a Hopfield Net to “Memorize” target patterns • The Hopfield network can be trained to remember specific “target” patterns – E.g. the pictures in the previous example • This can be done by setting the weights ! appropriately Random Question: Can you use backprop to train Hopfield nets? Hint: Think RNN 10

Training a Hopfield Net to “Memorize” target patterns The Hopfield network can be trained to remember specific “target” • patterns – E.g. the pictures in the previous example A Hopfield net with ! neurons can designed to store up to ! target • ! -bit memories – But can store an exponential number of unwanted “parasitic” memories along with the target patterns Training the network: Design weights matrix " such that the • energy of … – Target patterns is minimized, so that they are in energy wells – Other untargeted potentially parasitic patterns is maximized so that they don’t become parasitic 11

Training the network ! " = argmin * /(+) − * /(+) " +∈- . +∉- . Minimize energy of Maximize energy of target patterns all other patterns Energy 12 state

Optimizing W !(#) = − 1 2 # ) *# + * = argmin 2 !(#) − 2 !(#) * #∈4 5 #∉4 5 • Simple gradient descent: ## ) − 2 ## ) * = * + 8 2 #∈4 5 #∉4 5 Minimize energy of Maximize energy of target patterns all other patterns

Training the network && * − % && * ! = ! + $ % &∈( ) &∉( ) Minimize energy of Maximize energy of target patterns all other patterns Energy 14 state

Simpler: Focus on confusing parasites && * − && * ! = ! + $ % % &∈( ) &∉( ) &&./01123 • Focus on minimizing parasites that can prevent the net from remembering target patterns – Energy valleys in the neighborhood of target patterns Energy 15 state

Training to maximize memorability of target patterns && * − && * ! = ! + $ % % &∈( ) &∉( ) &&./01123 Lower energy at valid memories • Initialize the network at valid memories and let it evolve • – It will settle in a valley. If this is not the target pattern, raise it Energy 16 state

Training the Hopfield network && * − && * ! = ! + $ % % &∈( ) &∉( ) &&./01123 • Initialize ! • Compute the total outer product of all target patterns – More important patterns presented more frequently • Initialize the network with each target pattern and let it evolve – And settle at a valley • Compute the total outer product of valley patterns • Update weights 17

Training the Hopfield network: SGD version "" ( − "" ( ! = ! + ' * * "∈, - "∉, - &"0$12234 • Initialize ! • Do until convergence, satisfaction, or death from boredom: – Sample a target pattern " # • Sampling frequency of pattern must reflect importance of pattern – Initialize the network at " # and let it evolve • And settle at a valley " $ – Update weights ( − " $ " $ ( • ! = ! + ' " # " # 18

More efficient training • Really no need to raise the entire surface, or even every valley • Raise the neighborhood of each target memory – Sufficient to make the memory a valley – The broader the neighborhood considered, the broader the valley Energy 19 state

Training the Hopfield network: SGD version "" ( − "" ( ! = ! + ' * * "∈, - "∉, - &"0123345 • Initialize ! • Do until convergence, satisfaction, or death from boredom: – Sample a target pattern " # • Sampling frequency of pattern must reflect importance of pattern – Initialize the network at " # and let it evolve a few steps (2-4) • And arrive at a down-valley position " $ – Update weights ( − " $ " $ ( • ! = ! + ' " # " # 20

Problem with Hopfield net • Why is the recalled pattern not perfect? 21

A Problem with Hopfield Nets Parasitic memories Energy state • Many local minima – Parasitic memories • May be escaped by adding some noise during evolution – Permit changes in state even if energy increases.. • Particularly if the increase in energy is small 22

Recap: Stochastic Hopfield Nets ! " = 1 % & ) '" * ' '(" + * " = 1 = , ! " + * " = 0 = 1 − , ! " • The evolution of the Hopfield net can be made stochastic • Instead of deterministically responding to the sign of the local field, each neuron responds probabilistically – This is much more in accord with Thermodynamic models – The evolution of the network is more likely to escape spurious “weak” memories 23

Recap: Stochastic Hopfield Nets ' # = 1 ( ) , *# " * *+# ! " # = 1 = & ' # The field quantifies the energy difference obtained by flipping the current unit • The evolution of the Hopfield net can be made stochastic • Instead of deterministically responding to the sign of the local field, each neuron responds probabilistically – This is much more in accord with Thermodynamic models – The evolution of the network is more likely to escape spurious “weak” memories 24

Recap: Stochastic Hopfield Nets ' # = 1 ( ) , *# " * *+# ! " # = 1 = & ' # The field quantifies the energy difference obtained by flipping the current unit • The evolution of the Hopfield net can be made stochastic If the difference is not large, the probability of flipping approaches 0.5 • Instead of deterministically responding to the sign of the local field, each neuron responds probabilistically – This is much more in accord with Thermodynamic models – The evolution of the network is more likely to escape spurious “weak” memories 25

Recap: Stochastic Hopfield Nets ' # = 1 ( ) , *# " * *+# ! " # = 1 = & ' # The field quantifies the energy difference obtained by flipping the current unit • The evolution of the Hopfield net can be made stochastic If the difference is not large, the probability of flipping approaches 0.5 T is a “temperature” parameter: increasing it moves the probability of the • Instead of deterministically responding to the sign of the bits towards 0.5 local field, each neuron responds probabilistically At T=1.0 we get the traditional definition of field and energy At T = 0, we get deterministic Hopfield behavior – This is much more in accord with Thermodynamic models – The evolution of the network is more likely to escape spurious “weak” memories 26

Evolution of a stochastic Hopfield net 1. Initialize network with initial pattern Assuming T = 1 ! " 0 = % " , 0 ≤ ( ≤ ) − 1 2. Iterate 0 ≤ ( ≤ ) − 1 , = - . 1 /" ! / /0" ! " 2 + 1 ~ 5(678(9:(,) 27

Evolution of a stochastic Hopfield net 1. Initialize network with initial pattern Assuming T = 1 ! " 0 = % " , 0 ≤ ( ≤ ) − 1 2. Iterate 0 ≤ ( ≤ ) − 1 , = - . 1 /" ! / /0" ! " 2 + 1 ~ 5(678(9:(,) • When do we stop? • What is the final state of the system – How do we “recall” a memory? 28

Neural Networks Hopfield Nets and Boltzmann Machines Fall 2017 1 - PowerPoint PPT Presentation

Neural Networks Hopfield Nets and Boltzmann Machines Fall 2017 1 Recap: Hopfield network & $ = + % "$ & " + ( $ "#$ , = -+1 /0 , > 0 1 /0 , 0 At each time each neuron receives a field

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

Deep networks CS 446 The ERM perspective These lectures will follow an ERM perspective on deep

Neural Networks Stefan Edelkamp 1 Overview - Introduction - Percepton - Hofield-Nets -

EFET position paper One line title for an improved market design in intraday Irina Nikolova

Statistical Preliminaries Stony Brook University CSE545, Fall 2016 Random Variables X : A

ts tr P

Definition of Natural Logarithm Function Recall x n dx = x n +1 n + 1 + C n = 1 .

MATH 12002 - CALCULUS I 1.5: Intermediate Value Theorem Professor Donald L. White Department

Intermediate Dimensions, Capacities and Projections Kenneth Falconer University of St Andrews,