Local Representation Alignment: A Biologically Motivated Algorithm for Training Neural Systems
Alexander G. Ororbia II The Neural Adaptive Computing (NAC) Laboratory Rochester Institute of Technology
1
Local Representation Alignment: A Biologically Motivated Algorithm - - PowerPoint PPT Presentation
Local Representation Alignment: A Biologically Motivated Algorithm for Training Neural Systems Alexander G. Ororbia II The Neural Adaptive Computing (NAC) Laboratory Rochester Institute of Technology 1 Collaborators The Pennsylvania State
Alexander G. Ororbia II The Neural Adaptive Computing (NAC) Laboratory Rochester Institute of Technology
1
2
3
Equilibrium propagation (EP) Contrastive Divergence (CD)
4
Backprop, CHL, LRA SGD, Adam, RMSprop MSE, MAE, CNLL MNIST MLP, AE, BM, RNN
5
MLP = Multilayer perceptron AE = Autoencoder BM = Boltzmann machine
Global optimization, back-prop through whole graph.
6
differentiable → difficulty in handling discrete-valued functions
adversarial samples
Illustration: forward propagation in a multilayer perceptron (MLP) to collect activities (Shared across most algorithms, i.e., backprop, random feedback alignment, direct feedback alignment, local representation alignment)
7
8
9
10
11
12
13
Conducting credit assignment using the activities produced by the inference pass
14
Pass error signal back through post-activations (get derivatives w.r.t. pre-activitions)
15
Pass error signal back through (incoming) synaptic weights to get error signal transmitted to post- activations in layer below
16
Repeat the previous steps, layer by layer (recursive treatment of backprop procedure)
17
18
19
20
21
Pass error signal back through post-activations (get derivatives w.r.t. pre-activitions)
22
Pass error signal back through fixed, random alignment weights (replaces backprop’s step of passing error through transpose
23
Repeat previous steps (similar to backprop)
24
25
26
27
28
Pass error signal back through post-activations (get derivatives w.r.t. pre-activitions)
29
Pass error signal along first set of direct alignment weights to second layer
30
Pass error signal along next set of direct alignment weights to first layer
31
Treat the signals propagated along direct alignment connections as proxies for error derivatives and run them through post-activations in each layer, respectively
32
33
Random Feedback Alignment: Direct Feedback Alignment: Backpropagation of Errors:
34
Global optimization, back-prop through whole graph. Local optimization, back-prop through sub-graphs.
36
Global optimization, back-prop through whole graph. Local optimization, back-prop through sub-graphs.
37
Global feedback pathway Will these yield coherent models?
38
Negative phase Positive phase
target representations
39
Image adapted from (Lillicrap et al., 2018)
zL zL ˄ z L-1 z L-1 ˄ g(z )
L
g(z )
L
˄
40
41
42
Transmit error along error feedback weights, and error correct the post- activations using the transmitted displacement/delta
43
Calculate local error in layer below, measuring discrepancy between
corrected post-activation
44
Repeat the past several steps, error-correcting each layer further down within the network/system
45
46
Optional…substitute & repeat!
47
The Cauchy local loss:
48
motivated/inspired by (Rao & Ballard, 1999)
49
motivated/inspired by (Rao & Ballard, 1999) There is more than
these changes
50
51
52
MNIST Fashion MNIST
Trousers Dress Shirt
(Ororbia et al., 2018 Bio)
Third level filters acquired, after a single pass through the data, by tanh network trained by a) backprop, b) LRA.
Backprop LRA
53
54
55
Angle between LRA, DFA, & DTP-σ against Backprop Measuring Total Discrepancy in LRA-E
Equilibrium Propagation (8 layers): MNIST: 59.03% Fashion MNIST: 67.33% Equilibrium Propagation (3 layers): MNIST: 6.00% Fashion MNIST: 16.71%
(Ororbia et al., 2018 Credit)
56
LWTA: SLWTA:
(Ororbia et al., 2018 Credit)
57
58
(Ororbia et al., 2018 Credit)
59
The Parallel Temporal Neural Coding Network (P-TNCN) (Ororbia et al., 2018) (Ororbia et al., 2018 Continual)
60
61
62
63
alternatives such as Equilibrium Propagation & alignment algorithms
64
65
and C. Lee Giles. “Deep Credit Assignment by Aligning Local Distributed Representations”. arXiv:1803.01834 [cs.LG].
Giles, and Daniel Kifer. “Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations”. arXiv:1810.07411 [cs.LG].
Reitter, and C. Lee Giles. “Learning to Adapt by Minimizing Discrepancy”. arXiv:1711.11542 [cs.LG].
, and C. Lee Giles. “Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively”. arXiv:1905.10696 [cs.LG].
Motivated Algorithms for Propagating Local Target Representations”. In: Thirty- Third AAAI Conference on Artificial Intelligence.
66