Speculations on possible brain substrates of symbolic processing and - - PowerPoint PPT Presentation

speculations on possible brain substrates of symbolic
SMART_READER_LITE
LIVE PREVIEW

Speculations on possible brain substrates of symbolic processing and - - PowerPoint PPT Presentation

Speculations on possible brain substrates of symbolic processing and structured I/O from memory Adam Marblestone CS 379c Stanford 2019 (slides based on pre-DeepMind work, much of it with Ken Hayworth) A tentative high-level template for AI


slide-1
SLIDE 1

Adam Marblestone CS 379c Stanford 2019 (slides based on pre-DeepMind work, much of it with Ken Hayworth)

Speculations on possible brain substrates

  • f symbolic processing and structured I/O

from memory

slide-2
SLIDE 2

A tentative high-level template for AI cognitive architectures, based on some interpretations of modern neuroscience (such as it is)

slide-3
SLIDE 3

A tentative high-level template for AI cognitive architectures, based on some interpretations of modern neuroscience (such as it is)

????????????

But raises more questions than it answers...

slide-4
SLIDE 4

Working memory: reverberating activity? qualitatively similar to ongoing activity in a LSTM?

  • - but, in cortex? cortico-thalamic loops? unstructured versus pre-structured? variables/slots?
  • - gating / routing of access to/from working memory

Episodic memory: rapid plasticity in hippocampus, supports pattern completion, linked to diverse cortical representations

  • - many open questions… temporal, spatial, predictive and other relational organizing principles?
  • - how is it consolidated into semantic memory or other cortically-encoded knowledge?
  • - free association, chunking, hierarchical contexts...
  • - how are memory recall, offline replay + prospective planning linked with RL?
  • - interplay of feature-based generalization and sparse, arbitrary pattern-separated codes?
  • - ...

Semantic memory: knowledge-graph like representations in cortical association areas?

  • - distinct from episodic memory? distinct from “unstructured” cortical weights?
  • - is this a distinct architecture, or something that emerges from the other systems?

Procedural memory: cortico-striatal synapses governing basal-ganglia action selection? selectable cortical programs? Other: how is the information encoded (e.g., based on which loss functions) before entering any of the above systems? are VAE-like “latent vectors” able to capture enough structure, when trained with the right loss functions, e.g., see MERLIN predictive losses? or does one need something more like “capsules” or other architectural features?

Psychological inspirations for knowledge representation in AI cognitive architectures... and assumptions to question

slide-5
SLIDE 5

Neural Turing Machine:

  • riginally framed as extension to LSTM “working memory”
slide-6
SLIDE 6

NTM arguably solves long-standing complaints about lack

  • f symbolic “variable binding” in NNs (e.g., Gary Marcus)
slide-7
SLIDE 7

Can we forge tighter links with neuroscience to constrain architectural choices for working + episodic memory analogs, symbolic structures, dynamic routing, and training procedures in ANNs?

slide-8
SLIDE 8

neural attractors/assemblies/ensembles

http://fourier.eng.hmc.edu/e161/lectures/figures/energylandscape.gif

(cf., Hopfield…)

slide-9
SLIDE 9

(cf., Hopfield…)

https://github.com/adammarblestone/AssociativeMemories

slide-10
SLIDE 10

Information represented via assemblies/attractors

slide-11
SLIDE 11

Information represented via assemblies/attractors

slide-12
SLIDE 12

Sequences of point attractors in the hippocampus?

slide-13
SLIDE 13

Sequences of point attractors in the hippocampus?

slide-14
SLIDE 14

The attractors may be in cortico-thalamo-cortical loops

slide-15
SLIDE 15

Thalamic Latches and Working Memory Buffers

McFarland & Haber Murray Sherman

slide-16
SLIDE 16

Assumption: Information necessary to select an assembly passes through thalamus between cortical buffers

Thalamic Latches and Working Memory Buffers

slide-17
SLIDE 17

Idea: Thalamic relay + attractor implementation of “dynamically partitionable auto-associative neural network” (Hayworth 2012)

  • Global attractors/assemblies/ensembles shared across source > thalamic relay >

destination buffers

  • Gating the thalamic relay off allows “partitioning” of the buffers
  • Gating the thalamic relay on allows information to be “copied” from a source

buffer to a destination buffer, forcing the destination buffer to occupy an attractor globally shared with that of the source

Gated communication using thalamic relay of attractors

slide-18
SLIDE 18

Cortico-thalamic latched memory buffer

slide-19
SLIDE 19

Cortico-thalamic latched memory buffer

Assembly/attractor/ ensemble shared across connected cortical and thalamic areas…

slide-20
SLIDE 20

Hayworth and Marblestone 2018 “Copy and paste” of symbols using partitionable attractors

slide-21
SLIDE 21

Hayworth 2009 “Copy and paste” of symbols using partitionable attractors

Sequence of gating operations for copy-and-paste of assemblies (cf., symbolic variable binding)

During training / symbol allocation...

slide-22
SLIDE 22

Hayworth 2009 “Copy and paste” of symbols using partitionable attractors

Sequence of gating operations for copy-and-paste of assemblies (cf., symbolic variable binding)

Later, executing a routing operation...

slide-23
SLIDE 23

Hayworth and Marblestone 2018 “Copy and paste” of symbols using partitionable attractors

Sequence of gating operations for copy-and-paste of assemblies (cf., symbolic variable binding)

slide-24
SLIDE 24

Lisman 2015 “Copy and paste” of symbols using partitionable attractors

“Latch” and “relay” control via basal ganglia discrete outputs

  • Evolutionarily ancient (homologies to simplest vertebrate brains, e.g., ZFish)
  • Does RL
  • BG and superior colliculus may also contain innate control structures that could drive

“training routines” / “internal curricula” / “bootstrap cost functions”... discrete inhibitory/disinhibitory control over target thalamic areas/relays/latches?

slide-25
SLIDE 25

Hayworth and Marblestone 2018 Clamping in target patterns for “contrastive” learning

slide-26
SLIDE 26

Clamping in target patterns for “contrastive” learning Explicit basal ganglia directed control over the learning of invariances (not just unsupervised “slow feature” finding)?

Example:

  • Basal ganglia recognizes boundaries of “episode” with a given object

(BG learns this policy via reinforcement learning?)

  • BG “clamps” target patterns into thalamo-cortical target buffer
  • BG trains upstream sensory hierarchy to map varying input to clamped target
  • Target pattern may be retrieved from memory on subsequent episode?

Hayworth and Marblestone 2018

slide-27
SLIDE 27

Hayworth and Marblestone 2018 Structured I/O from an associative memory Unstructured associative code

slide-28
SLIDE 28

Hayworth and Marblestone 2018 Structured I/O from an associative memory Structured representation across multiple buffers

slide-29
SLIDE 29

Hayworth and Marblestone 2018

A crude, very partial, and speculative “integrative picture”

slide-30
SLIDE 30

Returning to the current situation re integrated memory-based RL architectures in AI

slide-31
SLIDE 31

Basically “soft attention” over a set of memory “slots”, with cosine-distance based similarity lookup…

Returning to the current situation re integrated memory-based RL architectures in AI

slide-32
SLIDE 32

What about structured routing / potential thalamus analogs?

slide-33
SLIDE 33

What about structured routing / potential thalamus analogs?