Where-What Network 3 (WWN-3): Developmental Top-Down Attention for - - PowerPoint PPT Presentation

where what network 3 wwn 3 developmental top down
SMART_READER_LITE
LIVE PREVIEW

Where-What Network 3 (WWN-3): Developmental Top-Down Attention for - - PowerPoint PPT Presentation

Where-What Network 3 (WWN-3): Developmental Top-Down Attention for Multiple Foregrounds and Complex Backgrounds Matthew Luciw www.cse.msu.edu/~luciwmat Juyang Weng www.cse.msu.edu/~weng Embodied Intelligence Lab www.cse.msu.edu/ei


slide-1
SLIDE 1

Where-What Network 3 (WWN-3): Developmental Top-Down Attention for Multiple Foregrounds and Complex Backgrounds

Matthew Luciw www.cse.msu.edu/~luciwmat Juyang Weng www.cse.msu.edu/~weng Embodied Intelligence Lab www.cse.msu.edu/ei

slide-2
SLIDE 2

Michigan State University 2

Attention with Multiple Contexts

  • What’s the

foreground?

  • “Find the person in

a pink shirt and red hat.”

  • “Find the person in

a brown shirt and gray hat.”

slide-3
SLIDE 3

Michigan State University 3

General Purpose Attention and Recognition

  • Major Issues
  • Complex backgrounds
  • Binding problem
  • Lack of constraints
  • Chicken-egg problem
  • Recognition requires segmentation
  • Segmentation requires recognition
  • Remains an open problem
slide-4
SLIDE 4

Problems with Many Attention- Recognition Methods

  • Not utilizing top-down feedback simultaneously

with bottom-up, at each layer

  • As in biological visual systems
  • Deal with multiple contexts
  • Learning better features
  • Border ownership and transparency
  • Not developmental
  • Using pre-selected rules to find interest points (e.g.,

corner detection)

  • Detect and recognize pre-selected objects
  • Pre-selected architecture (e.g., # layers and neurons)
slide-5
SLIDE 5

Michigan State University 5

Some Other Approaches

  • Feature Integration Theory (Treisman 1980): a master map for

location?

  • Saliency-based (Itti et al. 1998): feature types pre-selected
  • Bottom-up: traditional; top-down: gain-tuning (Backer et al. 2001)
  • Shift circuits (Anderson & Van Essen 1987, Olshausen et al.

1993): how were they developed?

  • SIFT: requiring pre-selected rule for interest points
  • Top-down in connectionist models
  • Visual search and label-based top-down: (Deco & Rolls, 2004): no top-

down in training

  • Selective tuning (Tsostos et al. 1995) using inhibitory top-down
  • ARTSCAN (Fazl & Grossberg, 2007): excitatory top-down, form fitting

``attentional shroud’’ --- potential difficulty with complex backgrounds

slide-6
SLIDE 6

Visual System: Rich Bidirectional Connectivity

  • Coritcal area connectivity e.g., as seen in Felleman and Van

Essen’s study (1993)…

  • But… this seems too complicated to model?
slide-7
SLIDE 7

Evidence that Areas are Developed from Statistics

  • (1): Orientation selective neurons: internal representation
  • (2): Blakemore and Cooper: representation is experience dependent
  • (3): M. Sur: Input-driven self-organization and feature development
  • Suggests functional representation is not hardcoded, but developed
slide-8
SLIDE 8

Consider a Single Area with Bottom-Up and Top-Down

V (bottom-up weight matrix) M (top-down weight matrix) For a single neuron:

slide-9
SLIDE 9

Multilayer Bidirectional Where What Networks

slide-10
SLIDE 10

WWN-3

  • Two way information flow in both

training and testing

  • Different information flow

parameterizations allow different attention modes (i.e., what- imposed, where-imposed)

  • No hardcoded rules for interest

points or features: each area learns through Lobe Component Analysis

slide-11
SLIDE 11

Each Layer: Lobe Component Analysis

LCA incrementally approximates joint distribution of bottom-up + top-down, in a dually optimal way LCA used for learning bottom-up and top-down weights in each area

  • Weng & Zheng, WCCI, 2006
  • Weng & Luciw, TAMD, 2009
slide-12
SLIDE 12

Learned Prototypes

(Above): Example training images, from 5 classes with 3 rotation variations in depth Location and Type are imposed at the motors Right: response-weighted input of a slice of V4: shows bottom-up sensitivities Current object representation pathway is limited

slide-13
SLIDE 13

Learned Features in IT and PP

IT spatial representation

(a): IT learned type-specific (here: duck) but allows location variations: we show response- weighted input of 4 single neurons here (b): PP learned location-specifc but allows type variation These effects are enabled by top- down connections in training

slide-14
SLIDE 14

Response of a Layer

  • V: bottom-up weights
  • M: top-down weights
  • f: lateral inhibition (approximation), k --- number of nonzero firing units
  • rho: relative influence of bottom-up to top-down
  • g: activation function e.g., sigmoid, tanh, linear
slide-15
SLIDE 15

WWN Operates Over Multiple Contexts

slide-16
SLIDE 16

V4: “Find the Cat”

From IT (PP has a low weight in search tasks) To IT and PP (right): bottom-up response (below): top-k (40) Integration of bottom-up and top-down Type: cat imposed at motor Top-k (4):

slide-17
SLIDE 17

V4: “Find the Pig”

From IT To IT and PP (right): bottom-up response (below): top-k (40) Integration of bottom-up and top-down Type: pig imposed at motor Top-k (4):

slide-18
SLIDE 18

Attentional Context

slide-19
SLIDE 19

Performance Over Learning

Disjoint views used in testing

slide-20
SLIDE 20

Performance with Multiple Objects

slide-21
SLIDE 21

Future: Multimodal SASE

  • The SASE (self-aware and self-effecting) architecture

describes a highly recurrent architecture of a multi-sensor, multi-effector brain. Multi-sensory and multi-effector integration are achieved through developmental learning.

slide-22
SLIDE 22

Conclusions

  • Novel methods on utilizing top-down excitatory

connections in multilayer Hebbian networks

  • Top-down connections in WWN-3
  • Top-down attention and recognition without a master map or

internal ``canonical views’’ (combination neurons)

  • Multilayer synchronization
  • Top-down context switching based on an internal idea or

external percept

  • Hopefully contributes to the foundations of online

learning based in cortex-inspired methods

slide-23
SLIDE 23

Michigan State University 23

Thank You

  • Questions
slide-24
SLIDE 24

Michigan State University 24

Future: Synaptic Neuromodulation

  • Background has high

variation, foreground has low variation

  • Automatic receptive field

learning for larger recognition hierarchies (e.g., V1 <-> V2 <-> V4)