I-tutorial Learning of Invariant Representations in Sensory Cortex - - PowerPoint PPT Presentation

i tutorial
SMART_READER_LITE
LIVE PREVIEW

I-tutorial Learning of Invariant Representations in Sensory Cortex - - PowerPoint PPT Presentation

The Center for Brains, Minds and Machines I-tutorial Learning of Invariant Representations in Sensory Cortex tomaso poggio CBMM McGovern Institute, BCS, LCSL, CSAIL MIT I-theory Learning of Invariant Representations in Sensory Cortex


slide-1
SLIDE 1

The Center for
 Brains, Minds and Machines

tomaso poggio CBMM McGovern Institute, BCS, LCSL, CSAIL MIT

I-tutorial

Learning of Invariant Representations in Sensory Cortex

slide-2
SLIDE 2

2

1.Intro and background 2.Mathematics of invariance 3.Biophysical mechanisms for tuning and pooling 4.Retina and V1: eccentricity dependent RFs; V2 and V4: pooling, crowding and clutter 5.IT: Class-specific approximate invariance and remarks

I-theory

Learning of Invariant Representations in Sensory Cortex

slide-3
SLIDE 3

3

Class 24 Mon Dec 1 Learning Invariant Representations: Retina and V1: eccentricity dependent RFs; V2 and V4: pooling, crowding and clutter

slide-4
SLIDE 4

4

Summary of previous class

slide-5
SLIDE 5

Hebb synapses imply that the tuning of the neuron converges to the top eigenvector of the covariance matrix of the “frames” of the movie of objects transforming. The convergence follows the Oja flow Different cells are exposed (during development) to translations in different directions.

Unsupervised ¡tuning ¡(during ¡development) ¡ and ¡eigenvectors ¡of ¡covariance ¡matrix

slide-6
SLIDE 6

Linking ¡Conjecture

  • Predicts Gabor-like tuning of simple cells in

V1

  • Qualitatively predicts tuning in V2/V4
  • Predicts/justifies mirror-symmetric tuning
  • f cells in face patch AL
slide-7
SLIDE 7

The ventral stream hierarchy: V1, V2, V4, IT A gradual increase in the
 receptive field size, in the complexity of the preferred stimulus, in tolerance to position and scale changes

Kobatake & Tanaka, 1994

Properties of the Ventral Stream

slide-8
SLIDE 8

8

End Summary

slide-9
SLIDE 9

Note: ¡we ¡focus ¡on ¡the ¡ sampling ¡layout ¡of ¡the ¡ retinal ¡ganglion ¡cells ¡ (RGCs) ¡-­‑ ¡the ¡outputs ¡of ¡ the ¡retina.

(Also: ¡focusing ¡on ¡the ¡Parvo ¡ pathway, ¡ignoring ¡Magno.)

slide-10
SLIDE 10

Receptive field size vs. eccentricity - HW

Hubel and Wiesel, 1971

slide-11
SLIDE 11

Scatter of receptive field sizes in V1

Schiller, P ., Finlay, B., Volman S. Quantitative Studies of Single Cells Properties in monkey striate cortex, 1976

slide-12
SLIDE 12

Retina and V1: eccentricity dependent RFs

  • Inverted truncated pyramid
  • Fovea and foveola
  • Scale and position invariance
slide-13
SLIDE 13

to have invariant representation

Usual recipe:

  • memorize a set of images/objects called templates and for

each template memorize observed transformations

  • to generate an invariant signature
  • compute dot products of transformations with image
  • compute histogram of the resulting values

13

slide-14
SLIDE 14

s-x space: definitions

slide-15
SLIDE 15

15

Geometry of scaling

slide-16
SLIDE 16

16

The magic window for pooling over scale and x-y shifts

slide-17
SLIDE 17

17

Sampling in the window

slide-18
SLIDE 18

18

Sampling in the window

slide-19
SLIDE 19

19

Magic window in V1

25’ !!! total 40x40 units 5 degree! total 40x40 units

slide-20
SLIDE 20

Anstis, 1974

“Prediction” of Anstis observation

slide-21
SLIDE 21

V2 and V4: pooling, crowding and clutter

  • Why multilayer pooling
  • Decimating the array
  • Bouma’s law
slide-22
SLIDE 22

22

Empfeh len
 11
 
 
 
 


We All Live In A Yellow Subroutine

slide-23
SLIDE 23

Hierarchies of magic HW modules: key property is covariance

l=4 l=3 l=2 l=1

HW module

slide-24
SLIDE 24

24

slide-25
SLIDE 25
  • Compositionality: signatures for wholes and for parts of

different size at different locations

  • Minimizing clutter effects
  • Invariance for certain non-global affine transformations
  • Retina to V1 map

Why V1, V2, V4, IT?

slide-26
SLIDE 26

V2 and V4: pooling, crowding and clutter

  • Why multilayer pooling
  • Decimating the array
  • Bouma’s law
slide-27
SLIDE 27

¡ ¡Top ¡module

∑ = signature⋅vector ⋅

Associative memory

slide-28
SLIDE 28

28

V1 and V2…

slide-29
SLIDE 29

(from ¡Freeman ¡& ¡Simoncelli ¡2011)

Magic theory “predicts” eccentricity dependence of M in V1, V2, V4, IT

slide-30
SLIDE 30

30

slide-31
SLIDE 31

Predictions

  • Very small foveola ~25’
  • Scale invariance more important than position invariance
  • Uniform scale invariance at “all” eccentricities
  • Shift invariance proportional to spatial frequency
  • Bouma’s law for crowding d= b x
  • Role of V2 (b=0.5)
slide-32
SLIDE 32

Hierarchical network

l=4 l=3 l=2 l=1

HW module

slide-33
SLIDE 33

Predictions

  • Very small foveola ~25’
  • Scale invariance more important than position invariance
  • Uniform scale invariance at “all” eccentricities
  • Shift invariance proportional to spatial frequency
  • Crowding in the fovea d=2’40” in fovea
  • Bouma’s law for peripheral crowding d= b x (role of V2 b=0.5)
slide-34
SLIDE 34

The predictions are:

  • a. Consider a small target, such as a 5’ width letter, placed in the center of

the fovea, activating the smallest simple cells at the bottom of the inverted

  • pyramid. The smallest critical distance to avoid interference should be the size of

a complex cell at the smallest scale, that is d=1’ 20” in V1 and d=2’40” in V2. If the letter is made larger, then the activation of the simple cells shifts to a larger scale and thus does the critical spacing which is proportional to the size of the

  • target. It is remarkable that both these predictions match quite well Figure 10 in

Levi and Carney, 2011.

  • b. Usually the target is just large enough to be visible at that eccentricity

(positive say). The critical separation for avoiding crowding outside the foveola is 12) d ~ b x since the RF size of the complex cells increases linearly with eccentricity, with depending on the cortical area responsible for the recognition signal. Thus the theory ``predicts'' Bouma's law , (Bouma, 1970) of crowding!

Predictions on crowding

slide-35
SLIDE 35

35

slide-36
SLIDE 36

36

slide-37
SLIDE 37

37

slide-38
SLIDE 38

38

slide-39
SLIDE 39

Collaborators (MIT-IIT, LCSL) in recent work

  • F. Anselmi, J. Mutch , J. Leibo, L. Rosasco, A. Tacchetti, Q. Liao

+ + Evangelopoulos, Zhang, Voinea Also: ¡ ¡L. ¡Isik, ¡S. ¡Ullman, ¡S. ¡Smale, ¡ ¡C. ¡Tan, ¡M. ¡Riesenhuber, ¡T. ¡Serre, ¡G. ¡Kreiman, ¡S. ¡Chikkerur, ¡

  • A. ¡Wibisono, ¡J. ¡Bouvrie, ¡M. ¡Kouh, ¡ ¡ ¡J. ¡DiCarlo, ¡ ¡C. ¡Cadieu, ¡S. ¡Bileschi, ¡ ¡L. ¡Wolf, ¡ ¡
  • D. ¡Ferster, ¡I. ¡Lampl, ¡N. ¡Logothe[s, ¡H. ¡Buelthoff
slide-40
SLIDE 40

40

We All Live In A Yellow Subroutine