I-tutorial Learning of Invariant Representations in Sensory Cortex - PowerPoint PPT Presentation

From old HMAX to present HMAX (a special case of full i-theory) How the new version of the model evolved from the original one 1. The two key operations: Operations for selectivity and invariance, originally computed in a simplified and idealized form (i.e., a multivariate Gaussian and an exact max, see Section 2) have been replaced by more plausible operations, normalized dot-product and softmax 2. S1 and C1 layers: In [Serre and Riesenhuber, 2004] we found that the S1 and C1 units in the original model were too broadly tuned to orientation and spatial frequency and revised these units accordingly. In particular at the S1 level, we replaced Gaussian derivatives with Gabor filters to better fit parafoveal simple cells’ tuning properties. We also modified both S1 and C1 receptive field sizes. 3. S2 layers: They are now learned from natural images. S2 units are more complex than the old ones (simple 2 °— 2 combinations of orientations). The introduction of learning, we believe, has b een the key factor for the model to achieve a high-level of performance on natural images, see [Serre et al., 2002]. 4. C2 layers: Their receptive field sizes, as well as range of invariances to scale and position have been decreased so that C2 units now better fit V4 data. 5. S3 and C3 layers: They were recently added and constitute the top-most layers of the model along with the S2b and C2b units (see Section 2 and above). The tuning of the S3 units is also learned from natural images. 6. S2b and C2b layers: We added those two layers to account for the bypass route (that projects directly from V1/V2 to PIT, thus bypassing V4 [see Nakamura et al., 1993]).

Serre & Riesenhuber 2004

1. Problem of visual recognition, visual cortex 2. Historical background 3. Neurons and areas in the visual system 4. Feedforward hierarchical models 5. Beyond hierarchical models

Vision: ¡what ¡is ¡where ¡ • Human ¡Brain ¡ – 10 10 -‑10 11 ¡neurons ¡ ¡(~1 ¡million ¡flies) ¡ – 10 14 -‑ ¡10 15 ¡synapses ¡ • Neuron – Fundamental space dimensions: • fine dendrites : 0.1 µ diameter; lipid bilayer membrane : 5 nm thick; specific proteins : pumps, channels, receptors, enzymes – Fundamental time length : 1 msec • Ventral ¡stream ¡in ¡rhesus ¡monkey ¡ – ~10 9 ¡neurons ¡in ¡the ¡ventral ¡stream ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ (350 ¡10 6 ¡ in ¡each ¡emisphere) ¡ – ~15 ¡10 6 ¡neurons ¡in ¡AIT ¡(Anterior ¡ InferoTemporal) ¡cortex ¡ Van Essen & Anderson, 1990

Vision: ¡what ¡is ¡where ¡ Source: Lennie, Maunsell, Movshon

The Ventral Stream The ventral stream hierarchy: V1, V2, V4, IT A gradual increase in the   receptive field size, in the complexity of the preferred stimulus, in tolerance to position and scale changes Kobatake & Tanaka, 1994

(Thorpe and Fabre-Thorpe, 2001)

V1: hierarchy of simple and complex cells LGN-type Simple Complex cells cells cells (Hubel & Wiesel 1959)

1. Problem of visual recognition, visual cortex 2. Historical background 3. Neurons and areas in the visual system 4. Feedforward hierarchical models 5. Beyond hierarchical models

¡ ¡Recogni3on ¡in ¡the ¡Ventral ¡Stream: ¡‘’classical ¡model” *Modified from (Gross, 1998) [software available online Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu with CNS (for GPUs)] Knoblich Kreiman & Poggio 2005; Serre Oliva Poggio 2007

A “feedforward” version of the problem:   rapid categorization (RVSP) Biederman 1972; Potter 1975; Thorpe et al 1996

Two key computations, suggested by physiology Unit Pooling Computation Operation types Gaussian- Selectivity / tuning / Simple template matching AND-like Soft-max / Complex Invariance or-like

Gaussian tuning Gaussian tuning in Gaussian tuning in IT V1 for orientation around 3D views Hubel & Wiesel 1958 Logothetis Pauls & Poggio 1995

Max-like operation Max-like behavior in V4 Max-like behavior in V1 Lampl Ferster Poggio & Riesenhuber 2004 Gawne & Martin 2002 see also Finn Prieber & Ferster 2007

Two operations (~OR, ~AND): disjunctions of conjunctions Ø Tuning operation (Gaussian-like, AND-like) y = e − | x − w | 2 or y ~ x i w | x | Ø Simple units Stage 3 Ø Max-like operation (OR-like) Stage 2 Ø Complex units Stage 1 Each operation ~microcircuits of ~100 neurons

Plausible biophysical implementations • Max and Gaussian-like tuning can be approximated with same canonical circuit using shunting inhibition. Tuning (eg “center” of the Gaussian) corresponds to synaptic weights. (Knoblich Koch Poggio in prep; Kouh & Poggio 2007; Knoblich Bouvrie Poggio 2007)

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡ ¡ circuits ¡and ¡biophysics A canonical microcircuit of spiking neurons? Stage 2 Stage 1 A plausible biophysical implementation for both Gaussian tuning (~AND) + max (~OR): normalization circuits with divisive inhibition ( Kouh, Poggio, 2008; also RP, 1999; Heeger, Carandini, Simoncelli, … )

Basic circuit is closely related to other models Can be implemented by shunting inhibition (Grossberg 1973, Reichardt et al. 1983, Carandini and Heeger, 1994) and spike threshold variability (Anderson et al. 2000, Miller and Troyer, 2002) Adelson and Bergen (see also Hassenstein and Reichardt, 1956) Of the same form as model of MT (Rust et al., Nature Neuroscience, 2007

Simulation with spiking neurons and realistic synapses

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡ ¡ circuits ¡and ¡biophysics Stage 2 A plausible biophysical implementation of a Gaussian-like tuning ( Kouh, Poggio, 2008) : Stage 1 normalized dot product w ⋅ x | x |

S1 units Gabor filters Parameters fit to V1 data (Serre & Riesenhuber 2004) 17 spatial frequencies (=scales) 4 orientations

C1 units Increase in tolerance to position (and in RF size)

C1 units Increase in tolerance to scale

Serre & Riesenhuber 2004

S2 units Features of moderate complexity (n~1,000 types) Combination of V1-like complex units at different orientations Synaptic weights w learned from natural images 5-10 subunits chosen at random from all possible afferents (~100-1,000)

S2 units homogenous fields stronger facilitation stronger suppression cross- orientation fields

Nature Neuroscience - 10, 1313 - 1321 (2007) / Published online: 16 September 2007 | doi:10.1038/nn1975 Neurons in monkey visual area V2 encode combinations of orientations Akiyuki Anzai, Xinmiao Peng & David C Van Essen

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡learning ¡ (from ¡Serre, ¡2007) ¡

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡learning ¡ • Task-specific circuits (from IT to PFC?) - Supervised learning: ~ classifier Overcomplete dictionary of “templates” ~ image “patches” ~ ~ “parts” is learned during an unsupervised learning stage (from ~10,000 natural images) by tuning S units. see also (Foldiak 1991; Perrett et al 1984; Wallis & Rolls, 1997; Lewicki and Olshausen, 1999; Einhauser et al 2002; Wiskott & Sejnowski 2002; Spratling 2005)

Start with S2 layer Units are organized in n … feature maps … Database ~1,000 natural images At each iteration: Ø Present one image Ø Learn k feature maps

w 1 Start with S2 layer … Pick 1 unit from the first map at random … Store in unit synaptic weights the precise pattern of subunits activity, i.e. w=x Image “moves” (looming and shifting) S2 Weight vector w is copied to all units in feature map 1 C1 (across positions and scales)

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡learning ¡ S2 units • Features of moderate complexity (n~1,000 types) • Combination of V1-like complex units at different orientations stronger • Synaptic weights w facilitation learned from natural images • 5-10 subunits chosen at random from all possible afferents (~100-1,000) stronger suppression

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡learning ¡ Sample ¡S2 ¡Units ¡Learned ¡ (from ¡Serre, ¡2007)

Nature Neuroscience - 10, 1313 - 1321 (2007) / Published online: 16 September 2007 | doi:10.1038/nn1975 Neurons in monkey visual area V2 encode combinations of orientations Akiyuki Anzai, Xinmiao Peng & David C Van Essen

Comparison ¡w| ¡V4 Tuning ¡for ¡ curvature ¡and ¡ boundary ¡ conformaJons? Pasupathy & Connor 2001

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡learning ¡ C2 ¡units • Same selectivity as S2 units but increased tolerance to position and size of preferred stimulus • Local pooling over S2 units with same selectivity but different positions and scales

Cerebral Cortex Advance Access published online on June 19, 2006 A Comparative Study of Shape Representation in Macaque Visual Areas V2 and V4 Jay Hegdé and David C. Van Essen

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡learning ¡ Beyond ¡C2 ¡units • Units increasingly complex and invariant • S3/C3 units: • Combination of V4-like units with different selectivities • Dictionary of ~1,000 features = num. columns in IT (Fujita 1992)

A loose hierarchy • Bypass routes along with main routes: • From V2 to TEO (bypassing V4) (Morel & Bullier 1990; Baizer et al 1991; Distler et al 1991; Weller & Steele 1992; Nakamura et al 1993; Buffalo et al 2005) • From V4 to TE (bypassing TEO) (Desimone et al 1980; Saleem et al 1992) • “Replication” of simpler selectivities from lower to higher areas • Rich dictionary of features – across areas -- with various levels of selectivity and invariance

¡ ¡Model: ¡testable ¡at ¡different ¡levels ¡ The ¡ most ¡ recent ¡ version ¡ of ¡ this ¡ straighLorward ¡ class ¡ of ¡ models ¡ is ¡ consistent ¡ with ¡ many ¡ data ¡ at ¡ different ¡ levels ¡ -‑-‑ ¡ from ¡ the ¡ computa(onal ¡ to ¡ the ¡ biophysical ¡ level. ¡ ¡ ¡ Being ¡testable ¡across ¡all ¡these ¡levels ¡ is ¡a ¡high ¡bar ¡and ¡an ¡important ¡one ¡ (too ¡ easy ¡ to ¡ develop ¡ models ¡ that ¡ explain ¡ one ¡ phenomenon ¡ or ¡ one ¡ area ¡ or ¡ one ¡ illusion...these ¡ models ¡ overfit ¡ the ¡ data, ¡ they ¡ are ¡ not ¡ scienJfic)

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡ model ¡accounts ¡for ¡ ¡physiology+ ¡psychophysics Hierarchical ¡Feedforward ¡Models:   is ¡consistent ¡with ¡or ¡predict ¡ ¡neural ¡data V1: Simple and complex cells tuning (Schiller et al 1976; Hubel & Wiesel 1965; Devalois et al 1982) MAX-like operation in subset of complex cells (Lampl et al 2004) V2: Subunits and their tuning (Anzai, Peng, Van Essen 2007) V4: Tuning for two-bar stimuli (Reynolds Chelazzi & Desimone 1999) MAX-like operation (Gawne et al 2002) Two-spot interaction (Freiwald et al 2005) Tuning for boundary conformation (Pasupathy & Connor 2001, Cadieu, Kouh, Connor et al., 2007) Tuning for Cartesian and non-Cartesian gratings (Gallant et al 1996) IT: Tuning and invariance properties (Logothetis et al 1995, paperclip objects) Differential role of IT and PFC in categorization (Freedman et al 2001, 2002, 2003) Read out results (Hung Kreiman Poggio & DiCarlo 2005) Pseudo-average effect in IT (Zoccolan Cox & DiCarlo 2005; Zoccolan Kouh Poggio & DiCarlo 2007) Human: Rapid categorization (Serre Oliva Poggio 2007) Face processing (fMRI + psychophysics) (Riesenhuber et al 2004; Jiang et al 2006)

¡ ¡Recogni3on ¡in ¡Visual ¡Cortex: ¡ model ¡accounts ¡for ¡ ¡phychophysics

I-tutorial Learning of Invariant Representations in Sensory Cortex - PowerPoint PPT Presentation

The Center for Brains, Minds and Machines I-tutorial Learning of Invariant Representations in Sensory Cortex tomaso poggio Center for Brains Minds and Machines McGovern Institute, BCS, LCSL, CSAIL MIT I-theory Learning of Invariant

Tutorial Tutorial A2 is out, its called Inpainting Tutorial Tutorial A2 is out, its called

A GAMS TUTORIAL A GAMS TUTORIAL A GAMS TUTORIAL WHAT IS GAMS ? General Algebraic Modeling

Excel Tutorial 1 Getting Started with Excel Tutorial 2 Formatting a Workbook Tutorial 3

PROGRAMMING TUTORIAL Thierry Lepley, April 4 th 2016 TUTORIAL GOAL Intermediate Tutorial for

Do Fifty- Two Motivation Overview of the Language

UPPAAL Tutorial UPPAAL Tutorial UPPAAL Tutorial Introduction Introduction Alexandre David

PowerPoint Tutorial 1 Creating a Presentation Tutorial 2 Applying and Modifying Text and

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Comp 1402 Winter 2008 Tutorial #1 Tutorial 1 The objectives of this tutorial will be:

XDP hands-on tutorial Jesper Dangaard Brouer Toke Hiland-Jrgensen Bornhack Gelsted, August

Prose tutorial Edit New Page Sumit Gulwani edited this page 9 minutes ago 60 revisions

Tutorial on using the Google Cloud Platform (GCP) Tutorial on using the Google Cloud Platform

CS 525M Mobile and Ubiquitous Computing Tutorial 1: Introduction by Bucky Roberts (thenewboston)

CAVE2 Unity Tutorial CAVE2 unity tutorial on github Omicron Cave example unity scene Cave2

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

CAVE2 Unity Tutorial CAVE2 unity tutorial on github Omicron Cave example unity scene Cave2

Lecture 5.0:Gene Regulation Bioinformatics Wyeth W. Wasserman University of British Columbia

Applications of Dominant Set Sebastiano Vascon, PhD DAIS 09/05/2017 Recap on the Dominant Set

SimCluster: pat SimCluster: pat attern recognition attern recognition for composition nal data

CPS Applications Heechul Yun Note: Some slides are adopted from Prof. Pellizzoni 1 Outline

Image Databases Image Databases Image Databases Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Analysis of the binding site of S1 -casein to its cellular receptor TLR4 by selective

1 The role of early experience in human *A critical period is a limited developmental period when

P An Introduction