NIPS07 tutorial (preliminary) Visual Recognition in Primates and - PowerPoint PPT Presentation

NIPS’07 tutorial (preliminary) Visual Recognition in Primates and Machines Tomaso Poggio (with Thomas Serre) McGovern Institute for Brain Research Center for Biological and Computational Learning Department of Brain & Cognitive Sciences Massachusetts Institute of Technology Cambridge, MA 02139 USA

Motivation for studying vision: trying to understand how the brain w orks • Old dream of all philosophers and more recently of AI: – understand how the brain works – make intelligent machines

This tutorial: using a class of models to summarize/interpret experimental results • Models are cartoons of reality eg Bohr’s model of the hydrogen atom • All models are “wrong” • Some models can be useful summaries of data and some can be a good starting point for more complete theories

1. Problem of visual recognition, visual cortex 2. Historical background 3. Neurons and areas in the visual system 4. Data and feedforward hierarchical models 5. What is next?

The problem: recognition in natural images (e.g., “is there an animal in the image?”)

How does visual cortex solve this problem? How can computers solve this problem? dorsal stream: “where” ventral stream: “what” Desimone & Ungerleider 1989

A “feedforw ard” version of the problem: rapid categorization SHOW RSVP MOVIE Movie courtesy of Jim DiCarlo Biederman 1972; Potter 1975; Thorpe et al 1996

A model of the ventral stream w hich is also an algorithm *Modified from (Gross, 1998) Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005; [software available online] Serre Oliva Poggio 2007

… solves the problem (if mask forces feedforw ard processing) … Model 82% human- observers (n = 24) 80% Human 80% • d’~ standardized error rate • the higher the d’, the better the perf. Serre Oliva & Poggio 2007

Kanade 1974 *Best CVPR’07 paper 10 yrs ago … Object recognition for computer vision: Digit recognition Multi-class / multi-objects Pedestrian detection Car detection Face identification Face detection 1990 Turk & Pentland 1991 personal historical perspective Brunelli & Poggio 1993 Sung & Poggio 1994 Beimer & Poggio 1995 1995 Perona and colleagues 1996-now Osuna & Girosi 1997* LeCun et al. 1998; Schneiderman & Kanade 1998; Rowley Baluja & Kanade 1998 Mohan Papageorgiou & Poggio 1999; Amit and Geman 1999 2000 Schneiderman & Kanade 2000 Viola & Jones 2001 Belongie & Malik 2002; Argawal & Roth 2002; Ullman e the past few years… excellent algorithms in … Many more al 2002 Fergus et al 2003 Torralba et al 2004 …

Examples: Learning Object Detection: Finding Frontal Faces • Training Database • 1000+ Real, 3000+ VIRTUAL • 50,0000+ Non-Face Pattern Sung & Poggio 1995

~10 year old CBCL computer vision w ork: pedestrian detection system in Mercedes test car now becoming a product (MobilEye)

Hubel & Wiesel 1959 1960 Hubel & Wiesel 1962 Hubel & Wiesel 1965 Gross et al 1969 1970 Zeki 1973 IT-STS Exstrastriate V1 monkey V1 cat Object recognition in cortex: Hubel & Wiesel 1977 Historical perspective 1980 cortex Ungerleider & MIshkin 1982; Perrett Rolls et al 1982 Schwartz et al 1983 Desimone et al 1984 1990 Schiller & Lee 1991 Kobatake & Tanaka 1994 Logothetis et al 1995 past 10 yrs progress in the … Much

Some personal history: First step in developing a model: learning to recognize 3D objects in IT cortex Examples of Visual Stimuli Poggio & Edelman 1990

An idea for a module for view -invariant identification Prediction: neurons become view-tuned VIEW- through learning INVARIANT, OBJECT- Architecture that Regularization SPECIFIC accounts for Network (GRBF) invariances to 3D with Gaussian kernels UNIT effects (>1 view needed to learn!) View Angle Poggio & Edelman 1990

Learning to Recognize 3D Objects in IT Cortex Examples of Visual Stimuli After human psychophysics (Buelthoff, Edelman, Tarr, Sinha, …), which supports models based on view-tuned units... … physiology! Logothetis Pauls & Poggio 1995

Recording Sites in Anterior IT LUN LAT …neurons tuned to STS faces are intermingled nearby…. IOS AMTS LAT STS Ho=0 AMTS Logothetis, Pauls & Poggio 1995

Neurons tuned to object view s as predicted by model Logothetis Pauls & Poggio 1995

A “View -Tuned” IT Cell Target Views -168 o -120 o -108 o -96 o -84 o -72 o -60 -48 o -36 o -24 o -12 o 0 o o -168 -120 -108 -96 -84 -72 -48 -36 -24 -12 0 -60 o o o o o o o o o o o o 12 24 36 48 60 72 84 96 108 120 132 168 36 48 60 108 120 12 24 72 84 96 Distractors 60 spikes/sec 800 msec Logothetis Pauls & Poggio 1995

But also view -invariant object-specific neurons (5 of them over 1000 recordings) Logothetis Pauls & Poggio 1995

View -tuned cells: scale invariance (one training view only) motivates present model Scale Invariant Responses of an IT Neuron 3.25 deg 1.0 deg 1.75 deg 76 2.5 deg 76 76 76 (x 0.4) (x 0.7) (x 1.0) (x 1.3) Spikes/sec Spikes/sec Spikes/sec / ik 0 0 0 0 0 1000 2000 3000 0 1000 2000 3000 0 1000 2000 3000 0 1000 2000 3000 Time (msec) Time (msec) Time (msec) Time (msec) 76 4.0 deg 76 4.75 deg 76 5.5 deg 76 6.25 deg (x 1.6) (x 1.9) (x 2.2) (x 2.5) Spikes/sec Spikes/sec / / S ik S ik 0 0 0 0 0 1000 2000 3000 0 1000 2000 3000 0 1000 2000 3000 1000 2000 3000 0 Time (msec) Time (msec) Time (msec) Time (msec) Logothetis Pauls & Poggio 1995

From “ HMAX ” to the model now … Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005; Serre Oliva Poggio 2007

Neural Circuits Source: Modified from Jody Culham’s web slides

Neuron basics INPUT= pulses or graded potentials COMPUTATION = Analog spikes OUTPUT = Chemical

Some numbers • Human Brain – 10 11 -10 12 neurons (1 million flies ☺ ) – 10 14 - 10 15 synapses • Neuron – Fundamental space dimension: • fine dendrites : 0.1 µ diameter; lipid bilayer membrane : 5 nm thick; specific proteins : pumps, channels, receptors, enzymes – Fundamental time length : 1 msec

The cerebral cortex Human Macaque Thickness 3 – 4 mm 1 – 2 mm Total surface area ~1600 cm2 ~160 cm2 (both sides) (~50cm diam) (~15cm diam) ~10 ⁵ / mm2 ~ 10 ⁵ / mm2 Neurons /mm² Total cortical neurons ~2 x 1010 ~2 x 109 Visual cortex 300 – 500 cm2 80+cm2 Visual Neurons ~4 x 109 ~109 neurons

Gross Brain Anatomy A large percentage of the cortex devoted to vision

The Visual System [Van Essen & Anderson, 1990]

V1: hierarchy of simple and complex cells LGN-type Simple Complex cells cells cells (Hubel & Wiesel 1959)

V1: Orientation selectivity Hubel & Wiesel movie

V1: Retinotopy

(Thorpe and Fabre-Thorpe, 2001)

Beyond V1: A gradual increase in RF size Reproduced from [Kobatake & Tanaka, 1994] Reproduced from [Rolls, 2004]

Beyond V1: A gradual increase in the complexity of the preferred stimulus Reproduced from (Kobatake & Tanaka, 1994)

AIT: Face cells Reproduced from (Desimone et al. 1984)

AIT: Immediate recognition categorization identification Hung Kreiman Poggio & DiCarlo 2005 See also Oram & Perrett 1992; Tovee et al 1993; Celebrini et al 1993; Ringach et al 1997; Rolls et al 1999; Keysers et al 2001

The ventral stream Source: Lennie, Maunsell, Movshon

We consider feedforw ard architecture only (Thorpe and Fabre-Thorpe, 2001)

Our present model of the ventral stream: feedforw ard, accounting only for “immediate recognition” • It is in the family of “Hubel-Wiesel” models (Hubel & Wiesel, 1959; Fukushima, 1980; Oram & Perrett, 1993, Wallis & Rolls, 1997; Riesenhuber & Poggio, 1999; Thorpe, 2002; Ullman et al., 2002; Mel, 1997; Wersing and Koerner, 2003; LeCun et al 1998; Amit & Mascaro 2003; Deco & Rolls 2006…) • As a biological model of object recognition in the ventral stream it is perhaps the most quantitative and faithful to known biology (though many details/facts are unknown or still to be incorporated)

Tw o key computations Unit types Pooling Computation Operation Selectivity / Gaussian- Simple template tuning / matching and-like Soft-max / Complex Invariance or-like

� Gaussian-like tuning operation (and-like) � Simple units � Max-like operation (or-like) � Complex units

NIPS07 tutorial (preliminary) Visual Recognition in Primates and - PowerPoint PPT Presentation

NIPS07 tutorial (preliminary) Visual Recognition in Primates and Machines Tomaso Poggio (with Thomas Serre) McGovern Institute for Brain Research Center for Biological and Computational Learning Department of Brain & Cognitive

Tutorial Tutorial A2 is out, its called Inpainting Tutorial Tutorial A2 is out, its called

UCL Tutorial on: Deep Belief Nets (An updated and extended version of my 2007 NIPS tutorial)

A GAMS TUTORIAL A GAMS TUTORIAL A GAMS TUTORIAL WHAT IS GAMS ? General Algebraic Modeling

Preliminary results of Preliminary results of Preliminary results of Invalda Preliminary results

Preliminary Report from Preliminary Report from Preliminary Report from Preliminary Report from

Preliminary results of Preliminary results of Preliminary results of Preliminary results of

Excel Tutorial 1 Getting Started with Excel Tutorial 2 Formatting a Workbook Tutorial 3

for NIPS testing? 1 Close to 90% of Medicaid But this coverage covered lives can depend on a

Advances in Gaussian Processes Tutorial at NIPS 2006 in Vancouver Carl Edward Rasmussen Max

Networks Mostly adapted from Goodfellows 2016 NIPS tutorial:

PRELIMINARY BUDGET TIMELINE Adopt Preliminary budget on June 23 rd The preliminary budget

PROGRAMMING TUTORIAL Thierry Lepley, April 4 th 2016 TUTORIAL GOAL Intermediate Tutorial for

Do Fifty- Two Motivation Overview of the Language

UPPAAL Tutorial UPPAAL Tutorial UPPAAL Tutorial Introduction Introduction Alexandre David

PowerPoint Tutorial 1 Creating a Presentation Tutorial 2 Applying and Modifying Text and

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Y P O C T The neurophysiology of tES O N O D Michael A. Nitsche E Department Psychology

Theory of correlation transfer and correlation structure Part II: recurrent networks CNS*2012

Managing the Caveman/Cavewoman Brain in the 21st Century Bridging the Hearts and Minds of Youth

Top-Down Connections and its Applications to 3D Object Recognition Matthew D. Luciw and Juyang

Chapter 3 Part 1 Orientation Directions in the nervous system are described relatively to

Networks in Biology and Neuroscience CSE 5339: Topics in Network Data Analysis Samir Chowdhury

------------------------ Cognitive benefits of learning to play chess and other strategy games

N E U R O N A L R H Y T H M S A F R A M E W O R K F O R U N D E R S T A N D I N G I N T E R A