Going after object recognition peformance to discover how the - PowerPoint PPT Presentation

“invariance” is crux problem Going after object recognition peformance to discover how the ventral stream works. hierarchical, working system James DiCarlo MD, PhD Professor of Neuroscience Head, Department of Brain and Cognitive Sciences Investigator, The McGovern Institute for Brain Research Massachusetts Institute of Technology, Cambridge MA, USA

Systems neuroscience: the non human primate model Ventral visual stream

Systems neuroscience: the non human primate model Ventral visual stream Powerful set of visual features

Understanding the brain and discovering game-changing information processing technology are two sides of the same coin. How the brain works

The convergence of three fields When biological brains perform better than computers New ideas, algorithm parameters New phenomena psychophysics computer science neuroscience How the brain works Attempt to test/ falsify those hypotheses Falsifiable hypotheses When computers perform as well as or better than biological brains

Common physical source (object) leads to many images “identity preserving image variation” View: position, size, pose, illumination Clutter, occlusion, illumination Intraclass Poggio, Ullman, Grossberg, Edleman, Biederman, etc. Deformation, DiCarlo and Cox, TICS (2007); articulation Pinto, Cox, and DiCarlo, PLoS Comp Bio (2008)

The convergence of three fields New ideas, algorithm parameters New phenomena psychophysics computer science neuroscience How the brain works

Brain-inspired computer algorithms 2. Tolerance 1. Selectivity • Examples: • Hubel & Wiesel (1962) • Fukushima (1980) • Perrett & Oram (1993) • Wallis & Rolls (1997) “AND” “OR” • LeCun et al. (1998) • Risenhuber & Poggio (1999) FROM BIOLOGY: • Serre, Kouh, et al. (2005) • Hierarchy • Spatially local filters • Convolution • Normalization • Threshold NL • Unsupervised learning • ... Serre, Kouh, Cadieu, Knoblich, Kreiman & Poggio 2005

The convergence of three fields psychophysics computer science neuroscience How the brain works Attempt to test/ e.g. HMAX falsify those hypotheses Falsifiable hypotheses

HMAX successes (~2005) Serre, Kouh, Cadieu, Knoblich, Kreiman & Poggio 2005

HMAX successes (~2007) (under limited human viewing conditions) Serre Oliva & Poggio 2007

Circa 2007 Human level IT population HMAX Performance pixels

~2008: But HMAX and other models failed to explain neurons Representational similarity analysis Models of Biological ventral stream ventral stream HMAX model Kriegeskorte, Frontiers in Neuroscience (2009)

What went wrong? New ideas, algorithm parameters New phenomena psychophysics computer science neuroscience How the brain works Attempt to test/ falsify those hypotheses Falsifiable hypotheses Stringency of these “Brains vs. Machines” tests was far too weak

~2008: Tests of performance were not stringent enough. Caltech 101 benchmark Animal vs. Non-animal Performance (%) 100 “V1-like” models SLF (~HMAX) V1-like 75 “HMAX 2.0” (Serre et al. PNAS 2007 ) Humans 50 One problem was insufficient Far-body Head Medium-body Close-body variation in the test sets. Pinto, Majaj, Barhomi, Salomon, Cox, DiCarlo COSYNE 2010 Pinto, Cox, and DiCarlo, PLoS Comp Bio (2008)

Human level IT population V1-like HMAX Performance pixels

2009: More stringent, but compact tests of “object recognition” Example object recognition task: “car detection” Image generation strategy: Pinto, Cox & DiCarlo, PLoS Comp Bol (2008), Pinto, DiCarlo and Cox, ECCV (2008); Pinto, Doukan, DiCarlo & Cox, PLoS Comp Biol (2009)

2009: Toward more stringent tests of “object recognition” “car” not “car” Basic car task, variation level: 3 Example object recognition task: “car detection” Image generation strategy: - Parametric control of task demand (esp. invariance) - Few images needed to bring computer vision features to their knees no variation lots of variation more variation lots of variation no variation more variation ... ... n>100 n>700 Pinto, Cox & DiCarlo, PLoS Comp Bol (2008), Pinto, DiCarlo and Cox, ECCV (2008); Pinto, Doukan, DiCarlo & Cox, PLoS Comp Biol (2009)

2010: Machines vs. human brains Δ Machines beat humans! a) “cars vs. planes” task b) controls SLF 100 25 Performance relative to Pixels (%) 90 PHOW PHOG Geometric Blur V1-like Performance (%) SIFT 80 S L F V1-like ( ~ H M P A i X x 0 ) e l s P 70 H O S W Machines lose to I F T P H O humans G 60 chance 50 0 1 2 3 4 6 position (x-axis) 0% 10% 20% 30% 40% 50% 60% position (y-axis) 0% 20% 40% 60% 80% 100% 120% scale 0% 10% 20% 30% 40% 50% 60% in-plane rotation 0° 15° 30° 45° 60° 75° 90° in-depth rotation 0° 15° 30° 45° 60° 75° 90° Increasing Composite Variation Pinto, Barhomi, Cox & DiCarlo, WACV(2010) Data merged here: 48 basic-level tasks (8 labels x 6 level of variation)

Human level IT population V1-like HMAX Performance pixels

Human level HMAX IT population V1-like Performance pixels

Human level IT population simple decode HMAX V1-like Performance pixels

Human level IT population simple decode V4 population HMAX V1-like Performance pixels

? Zeiler& Human level Super Fergus Vision HMO IT population simple decode V4 population HMAX V1-like Performance pixels

b 0.9 Image Object Yamins, Hong, Soloman, Seibert and DiCarlo (under review) Neural population similarity of images along the ventral stream generalization generalization Current maximum expected explanatory power * Popululation similaritty to IT IT neuronal units HMO model Explanatory power of 0.6 (RDM correlation) HMO model Animals (8) other models Boats (8) Cars (8) Image Chairs (8) IT units split-half 0.3 Pixels Faces (8) Fruits (8) V4 units V2-like V1-like HMAX Planes (8) HMO SIFT Tables (8) Inspired by N. Kriegeskorte et al. (2008, 2009) 0.0 a Pixels V1-like V2-like V4 neuronal units IT neuronal units HMO model Animals (8) Boats (8) Cars (8) Image Chairs (8) Faces (8) Fruits (8) Planes (8) Tables (8)

Predictions of single site IT responses from current best model d Unit 1: r 2 = 0.48 Response of neural site Chairs Animals Boats Cars Faces Fruits Planes Tables Prediction of HMO model Unit 2: r 2 = 0.55 Response of neural site Prediction of HMO Faces Animals Boats Cars Chairs Fruits Planes Tables model Ability to predict IT responses to new images and new objects is dramatically better than previous models. Yamins, Hong, Soloman, Seibert and DiCarlo (under review)

Basic bio-constrained model component inside HMO Pinto, Doukan, DiCarlo & Cox, PLoS Comp Biol (2009) a Basic operations: �� filter , thr , sat , pool , norm � �� Filter Threshold & Pool Normalize Saturate � � 1 � � 2 ... � � k Neural-like basic operations � � � �� “Output” is thousands L1 of visual Hierarchical Stacking L2 features L3 Hubel & Wiesel (1962), Fukushima (1980); Perrett & Oram (1993); Wallis & Rolls (1997); LeCun et al. (1998); Riesenhuber & Poggio (1999); Serre, Kouh, et al. (2005), etc....

The better a model performs, the better is explains IT responses. Ability of artificial visual features (2013) 50% to predict IT responses (% variance explained) basic model class Exploration of 0% We are optimizing this way Performance of artificial visual features (% correct)

Today: ? ? Human level Zeiler& HMO Super IT population simple Fergus Vision decode V4 population HMAX V1-like Performance pixels

Follow the performance trail... New ideas, algorithm parameters New phenomena psychophysics computer science neuroscience How the brain works Attempt to test/ falsify those hypotheses Falsifiable hypotheses Stringency of these tests is crucial. Must include “invariance”.

The power of stringent tests to elucidate biological brains • Discover IT neuronal codes that can explain behavior 1) • Demonstrate that other possible codes CANNOT • Demonstrate which computer vision features CANNOT Dan Yamins Ha Hong Ethan Soloman 2) • Driving discovery (“learning?”) of new CV features • These are becoming more and more capable of explaining what the brain is doing Dan Yamins Ha Hong Charles Cadieu Dave Cox Nicolas Pinto

Going after object recognition peformance to discover how the - PowerPoint PPT Presentation

invariance is crux problem Going after object recognition peformance to discover how the ventral stream works. hierarchical, working system James DiCarlo MD, PhD Professor of Neuroscience Head, Department of Brain and Cognitive Sciences

Measuring Performance November 17, 2008 Measuring Performance Introduction CPU Peformance and

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

How Computers Discover How Computers Discover A Mini-Review of Algorithmic Meta-Discovery Filip

Y2 Parent Information Night The Year 2 Team Grow. Discover. Dream. People Working With Year 2

Welcome tEa Session 14 14 Se Sept ptembe mber r 20 2020 20 Discover Victoria Discover

GLO Science Professional Before & After Images Before GLO After GLO Before GLO After GLO

Instance-level Recognition Pingmei Xu Object Recognition Friends SE01EP02 Recognition: Find the

Supervised object recognition, unsupervised object recognition then Perceptual organization Bill

Beyond Object Recognition in 2D Georgia Gkioxari Object Recognition in 2D The World is 3D

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Learning for Action Recognition Yemin Shi shiyemin@pku.edu.cn 2018-03 1 Background Action

Student Information System Request for Proposals DISCOVER | NURTURE | INSPIRE DISCOVER |

DISCOVER RIM WASH SYSTEM WHAT IS DISCOVER? It is a new and innovative NO TOUCH rim wash

ESTABLISHING A REGIONAL MS MDT IN THE EAST MIDLANDS JONATHAN EVANS CONSULTANT NEUROLOGIST, QMC

HEALTHCARE EQUIPMENT Utilizing BIM In Healthcare Engineering and Planning Bukola Jaji, Cockrell

Opening remarks and overview of Balkan Society of Radiology activities Dimitrios K. Tsetis

multidisciplinary team for interventional acute stroke management Bernhard Reimers Humanitas

Computational Neuroscience of Vision Dr. James A. Bednar jbednar@inf.ed.ac.uk

Computational Models of Neural Systems: 15-883 Fall 2013 Instructor: David S. Touretzky

McKean-Vlasov limit for interacting systems with simultaneous jumps Luisa Andreis Prof. Paolo

Performance Analysis of Computational Neuroscience Software NEURON on Knights Corner Many Core

Going after object recognition peformance to discover how the - PowerPoint PPT Presentation

invariance is crux problem Going after object recognition peformance to discover how the ventral stream works. hierarchical, working system James DiCarlo MD, PhD Professor of Neuroscience Head, Department of Brain and Cognitive Sciences

Measuring Performance November 17, 2008 Measuring Performance Introduction CPU Peformance and

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

How Computers Discover How Computers Discover A Mini-Review of Algorithmic Meta-Discovery Filip

Y2 Parent Information Night The Year 2 Team Grow. Discover. Dream. People Working With Year 2

Welcome tEa Session 14 14 Se Sept ptembe mber r 20 2020 20 Discover Victoria Discover

GLO Science Professional Before &amp; After Images Before GLO After GLO Before GLO After GLO

Instance-level Recognition Pingmei Xu Object Recognition Friends SE01EP02 Recognition: Find the

Supervised object recognition, unsupervised object recognition then Perceptual organization Bill

Beyond Object Recognition in 2D Georgia Gkioxari Object Recognition in 2D The World is 3D

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Learning for Action Recognition Yemin Shi shiyemin@pku.edu.cn 2018-03 1 Background Action

Student Information System Request for Proposals DISCOVER | NURTURE | INSPIRE DISCOVER |

DISCOVER RIM WASH SYSTEM WHAT IS DISCOVER? It is a new and innovative NO TOUCH rim wash

ESTABLISHING A REGIONAL MS MDT IN THE EAST MIDLANDS JONATHAN EVANS CONSULTANT NEUROLOGIST, QMC

HEALTHCARE EQUIPMENT Utilizing BIM In Healthcare Engineering and Planning Bukola Jaji, Cockrell

Opening remarks and overview of Balkan Society of Radiology activities Dimitrios K. Tsetis

multidisciplinary team for interventional acute stroke management Bernhard Reimers Humanitas

Computational Neuroscience of Vision Dr. James A. Bednar jbednar@inf.ed.ac.uk

Computational Models of Neural Systems: 15-883 Fall 2013 Instructor: David S. Touretzky

McKean-Vlasov limit for interacting systems with simultaneous jumps Luisa Andreis Prof. Paolo

Performance Analysis of Computational Neuroscience Software NEURON on Knights Corner Many Core

GLO Science Professional Before & After Images Before GLO After GLO Before GLO After GLO