Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant - - PowerPoint PPT Presentation

modeling visual cortex v4 in naturalistic conditions with
SMART_READER_LITE
LIVE PREVIEW

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant - - PowerPoint PPT Presentation

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant and Sparse Image Representations Bin Yu Departments of Statistics and EECS University of California at Berkeley Rutgers University, May 2, 2014 Modeling Visual Cortex V4 in


slide-1
SLIDE 1

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant and Sparse Image Representations

Bin Yu Departments of Statistics and EECS University of California at Berkeley Rutgers University, May 2, 2014

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 1/51

slide-2
SLIDE 2

Co-authors

Yu Group: Julien Mairal and Yuval Benjemani (leads) Gallant Lab at UC Berkeley: Ben Willmore, Michael Oliver, Jack Gallant Supported by NSF STC, Center for Science of Information (CSoI)

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 2/51

slide-3
SLIDE 3

Brain Science 2013 – new ”genomics”

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 3/51

slide-4
SLIDE 4

! !

!"#"$%&'(&)*+",'-*+&".'(/-%",'"+&"-

!"

!"#$%&'%(%')#%!* +$"#%!* ,'%)!*

#$

!% !&

$'()*#$

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 4/51

slide-5
SLIDE 5

! !

!"#"$%&'"()&"*+(#,-$*".&%/(0#1,22(01"02

!"#$%$&'()(*$+$&$,(-../ Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 5/51

slide-6
SLIDE 6

! !

!"#$%&$'("#'()*&'(+"&'%+,"-."%+'($%

/*0'1*&23"4"5$%%$(6"7888

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 6/51

slide-7
SLIDE 7

! !

!"#$%&'()*"'+$,-.).$'-"./-,("01"-,/+'-

2)3$(4"5)6(,-"7"8)**)-.4"9::;

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 7/51

slide-8
SLIDE 8

Sources of Knowledge About the Brain

Ways of Understanding the Visual Cortex

study of lesions and associated impairments; electrodes (single or arrays); imaging studies (fMRI, ...); image below from Hansen et al. [2007]

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 8/51

slide-9
SLIDE 9

Classical Models for V1 “Simple” Cells

Image from Olshausen and Field [2005]:

V1

Models based on Gabor filters achieve impressive prediction performance with experimental data(signle neuron and fMRI); V1 receptive fields are relatively small and well localized. V1 is the most well understood area, but not all is ... It serves as a performance benchmark for other areas.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 9/51

slide-10
SLIDE 10

V4: an intermediate area on the ”what” pathway

[see Roe et al., 2012]

What we know about V4

affected by attention; diverse selectivity; larger receptive fields and is more invariant than V1/V2; no good predictive model with natural image inputs.

Question about V4

what are the roles of V4? Roe et al advocated a background-foreground thesis, among other things.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 10/51

slide-11
SLIDE 11

Experimental set-up in the Gallant Lab for single neuron data collection

Image Sequence Subject Recording Response Signal Filtering Spike Sorting Time Binning Firing Rate

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 11/51

slide-12
SLIDE 12

Objectives and data

Aim: a statistical/computational model with

good prediction performance on natural scenes (validation data); elucidation of properties of a population of V4 neurons; biologically interpretability.

Data

consists of 4000 − 12000 grayscale images (with no motion or color content) and average firing rates for 71 neurons; the image sequence is shown at 30 Hz; the stimuli is centered on an estimated receptive field (RF) while the subject performs a fixation task; the stimuli size is 2 − 4 times larger than the estimated RF.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 12/51

slide-13
SLIDE 13

Outline of Today’s Talk

Multi-layer invariant feature extraction Prediction model via low-rank regularization Model interpretation

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 13/51

slide-14
SLIDE 14

Part I: Invariant Image Representation: Feature Extraction

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 14/51

slide-15
SLIDE 15

Methodology

Classical computer vision image representation for scene analysis

1 dense low-level feature extraction (local histograms of gradient

  • rientations) [Lowe, 2004];

2 feature encoding into visual words using vector quantization or

sparse coding [Olshausen and Field, 1997];

3 feature pooling.

A state-of-the-art pipeline for scene and object recognition [Lazebnik et al., 2006, Yang et al., 2009, Boureau et al., 2010];

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 15/51

slide-16
SLIDE 16

Methodology

Classical computer vision image representation for scene analysis

1 dense low-level feature extraction (local histograms of gradient

  • rientations) [Lowe, 2004];

2 feature encoding into visual words using vector quantization or

sparse coding [Olshausen and Field, 1997];

3 feature pooling.

A state-of-the-art pipeline for scene and object recognition [Lazebnik et al., 2006, Yang et al., 2009, Boureau et al., 2010]; Can we exploit some ideas from this line of thinking to mimic invariance properties of V4 neurons?

  • btain a model tuned to natural images?

be as biologically compatible as possible?

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 15/51

slide-17
SLIDE 17

Our pipeline

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 16/51

slide-18
SLIDE 18

First Layer

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 17/51

slide-19
SLIDE 19

Patch Extraction Across Orientation Maps

The 3D-patches

are of size 4 × 4 × 8 = 128 (8 orientations); correspond in the original image domain to 32×32 patches; are invariant to local deformations in the original image domain;

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 18/51

slide-20
SLIDE 20

Patch Extraction Across Orientation Maps

What is the effect of the first processing layer?

Comparing patches on orientation maps and in the image domain leads to a different similarity measure. 1.0 0.92 0.92 0.91 0.86 0.76 1.0 0.65 0.76 0.97 0.95 0.93 In blue, correlation in the original image domain. In red, correlation in the new domain.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 19/51

slide-21
SLIDE 21

Constrast Normalization

Given a 3D-patch x in R128

+ , we apply x ← x/ max(x2, c).

In short, our 3D-patches

have similar properties as dense SIFT descriptors [Lowe, 2004] but are based on simple filtering/subsampling and normalization step.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 20/51

slide-22
SLIDE 22

Second Layer

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 21/51

slide-23
SLIDE 23

Sparse Coding

1 learn a dictionary on a database of 3D-patches once in for all; 2 encode all 3D-patches of an image to obtain feature maps.

Dictionary Learning Formulation

min

A∈Rp×n,D∈D n

  • i=1

1 2xi − Dαi2

2

  • data fitting

+ λαi1

sparsity

, We also force the codes α and the dictionary D to be non-negative. The dictionary is fairly large (p = 2 048). The original formulation of Olshausen and Field [1997] was in the image

  • domain. It was successfully used on image descriptors in computer

vision [Yang et al., 2009, Boureau et al., 2010].

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 22/51

slide-24
SLIDE 24

Third Layer

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 23/51

slide-25
SLIDE 25

Feature Pooling

A =     α1 α2 · α29∗29    

  • sparse codes of an image

= ⇒    β1 . . . βp   

single vector

The pooling operation is the ℓ2-norm of features βk

= 29∗29

i=1 (αk i )2

for one pooling region and k = 1, ..., 2048. Another alternative is the max-pooling operation [Riesenhuber et al., 1999, Cadieu et al., 2007], often used in computer vision [Lazebnik et al., 2006, Yang et al., 2009, Boureau et al., 2010].

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 24/51

slide-26
SLIDE 26

Summary

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 25/51

slide-27
SLIDE 27

Our Feature Extraction Process in a Nutshell

It consists of local simple operations and uses the sparse coding

  • principle. In particular, it

has some invariance to small image deformation (first layer); has some selectivity to features learned from natural image statistics (second layer); is shift invariant within the receptive field (third layer);

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 26/51

slide-28
SLIDE 28

Part II: Prediction Model based on Extracted Features

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 27/51

slide-29
SLIDE 29

Prediction Pipeline

Input Image Sequence Neuron Responses Nonlinear Encoding Linear Model

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 28/51

slide-30
SLIDE 30

Temporal Aspect of Data

Typical Time Response to an Excitatory Stimulus

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 29/51

slide-31
SLIDE 31

Prediction with Low-Rank Regularization

Input: image sequence and neuron responses y1, y2, . . . , yT; Preprocessing: center and normalize the neuron responses; Image feature vector: (2048*5)-dimensional βt vector at time t = 1, ..., T; Model: yt ≈ τ

j=1 βt−j⊤wj with a lag τ = 9 starting at time t +1;

Constraint on the weights wj: W = [w1, w2, . . . , wτ]

  • time window

∈ Rp×τ should be low-rank. Formulation: min

W∈Rp×τ T

  • t=1

1 2(yt −

τ

  • j=0

βt−j⊤wj)2 + γW∗. .∗ is the trace norm (ℓ1-norm or sum of singular values) [see Fazel et al., 2001].

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 30/51

slide-32
SLIDE 32

Prediction Performance on Validation Data

The baseline is a well engineered non-linear Gabor model [see, e.g., Nishimoto and Gallant, 2011] (state of the art for V1/V2 prediction).

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 31/51

slide-33
SLIDE 33

Prediction Performance on Validation Data

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 32/51

slide-34
SLIDE 34

Prediction Performance on Validation Data

Example for a neuron with ρ = 0.67 on validation data

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 33/51

slide-35
SLIDE 35

Prediction Performance:

first model to achieve similar prediction performance on natural images as the ones achieved for V1 and V2 cells; significantly outperforms the Gabor model; 5-fold cross-validation leads to semi time-separable models.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 34/51

slide-36
SLIDE 36

Part III: Model Interpretation

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 35/51

slide-37
SLIDE 37

Visualization of Dictionary Elements

Difficulty: 3D-patches are not in the image domain

We build a database of one million correspondence pairs of (32 × 32 image patches, 4 × 4 × 8 3D-patches) and find best matches.

(A) Complex Feature Visualization

Others … 1 2 3 4 5 6 7 8 9 10 11 12

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 36/51

slide-38
SLIDE 38

Categorization of Dictionary Elements 1/2

(B) Feature Classification by Type

Others …

Straight line Curve inward Curve outward Multiple curves White bar Black bar Stripes White corner Black corner Acute white corner Acute black corner Smooth Transition

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 37/51

slide-39
SLIDE 39

Categorization of Dictionary Elements 2/2

Others …

White blob Black blob Double white blobs Double black blobs Complex blobs Crosses and Junctions

We also categorize each dictionary element by

  • rientation;

scale; texture affinity.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 38/51

slide-40
SLIDE 40

Visualization of Feature Channels

curves/edges bars corners blobs image

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 39/51

slide-41
SLIDE 41

Model Analysis

Methodology for Population Analysis

1 find the peak excitation lag; 2 look at the center pooling region; 3 compute 19 impact values: 8 feature type, 3 scales, 3 texture

affinity, 5 orientation;

4 perform a sparse principal component analysis [Zou et al., 2006].

Methodology for Individual Neuron Analysis

retrieve most excitatory and inhibitory images; compute impact values (possibly with refined categories).

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 40/51

slide-42
SLIDE 42

Individual Neuron Analysis

neuron 64, ρ = 0.64

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 41/51

slide-43
SLIDE 43

Individual Neuron Analysis

neuron 40, ρ = 0.76

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 42/51

slide-44
SLIDE 44

Individual Neuron Analysis

neuron 71, ρ = 0.75

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 43/51

slide-45
SLIDE 45

Individual Neuron Analysis

neuron 22, ρ = 0.64

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 44/51

slide-46
SLIDE 46

Individual Neuron Analysis

neuron 24, ρ = 0.63

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 45/51

slide-47
SLIDE 47

Individual Neuron Analysis

neuron 31, ρ = 0.66

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 46/51

slide-48
SLIDE 48

Individual Neuron Analysis

neuron 59, ρ = 0.67

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 47/51

slide-49
SLIDE 49

Population Analysis via Sparse PCA

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 48/51

slide-50
SLIDE 50

Population Analysis via Sparse PCA

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 49/51

slide-51
SLIDE 51

Summary

Main conclusions

first quantitative model with natural images that meet the benchmark performance; the most striking observation is the role of contours vs texture discrimination; V4 neurons are selective to a large diversity of features types such as bars, edges, corners... some V4 neurons are selective to orientation, some of them not.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 50/51

slide-52
SLIDE 52

Current Directions

include time and color to deal with fully naturalistic conditions; using the new V4 model for movie reconstruction based on fMRI data; theoretical and simulation studies of deep learning methods; sparse coding: a key step in analyzing fruitfly TF images.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 51/51

slide-53
SLIDE 53

References I

Y-L. Boureau, F. Bach, Y. Lecun, and J. Ponce. Learning mid-level features for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.

  • C. Cadieu, M. Kouh, A. Pasupathy, C.E. Connor, M. Riesenhuber, and
  • T. Poggio. A model of v4 shape selectivity and invariance. Journal of

Neurophysiology, 98(3):1733–1750, 2007.

  • M. Fazel, H. Hindi, and S.P. Boyd. A rank minimization heuristic with

application to minimum order system approximation. In American Control Conference, 2001. Proceedings of the 2001, volume 6, pages 4734–4739, 2001. K.A. Hansen, K.N. Kay, and J.L. Gallant. Topographic organization in and near human visual area v4. The Journal of Neuroscience, 27(44): 11896–11911, 2007.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 52/51

slide-54
SLIDE 54

References II

  • S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial

pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006. D.G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91–110, 2004.

  • S. Nishimoto and J.L. Gallant. A three-dimensional spatiotemporal

receptive field model explains responses of area mt neurons to naturalistic movies. The Journal of Neuroscience, 31(41): 14551–14564, 2011.

  • B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete

basis set: A strategy employed by V1? Vision Research, 37: 3311–3325, 1997. B.A. Olshausen and D.J. Field. How close are we to understanding v1? Neural computation, 17(8):1665–1699, 2005.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 53/51

slide-55
SLIDE 55

References III

  • M. Riesenhuber, T. Poggio, et al. Hierarchical models of object

recognition in cortex. Nature neuroscience, 2:1019–1025, 1999. A.W. Roe, L. Chelazzi, C.E. Connor, B.R. Conway, I. Fujita, J.L. Gallant, H. Lu, and W. Vanduffel. Toward a unified theory of visual area v4. Neuron, 74(1):12–29, 2012.

  • J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid

matching using sparse coding for image classification. In Proceedings

  • f the IEEE Conference on Computer Vision and Pattern Recognition

(CVPR), 2009.

  • H. Zou, T. Hastie, and R. Tibshirani. Sparse principal component
  • analysis. Journal of computational and graphical statistics, 15(2):

265–286, 2006.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 54/51

slide-56
SLIDE 56

SIFT Representation

[Lowe, 2004]

1 compute pixel orientations: (ρ=∇I, θ=arctan(∇yI/∇xI)). 2 binning + histograms. 3 normalization.

standard parameters: 8 orientations, 16 × 16 patches, 4 × 4 bins.

Modeling Visual Cortex V4 in Naturalistic Conditions with Invariant 55/51