A. Hyv arinen and P. O. Hoyer A Two-Layer Sparse Coding Model - PowerPoint PPT Presentation

A. Hyv¨ arinen and P. O. Hoyer A Two-Layer Sparse Coding Model Learns Simple and Complex Cell Receptive Fields and Topography from Natural Images. presented by Hsin-Hao Yu Department of Cognitive Science November 7, 2001

An overview of the visual pathway 2

Basic V1 physiology approximately linear filters Simple cells localized, oriented, band-pass phase sensitive non-linear Complex cells phase insensitive Question: Why do we have these neurons? 3

The principle of redundancy reduction The Principle of redundancy reduction: The world is highly structured. The purpose of early sensory processing is to transform the redundant sensory input to an efficient code. [Barlow 1961] Two approaches have been developed to apply this idea to study the visual cortex: 1. Sparse coding (eg. Olshausen and Field) 2. Independent Component Analysis (eg. Bell and Sejnowski) 4

Compact coding vs. Sparse coding What does a efficient code means? Strategy 1: Compact coding represents data with minimum number of units. This requirement often produces solutions that’s similar to Principal Component Analysis , but the principal components do not resemble any receptive field structures found in the visual cortex. 5

Principal components of natural images Not localized, and no orientational selectivity. 6

Compact coding vs. Sparse coding Strategy 2: Sparse coding represents data with minimum number of active units, but the dimensionality of the representation is the same as (or even larger than) the dimensionality of the input data. 7

Learning sparse codes: image model We use the linear generative mode. That is, � I ( x, y ) = a i φ i ( x, y ) i where I ( x, y ) is a patch of natural image, and { a i } are coefficients to the basis functions { φ i ( x, y ) } . A neural network interpretation: writing images as column vectors,       a 1 . . . φ 1 . . . .       .  =  I   .           a n or I = Φ A . Thus, A = WI where W = Φ − 1 . A is the output layer of a linear network, and W is the weight matrix (ie. filters .) 8

Learning sparse codes: algorithm [Olshausen and Field, 1996] For the image model � I ( x, y ) = a i φ i ( x, y ) i We require that the distributions of the coefficients, a i , are “sparse”. This can be achieved by minimizing the following cost function: = − [ fidelity ] − λ [ sparseness ] E i a i φ i ( x, y )] 2 = − � x,y [ I ( x, y ) − � fidelity = − � i S ( a i ) sparseness = log (1 + x 2 ) . S ( x ) 9

Maximum-likelihood and sparse codes The sparse-coding algorithm can be interpreted as finding φ that maximizes the average log-likelihood of the images under a sparse, independent prior. fidelity negative log-likelihood of the image given φ and a , assuming gaussian noise. − | I − aρ | 2 2 ρ 2 1 P ( I | a, φ ) = Z ρN e N sparseness sparse, independent prior for a . i e − βS ( a i ) P ( a ) = � So E ∝ − log ( P ( I | a, φ ) P ( a )). It can be shown that minimizing E is equal to maximizing P ( I | φ ), given some approximation assumptions. 10

Supergaussian distributions S ( a i ) = log (1 + a 2 1 i ) P ( a i ) = Cauchy distribution 1+ a 2 i P ( a i ) = e −| x | S ( a i ) = | a i | Laplace distribution 11

Independent Component Analysis In the context of natural image analysis: � I ( x, y ) = a i φ i ( x, y ) i where the number of a i equals to the dimensionality of I . We require that { a i } , as random variables, are independent to each other. That is, P ( a i | a j ) = P ( a i ). In a more general context, let I be a random vector. The goal of the Independent Component Analysis is to find a matrix W , such that the components of A = WI are non-gaussian, and independent to each other. 12

The Infomax ICA [Bell and Sejnowski 1995] derived a learning rule for ICA by maximizing the entropy of a neural network with logistic (or Laplace) neurons. Similar or equivalent algorithms can be derived from many other frameworks. Let H ( X ) be the entropy of X . The joint entropy of a 1 and a 2 can be written as: H ( a 1 , a 2 ) = H ( a 1 ) + H ( a 2 ) − I ( a 1 , a 2 ) where I ( a 1 , a 2 ) is the mutual information between a 1 and a 2 . { a 1 , a 2 } are independent to each other when I ( a 1 , a 2 ) = 0. We approximate the solution by maximizing H ( a 1 , a 2 ). 13

Independent components of natural images Olshausen and Field 1996 Bell and Sejnowski 1996 16x16 basis patches 12x12 filters 14

More ICA applications 1. Direction selectivity [van Hatern et al., 1998] 2. Flow-field templates [Park and Jabri, 2000] 3. Color [Hoyer, 2000; Tailor, 2000; Lee, 2001] 4. Binocular vision [Hoyer, 2000] 5. Audition [Bell and Sejnowski 1996; Lewicki??] 15

Complex cells and topography [Hyv¨ arinen and Hoyer, 2001] uses a hierarchical network and the sparse coding principle to explain the emergence of complex-cell-like receptive fields and topographic structures of simple cells. 16

from [H¨ ubener et al. 1997] 17

The “ice-cube” model of V1 layer 4c 18

Network architecture 19

Results: summary simple cell physiology orientation/freq selective phase/position senstive simple cell topography orientation continuity, but not phase orientation singularities, or “pinwheels” “blob” - grouping of low-freq complex cells physiology orientation/freq selective phase/position insensitive 21

A. Hyv arinen and P. O. Hoyer A Two-Layer Sparse Coding Model - PowerPoint PPT Presentation

A. Hyv arinen and P. O. Hoyer A Two-Layer Sparse Coding Model Learns Simple and Complex Cell Receptive Fields and Topography from Natural Images. presented by Hsin-Hao Yu Department of Cognitive Science November 7, 2001 An overview of the

Linear Non-Gaussian Acyclic Model for Causal Discovery Aapo Hyv arinen Dept of Computer

Grid-Based SAT Solving with Iterative Partitioning and Clause Learning Antti E. J. Hyv arinen,

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts III-IV

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Verification-Aided Regression Testing Fabrizio Pastore 1 Leonardo Mariani 1 arinen 2 Grigory

Network Layer October 2, 2019 guha.jayachandran@sjsu.edu Layer 2: Protocol atop Layer 1

Bag of Pursuits and Neural Gas for Improved Sparse Coding Manifold Learning with Sparse Coding

Sparse Coding and Dictionary Learning for Image Analysis Part I: Optimization for Sparse Coding

ELEC / COMP 177 Fall 2016 Some slides from Kurose and Ross, Computer Networking , 5 th Edition

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

5 Network Layer Network Layer Network Layer Network Layer Example: Choosing among multiple ASes

Lecture 6: Wireless Link Layer, Lecture 6: Wireless Link Layer, MAC protocols, CSMA MAC

1 Transport Layer Transport Layer Outline Message, Segment, Datagram Transport-layer

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

2018 ICEH Alumni Workshop - Presentation Summary Summary of alumni Goals for the next 2-3 Name

Social Pr Social Protection otection: : Conc Concepts and Lif epts and Lifec ecycle le

Perspective Representing 3-D in Comics, Animation and Viewmaster 2 3D Space as a Similie for

Doing practitioner research: A case study in the use of subjectivity and the self as research

Andrej Karpathy Bay Area Deep Learning School, 2016 So far... So far... Some input vector (very

Clinical case presentation Dr. Arunbabu. R Post MCh Senior Resident Neurosurgery, NIMHANS.

Augmented Reality Information Displays Psychology 6135: Psychology of Data Visualization Matthew

LOW-LATENCY, NEAR-EYE GAZE ESTIMATION Michael Stengel, Alexander Majercik Part I (Michael) 25 min