Overcomplete models & Lateral interactions and Feedback Teppo - - PowerPoint PPT Presentation

overcomplete models lateral interactions and feedback
SMART_READER_LITE
LIVE PREVIEW

Overcomplete models & Lateral interactions and Feedback Teppo - - PowerPoint PPT Presentation

Overcomplete models & Lateral interactions and Feedback Teppo Niinimki April 22, 2010 Contents Overcomplete models 1 Overcomplete basis Energy based models Lateral interaction and feedback 2 Feedback and Bayesian inference


slide-1
SLIDE 1

Overcomplete models & Lateral interactions and Feedback

Teppo Niinimäki April 22, 2010

slide-2
SLIDE 2

Contents

1

Overcomplete models Overcomplete basis Energy based models

2

Lateral interaction and feedback Feedback and Bayesian inference End-stopping Predictive coding

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 2 / 26

slide-3
SLIDE 3

Contents

1

Overcomplete models Overcomplete basis Energy based models

2

Lateral interaction and feedback Feedback and Bayesian inference End-stopping Predictive coding

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 3 / 26

slide-4
SLIDE 4

Motivation

So far Sparse coding models: feature detector weights orthogonal Generative models: A invertible ⇒ square matrix

⇒ no. of features ≤ no. of dimensions in data ≤ no. of pixels

Why more features? processing location independent

⇒ same set of features for every location

  • no. of simple cells in V1 ≫ no. of retinal gaglion cells

(≈ 25 times)

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 4 / 26

slide-5
SLIDE 5

Overcomplete basis: Generative model

Generative model: I(x,y) =

m

i=1

Ai(x,y)si basis vectors: Ai features: si

  • no. of features: m > |I| (or m > dimension of data)

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 5 / 26

slide-6
SLIDE 6

Overcomplete basis: Generative model

Generative model: I(x,y) =

m

i=1

Ai(x,y)si + N(x,y) basis vectors: Ai features: si

  • no. of features: m > |I| (or m > dimension of data)

Gaussian noise: N(x,y)

⇒ simplifies computations

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 5 / 26

slide-7
SLIDE 7

Overcomplete basis: Computation of features

I(x,y) =

m

i=1

Ai(x,y)si + N(x,y)

How to compute the coefficients si for I? A not invertible more unknowns than equations

⇒ many (infinite number of) different solutions

Find the sparsest solution (most si are close to 0): assume sparse distribution for si find the most probable values for si

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 6 / 26

slide-8
SLIDE 8

Overcomplete basis: Computation of features

Aim: Find s which maximizes p(s|I). By Bayes’ rule we get p(s|I) = p(I|s)p(s) p(I) Ignore constant p(I) and maximize logarithm instead: logp(s|I) = logp(I|s)+ logp(s)+ const. For prior distribution p(s) assume sparsity and independence ⇒ logp(s) =

m

i=1

G(si)

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 7 / 26

slide-9
SLIDE 9

Overcomplete basis: Computation of features

I(x,y) =

m

i=1

Ai(x,y)si + N(x,y) logp(s|I) = logp(I|s)+ logp(s)+ const.

Next compute logp(I|s). Probability of I(x,y) given s is Gaussian pdf of N(x,y) = I(x,y)−

m

i=1

Ai(x,y)si. Insert above into p(N(x,y)) = 1

2π exp

  • − 1

2σ2 N(x,y)2

  • to get

logp(I(x,y)|s) = − 1 2σ2

  • I(x,y)−

m

i=1

Ai(x,y)si

2 − 1

2 log2π.

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 8 / 26

slide-10
SLIDE 10

Overcomplete basis: Computation of features

Because the noise is independent in pixels, we can sum over x,y to get the pdf for whole image logp(I|s) = − 1 2σ2 ∑

x,y

  • I(x,y)−

m

i=1

Ai(x,y)si

2 − n

2 log2π. Combining above: Find s that maximizes logp(s|I) = − 1 2σ2 ∑

x,y

  • I(x,y)−

m

i=1

Ai(x,y)si

2 +

m

i=1

G(si)+ const.

⇒ numerical optimization ⇒ non-linear cell activities si

How about learning Ai?

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 9 / 26

slide-11
SLIDE 11

Overcomplete basis: Basis estimation

Assume flat prior for the Ai

⇒ above p(s|I) is actually p(s,A|I).

Maximize the probability (likelihood) of Ai over independent image samples I1,I2,...,I3:

T

t=1

logp(s(t),A|It) =− 1 2σ2

T

t=1∑ x,y

  • It(x,y)−

m

i=1

Ai(x,y)si

2 +

T

t=1 m

i=1

G(si(t))+ const. At the same time we compute basis vectors Ai cell outputs si(t).

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 10 / 26

slide-12
SLIDE 12

Energy based models

Another approach: no generative model instead relax ICA to add more linear feature detectors Wi

⇒ not basis, but overcomplete representation

In ICA we maximized: logL(v1,...,vm;z1,...,zT) = T log|det(V)|+

m

i=1 T

t=1

Gi(vT

i zt)

Recall zt ∼ It, vi ∼ Wi, m = n and Gi(u) = logpi(u). If m > n then log|det(V)| is not defined.

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 11 / 26

slide-13
SLIDE 13

Energy based models: estimation

Actually log|det(V)| is a normalization constant. Replace it and instead maximize: logL(v1,...,vm;z1,...,zT) = −T log|Z(V)|+

m

i=1 T

t=1

Gi(vT

i zt)

where Z(V) = Z

n

i=1

exp(Gi(vT

i z))dz.

Above integral extremely difficult to evaluate. However it can be estimated or the model can be estimated directly: score matching and contrastive divergence

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 12 / 26

slide-14
SLIDE 14

Energy based models: results

Estimated overcomplete representation with energy based model Gi(u) = αi logcosh(u) score matching patches of 16 x 16 = 256 pixels preprocessing ⇒ n = 128 m = 512 receptive fields

(Fig 13.1: Random sample of Wi.)

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 13 / 26

slide-15
SLIDE 15

Contents

1

Overcomplete models Overcomplete basis Energy based models

2

Lateral interaction and feedback Feedback and Bayesian inference End-stopping Predictive coding

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 14 / 26

slide-16
SLIDE 16

Motivation

So far considered ”bottom-up” or feedforward frameworks In reality there are also ”top-down” connections ⇒ feedback lateral (horizontal) interactions How to model them too?

⇒ using Bayesian inference!

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 15 / 26

slide-17
SLIDE 17

Feedback as Bayesian inference: contour integrator

Why feedback connections? to enhance responses consistent with the broader visual context to reduce noise (activity inconsistent with the model)

⇒ combine bottom-up sensory information with top-down priors

Example: contour cells and complex cells Define generative model: ck =

K

i=1

akisi + nk where nk is Gaussian noise.

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 16 / 26

slide-18
SLIDE 18

Feedback as Bayesian inference: contour integrator

ck =

K

i=1

akisi + nk

Now we just model the feedback! First calculate s for given image:

1

compute c normally (feedforward)

2

find s = ˆ s that maximizes logp(s|c)

⇒ should be non-linear in c (why?)

Then reconstruct complex cell outputs using the linear generative model, but ignoring the noise:

ˆ

ck =

K

i=1

akiˆ si (for instance by sending feedback signal uki =

  • ∑K

i=1 akiˆ

si

  • − ck)

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 17 / 26

slide-19
SLIDE 19

Feedback as Bayesian inference: contour integrator example

(Fig. 14.1)

Example results: left: patches with random Gabor functions (three collienar in upper case) middle: ck right: ˆ ck (based on contour-coding unit activities si)

⇒ noise reduction empasizes collinear

activations but suppresses others

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 18 / 26

slide-20
SLIDE 20

Feedback as Bayesian inference: higher-order activities

How to estimate higher order activities ˆ s = argmaxs p(s|c)? Like before, using Bayes’ rule we get logp(s|c) = logp(c|s)+ logp(s)+ const. Again we assume that logp(s) is sparse. Analogously to overcomplete basis: logp(s|c) = − 1 2σ2

K

k=1

  • ck −

m

i=1

akisi

2 +

m

i=1

G(si)+ const. Next assume A is invertible and orthogonal

⇒ multiplying c− As by AT in above square sum c− As we get AT c− s

without changing the norm: logp(s|c) = − 1 2σ2

m

i=1

  • K

k=1

akick − si

2 +

m

i=1

G(si)+ const.

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 19 / 26

slide-21
SLIDE 21

Feedback as Bayesian inference: higher-order activities

Maximize separately each: logp(si|c) = − 1 2σ2

  • K

k=1

akick − si

2 + G(si)+ const.

Maximum point can be represented as

ˆ

si = f

  • K

k=1

akick

  • where f depends on G = logp(si).
  • ex. for Laplacian distribution f(y) = sign(y)max(|y|−

2σ2,0).

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 20 / 26

slide-22
SLIDE 22

Feedback as Bayesian inference: higher-order activities

(Fig. 14.3)

Sparseness leads to shrinkage/tresholding. Left image: f for Laplacian distribution (solid line) highly sparse distribution [7.22 in the book] (dash-dotted line)

⇒ cell activities considered noise are lowered

to zero

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 21 / 26

slide-23
SLIDE 23

Feedback as Bayesian inference: Categorization

Generative model applicable to any two cell groups. Example: category variables si ∈ {0,1} value = 1 if the object in visual field in certain category

⇒ jumpy behaviour

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 22 / 26

slide-24
SLIDE 24

Overcomplete basis and end-stopping

Receptive fields Stimuli

(Fig. 14.4)

End-stopping: some (simple) cells reduce firing rate if Gabor stimulus is elongated

⇒ receptive fields not linear?

How to model?

  • vercomplete basis and bayesian

inference

⇒ competition between overlapping cells

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 23 / 26

slide-25
SLIDE 25

Predictive coding

Predictive coding upper level predicts activity in the lower level lower level sends errors back to the upper level In noisy generative model, the prediction is implicit. ⇒ estimating noisy generative model ≈ minimization of prediction error To infer the most likely si repeat above steps and update the model using gradient method with

∂logp(s|c) ∂si = 1 σ2 ∑

k

aki

  • ck −

m

i=1

akisi

  • + G′(si).

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 24 / 26

slide-26
SLIDE 26

Summary

Overcomplete models:

  • vercomplete basis

energy based models Interactions: noisy model and Bayesian inference ⇒ feedback

  • vercomplete basis ⇒ end-stopping

predictive coding

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 25 / 26

slide-27
SLIDE 27

Teppo Niinimäki () Overcomplete models&Lateral interactions and Feedback April 22, 2010 26 / 26