Information Theory Slides Jonathan Pillow Barlows Efficient Coding - - PowerPoint PPT Presentation

information theory slides
SMART_READER_LITE
LIVE PREVIEW

Information Theory Slides Jonathan Pillow Barlows Efficient Coding - - PowerPoint PPT Presentation

Information Theory Slides Jonathan Pillow Barlows Efficient Coding Hypothesis Barlow 1961 Efficient Coding Hypothesis: Atick & Redlich 1990 goal of nervous system: maximize information about environment (one of the core


slide-1
SLIDE 1

Information Theory Slides

Jonathan Pillow

slide-2
SLIDE 2

Barlow’s “Efficient Coding Hypothesis”

slide-3
SLIDE 3

Efficient Coding Hypothesis:

mutual information channel capacity redundancy:

  • goal of nervous system: maximize information about environment

(one of the core “big ideas” in theoretical neuroscience)

Barlow 1961 Atick & Redlich 1990

slide-4
SLIDE 4

Efficient Coding Hypothesis:

Barlow 1961 Atick & Redlich 1990

mutual information channel capacity redundancy: channel capacity:

  • upper bound on mutual information
  • determined by physical properties of encoder

mutual information:

  • avg # yes/no questions you can

answer about x given y (“bits”)

  • entropy:

“noise” entropy response entropy

  • goal of nervous system: maximize information about environment

(one of the core “big ideas” in theoretical neuroscience)

slide-5
SLIDE 5

Barlow’s original version:

mutual information redundancy: mutual information:

response entropy “noise” entropy

if responses are noiseless

Barlow 1961 Atick & Redlich 1990

slide-6
SLIDE 6

Barlow’s original version:

response entropy redundancy: mutual information:

“noise” entropy

noiseless system brain should maximize response entropy

  • use full dynamic range
  • decorrelate (“reduce redundancy”)
  • mega impact: huge number of theory and experimental papers focused
  • n decorrelation / information-maximizing codes in the brain

Barlow 1961 Atick & Redlich 1990

response entropy

slide-7
SLIDE 7

basic intuition

natural image nearby pixels exhibit strong dependencies

neural response i neural response i+1

neural representation desired encoding

pixel i pixel i+1

pixels

slide-8
SLIDE 8

stimulus prior noiseless, discrete encoding

Gaussian prior Q: what solution for infomax? Application Example: single neuron encoding stimuli from a distribution P(x)

slide-9
SLIDE 9

cdf

stimulus prior noiseless, discrete encoding infomax

Gaussian prior Q: what solution for infomax? A: histogram-equalization Application Example: single neuron encoding stimuli from a distribution P(x)

slide-10
SLIDE 10

response data

Laughlin 1981: blowfly light response

cdf of light level

  • first major validation of Barlow’s theory
slide-11
SLIDE 11

luminance-dependent receptive fields

Atick & Redlich 1990 - extended theory to noisy responses

weighting space High SNR (“whitening” / decorrelating) Middle SNR (partial whitening) Low SNR (averaging / correlating)

slide-12
SLIDE 12

estimating entropy and MI from data

slide-13
SLIDE 13
  • 1. the “direct method”

repeated stimulus raster

(Strong et al 1998)

  • fix bin size Δ
  • fix word length N

samples from 001 010 010 110 ... estimate eg, Δ=10ms,N=3 23=8 possible words

(Strong et al 1998)

i.e., from histogram-based estimate of probabilities p(R|Sj), then H = -∑P log P

slide-14
SLIDE 14

repeated stimulus raster

  • fix bin size Δ
  • fix word length N

samples from 001 010 010 110 ... estimate eg, Δ=10ms,N=3 23=8 possible words

average over all blocks of size N

Estimate is:

all words

  • 1. the “direct method”

(Strong et al 1998)

slide-15
SLIDE 15

repeated stimulus raster psth Information per spike: mean rate

  • equal to the information

carried by an inhomogeneous Poisson process

  • 2. “single-spike information”

(Brenner et al 2000)

slide-16
SLIDE 16

derivation of single-spike information

entropy of Unif([0 T]) entropy of p(tsp|stim)

normalized psth mean rate

slide-17
SLIDE 17

derivation of single-spike information

mean rate

entropy of Unif([0 T])

normalized psth

entropy of p(tsp|stim)

slide-18
SLIDE 18

So far we have focused on the formulation: Decoding-based approaches focus on the alternative version:

  • 3. decoding-based methods
slide-19
SLIDE 19

Suppose we have decoder to estimate the stimulus from spikes:

(e.g., MAP, or Optimal Linear Estimator): Stimulus Response Decoder Covariance of residual errors entropy of a Gaussian with cov = cov(residual errors) (Maximum Entropy distribution with this covariance)

Data Processing Inequality:

Bound #1 Bound #2

  • 3. decoding-based methods