[PPT] - Learning visual motion with recurrent neural networks Marius PowerPoint Presentation

SLIDE 1

Learning visual motion Statistical models of spike trains

Learning visual motion with recurrent neural networks

Marius Pachitariu

Gatsby Unit, UCL adviser: Maneesh Sahani

Marius Pachitariu Learning visual motion with RNNs 1 / 48

SLIDE 2

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Outline

Learning visual motion Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Marius Pachitariu Learning visual motion with RNNs 2 / 48

SLIDE 3

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Marr’s three levels of analysis

Levels of analysis

◮ Computational ◮ Algorithmic / Representational ◮ Physical

Marius Pachitariu Learning visual motion with RNNs 3 / 48

SLIDE 4

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Sequential data types

◮ Movies ◮ Spike trains ◮ Language

Marius Pachitariu Learning visual motion with RNNs 4 / 48

SLIDE 5

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Spatio-temporal filters

◮ Dominant in both visual neuroscience and

computer vision.

◮ Caveats:

◮ not real-time/requires copies of the past

→ bad for real-world systems, like the brain.

Marius Pachitariu Learning visual motion with RNNs 5 / 48

SLIDE 6

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Spatio-temporal filters

◮ Dominant in both visual neuroscience and

computer vision.

◮ Caveats:

◮ not real-time/requires copies of the past

→ bad for real-world systems, like the brain.

◮ too many parameters

→ bad for learning and generalization.

◮ high computational complexity

→ bad with high-bandwidth data.

Marius Pachitariu Learning visual motion with RNNs 5 / 48

SLIDE 7

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Neural candidates for ST filters

◮ lagged LGN cells (Mastronarde, 1987)

Marius Pachitariu Learning visual motion with RNNs 6 / 48

SLIDE 8

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Neural candidates for ST filters

◮ lagged LGN cells (Mastronarde, 1987) ◮ but LGN is an information bottleneck

Marius Pachitariu Learning visual motion with RNNs 7 / 48

SLIDE 9

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Neural candidates for ST filters

◮ lagged LGN cells (Mastronarde, 1987) ◮ but LGN is an information bottleneck ◮ but LGN responds precisely to natural movies (Butts et al, 2011)

Marius Pachitariu Learning visual motion with RNNs 7 / 48

SLIDE 10

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Compact parametrization of ST filters with an RNN

xt yt yt−1 yt−2 yt−τ

W0

W1 W2 Wτ

Spatiotemporal filtering

xt =

∞

τ=0

Wτ yt−τ

Marius Pachitariu Learning visual motion with RNNs 8 / 48

SLIDE 11

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Compact parametrization of ST filters with an RNN

xt yt yt−1 yt−2 yt−τ

W0

W1 W2 Wτ

Spatiotemporal filtering

xt xt−1 xt−2 xt−τ yt yt−1 yt−2 yt−τ

W0

W0 W0 W0 R R R R R

Recurrent neural network

xt =

∞

τ=0

Wτ yt−τ Wτ = (R)τ W0 xt = W0 yt + R

∞

τ=0

(R)τ W0 yt−1−τ xt = W0 yt + R xt−1

Marius Pachitariu Learning visual motion with RNNs 8 / 48

SLIDE 12

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Compact parametrization of ST filters with an RNN

xt yt yt−1 yt−2 yt−τ

W0

W1 W2 Wτ

Spatiotemporal filtering

xt xt−1 xt−2 xt−τ yt yt−1 yt−2 yt−τ

W0

W0 W0 W0 R R R R R

Recurrent neural network

xt =

∞

τ=0

Wτ yt−τ Wτ = (R)τ W0 xt = W0 yt + R

∞

τ=0

(R)τ W0 yt−1−τ xt = W0 yt + R xt−1 As a simple example, we fit R to a diverse bank of spatiotemporal filters.

Marius Pachitariu Learning visual motion with RNNs 8 / 48

SLIDE 13

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Reconstructions of the ST filters are good

Marius Pachitariu Learning visual motion with RNNs 9 / 48

SLIDE 14

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

What do the connections look like?

◮ Spectrum of R

Marius Pachitariu Learning visual motion with RNNs 10 / 48

SLIDE 15

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

What do the connections look like?

◮ Spectrum of R ◮ Strongest connections to a given neuron (animation).

Marius Pachitariu Learning visual motion with RNNs 10 / 48

SLIDE 16

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Computational complexity and memory requirements

◮ lx by ly by nt filters (12 by 12 by 30) ◮ N (1600) ST filters ◮ Feedforward flops = 2N l2 x l2 y nt

Marius Pachitariu Learning visual motion with RNNs 11 / 48

SLIDE 17

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Computational complexity and memory requirements

◮ lx by ly by nt filters (12 by 12 by 30) ◮ N (1600) ST filters ◮ Feedforward flops = 2N l2 x l2 y nt ◮ Recurrent flops = 2N2 + 2N l2 x l2 y

Marius Pachitariu Learning visual motion with RNNs 11 / 48

SLIDE 18

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Computational complexity and memory requirements

◮ lx by ly by nt filters (12 by 12 by 30) ◮ N (1600) ST filters ◮ Feedforward flops = 2N l2 x l2 y nt ◮ Recurrent flops = 2N2 + 2N l2 x l2 y ◮ 5 % non-zero connections in R. ◮ Recurrent flops = 2 · 0.05 · N2 + 2N l2 x l2 y

Marius Pachitariu Learning visual motion with RNNs 11 / 48

SLIDE 19

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Computational complexity and memory requirements

◮ lx by ly by nt filters (12 by 12 by 30) ◮ N (1600) ST filters ◮ Feedforward flops = 2N l2 x l2 y nt ◮ Recurrent flops = 2N2 + 2N l2 x l2 y ◮ 5 % non-zero connections in R. ◮ Recurrent flops = 2 · 0.05 · N2 + 2N l2 x l2 y ◮ Recurrent flops < Feedforward flops

Marius Pachitariu Learning visual motion with RNNs 11 / 48

SLIDE 20

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Advantages of recurrent neural networks

◮ the brain already has the hardware

Marius Pachitariu Learning visual motion with RNNs 12 / 48

SLIDE 21

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Advantages of recurrent neural networks

◮ the brain already has the hardware ◮ do not require copies of the past

→ less memory usage → the brain has short timescales + bottleneck in LGN → no evidence for true delay lines in cortex

Marius Pachitariu Learning visual motion with RNNs 12 / 48

SLIDE 22

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Advantages of recurrent neural networks

◮ the brain already has the hardware ◮ do not require copies of the past

→ less memory usage → the brain has short timescales + bottleneck in LGN → no evidence for true delay lines in cortex

◮ fewer parameters

→ important for learning and generalization

Marius Pachitariu Learning visual motion with RNNs 12 / 48

SLIDE 23

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Advantages of recurrent neural networks

◮ the brain already has the hardware ◮ do not require copies of the past

→ less memory usage → the brain has short timescales + bottleneck in LGN → no evidence for true delay lines in cortex

◮ fewer parameters

→ important for learning and generalization

◮ reduced computational complexity

→ good for high bandwidth data

Marius Pachitariu Learning visual motion with RNNs 12 / 48

SLIDE 24

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Advantages of recurrent neural networks

◮ the brain already has the hardware ◮ do not require copies of the past

→ less memory usage → the brain has short timescales + bottleneck in LGN → no evidence for true delay lines in cortex

◮ fewer parameters

→ important for learning and generalization

◮ reduced computational complexity

→ good for high bandwidth data

◮ can integrate over long time periods

→ natural visual motion can be slow and noisy

Marius Pachitariu Learning visual motion with RNNs 12 / 48

SLIDE 25

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Neural sequence learning via STDP (toy model)

Rao & Sejnowski, NIPS 2000

Marius Pachitariu Learning visual motion with RNNs 13 / 48

SLIDE 26

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Spike and Slab Sparse Coding

◮ Olshausen&Millman 2000, Rehn&Sommer 2007, Goodfellow&al 2012

Marius Pachitariu Learning visual motion with RNNs 14 / 48

SLIDE 27

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Spike and Slab Sparse Coding

◮ Olshausen&Millman 2000, Rehn&Sommer 2007, Goodfellow&al 2012

zt xt ht yt p τx W, τy

ht

k = Bernoulli (pk)

xt = N

0, τ 2

x · I

zt = ht ◦ xt

yt = N

W · zt, τ 2

y

Marius Pachitariu

Learning visual motion with RNNs 14 / 48

SLIDE 28

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Spike and Slab Recurrent Neural Network

P

ht+1|zt

= σ

R · zt + b
σ(x) = 1/(1 + exp(−x))

zt xt ht yt ht+1 zt−1 t R, b τx W, τy . . . . . .

Marius Pachitariu Learning visual motion with RNNs 15 / 48

SLIDE 29

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Spike and Slab Recurrent Neural Network

P

ht+1|zt

= σ

R · zt + b
σ(x) = 1/(1 + exp(−x))

zt xt ht yt ht+1 zt−1 t R, b τx W, τy . . . . . .

For approximate inference:

◮ Assuming we have set

ˆ xt, ˆ ht for t = 1 to T.

◮ At T + 1 we only need

to solve an SC problem.

Marius Pachitariu Learning visual motion with RNNs 15 / 48

SLIDE 30

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Natural movies and artificial stimuli

Training data (ZCA whitened) Full single frame of training data Test data (whitened)

Marius Pachitariu Learning visual motion with RNNs 16 / 48

SLIDE 31

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Results - Speed tuning

Speed (pix/frame) Response (a.u.)

preferred non-preferred

3 Rectified neural responses max(z, 0) to drifting square gratings.

Marius Pachitariu Learning visual motion with RNNs 17 / 48

SLIDE 32

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Results - Speed tuning

Speed (pix/frame) Response (a.u.)

preferred non-preferred

3 Rectified neural responses max(z, 0) to drifting square gratings.

Orban et al, 1986

Marius Pachitariu Learning visual motion with RNNs 17 / 48

SLIDE 33

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Results - Direction selectivity indices

1 2 3 50 100 150

Number of units

Preferred speed (pix/frame)

Direction Index

DI = 1 − rnon-pref/rpref

Marius Pachitariu Learning visual motion with RNNs 18 / 48

SLIDE 34

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Results - Direction selectivity indices

DI = 1 − rnon-pref/rpref Peterson et al, 2004

Marius Pachitariu Learning visual motion with RNNs 18 / 48

SLIDE 35

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Results - Direction selectivity indices

DI = 1 − rnon-pref/rpref Peterson et al, 2004 Gur et al, 2007

Marius Pachitariu Learning visual motion with RNNs 18 / 48

SLIDE 36

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Results - connectomics in silico

Largest outgoing connections of one unit

Presynaptic Postsynaptic

Marius Pachitariu Learning visual motion with RNNs 19 / 48

SLIDE 37

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Results - connectomics in silico

Largest outgoing connections of one unit

Presynaptic Postsynaptic

Connected units are co-oriented

Marius Pachitariu Learning visual motion with RNNs 19 / 48

SLIDE 38

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Results - connectomics in silico

Outgoing connections of 15 randomly chosen DS units (and animation during learning)

Marius Pachitariu Learning visual motion with RNNs 20 / 48

SLIDE 39

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Results - connectomics in silico

Outgoing connections of 15 randomly chosen DS units (and animation during learning) Responses to small drifting Gabors

polar plots

Marius Pachitariu Learning visual motion with RNNs 20 / 48

SLIDE 40

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Inference and learning

zt x t ht y t ht + 1 zt − 1 t R, b τx W , τy . . . . . .

Lss-RNN =

t

Lt

ss-RNN

Lt

ss-RNN = const − yt − W(xt ◦ ht)2/2τ 2 y − xt2/2τ 2 x +

+

N

j=1

ht

j log σ

R
ht−1 ◦ xt−1

+ b

j

+

N

j=1

(1 − ht

j ) log

1 − σ
R
ht−1 ◦ xt−1

+ b

j
Marius Pachitariu

Learning visual motion with RNNs 21 / 48

SLIDE 41

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Inference and learning

zt x t ht y t ht + 1 zt − 1 t R, b τx W , τy . . . . . .

Lss-RNN =

t

Lt

ss-RNN

Lt

ss-RNN = const − yt − W(xt ◦ ht)2/2τ 2 y − xt2/2τ 2 x +

+

N

j=1

ht

j log σ

R
ht−1 ◦ xt−1

+ b

j

+

N

j=1

(1 − ht

j ) log

1 − σ
R
ht−1 ◦ xt−1

+ b

j
For approximate inference, we use greedy filtering:

◮ Assuming we have already set ˆ

xt, ˆ ht for t = 1 to T.

◮ At step T + 1 we only need to solve an sparse coding

problem given by the slice LT+1

ss-RNN. ◮ We solve the SC problem with standard matching

pursuit / coordinate descent methods.

Marius Pachitariu Learning visual motion with RNNs 21 / 48

SLIDE 42

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Learning of ss-RNN

Rao & Sejnowski, NIPS 2000

Gradients for learning R are similar to the STDP learning rule used in Rao & Sejnowski, 2000. ∂Lt

ss-RNN

∂Rjk =

ht−1

k

xt−1

k

·
ht

j − σ

R
ht ◦ xt

+ b

j
.

Marius Pachitariu Learning visual motion with RNNs 22 / 48

SLIDE 43

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

DS is learned, OS is not?

◮ Orientation selectivity (OS) and ocular dominance (OD) do not require

visual experience

◮ Visual deprivation has little impact on OS and OD

Marius Pachitariu Learning visual motion with RNNs 23 / 48

SLIDE 44

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

DS is learned, OS is not?

◮ Orientation selectivity (OS) and ocular dominance (OD) do not require

visual experience

◮ Visual deprivation has little impact on OS and OD ◮ However, DS does require visual experience in ferrets (Li et al, 2006)

Marius Pachitariu Learning visual motion with RNNs 23 / 48

SLIDE 45

Learning visual motion Statistical models of spike trains Spatiotemporal filtering Recurrent neural networks can compute visual motion Learning in generative RNN

Conclusions

◮ Recurrent neural networks can analyze visual motion in an online fashion

without delayed inputs.

◮ Formulating a generative model allows learning the recurrent connections

via an STDP rule.

◮ As a model of V1, the RNN makes testable predictions about the lateral

connectivity of neurons.

◮ Responses to stimuli may however be similar to those of spatiotemporal

filters.

Marius Pachitariu Learning visual motion with RNNs 24 / 48

SLIDE 46

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Statement of the problem

Marius Pachitariu Learning visual motion with RNNs 25 / 48

SLIDE 47

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

The equivalent of spatiotemporal filters: Generalized Linear Models (GLMs)

Pillow et al, 2008.

Marius Pachitariu Learning visual motion with RNNs 26 / 48

SLIDE 48

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Recurrent Generalized Linear Models

xt xt−1 xt−2 xt−τ yt yt−1 yt−2 yt−τ

W0

W0 W0 W0 R R R R R Z Z

predictive RNN

xt = W0 yt + R xt−1 yt ⊥ ⊥ {yt−1, yt−2, ...} | xt

Marius Pachitariu Learning visual motion with RNNs 27 / 48

SLIDE 49

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Recurrent Generalized Linear Models

xt xt−1 xt−2 xt−τ yt yt−1 yt−2 yt−τ

W0

W0 W0 W0 R R R R R Z Z

predictive RNN

xt = W0 yt + R xt−1 yt ⊥ ⊥ {yt−1, yt−2, ...} | xt Similar to state-of-the-art language models: Sutskever et al, 2011, Mikolov et al, 2011

Marius Pachitariu Learning visual motion with RNNs 27 / 48

SLIDE 50

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Relationship to linear dynamical system (LDS)

xt xt−1 xt−2 xt−τ yt yt−1 yt−2 yt−τ

W0

W0 W0 W0 R R R R R Z Z

R-GLM

xt xt−1 xt−2 xt−τ yt yt−1 yt−2 yt−τ

W0

W0 W0 W0 R R R R R

LDS

Marius Pachitariu Learning visual motion with RNNs 28 / 48

SLIDE 51

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Problem: cannot have instantaneous connections in a GLM

◮ where are instantaneous correlations coming from?

Marius Pachitariu Learning visual motion with RNNs 29 / 48

SLIDE 52

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Problem: cannot have instantaneous connections in a GLM

◮ where are instantaneous correlations coming from? ◮ can add Ising observation model but

◮ partition function

p(x) = 1 Z e−xT Ax/2−xT b

◮ not available for Poisson observations ◮ cannot add nonlinear link function like in GLM Marius Pachitariu Learning visual motion with RNNs 29 / 48

SLIDE 53

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Our solution: sequential prediction of each neuron

◮ constrain A to be strictly lower triangular

p(x) = Poisson (f (Ax + b))

Marius Pachitariu Learning visual motion with RNNs 30 / 48

SLIDE 54

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Our solution: sequential prediction of each neuron

◮ constrain A to be strictly lower triangular

p(x) = Poisson (f (Ax + b))

◮ Can do the same with Gaussian observation noise.

→ Equivalent to full covariance Gaussians.

Marius Pachitariu Learning visual motion with RNNs 30 / 48

SLIDE 55

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Our solution: sequential prediction of each neuron

◮ constrain A to be strictly lower triangular

p(x) = Poisson (f (Ax + b))

◮ Can do the same with Gaussian observation noise.

→ Equivalent to full covariance Gaussians.

◮ What about the ordering?

Marius Pachitariu Learning visual motion with RNNs 30 / 48

SLIDE 56

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Our solution: sequential prediction of each neuron

◮ constrain A to be strictly lower triangular

p(x) = Poisson (f (Ax + b))

◮ Can do the same with Gaussian observation noise.

→ Equivalent to full covariance Gaussians.

◮ What about the ordering? ◮ Can do the same with Bernoulli observation noise.

→ Performance matches Ising model.

Marius Pachitariu Learning visual motion with RNNs 30 / 48

SLIDE 57

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Our solution: sequential prediction of each neuron

◮ constrain A to be strictly lower triangular

p(x) = Poisson (f (Ax + b))

◮ Can do the same with Gaussian observation noise.

→ Equivalent to full covariance Gaussians.

◮ What about the ordering? ◮ Can do the same with Bernoulli observation noise.

→ Performance matches Ising model.

◮ Similar to recent image models: Theis et al, 2011 and Larochelle&Murray,

2011.

Marius Pachitariu Learning visual motion with RNNs 30 / 48

SLIDE 58

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Sampling the correlated Poisson model

Data correlations Model correlations

Marius Pachitariu Learning visual motion with RNNs 31 / 48

SLIDE 59

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Instantaneous noise and recurrence increase the likelihood

Likelihood (bits/spike) fully independent

3.15

correlated Poisson + 0.175 GLM + 0.225 GLM with correlated Pois- son + 0.03 R-GLM with correlated Poisson + 0.03

Marius Pachitariu Learning visual motion with RNNs 32 / 48

SLIDE 60

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

R-GLM learns long timescales

0.85 0.9 0.95 1 −0.4 −0.2 0.2 0.4 real imaginary R−GLM LDS Eigenvalues of the recurrent matrix

Marius Pachitariu Learning visual motion with RNNs 33 / 48

SLIDE 61

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Samples of model reproduce spatiotemporal correlations in data

LDS and GLM

−100 −50 50 100 −0.01 0.01 0.02 0.03 0.04 0.05 0.06 Time lag (ms) Correlation Neurons 1−23 Neurons 24−46 Neurons 47−69 Neurons 70−92

Marius Pachitariu Learning visual motion with RNNs 34 / 48

SLIDE 62

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Samples of model reproduce spatiotemporal correlations in data

LDS and GLM

−100 −50 50 100 −0.01 0.01 0.02 0.03 0.04 0.05 0.06 Time lag (ms) Correlation Neurons 1−23 Neurons 24−46 Neurons 47−69 Neurons 70−92

R-GLM

Marius Pachitariu Learning visual motion with RNNs 34 / 48

SLIDE 63

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Hidden units integrate info about stimulus

200 400 600 800 1000 1200 Simulation Time (ms) 200 400 600 800 1000 1200 Data Time (ms)

Delay period

Marius Pachitariu Learning visual motion with RNNs 35 / 48

SLIDE 64

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Hidden units generate dynamics

200 400 600 800 1000 1200 1400 1600 1800 Simulation Time (ms) 200 400 600 800 1000 1200 1400 1600 1800 Data Time (ms)

Delay + Movement period

Marius Pachitariu Learning visual motion with RNNs 36 / 48

SLIDE 65

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Joint estimation with hand position improves decoding

xt xt−1 xt−2 xt−τ yt yt−1 yt−2 yt−τ

W0

W0 W0 W0 R R R R R Z Z

Task

Marius Pachitariu Learning visual motion with RNNs 37 / 48

SLIDE 66

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Joint estimation with hand position improves decoding

xt xt−1 xt−2 xt−τ yt yt−1 yt−2 yt−τ

W0

W0 W0 W0 R R R R R Z Z

Task Mixture of trajectories model

Marius Pachitariu Learning visual motion with RNNs 37 / 48

SLIDE 67

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Joint estimation with hand position improves decoding

xt xt−1 xt−2 xt−τ yt yt−1 yt−2 yt−τ

W0

W0 W0 W0 R R R R R Z Z

Task Mixture of trajectories model

Our result: 6.45 mm.

Marius Pachitariu Learning visual motion with RNNs 37 / 48

SLIDE 68

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Speed profiles

10 20 30 5 10 15 10 20 30 5 10 15 10 20 30 5 10 15 10 20 5 10 15 10 20 30 5 10 15 10 20 30 5 10 15 10 20 30 5 10 15 10 20 5 10 15 10 20 30 5 10 15 10 20 30 5 10 15 10 20 30 5 10 15 10 20 5 10 15 10 20 30 5 10 15 10 20 30 5 10 15 10 20 30 5 10 15 10 20 5 10 15

Marius Pachitariu Learning visual motion with RNNs 38 / 48

SLIDE 69

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Conclusions

◮ Recurrent GLMs with correlated Poisson observations improve statistical

models of spike trains.

◮ The low dimensional parametrization improves decoding of hand

trajectories from neural data.

◮ This work was funded by the Gatsby Charitable Foundation.

Marius Pachitariu Learning visual motion with RNNs 39 / 48

SLIDE 70

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

The Bayesian Sampling hypothesis

◮ Bayesian brain hypothesis ◮ how does the brain do inference: sampling ◮ interesting data from V1 supports Bayesian sampling hypothesis

Marius Pachitariu Learning visual motion with RNNs 40 / 48

SLIDE 71

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

The Bayesian Sampling hypothesis

◮ Bayesian brain hypothesis ◮ how does the brain do inference: sampling ◮ interesting data from V1 supports Bayesian sampling hypothesis ◮ visual word reading time ∼ − log(P(word))

Marius Pachitariu Learning visual motion with RNNs 40 / 48

SLIDE 72

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

The Bayesian Sampling hypothesis

◮ Bayesian brain hypothesis ◮ how does the brain do inference: sampling ◮ interesting data from V1 supports Bayesian sampling hypothesis ◮ visual word reading time ∼ − log(P(word)) ◮ Bayesian reader (Norris, 2006) collects visual samples until

P(word|visual samples) is large.

Marius Pachitariu Learning visual motion with RNNs 40 / 48

SLIDE 73

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

The sequential Bayesian Reader

◮ Hypothesis: visual word reading time ∼ P(word|history) ◮ Need two ingredients

◮ Data ◮ good language models Marius Pachitariu Learning visual motion with RNNs 41 / 48

SLIDE 74

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Language modelling: statement of the problem

The Great Gatsby, by F. Scott Fitzgerald In my younger and more vulnerable years my father gave me some advice that I’ve been turning over in my mind ever since. ”Whenever you feel like criticizing any one,” he told me, ”just remember that all the people in this world haven’t had the advantages that you’ve had.”

Marius Pachitariu Learning visual motion with RNNs 42 / 48

SLIDE 75

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

The equivalent of spatio-temporal filters and GLMs: N-grams

4-gram (frequency)

◮ serve as the incoming (92) ◮ serve as the incubator (99) ◮ serve as the independent (794) ◮ serve as the index (223) ◮ serve as the indication (72) ◮ serve as the indicator (120)

Marius Pachitariu Learning visual motion with RNNs 43 / 48

SLIDE 76

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Neural network language models

4-gram (frequency)

◮ serve as the incoming (92) ◮ serve as the incubator (99) ◮ serve as the independent (794) ◮ serve as the index (223) ◮ serve as the indication (72) ◮ serve as the indicator (120)

serve as the incoming

Neural Network Language Model

ht0 = f t=∞

t=1

WT It0−t

P(It0) = softmax(ZTht0).

Marius Pachitariu Learning visual motion with RNNs 44 / 48

SLIDE 77

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Recurrent neural network language models

serve as the incoming

Neural Network Language Model

serve as the incoming

Recurrent Neural Network Language Model

◮ State of the art (Mikolov et al

2011).

◮ Our simplification: linear RNN

(R-GLM). ht0 = R ht0−1 + WT

0 It0−t.

ht0 =

t=∞

t=0

Rt It0−t.

Marius Pachitariu Learning visual motion with RNNs 45 / 48

SLIDE 78

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

What are the relevant time scales of language?

◮ R-GLM learns caching. 128 256 384 512 0.25 1 3 10 30 60 Hidden dimension # Time scale (words in the past)

The timescales of language

◮ Long time scales are good: dynamic R-GLM further adapts parameters at

test time.

Marius Pachitariu Learning visual motion with RNNs 46 / 48

SLIDE 79

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Perplexity results on Penn Corpus (930k tokens, 10k vocab) - single models

Single +KN5+cache x10 x10+KN5+cache 5-gram Kneser-Ney1 141.2 125.7 feedforward NNLM1 140.2 106.6 Log-bilinear LM1 144.5 105.8 RNN1 124.7 97.5 102.1 89.4 dynamic RNN1 123.2 98.0 101.0 90.0 R-GLM(no reg) 137 R-GLM(L1 reg) 125 R-GLM (2DO&CN) 102 94 98.8 92.5 dynamic R-GLM(2DO&CN) 98.4 90.7 95.1 89.1

1 copied from Tomas Mikolov thesis 2 trained with random dropouts and column normalization

Marius Pachitariu Learning visual motion with RNNs 47 / 48

SLIDE 80

Learning visual motion Statistical models of spike trains Recurrent GLM Instantaneous noise Results

Conclusions

◮ None yet. Need to collect data.

Marius Pachitariu Learning visual motion with RNNs 48 / 48