Overview Understanding the neural code Neural Encoding Encoding: - PowerPoint PPT Presentation

Overview Understanding the neural code Neural Encoding Encoding: Prediction of neural response to a given stimulus Decoding (homunculus): Given response, what was the stimulus? Mark van Rossum Given firing pattern, what will be the motor output? (Important for prosthesis) School of Informatics, University of Edinburgh Measuring information rates Books: January 2015 [Rieke et al., 1996] (a very good book on these issues), [Dayan and Abbott, 2002] (chapters 2 and 3) [Schetzen, 2006], [Schetzen, 1981] (review paper on method) 1 / 58 2 / 58 The neural code Encoding: Stimulus-response relation Predict response r to stimulus s . Black box approach. This is a supervised learning problem, but: Understanding the neural code is like building a dictionary. Stimulus s can be synaptic input or sensory stimulus. Translate from outside world (sensory stimulus or motor action) to Responses are noisy and unreliable: Use probabilities. internal neural representation Typically many input (and sometimes output) dimensions Translate from neural representation to outside world Reponses are non-linear 1 Like in real dictionaries, there are both one-to-many and Assume non-linearity is weak. Make series expansion? many-to-one entries in the dictionary (think of examples) Or, impose a parametric non-linear model with few parameters Need to assume causality and stationarity (system remains the same). This excludes adaptation! 1 Linear means: r ( α s 1 + β s 2 ) = α r ( s 1 ) + β r ( s 2 ) for all α , β . 3 / 58 4 / 58

Response: Spikes and rates Paradigm: Early Visual Pathways Response consists of spikes. Spikes are (largely) stochastic. Compute rates by trial-to-trial average and hope that system is stationary and noise is really noise. Initially, we try to predict r , rather than predict the spikes. (Note, there are methods to estimate most accurate histogram from data). [Figure: Dayan and Abbott, 2001, after Nicholls et al, 1992] 5 / 58 6 / 58 Retinal/LGN cell response types V1 cell response types (Hubel & Wiesel) Odd Even On-centre off-surround Off-centre on-surround Simple cells, modelled by Gabor functions Also complex cells, and spatio-temporal receptive fields Higher areas Also colour opponent cells Other pathways (e.g. auditory) 7 / 58 8 / 58

Not all cells are so simple... Not all cells are so simple... The methods work well under limited conditions and for early sensory systems. But intermediate sensory areas (eg. IT) do things like face recognition. Very non-linear; hard with these methods. In even higher areas the receptive field (RF) is not purely sensory. Example: pre-frontal cells that are task dependent [Wallis et al., 2001] 9 / 58 10 / 58 Overview Simple example A thermometer: temperature T gives response r = g ( T ) , r measures cm mercury g ( T ) is monotonic, g − 1 ( r ) probably exists Could be somewhat non-linear Volterra and Wiener expansions Could in principle be noisy. Spike-triggered average & covariance Will not indicate instantaneous T , but its recent history. �� dt ′ T ( t ′ ) k ( t − t ′ ) � Linear-nonlinear-Poisson (LNP) models For example r ( t ) = g Integrate & fire and Generalized Linear models k is called a (filter) kernel. The argument of g () is a convolution dt ′ T ( t ′ ) k ( t − t ′ ) � T ⋆ k ≡ Networks Note, if k ( t ) = δ ( t ) then r ( t ) = g ( T ( t )) 11 / 58 12 / 58

Volterra Kernels Noise and power spectra Inspiration from Taylor expansion: 2 r ′′ ( 0 ) s 2 + . . . = r 0 + r 1 s + 1 2 r 2 s 2 + . . . r ( s ) = r ( 0 ) + r ′ ( 0 ) s + 1 At each timestep draw an independent sample from a zero-mean But include temporal response (Taylor series with memory) Gaussian � s ( t 1 ) � = 0, � s ( t 1 ) . . . s ( t 2 k + 1 ) � = 0 r ( t ) = h 0 � s ( t 1 ) s ( t 2 ) � = C ( t 1 − t 2 ) = σ 2 δ ( t 1 − t 2 ) , � ∞ + d τ 1 h 1 ( τ 1 ) s ( t − τ 1 ) � s ( t 1 ) s ( t 2 ) s ( t 3 ) s ( t 4 ) � = σ 4 [ δ ( t 1 − t 2 ) δ ( t 3 − t 4 ) + δ ( t 1 − t 3 ) δ ( t 2 − t 4 ) + δ ( t 1 − t 4 ) δ ( t 2 − t 3 )] 0 � ∞ � ∞ + d τ 1 d τ 2 h 2 ( τ 1 , τ 2 ) s ( t − τ 1 ) s ( t − τ 2 ) The noise is called white because in the Fourier domain all 0 0 frequencies are equally strong. + . . . h 3 ( τ 1 , τ 2 , τ 3 ) . . . The powerspectrum of the signal and the autocorrelation are directly related via Wiener-Kinchin theorem. Note, h 2 ( τ 1 , τ 2 ) = h 2 ( τ 2 , τ 1 ) Hope that � ∞ S ( f ) = 4 C ( τ ) cos( 2 π f τ ) lim τ i →∞ h k ( τ j ) = 0 0 h k is smooth, h k small for large k . 13 / 58 14 / 58 Wiener Kernels Estimating Wiener Kernels To find kernels, stimulate with Gaussian white noise Wiener kernels are a rearrangement of the Volterra expansion, used g 0 = � r � when s ( t ) is Gaussian white noise with � s ( t 1 ) s ( t 2 ) � = σ 2 δ ( t 1 − t 2 ) 1 0th and 1st order Wiener kernels are identical to Volterra g 1 ( τ ) = σ 2 � r ( t ) s ( t − τ ) � (correlation) 1 r ( t ) = g 0 g 2 ( τ 1 , τ 2 ) = 2 σ 4 � r ( t ) s ( t − τ 1 ) s ( t − τ 2 ) � ( τ 1 � = τ 2 ) � ∞ + d τ 1 g 1 ( τ 1 ) s ( t − τ 1 ) 0 In Wiener, but not Volterra, expansion successive terms are �� ∞ � ∞ independent. Including a quadratic term won’t affect the + d τ 1 d τ 2 g 2 ( τ 1 , τ 2 ) s ( t − τ 1 ) s ( t − τ 2 ) estimation of the linear term, etc. 0 0 � ∞ � − σ 2 Technical point [Schetzen, 1981] : Lower terms do enter in higher d τ 1 g 2 ( τ 1 , τ 1 ) + . . . (1) order correlations, e.g. 0 � r ( t ) s ( t − τ 1 ) s ( t − τ 2 ) � = 2 σ 4 g 2 ( τ 1 , τ 2 ) + σ 2 g 0 δ ( τ 1 − τ 2 ) The predicted rate is given by Eq.(1). 15 / 58 16 / 58

Remarks Wiener Kernels in Discrete Time Model: L − 1 The predicted rate can be <0. � r ( n ∆ t ) = g 0 + g 1 i s (( n − i )∆ t ) + . . . In biology, unlike physics, there is no obvious small parameter that i = 0 justifies neglecting higher orders. Check the accuracy of the approximation post hoc. In discrete time this is just linear/polynomial regression Solve e.g. to minimize squared error, E = ( r − Sg ) T ( r − Sg ) . Averaging and ergodicity E.g. L = 3, g = ( g 0 , g 10 , g 11 , g 12 ) T and r = ( r 1 , r 2 , . . . , r n ) T � x � formally means an average over many realizations over the random variables of the system (both stimuli and internal state).   1 s 1 s 0 s − 1 This definition is good to remember when conceptual problems 1 s 2 s 1 s 0   occur. S =  . .  . .   . . An ergodic system visits all realizations if one waits long enough.   1 s n s n − 1 s n − 2 That means one can measure from a system repeatedly and get the true average. S is a n × ( 1 + L ) matrix (’design matrix’) 17 / 58 18 / 58 Linear case: Fourier domain The least-squares solution ˆ g for any stimulus (differentiate E wrt. g ): g = ( S T S ) − 1 S T r ˆ Convolution becomes simple multiplication in Fourier domain Assume the neuron is purely linear ( g j = 0, j > 1 ), Note that on average for Gaussian noise � S T S � ij = n δ ij ( σ 2 + ( 1 − σ 2 ) δ i 1 ) otherwise Fourier representation is not helpful r ( t ) = r 0 + s ∗ g 1 After substitution we obtain � s ( t ) r ( t + τ ) � = � sr 0 � + � s ( t ) g 1 ∗ s � n 1 � ˆ Now g 1 ( ω ) = � rs � ( ω ) g 0 = r i = � r � n � ss � ( ω ) i = 1 For Gaussian white noise � ss � ( ω ) = σ 2 (note, that � s � = 0) n 1 1 s i − j r i = 1 � ˆ g 1 j = σ 2 corr ( s , r ) 1 σ 2 So g 1 ( ω ) = σ 2 � rs � ( ω ) n i = 1 g 1 can be interpreted as impedance of the system Note parallel with continuous time equations. 19 / 58 20 / 58

Regularization Regularization 0.006 unreg regul 0.004 E 3 f ( x ) 0.002 STA 2 0 1 validation -0.002 0 -0.004 0 20 40 60 80 100 training − 1 0 1 2 3 Time x model complexity Fits with many parameters typically require regularization to Figure: Over-fitting: Left: The stars are the data points. Although the dashed prevent over-fitting line might fit the data better, it is over-fitted. It is likely to perform worse on Regularization: punish fluctuations (smooth prior) new data. Instead the solid line appears a more reasonable model. Right: � rs � ( ω ) When you over-fit, the error on the training data decreases, but the error on Non-white stimulus, Fourier: g 1 ( ω ) = � ss � ( ω )+ λ (prevent division by new data increases. Ideally both errors are minimal. zero as ω → ∞ ) In time-domain: ˆ g = ( S T S + λ I ) − 1 S T r Set λ by hand 21 / 58 22 / 58 Spatio-temporal kernels Higher-order kernels Including higher orders leads to more accurate estimates. [Dayan and Abbott, 2002] Kernel can also be in spatio-temporal domain. This V1 kernel does not respond to static stimulus, but will respond to a moving grating ([Dayan and Abbott, 2002]§2.4 for more motion detectors) Chinchilla auditory system [Temchin et al., 2005] 23 / 58 24 / 58

Overview Understanding the neural code Neural Encoding Encoding: - PowerPoint PPT Presentation

Overview Understanding the neural code Neural Encoding Encoding: Prediction of neural response to a given stimulus Decoding (homunculus): Given response, what was the stimulus? Mark van Rossum Given firing pattern, what will be the motor

01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 | KPF Overview 01 |

OVERVIEW PRESENTATION / 1 OVERVIEW PRESENTATION / 1 SF park overview OVERVIEW PRESENTATION / 2

OVERVIEW PRESENTATION / 1 OVERVIEW PRESENTATION / 1 Acknowledgements OVERVIEW PRESENTATION / 2 SF

INVESTOR PRESENTATION FEBRUARY 2016 INDEX EXECUTIVE SUMMARY COMPANY OVERVIEW BUSINESS OVERVIEW

INVESTOR PRESENTATION MAY 2019 Index Executive Summary Company Overview Business Overview

INVESTOR PRESENTATION MARCH 2016 INDEX EXECUTIVE SUMMARY COMPANY OVERVIEW BUSINESS OVERVIEW

1 Overview Overview Regional demographic overview Regional demographic overview Workforce

Covid-19 and Business Interruption: Maximizing Insurance Coverage and Federal Grants Counsel

OVERVIEW OVERVIEW OVERVIEW OVERVIEW The qualifications are aimed at primary school

An overview to Maltese An overview to Maltese An overview to Maltese An overview to Maltese

GSM System Overview GSM System Overview GSM System Overview GSM System Overview Phone Lin

Butterball Employees Butterball Employees Butterball Employees Benefits Overview Ruan Benefits

Program-for-Results Financing Overview Overview Overview of World Bank Instruments

INVESTOR PRESENTATION Index Executive Summary Company Overview Business Overview Industry

Key Maths 3 UK Assessm ent overview Claire Parsons Overview 1. Key Maths 3 UK (overview) 2.

Federal Fiscal Year 2017-18 CHASE Fee Program June 21, 2018 Overview CHASE Overview Fee

End-to-end approach to ASR, TTS and Speech Translation Satoshi Nakamura 1,2 with Sakriani Sakti

What underlies between-frequency gap detection? Shuji Mori Kyushu University 2014 Symposium on

Spikes and Computation in Sensory Processing Simon Thorpe CerCo ( Brain and Cognition Research

CS440/ECE448 Lecture 26: Speech Mark Hasegawa-Johnson, 4/17/2019, CC-By 3.0 Outline Human

Availability for Learning: The Forgotten Senses David Brown Deafblind Educational Specialist

Slide 1 / 39 Directions: Select/State the answer that best completes the statement or answers the

Basic Acoustics Graduate School of Culture Technology (GSCT) Juhan Nam 1 Outlines What is

the human sensory, short-term , long-term Information processed and applied