Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby - PowerPoint PPT Presentation

Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2010

Coding so far ... • Time-series for both spikes and stimuli • Empirical — estimate function(s) given measured stimuli or movements and spikes

Population codes • High dimensionality (cells × stimulus × time). – usually limited to simple rate codes. – even prosthetic work assumes instantaneous (lagged) coding • Limited empirical data – can record 10s - 100s of neurons. – population size more like 10 4 - 10 6 . – theoretical inferences, based on single-cell and aggregate (fMRI, LFP , optical) measurements .

Common approach The most common sort of questions asked of population codes: • given assumed encoding functions, how well can we (or downstream areas) de- code the encoded stimulus value? • what encoding schemes would be optimal, in the sense of allowing decoders to estimate stimulus values as well as possible. Before considering populations, we need to formulate some ideas about rate coding in the context of single cells.

Rate coding In the rate coding context, we imagine that the firing rate of a cell r represents a single (possibly multidimensional) stimulus value s at any one time: r = f ( s ) . Even if s and r are embedded in time-series we assume: 1. that coding is instantaneous (with a fixed lag), 2. that r (and therefore s ) is constant over a short time ∆ . The actual number of spikes n produced in ∆ is then taken to be distributed around r ∆ , often according to a Poisson distribution.

Tuning curves The function f ( s ) is known as a tuning curve. Commonly assumed forms: � � − 1 2 σ 2 ( x − x pref ) 2 • Gaussian r 0 + r max exp • Cosine r 0 + r max cos ( θ − θ pref ) � � � − 1 2 σ 2 ( θ − θ pref − 2 π n ) 2 • Wrapped Gaussian r 0 + r max exp n � � • von Mises (“circular Gaussian”) r 0 + r max exp κ cos ( θ − θ pref )

Measuring the performance of rate codes: Discrete choice Suppose we want to make a binary choice based on firing rate: • present / absent (signal detection) • up / down • horizontal / vertical Call one potential stimulus s 0 , the other s 1 . P ( n | s ) : probability density P(n|s 0 ) P(n|s 1 ) response

ROC curves probability density P(n|s 0 ) P(n|s 1 ) response 1 0.9 0.8 0.7 0.6 hit rate 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 false alarm rate

Summary measures • area under the ROC curve – given n 1 ∼ P ( n | s 1 ) and n 0 ∼ P ( n | s 0 ) , this equals P ( n 1 > n 0 ) • discriminability d ′ – for equal variance Gaussians d ′ = µ 1 − µ 0 . σ – for any threshold d ′ = Φ − 1 (1 − FA ) − Φ − 1 (1 − HR ) where Φ is a standard normal cdf. – definition unclear for non-Gaussian distributions.

Continuous estimation Now consider a (one dimensional) stimulus that takes any real value (or angle). • contrast • orientation • motion direction • movement speed Consider a neuron that fires n spikes in response to a stimulus s , according P ( n | f ( s ) ∆ ) Given n we attempt to estimate s . How well can we do?

Continuous estimation Useful to consider a limit given N → ∞ measurements n i all generated by the same stimulus s ∗ . Then the posterior over s is � log P ( s |{ n i } ) = log P ( n i | s ) + log P ( s ) − log Z ( { n i } ) i and so taking N → ∞ � � 1 n | s ∗ + 0 − log Z ( s ∗ ) N log P ( s |{ n i } ) → log P ( n | s ) and so P ( s |{ n i } ) → e N � log P ( n | s ) � n | s ∗ / Z = e − N KL [ P ( n | s ∗ ) � P ( n | s ) ] / Z

Continuous estimation Now, Taylor expand the KL divergence in s around s ∗ : � � P ( n | s ∗ ) � P ( n | s ) KL � � � � log P ( n | s ∗ ) = − log P ( n | s ) n | s ∗ + n | s ∗ � d log P ( n | s ) � � d 2 log P ( n | s ) � � � � � s ∗ − 1 � � log P ( n | s ∗ ) n | s ∗ − ( s − s ∗ ) 2( s − s ∗ ) 2 � � = − s ∗ + . . . ds 2 ds s ∗ s ∗ � � log P ( n | s ∗ ) + n | s ∗ � d 2 log P ( n | s ) � � = − 1 � 2( s − s ∗ ) 2 � s ∗ + . . . ds 2 s ∗ = 1 2( s − s ∗ ) 2 J ( s ∗ ) + . . . So in asymptopia, the posterior → N ( s ∗ , 1 / J ( s ∗ )) . J ( s ∗ ) is called the Fisher Information . � d 2 log P ( n | s ) � �� d log P ( n | s ) � 2 � � � � � J ( s ∗ ) = − � � s ∗ = ds 2 ds s ∗ s ∗ s ∗ (You will show that these are identical in the homework.)

Cram´ er-Rao bound The Fisher Information is important even outside the large data limit due to a deeper result that is due to Cram´ er and Rao. This states that for any N , any unbiased estimator ˆ s ( { n i } ) of s will have the property that � s ( { n i } ) − s ∗ ) 2 � 1 (ˆ n i | s ∗ ≥ J ( s ∗ ) . Thus, Fisher Information gives a lower bound on the variance of any unbiased estimator. This is called the Cram´ er-Rao bound. (There is also a version for biased estimators). The Fisher Information will be our primary tool to quantify the performance of a population code.

Fisher Info and tuning curves n = r ∆ + noise ; r = f ( s ) ⇒ �� d � 2 � � � J ( s ∗ ) = � s ∗ log P ( n | s ) ds s ∗ �� d � 2 � � � f ( s ∗ ) log P ( n | r ∆ ) ∆ f ′ ( s ∗ ) � = dr ∆ s ∗ = J noise ( r ∆ ) ∆ 2 f ′ ( s ∗ ) 2 f(s) J(s) firing rate / Fisher info s

Fisher info for Poisson neurons For Poisson neurons P ( n | f ( s )) = e − f ( s ) f ( s ) n n ! so �� d � 2 � � � J noise [ f ( s ∗ )] = � f ( s ∗ ) log P ( n | f ( s )) d f s ∗ �� d � 2 � � � � = f ( s ∗ ) − f ( s ) + n log f ( s ) − log n ! d f s ∗ �� 2 � − 1 + n / f ( s ∗ ) = s ∗ � ( n − f ( s ∗ )) 2 � = f ( s ∗ ) 2 s ∗ = f ( s ∗ ) 1 [ not surprising! � f ( s ∗ ) 2 = f ( s ) = n ] f ( s ∗ ) and J [ s ∗ ] = f ′ ( s ∗ ) 2 / f ( s ∗ )

Cooperative coding Scalar coding Labelled Line firing rate firing rate s s Distributed encoding firing rate firing rate s s

Cooperative coding All of these are found in biological systems. Issues: 1. redundancy and robustness (not scalar) 2. efficiency (not labelled line) 3. local computation (not scalar or distributed) 4. multiple values (not scalar)

Coding in multiple dimensions Cartesian Distributed s 2 s 2 s 1 s 1 • efficient • represent multiple values • problems with multiple values • may require more neurons

Cricket cercal system c T 1 c 2 = 0 r a ( s ) = r max [cos( θ − θ a )] + = r max [ c T a v ] + c 3 = − c 1 a a c 4 = − c 2 r a = r a / r max So, writing ˜ : a � ˜ � � � c T r 1 − ˜ r 3 1 = v c T r 2 − ˜ ˜ r 4 2 � ˜ � � r 1 − ˜ r 3 v = ( c 1 c 2 ) r 1 c 1 − ˜ r 3 c 3 + ˜ r 2 c 2 − ˜ r 4 c 4 = r a c a = ˜ ˜ r 2 − ˜ ˜ r 4 a This is called population vector decoding.

Motor cortex (simplified) Cosine tuning, randomly distributed preferred directions. In general, population vector decoding works for • cosine tuning • cartesian or dense ( tight ) directions But: • is it optimal? • does it generalise? (Gaussian tuning curves) • how accurate is it?

Bayesian decoding Take n a ∼ Poisson [ f a ( s ) ∆ ] , independently for different cells. Then � e − f a ( s ) ∆ ( f a ( s ) ∆ ) n a P ( n | s ) = n a ! a and � log P ( s | n ) = − f a ( s ) ∆ + n a log ( f a ( s ) ∆ ) − log n a ! + log P ( s ) a Assume � a f a ( s ) is independent of s for a homogeneous population, and prior is flat. � ds log P ( s | n ) = d d n a log ( f a ( s ) ∆ ) ds a � n a f a ( s ) ∆ f ′ = a ( s ) ∆ a

Bayesian decoding Now, consider f a ( s ) = e − ( s − s a ) 2 / 2 σ 2 , so f ′ a ( s ) = − ( s − s a ) /σ 2 e − ( s − s a ) 2 / 2 σ 2 and set the derivative to 0: � n a ( s − s a ) /σ 2 = 0 a � a n a s a � s MAP = ˆ a n a So the MAP estimate is a population average of preferred directions. Not exactly a population vector.

Population Fisher Info Fisher Informations for independent random variates add: � � − d 2 J n ( s ) = ds 2 log P ( n | s ) � � � − d 2 = log P ( n a | s ) ds 2 a � � � � − d 2 = ds 2 log P ( n a | s ) = J n a ( s ) . a a � f ′ a ( s ) 2 = ∆ f a ( s ) a

Optimal tuning properties A considerable amount of work has been done in recent years on finding optimal properties of tuning curves for rate-based population codes. Here, we reproduce one such argument (from Zhang and Sejnowski, 1999). Consider a population of cells that codes the value of a D dimensional stimulus, s . Let the a th cell emit r spikes in an interval τ with probability distribution that is conditionally independent of the other cells (given s ) and has the form P a ( r | s , τ ) = S ( r , f a ( s ) , τ ) . The tuning curve of the a th cell, f a ( s ) , has the form � D � ( ξ a ) 2 � i = s i − c a ( ξ a ) 2 = f a ( s ) = F · φ ( ξ a i ) 2 ; ξ a i ; , σ i where F is a maximal rate and the function φ is monotically decreasing. The param- eters c a and σ give the centre of the a th tuning curve and the (common) width.

Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby - PowerPoint PPT Presentation

Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2010 Coding so far ... Time-series for both spikes and stimuli Empirical estimate function(s)

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Population Ecology 1. Population Concepts 2. Population Growth 3. Regulation of Population

Risk-Based Coding and Reimbursement What is Risk-Based Coding? Risk-Based Coding Overview A

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

Coding and Applications in Sensor Networks Coding and Applications in Sensor Networks Why coding?

Applications of Random Coding and Algebraic Coding Theories to Universal Lossless Source Coding

Coding and Applications in Sensor Networks Why coding? Information compression

CODING: ICD-10 CODING & UB-04 CODING FOR PDPM NELIA ADACI RN, BSN CDONA, DNS-CT, RAC-CTA

Lecture 5 Lossless Coding (II) May 20, 2009 Shujun LI ( ): INF-10845-20091 Multimedia

Lecture 11 Vector Linear Network Coding Vector Linear Network Coding Outline Fundamentals for

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

Image and Video Coding: Hybrid Video Coding s n 1 [ x , y ] s n [ x , y ] m k = ( m x , m

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Automating reading comprehension by generating question and answer pairs Vishwajeet Kumar 1

Funding for Clubs 2020 FUTURE WEBINARS Thursday 23 rd April at 17.00 Club Development and

Estimation of Skill Distribution from a Tournament Ali Jadbabaie, Anuran Makur, and Devavrat Shah

Last Lecture: Localization Primitives This Lecture: Indoor Positioning Systems:

Auditing net neutrality violations globally -or- What happened when Apple said no

Formula 1 What is Formula 1 ? What is Formula 1 ? Highest class of single seater auto racing

Herecast: An Open Infrastructure for Location-Based Services using WiFi Mark Paciga and Hanan

optical readout Why are we building colossal liquid What is the origin of the

Sambuz

Useful Links

Newsletter

Mail Us

Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby - PowerPoint PPT Presentation

Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2010 Coding so far ... Time-series for both spikes and stimuli Empirical estimate function(s)

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Population Ecology 1. Population Concepts 2. Population Growth 3. Regulation of Population

Risk-Based Coding and Reimbursement What is Risk-Based Coding? Risk-Based Coding Overview A

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

Coding and Applications in Sensor Networks Coding and Applications in Sensor Networks Why coding?

Applications of Random Coding and Algebraic Coding Theories to Universal Lossless Source Coding

Coding and Applications in Sensor Networks Why coding? Information compression

CODING: ICD-10 CODING &amp; UB-04 CODING FOR PDPM NELIA ADACI RN, BSN CDONA, DNS-CT, RAC-CTA

Lecture 5 Lossless Coding (II) May 20, 2009 Shujun LI ( ): INF-10845-20091 Multimedia

Lecture 11 Vector Linear Network Coding Vector Linear Network Coding Outline Fundamentals for

Speech &amp; Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

Image and Video Coding: Hybrid Video Coding s n 1 [ x , y ] s n [ x , y ] m k = ( m x , m

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Automating reading comprehension by generating question and answer pairs Vishwajeet Kumar 1

Funding for Clubs 2020 FUTURE WEBINARS Thursday 23 rd April at 17.00 Club Development and

Estimation of Skill Distribution from a Tournament Ali Jadbabaie, Anuran Makur, and Devavrat Shah

Last Lecture: Localization Primitives This Lecture: Indoor Positioning Systems:

Auditing net neutrality violations globally -or- What happened when Apple said no

Formula 1 What is Formula 1 ? What is Formula 1 ? Highest class of single seater auto racing

Herecast: An Open Infrastructure for Location-Based Services using WiFi Mark Paciga and Hanan

optical readout Why are we building colossal liquid What is the origin of the

Sambuz

Useful Links

Newsletter

Mail Us

CODING: ICD-10 CODING & UB-04 CODING FOR PDPM NELIA ADACI RN, BSN CDONA, DNS-CT, RAC-CTA

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen