Statistical challenges and opportunities in neural data analysis - - PowerPoint PPT Presentation

statistical challenges and opportunities in neural data
SMART_READER_LITE
LIVE PREVIEW

Statistical challenges and opportunities in neural data analysis - - PowerPoint PPT Presentation

Statistical challenges and opportunities in neural data analysis Liam Paninski Department of Statistics Center for Theoretical Neuroscience Grossman Center for the Statistics of Mind Columbia University http://www.stat.columbia.edu/ liam


slide-1
SLIDE 1

Statistical challenges and

  • pportunities in neural data analysis

Liam Paninski

Department of Statistics Center for Theoretical Neuroscience Grossman Center for the Statistics of Mind Columbia University http://www.stat.columbia.edu/∼liam liam@stat.columbia.edu November 8, 2013

— with Y. Ahmadian, J. Huggins, S. Keshri, T. Machado, T. Paige, A. Pakman, D. Pfau, E. Pnevmatikakis, B. Shababo, C. Smith, J. Vogelstein Support: NIH/NSF CRCNS, Sloan, NSF CAREER, DARPA, ARO, McKnight.

slide-2
SLIDE 2

A golden age of statistical neuroscience

Some notable recent developments:

  • machine learning / statistics / optimization methods for

extracting information from high-dimensional data in a computationally-tractable, systematic fashion

  • computing (Moore’s law, massive parallel computing)
  • optical and optogenetic methods for recording from and

perturbing neuronal populations, at multiple scales

  • large-scale, high-density multielectrode recordings
  • growing acceptance that many fundamental neuroscience

questions are in fact statistics questions in disguise

slide-3
SLIDE 3

Circuit inference via optical methods

slide-4
SLIDE 4

Aim 1: Model-based estimation of spike rates

Note: each component here can be generalized easily.

slide-5
SLIDE 5

Fast maximum a posteriori (MAP) estimation

Start by writing out the posterior: log p(C|F) = log p(C) + log p(F|C) + const. =

  • t

log p(Ct+1|Ct) +

  • t

log p(Ft|Ct) + const. Three basic observations:

  • If log p(Ct+1|Ct) and log p(Ft|Ct) are concave, then so is log p(C|F).
  • Hessian H of log p(C|F) is tridiagonal: log p(Ft|Ct) contributes a diag term,

and log p(Ct+1|Ct) contributes a tridiag term (Paninski et al., 2010).

  • C is a linear function of n.

Newton’s method: iteratively solve HCdir = ∇. Tridiagonal solver requires O(T)

  • time. Can include nonneg constraint nt ≥ 0 via log-barrier (Koyama and

Paninski, 2010) — real-time deconvolution (Vogelstein et al., 2010).

slide-6
SLIDE 6

Example: nonnegative MAP filtering

slide-7
SLIDE 7

Multineuronal case: spatiotemporal demixing

Timestep Voxel #

100 200 300 400 500 600 700 800 900 1000 20 40 60 80 100 120

Model: Y = C + ǫ C(x, t) =

r

  • i=1

si(x)fi(t) fi(t + dt) =

  • 1 − dt

τi

  • fi(t) + ni(t)

Goal: infer low-rank matrix C from noisy Y . Rank r = number of visible neurons Additional structure: jumps in fi(t) are non-negative Locally rank-penalized convex optimization with nonnegativity constraints to infer C, followed by iterative matrix factorization under nonnegativity constraints to infer si(x), fi(t) (Pnevmatikakis et al, 2013). Examples: Machado, Lacefield, Shababo

slide-8
SLIDE 8

Compressed sensing imaging

Idea: instead of raster scans, take randomized projections in each frame.

(from Studer et al, 2011)

Estimating C given randomized projections Y can be cast as a similar convex

  • ptimization.
slide-9
SLIDE 9

Compressed sensing imaging

Spikes Neuron id

10 20 30 40 50 60

True traces Estimated traces

100 200 300 400 500 600 700 800 900 1000 100 200 300 400 500 600 700 800 900 1000 2 4 6

Example trace Frame

True Estimated

2 measurements per timestep (30x undersampling); Pnevmatikakis et al (2013)

slide-10
SLIDE 10

Compressed sensing imaging

Spikes Neuron id

10 20 30 40 50 60

True traces Estimated traces

100 200 300 400 500 600 700 800 900 1000 100 200 300 400 500 600 700 800 900 1000 2 4 6

Example trace Frame

True Estimated

4 measurements per timestep (15x undersampling); Pnevmatikakis et al (2013)

slide-11
SLIDE 11

Compressed sensing imaging

Spikes Neuron id

10 20 30 40 50 60

True traces Estimated traces

100 200 300 400 500 600 700 800 900 1000 100 200 300 400 500 600 700 800 900 1000 2 4 6

Example trace Frame

True Estimated

8 measurements per timestep (7.5x undersampling); Pnevmatikakis et al (2013)

slide-12
SLIDE 12

Phase transitions in decoding accuracy

New tool: “statistical dimension” (Amelunxen, Lotz, McCoy, Tropp ’13). Interesting feature of this problem: phase transition depends on pattern of spikes, not just sparsity (as in standard LASSO problem).

slide-13
SLIDE 13

Fully Bayesian approaches

Can we sample over {ni(t)} instead? In general, challenging: high-dimensional binary vector; not much structure to exploit. Currently exploring Hamiltonian Monte Carlo (HMC) methods. Two tricks:

  • Piecewise log-quadratic (PLQ) densities (e.g., truncated

multivariate normals) are easy to sample from using exact integration of Hamiltonian dynamics - no step-size parameter needed (Pakman and Paninski, ’13a). Can use similar O(T) tricks as before.

  • Arbitrary binary vectors or spike-and-slab posteriors can be

embedded in a PLQ density via simple augmented-variable approach (Pakman and Paninski, ’13b)

slide-14
SLIDE 14

Exact HMC truncated spike-and-slab sampling

(Pakman and Paninski ‘13b)

slide-15
SLIDE 15

Aim 2: estimating network connectivity

Coupled GLM structure; concave loglikelihoods, optimization is straightforward (Paninski, 2004; Pillow et al., 2008).

slide-16
SLIDE 16

The dreaded common input problem

How to distinguish direct connectivity from common input?

(from Nykamp ‘07)

Previous work (e.g., Vidne et al, 2012) modeled common input terms explicitly as latent variables; works well given enough a priori information, but not a general solution.

slide-17
SLIDE 17

A “shotgun sampling” approach

We can only observe K cells at a time. Idea: don’t observe the same subset of K cells throughout the experiment. Instead, observe as many different K-subsets as possible. Hard with multi-electrode arrays; easy with imaging approaches. Statistics problem: how to patch together all of the estimated subnetworks? Want to integrate over {ni(t)}, but scaling to large networks is a big challenge.

slide-18
SLIDE 18

Approximate sufficient statistics in large Poisson regressions

Model: ni,t ∼ Poiss(λi,t), λi,t = exp(bi + Wint−1) LLi =

  • t

ni,t(bi + Wint−1) −

  • t

exp(bi + Wint−1) Idea: CLT approximation for second term. (Wint−1 is a big sum; appeal to Diaconis-Freedman.) Dramatic simplification: profile approx log-likelihood is quadratic! (Ramirez and Paninski ’13) Approximate sufficient statistics: E(nt), E(ntnT

t−1). Can be

estimated from just the observed data - no need to impute unobserved {ni,t}.

slide-19
SLIDE 19

Simulated “shotgun” results

10 20 30 40 50 5 10 15 20 25 30 35 40 45 50 −200 −150 −100 −50 50 100 150 200

(a) True Weights

5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50

(b) All observed

−2 −1.5 −1 −0.5 0.5 1 1.5 2 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50

(c) Random varying subset observed

−2 −1.5 −1 −0.5 0.5 1 1.5 2 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16

(d) Fixed subset observed

−2 −1.5 −1 −0.5 0.5 1 1.5 2 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5

K = 20% of network size; spike-and-slab priors (Keshri et al, 2013)

slide-20
SLIDE 20

Aim 3: Optimal control of spike timing

To test our results, we want to perturb the network at will. How can we make a neuron fire exactly when we want it to? Assume bounded inputs; otherwise problem is trivial. Start with a simple model: λt = f(Vt + ht) Vt+dt = Vt + dt (−gVt + aIt) + √ dtσǫt, ǫt ∼ N(0, 1). Now we can just optimize the likelihood of the desired spike train, as a function of the input It, with It bounded. Concave objective function over convex set of possible inputs It + Hessian is tridiagonal = ⇒ O(T) optimization. — again, can be done in real time (Ahmadian et al., 2011)... though some open challenges when It is high-d, spatiotemporal

slide-21
SLIDE 21

Applications

  • sensory prosthetics, e.g. retinal prosthetics
  • online adaptive experimental design: choose stimuli which provide as much

information about network as possible. Major problem here: updating sparse

  • posteriors. Factorized approximations to spike-and-slab posteriors are effective in

this problem (Shababo, Paige et al, ‘13)

slide-22
SLIDE 22

Extension: Connectivity at the dendritic scale

Ramon y Cajal, 1888.

slide-23
SLIDE 23

The filtering problem

Spatiotemporal imaging data opens an exciting window on the computations performed by single neurons, but we have to deal with noise and intermittent observations.

slide-24
SLIDE 24

Basic paradigm: compartmental models

  • write neuronal dynamics in terms of equivalent nonlinear, time-varying

RC circuits

  • leads to a coupled system of stochastic differential equations
slide-25
SLIDE 25

Simplest case: Kalman filter

Dynamics and observation equations: d V /dt = A V + ǫt

  • yt = Bt

V + ηt Vi(t) = voltage at compartment i A = cable dynamics matrix: includes leak terms (Aii = −gl) and intercompartmental terms (Aij = 0 unless compartments are adjacent) Bt = observation matrix: point-spread function of microscope Even this case is challenging, since d = dim( V ) is very large Standard Kalman filter: O(d3) computation per timestep (matrix inversion)

slide-26
SLIDE 26

Low-rank approximations

Key fact: current experimental methods provide just a few low-SNR

  • bservations per time step.

Basic idea: if dynamics are approximately linear and time-invariant, we can approximate Kalman covariance Ct = cov(qt|Y1:t) as a perturbation of the marginal covariance C0 + UtDtU T

t , with C0 = limt→∞ cov(qt).

C0 is the solution to a Lyapunov equation. It turns out that we can solve linear equations involving C0 in O(dim(q)) time via Gaussian belief propagation, using the fact that the dendrite is a tree. The necessary recursions — i.e., updating Ut, Dt and the Kalman mean E(qt|Y1:t) — involve linear manipulations of C0, using Ct = [(ACt−1AT + Q)−1 + Bt]−1 C0 + UtDtU T

t

=

  • [A(C0 + Ut−1Dt−1U T

t−1)AT + Q]−1 + Bt

−1, and can be done in O(dim(q)) time (Paninski, 2010). Generalizable to many other state-space models (Pnevmatikakis and Paninski, 2011). Examples: speckle, vertical

slide-27
SLIDE 27

Application: synaptic locations/weights

slide-28
SLIDE 28

Application: synaptic locations/weights

slide-29
SLIDE 29

Application: synaptic locations/weights

Including known terms: d V /dt = A V (t) + W U(t) + ǫ(t); U(t) are known presynaptic spike times, and we want to detect which compartments are connected (i.e., infer the weight matrix W). Loglikelihood is quadratic; W is a sparse vector. L1-penalized loglikelihood can be optimized efficiently with homotopy (LARS) approach. Total computation time: O(dTk); d = # compartments, T = # timesteps, k = # nonzero weights.

slide-30
SLIDE 30

Example: real neural geometry

700 timesteps observed; 40 random compartments (of > 2000) observed per timestep Compressed sensing measurements improve accuracy further (Pakman et al 2013).

slide-31
SLIDE 31

Conclusions

  • Modern statistical approaches provide flexible, powerful

methods for answering key questions in neuroscience — many of these problems are statistics problems in disguise

  • Close relationships between biophysics, statistical modeling,

and experimental design

  • Modern optimization methods make computations very

tractable; suitable for closed-loop experiments

  • Dimensionality reduction is a key area for further research:

new methods, new ideas

  • Experimental methods progressing rapidly; many new

challenges and opportunities for breakthroughs based on statistical ideas

slide-32
SLIDE 32

References

Ahmadian, Y., Packer, A. M., Yuste, R., and Paninski, L. (2011). Designing optimal stimuli to control neuronal spike timing. Journal of Neurophysiology, 106(2):1038–1053. Djurisic, M., Antic, S., Chen, W. R., and Zecevic, D. (2004). Voltage imaging from dendrites of mitral cells: EPSP attenuation and spike trigger zones. J. Neurosci., 24(30):6703–6714. Knopfel, T., Diez-Garcia, J., and Akemann, W. (2006). Optical probing of neuronal circuit dynamics: genetically encoded versus classical fluorescent sensors. Trends in Neurosciences, 29:160–166. Koyama, S. and Paninski, L. (2010). Efficient computation of the map path and parameter estimation in integrate-and-fire and more general state-space models. Journal of Computational Neuroscience, 29:89–105. Paninski, L. (2004). Maximum likelihood estimation of cascade point-process neural encoding models. Network: Computation in Neural Systems, 15:243–262. Paninski, L. (2010). Fast Kalman filtering on quasilinear dendritic trees. Journal of Computational Neuroscience, 28:211–28. Paninski, L., Ahmadian, Y., Ferreira, D., Koyama, S., Rahnama, K., Vidne, M., Vogelstein, J., and Wu, W. (2010). A new look at state-space models for neural data. Journal of Computational Neuroscience, 29:107–126. Pillow, J., Shlens, J., Paninski, L., Sher, A., Litke, A., Chichilnisky, E., and Simoncelli, E. (2008). Spatiotemporal correlations and visual signaling in a complete neuronal population. Nature, 454:995–999. Pnevmatikakis, E. and Paninski, L. (2011). Fast interior-point inference in high-dimensional sparse, penalized state-space models. Submitted. Vogelstein, J., Packer, A., Machado, T., Sippy, T., Babadi, B., Yuste, R., and Paninski, L. (2010). Fast non-negative deconvolution for spike train inference from population calcium imaging. J. Neurophys., 104:3691–704.