Topics in Brain Computer Interfaces Topics in Brain Computer - - PowerPoint PPT Presentation

topics in brain computer interfaces topics in brain
SMART_READER_LITE
LIVE PREVIEW

Topics in Brain Computer Interfaces Topics in Brain Computer - - PowerPoint PPT Presentation

Topics in Brain Computer Interfaces Topics in Brain Computer Interfaces CS295- -7 7 CS295 Professor: M ICHAEL B LACK TA: F RANK W OOD Spring 2005 Kalman Filtering Michael J. Black - CS295-7 2005 Brown University v v = + z H x noise


slide-1
SLIDE 1

Michael J. Black - CS295-7 2005 Brown University

Topics in Brain Computer Interfaces Topics in Brain Computer Interfaces CS295 CS295-

  • 7

7

Professor: MICHAEL BLACK TA: FRANK WOOD Spring 2005 Kalman Filtering

slide-2
SLIDE 2

Michael J. Black - CS295-7 2005 Brown University

Approximation: Linear Gaussian (generative) model

… …

) , ( ~

t t j t

Q x H z v v Ν

  • bservation model

noise x H z

t t

+ = v v

Full covariance Q matrix models correlations between cells. H models how firing rates relate to full kinematic model (position, velocity, and acceleration).

slide-3
SLIDE 3

Michael J. Black - CS295-7 2005 Brown University

normalization constant (independent of mouth) Prior (a priori – before the evidence) Likelihood (evidence)

Bayesian Inference

Posterior a posteriori probability (after the evidence)

) firing ( ) kinematics ( ) kinematics | firing ( ) firing | kinematics ( p p p p =

We infer hand kinematics from uncertain evidence and our prior knowledge of how hands move.

slide-4
SLIDE 4

Michael J. Black - CS295-7 2005 Brown University

Multi-Modal Likelihood

State (e.g. position) likelihood How can we represent this?

slide-5
SLIDE 5

Michael J. Black - CS295-7 2005 Brown University

Non-Parametric Approximation

most samples have low probability – wasted computation How finely to discretize High dimensional space – discretization impractical We could sample at regular intervals

Problems?

slide-6
SLIDE 6

Michael J. Black - CS295-7 2005 Brown University

Factored Sampling

Weighted samples

weighted samples

∑ =

= N i i t t n t t

p p n t

w

1 ) ( ) (

) | ( ) | ( ) ( x z x z

Normalized likelihood:

} ... 1 ); , {(

) ( ) (

N i w S

i i

= = x

slide-7
SLIDE 7

Michael J. Black - CS295-7 2005 Brown University

Monte-Carlo Sampling

Given a weighted sample set

sample N 1 1 Cumulative distribution of weights

} ... 1 ); , {(

) ( ) (

N i w S

i i

= = x

slide-8
SLIDE 8

Michael J. Black - CS295-7 2005 Brown University

Bayesian Tracking Bayesian Tracking

1 1 1 1

)) | ( ) | ( ( ) ( ) | (

− − − −

=

t t t t t t t t t

d p p p p x Z x x x x z Z x κ Posterior over model parameters given an image sequence. Likelihood of

  • bserving the firing rates

given the hand kinematics. Temporal model (prior) Posterior from previous time instant Monte Carlo integration

slide-9
SLIDE 9

Michael J. Black - CS295-7 2005 Brown University

Particle Filter Particle Filter

Isard & Blake ‘96 Posterior

) | (

1 1 − − t t

p Z x

slide-10
SLIDE 10

Michael J. Black - CS295-7 2005 Brown University

Particle Filter Particle Filter

Isard & Blake ‘96 Posterior

) | (

1 1 − − t t

p Z x

sample sample

slide-11
SLIDE 11

Michael J. Black - CS295-7 2005 Brown University

Particle Filter Particle Filter

Isard & Blake ‘96 Temporal dynamics sample sample

) | (

1 − t t

p x x

Posterior

) | (

1 1 − − t t

p Z x

sample sample

slide-12
SLIDE 12

Michael J. Black - CS295-7 2005 Brown University

Particle Filter Particle Filter

Isard & Blake ‘96 Temporal dynamics sample sample Likelihood

) | (

t t

p x z

Posterior sample sample

) | (

1 − t t

p x x ) | (

1 1 − − t t

p Z x

slide-13
SLIDE 13

Michael J. Black - CS295-7 2005 Brown University

Particle Filter Particle Filter

Isard & Blake ‘96 Temporal dynamics sample sample Posterior Likelihood normalize normalize

) | (

t t

p Z x

Posterior sample sample

) | (

t t

p x z ) | (

1 − t t

p x x ) | (

1 1 − − t t

p Z x

slide-14
SLIDE 14

Michael J. Black - CS295-7 2005 Brown University

Pseudocode

condense1step

% generate cumulative distribution for posterior at t-1 …. % generate a vector of uniform random numbers. % if a the number is greater than refreshRate then

% generate a vector of uniform random numbers % use these to search the cumulative probability % find the indices of the corresponding particles % for each of these particles, predict the new state (eg. Add Gaussian noise!) % for each of these new states compute the log likelihood

% else generate a particle at random and compute its log likelihood. % find the maximum log likelihood and subtract it from all the other log likelihoods % construct the posterior at time t by exponentiating all the log likelihoods and normalizing so they sum to 1.

slide-15
SLIDE 15

Michael J. Black - CS295-7 2005 Brown University

Particle Filter

Michael Isard

slide-16
SLIDE 16

Michael J. Black - CS295-7 2005 Brown University

Linear Gaussian Likelihood

Generative model for the observation:

k k k k

q x H z + =

. 2 1 ,M , , k L =

,

d c k ×

∈ R H

), , ( ~

k k

N Q q

,

c c k ×

∈ R Q

slide-17
SLIDE 17

Michael J. Black - CS295-7 2005 Brown University

Explicit Form

The conditional probability has explicit form:

)) ( ) ( 2 1 exp( )) det( ) 2 (( 1 ) (

1 2 / 1 k k k k T k k k k c k k

p x H z Q x H z Q x z − − − =

π

The likelihood model is equivalent to that

) , ( ~

k k k k

N Q x H z

slide-18
SLIDE 18

Michael J. Black - CS295-7 2005 Brown University

Linear Gaussian Temporal Prior

Temporal prior of the state:

k k k k

w x A x + =

−1

. , 3 2 M , k L =

,

d d k ×

∈ R A ), , ( ~

k k

N W w ,

d d k ×

∈ R W

slide-19
SLIDE 19

Michael J. Black - CS295-7 2005 Brown University

Explicit Form

The conditional probability has explicit form: The prior model is equivalent to that

) , ( ~

1 k k k k

N W x A x +

)) ( ) ( 2 1 exp( )) det( ) 2 (( 1 ) (

1 1 1 2 / 1 1 k k k k T k k k k d k k

p x A x W x A x W x x − − − =

+ − + +

π

slide-20
SLIDE 20

Michael J. Black - CS295-7 2005 Brown University

Kalman Filter Model

L , 2 , 1

) , (

=

k k k

N Q q

System Equation:

,

1 k k k k

w x A x + =

L , 3 , 2

) , (

=

k k k

N W w

,

k k k k

q x H z + =

Measurement Equation:

Definition: Assumption:

All random variables have Gaussian distributions and they are linearly related

slide-21
SLIDE 21

Michael J. Black - CS295-7 2005 Brown University

Linear Gaussian Model

1 1 1 1

)) | ( ) | ( ( ) ( ) | (

− − − −

=

t t t t t t t t t

d p p p p x Z x x x x z Z x κ

Some basic facts about Gaussians: Marginal of a Gaussian is Gaussian Gaussian times a Gaussian is Gaussian

1 1 2 1 1 3 2 1 2 1 1 1 3 3 3 3 2 1 2 2 2 1 1 1

) ( ) ( ) , ( ) ( ) ( ) , ( ) ( ), , ( ) (

− − − − −

Σ + Σ = Σ Σ + Σ Σ = Σ = Σ = Σ = µ µ µ µ µ µ zN p p N p N p x x x x

slide-22
SLIDE 22

Michael J. Black - CS295-7 2005 Brown University

Linear Gaussian Model

1 1 1 1

)) | ( ) | ( ( ) ( ) | (

− − − −

=

t t t t t t t t t

d p p p p x Z x x x x z Z x κ

A linear transformation of a Gaussian distributed random variable is also Gaussian:

) , ( ~ ) , ( ~

T

N N AQA b x A y b Ax y Q x x + + =

slide-23
SLIDE 23

Michael J. Black - CS295-7 2005 Brown University

Linear Gaussian Model

1 1 1 1

)) | ( ) | ( ( ) ( ) | (

− − − −

=

t t t t t t t t t

d p p p p x Z x x x x z Z x κ

) , (

1 W

Ax −

t

N ) , (

1 1 − − t t

N P x

) , ( ~

1 1

W A AP x A x +

− − T t t t

N ) , ˆ ( ~

− − t t t

N P x x

slide-24
SLIDE 24

Michael J. Black - CS295-7 2005 Brown University

Linear Gaussian Model

1 1 1 1

)) | ( ) | ( ( ) ( ) | (

− − − −

=

t t t t t t t t t

d p p p p x Z x x x x z Z x κ

) , ( Q Hxt N

) , ˆ ( ~

− − t t t

N P x x

⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − − − − − ×

− − − − −

) ˆ ( ) ˆ ( 2 1 ) ( ) ( 2 1 exp

1 1 t t t T t t t t T t t

Q const x x P x x Hx z Hx z

slide-25
SLIDE 25

Michael J. Black - CS295-7 2005 Brown University

Linear Gaussian Model

) | ˆ ( ) ˆ ( ) ˆ ( 2 1 ) ( ) ( 2 1 exp ) | (

1 1 t t t t t T t t t t T t t t t

N Q const p P x x x P x x Hx z Hx z z x = ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − − − − − × =

− − − − −

( ) ( ) ( ) ( )

1 1 1 1 1 1 1 1 1 1

ˆ ˆ ˆ ˆ

− − − − − − − − − − − − − − − −

+ = + = + + = H Q H P P z Q H x P P x z Q H x P H Q H P x

T t t t T t t t t t T t t T t t

slide-26
SLIDE 26

Michael J. Black - CS295-7 2005 Brown University

Linear Gaussian Model

( ) ( ) ( )

1 1 1 1 1 1 1 1

ˆ ˆ

− − − − − − − − − − − −

+ = + + = H Q H P P z Q H x P H Q H P x

T t t t T t t T t t

( )

1 1 1 1 1

) ˆ ( ˆ ˆ

− − − − − − − −

+ = = − + = Q H H Q H P K Q H P K x H z K x x

T T t t T t t t t t t t

Some algebra.

slide-27
SLIDE 27

Michael J. Black - CS295-7 2005 Brown University

Simplifying

Matrix inversion lemma

( )

t t T t T t t T t

P HP H HP Q H P P H Q H P = + − = +

− − − − − − − − − − 1 1 1 1 1

) (

( )

1 1 1 1 − − − − −

+ = Q H H Q H P K

T T t t

slide-28
SLIDE 28

Michael J. Black - CS295-7 2005 Brown University

1 1

and ˆ

  • f

estimate Initial

t- t

P x −

Time Update Measurement Update

Welch and Bishop 2002

KALMAN FILTER ALGORITHM

Prior estimate Error covariance Posterior estimate Kalman gain Error covariance

1

ˆ ˆ

− − = t t

x A x

1 −

= Q H P K

T t t

− − − − − −

+ − =

t T t T t t t

HP H HP Q H P P P

1 1

) (

) ˆ ( ˆ ˆ

− −

− + =

t t t t t

x H z K x x

W A AP P

T t t

+ =

− − 1

slide-29
SLIDE 29

Michael J. Black - CS295-7 2005 Brown University

Learning Kalman Model

  • In practice, the parameters in the model need to be

estimated from training data. (In training data, we know both

hidden states and measurements.)

  • Common simplification: are

constant over time (independent of k).

  • The A, H, W, Q can be estimated by maximizing the

joint probability . ) , (

M M

p Z X

k k k k

Q W H A , , ,

slide-30
SLIDE 30

Michael J. Black - CS295-7 2005 Brown University

Bayesian Graphical Model

∏ ∏

= = −

= =

M k k k M k k k M M M M M

p p p p p p

1 2 1 1

)] ( ][ ) ( ) ( [ ) ( ) ( ) , ( x z x x x X | Z X Z X

k

x

1 − k

x

1 + k

x

k k k k

w x A x + =

−1 1 − k

z

k

z

1 + k

z

k k k k

q x H z + =

slide-31
SLIDE 31

Michael J. Black - CS295-7 2005 Brown University

Splitting the Joint Distribution

) , ( max arg

, , , M M Q H W A

p Z X

, )] ( ) ( ) [log(det ) ( log ) (

2 1 1 1

= − − −

− − + = − =

M k k k T k k M

p f Ax x W Ax x W X W A, α

where

= −

− − + = − =

M k k k T k k M M

p g

1 1

)]. ( ) ( ) [log(det ) ( log ) ( Hx z Q Hx z Q X Z Q H, β

How to optimize functions with matrix variables ???

) ( max arg ) ( max arg

, , M M Q H M W A

p p X Z X = ) ( min arg ) ( min arg

, ,

Q H, W A, g f

Q H W A

=

slide-32
SLIDE 32

Michael J. Black - CS295-7 2005 Brown University

Closed-form Solutions:

. z x H z z Q x x x z H x x A x x W x x x x A ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − = ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ = ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − = ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ =

∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑

= = − = = = − = − = − − = − M k T k k M k T k k M k T k k M k T k k M k T k k M k T k k M k T k k M k T k k

M M

1 1 1 1 1 2 1 2 1 2 1 1 2 1

1 , , 1 1 ,

slide-33
SLIDE 33

Michael J. Black - CS295-7 2005 Brown University

Off-line Reconstruction

Estimated/decoded position (reconstruction) Actual hand position

69 cells with 1.5 minutes of training data

slide-34
SLIDE 34

Michael J. Black - CS295-7 2005 Brown University

Accuracy

Population vector 8.66 Linear regression method 2.55 Kalman filter 2.18

Method RMSE (cm)

Continuous 2D hand motion (off-line reconstruction):

CorrCoeff

As number of cells increases:

slide-35
SLIDE 35

Michael J. Black - CS295-7 2005 Brown University

k k j k

q x H z v v v + =

* Uniform: lag j time steps (1 time step = 70ms)

Optimal “Lag”

* Non-uniform: lag time steps

) , , , (

42 2 1

j j j ⋅ ⋅ ⋅

4 , 3 , 2 , 1 , = j

Firing precedes motion: Measurement Equation

k k k

q x H z v v v + =

slide-36
SLIDE 36

Michael J. Black - CS295-7 2005 Brown University

Methods CC MSE (x , y) Kalman (0ms lag) (0.77, 0.91) 6.96 Kalman (70ms lag) (0.79, 0.93) 6.67 Kalman (140ms lag) (0.81, 0.93) 6.09 Kalman (210ms lag) (0.81, 0.89) 6.98 Kalman (280ms lag) (0.76, 0.82) 8.91 Kalman (non-uniform) (0.82, 0.93) 5.24

Reconstruction and Lag

) (

2

cm

slide-37
SLIDE 37

Michael J. Black - CS295-7 2005 Brown University

RECONSTRUCTION (TEST DATA)

true reconstructed

slide-38
SLIDE 38

Michael J. Black - CS295-7 2005 Brown University

Covariance

Diagonal covariance assumes noise in firing rates is statistically independent. Kalman decoding * Full covariance: MSE = 5.99 cm2 * Diagonal covariance: MSE = 6.35 cm2

∏ ∏

= =

− − = =

C i t i t i i i C i t t i t t

z z p p

1 2 , 2 1 ,

) ) ( 2 1 exp( 2 1 ) | ( ) | ( x H x x z σ σ π

slide-39
SLIDE 39

Michael J. Black - CS295-7 2005 Brown University

Off-line Decoding

Kalman filter Linear regression CC RMSE (cm) CC RMSE (cm) 23 (0.79, 0.82) 3.61 (0.70, 0.72) 4.35 30 (0.88, 0.79) 3.26 (0.85, 0.72) 3.49 36 (0.75, 0.74) 4.36 (0.77, 0.64) 4.38 26 (0.71, 0.76) 4.48 (0.71, 0.74) 4.39 69 (0.88, 0.89) 3.11 (0.72, 0.78) 5.30 69 (0.86, 0.88) 3.26 (0.71, 0.80) 3.99 # of cells Kalman filter CC = 0.81±0.06, RMSE = 3.7±0.6 (cm) Linear regression CC = 0.74±0.05, RMSE = 4.3±0.6 (cm)

slide-40
SLIDE 40

Michael J. Black - CS295-7 2005 Brown University

Decoding

True Reconstructed

5 10 15 20 25 30 5 10 15 20 25 30

x-position

5 10 15 20 25 30 5 10 15 20 25 30

y-position

time (second) time (second)

Decoding results over all six experiments:

slide-41
SLIDE 41

Michael J. Black - CS295-7 2005 Brown University

On-line Neural Control

Kalman filter decoder. Only 18 cells. Directly exploits the generative encoding model. Neural control

  • f a computer

cursor in real time. Brain substitutes for hand.

Target Visual feedback

slide-42
SLIDE 42

Michael J. Black - CS295-7 2005 Brown University

On-line Task Performance

Kalman filter Linear regression time targets rate time targets rate 17 60sec 38 38/min 30 105sec 55 31/min 58sec 24 25/min 36 57sec 28 29/min 42sec 15 21/min 69 45sec 28 37/min 60sec 22 22/min # of cells

Average results: Kalman filter 33.75 targets/min Linear regression 22.67 targets/min

50% improvement

slide-43
SLIDE 43

Michael J. Black - CS295-7 2005 Brown University

Non-Gaussian Likelihood

Hx

Hx x

= e z z p

z

) ( ! 1 ) | (

Inhomogeneous Poisson: Hx is the predicted mean firing rate. z is the observed rate for a single cell. Problems? Hx may be negative. No clear way to model correlated noise.