Kalman Filtering Notes Portions of these notes are adapted from [3], - - PDF document

kalman filtering notes
SMART_READER_LITE
LIVE PREVIEW

Kalman Filtering Notes Portions of these notes are adapted from [3], - - PDF document

Kalman Filtering Notes Greg Mori Kalman Filtering Notes Portions of these notes are adapted from [3], [5], [4], [2], and [1]. What is the Kalman Filter? Optimal recursive data processing algorithm for processing series of measurements generated


slide-1
SLIDE 1

Kalman Filtering Notes Greg Mori

Kalman Filtering Notes

Portions of these notes are adapted from [3], [5], [4], [2], and [1].

What is the Kalman Filter?

Optimal recursive data processing algorithm for processing series of measurements generated from a linear dynamic system. Define xt ∈ Rn to be (unobserved) state of dynamic system at time t, t ∈ {0, 1, . . ., T}. Define zt ∈ Rm to be (observed) measurement at time t. Kalman filter is an algorithm for determining P(xt|z0:t), given some particular assumptions about these random variables.

What is it used for?

Applications of the Kalman filter:

  • Radar tracking of planes/missles (classical application)
  • Tracking heads/hands/people from video data
  • Economics (stock market data)
  • Navigation

Simple example

Let’s say that Greg and Huyen are hungry and lost in downtown Vancouver, and are trying to get to Fatburger, located at xf. Greg thinks that Fatburger is located at zg, and his estimates of any location x come from a Gaussian distribution N(x, σ2

g) (for a small σg).

Huyen thinks that Fatburger is located at zh, and her estimates come from a Gaussian distribution N(x, σ2

h) (no comment on the relative size of σh).

Given these two measurements (let’s say zh and zg are scalars to make things easier), how do we combine them to get an estimate of the location of Fatburger? P(xf|zg, zh) = P(zg, zh|xf) P(xf) P(zg, zh) (by Bayes’ rule) (1) = αP(zg, zh|xf)P(xf) (2) = αP(zg|xf)P(zh|xf)P(xf) (assuming conditional ind.) (3) (4) As another simplification, let’s assume that the prior P(xf) is uniform (a frequentist sort of assumption). 1

slide-2
SLIDE 2

Kalman Filtering Notes Greg Mori Then, P(xf|zg, zh) ∝ P(zg|xf)P(zh|xf) (5) ∝ exp

  • −1

2(zg − xf)2/σ2

g

  • exp
  • −1

2(zh − xf)2/σ2

h

  • (6)

= exp

  • −1

2 (zg − xf)2σ2

h + (zh − xf)2σ2 g

σ2

hσ2 g

  • (7)

= exp

  • −1

2 (σ2

h + σ2 g)x2 f − 2(zgσ2 h + zhσ2 g)xf + (z2 gσ2 h + z2 hσ2 g)

σ2

hσ2 g

  • (8)

∝ exp

  • −1

2 (σ2

h + σ2 g)x2 f − 2(zgσ2 h + zhσ2 g)xf

σ2

hσ2 g

  • (9)

This looks messy, but can be converted into something familiar, using the trick of “completing the square.” ax2 + bx + c = a(x − −b 2a )2 + (c − b2 4a) (10) Applying this trick to Equation 9 gives us: P(xf|zg, zh) ∝ exp

  • −1

2 σ2

g + σ2 h

σ2

gσ2 h

xf − 2zgσ2

h + 2zhσ2 g

2(σ2

h + σ2 g)

2 + R

  • (11)

∝ exp     −1 2

  • xf −

zgσ2

h+zhσ2 g

(σ2

h+σ2 g)

2

  • σhσg

σ2

g+σ2 h

2      (12) i.e. a Gaussian distribution with mean

zgσ2

h+zhσ2 g

(σ2

h+σ2 g)

and variance

σ2

hσ2 g

σ2

g+σ2 h

More General Assumptions

We will make the following assumptions:

  • Markovian conditional independence for states, P(xt|x0:t−1) = P(xt|xt−1), and measure-

ments P(zt|x0:t, z0:t−1) = P(zt|xt)

  • x0 is drawn from a Gaussian distribution N(µ0, Σ0)
  • P(xt|xt−1) is a linear Gaussian distribution, P(xt|xt−1) = N(Axt−1, Σx). I.e. xt = Axt−1+

wt, linear transformation of xt−1 plus (white) Gaussian noise.

  • P(zt|xt) is a linear Gaussian distribution, P(zt|xt) = N(Cxt, Σz). I.e. zt = Cxt + vt

Note that A, Σx, C, Σz could also vary over time t in general. A general important fact about Bayesian networks of this sort, in which all conditional distributions are linear Gaussians, is that the joint probability distribution is a multivariate Gaussian distribution. Further, all conditional distributions are also multivariate Gaussians. 2

slide-3
SLIDE 3

Kalman Filtering Notes Greg Mori

The general case

P(xt|z0:t) = P(xt|z0:t−1, zt) (13) = αP(zt|xt, z0:t−1)P(xt|z0:t−1) (14) = αP(zt|xt)

  • P(xt, xt−1|z0:t−1)dxt−1

(15) = αP(zt|xt)

  • P(xt−1|z0:t−1)P(xt|xt−1, z0:t−1)dxt−1

(16) = αP(zt|xt)

  • P(xt−1|z0:t−1)P(xt|xt−1)dxt−1

(17) P(xt|z0:t) is a multivariate Gaussian for all t, denote the mean µt and the variance Σt. Equation 17 defines the recurrence relation between these parameters for t and t − 1. A slightly less simple 1-D example We will derive the Kalman filter updates for a 1-D state vector. Let P(xt|xt−1) = N(axt−1, σ2

x),

P(zt|xt) = N(cxt, σ2

z).

Following the derivation in [1], use the notation g(x; µ, ν) = exp

  • −(x − µ)2

  • (18)

The following identities then hold: g(x; µ, ν) = g(x − µ; 0, ν); (19) g(m; n, ν) = g(n; m, ν); (20) g(ax; µ, ν) = g(x; µ/a, ν/a2); (21) Also, ∞

−∞

g(x − u, µ, νa)g(u; 0, νb)du ∝ g(x; µ, ν2

a + ν2 b )

(22) This fact can be obtained by thinking about the distribution of Z = X + Y where X and Y are normally distributed random variables. Finally, g(x; a, b)g(x; c, d) = g

  • x; ad + cb

b + d , bd b + d

  • f(a, b, c, d)

(23) Note that the f(·) does not depend on x. The derivation of this fact is the same as that in the first simple example. Using these facts, we can evaluate the integral in Equation 17. 3

slide-4
SLIDE 4

Kalman Filtering Notes Greg Mori P(xt|z0:t−1) =

  • P(xt−1|z0:t−1)P(xt|xt−1)dxt−1

(24) =

  • P(xt|xt−1)P(xt−1|z0:t−1)dxt−1

(25) ∝

  • g(xt; axt−1, σ2

x)g(xt−1; µt−1, σ2 t−1)dxt−1

(26) ∝

  • g((xt − axt−1); 0, σ2

x)g((xt−1 − µt−1); 0, σ2 t−1)dxt−1

(27) ∝

  • g((xt − a(u + µt−1)); 0, σ2

x)g(u; 0, σ2 t−1)du

(28) ∝

  • g((xt − au); aµt−1, σ2

x)g(u; 0, σ2 t−1)du

(29) ∝

  • g((xt − v); aµt−1, σ2

x)g(v; 0, (aσt−1)2)du

(30) ∝ g(xt; aµt−1, σ2

x + (aσt−1)2)

(31) Denote the above mean and variances by µ−

t = aµt−1 and σ− t = σ2 x + (aσt−1)2. The final

update, multiplying Equation 31 into Equation 17 gives: P(xt|z0:t) = αP(zt|xt, z0:t−1)P(xt|z0:t−1) (32) ∝ g(zt; cxt, σ2

z)g(xt; µ− t , σ− t )

(33) = g(cxt; zt, σ2

z)g(xt; µ− t , σ− t )

(34) = g(xt; zt/c, (σz/c)2)g(xt; µ−

t , σ− t )

(35) Applying our identities, we obtain: µt = µ−

t σ2 z + czt(σ− t )2

σ2

z + c2(σ− t )2

  • (36)

σt =

  • σ2

z(σ− t )2

σ2

z + c2(σ− t )2

  • (37)

The multivariate case Similar identities for multivariate Gaussian distributions can be derived. In the full multi- variate case, the final update equations for µt and Σt are: µt = Aµt−1 + Kt(zt − CAµt−1) (38) Σt = (I − Kt)(AΣt−1AT + Σx) (39) where Kt = (AΣt−1AT + Σx)CT(C(AΣt−1AT + Σx)CT + Σz)−1 (40) 4

slide-5
SLIDE 5

Kalman Filtering Notes Greg Mori

Other issues

The data association problem arises when trying to track multiple (possibly interacting)

  • bjects. The basic problem is which measurement goes with which state variable.

A simple approach is to use nearest-neighbour data association, where measurements are assigned to closest forward projected state variables. Distance can be measured using the Mahalanobis distance, reweighting coordinates based on measurement covariance matrix ||x − y||2

Σ = (x − y)TΣ−1(x − y)T.

Probabilistic techniques that average over possible assignments (there are m(m − 1)(m − 2) . . .(m − n + 1) assignments with n objects and m measurements) are also used.

References

[1] D. Forsyth and J. Ponce. Computer Vision: A Modern Approach. Prentice Hall, 2003. [2] M. Jordan and C. Bishop. An introduction to graphical models. [3] P. S. Maybeck. Stochastic models, estimation, and control, volume 141 of Mathematics in Science and Engineering. 1979. [4] S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, Englewood Cliffs, NJ, second edition, 2003. [5] G. Welch and G. Bishop. Kalman filter webpage. http://www.cs.unc.edu/~welch/kalman/.

5