CS 287 Lecture 12 (Fall 2019) Kalman Filtering Lecturer: Ignasi - - PowerPoint PPT Presentation

cs 287 lecture 12 fall 2019 kalman filtering
SMART_READER_LITE
LIVE PREVIEW

CS 287 Lecture 12 (Fall 2019) Kalman Filtering Lecturer: Ignasi - - PowerPoint PPT Presentation

CS 287 Lecture 12 (Fall 2019) Kalman Filtering Lecturer: Ignasi Clavera Slides by Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Outline n Gaussians n Kalman filtering n Extend Kalman


slide-1
SLIDE 1

CS 287 Lecture 12 (Fall 2019) Kalman Filtering

Lecturer: Ignasi Clavera Slides by Pieter Abbeel UC Berkeley EECS

Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

slide-2
SLIDE 2

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

slide-3
SLIDE 3

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

slide-4
SLIDE 4

Multivariate Gaussians

slide-5
SLIDE 5

Multivariate Gaussians

(integral of vector = vector

  • f integrals of each entry)

(integral of matrix = matrix

  • f integrals of each entry)
slide-6
SLIDE 6

§ µ = [1; 0] § S = [1 0; 0 1] § µ = [-.5; 0] § S = [1 0; 0 1] § µ = [-1; -1.5] § S = [1 0; 0 1]

Multivariate Gaussians: Examples

slide-7
SLIDE 7

n

µ = [0; 0]

n

S = [1 0 ; 0 1]

§ µ = [0; 0] § S = [.6 0 ; 0 .6] § µ = [0; 0] § S = [2 0 ; 0 2]

Multivariate Gaussians: Examples

slide-8
SLIDE 8

§ µ = [0; 0] § S = [1 0; 0 1] § µ = [0; 0] § S = [1 0.5; 0.5 1] § µ = [0; 0] § S = [1 0.8; 0.8 1]

Multivariate Gaussians: Examples

slide-9
SLIDE 9

§ µ = [0; 0] § S = [1 0; 0 1] § µ = [0; 0] § S = [1 0.5; 0.5 1] § µ = [0; 0] § S = [1 0.8; 0.8 1]

Multivariate Gaussians: Examples

slide-10
SLIDE 10

§ µ = [0; 0] § S = [1 -0.5 ; -0.5 1] § µ = [0; 0] § S = [1 -0.8 ; -0.8 1] § µ = [0; 0] § S = [3 0.8 ; 0.8 1]

Multivariate Gaussians: Examples

slide-11
SLIDE 11

Partitioned Multivariate Gaussian

n

Consider a multi-variate Gaussian and partition random vector into (X, Y).

slide-12
SLIDE 12

Partitioned Multivariate Gaussian: Dual Representation

n

Precision matrix

n

Straightforward to verify from (1) that:

n

And swapping the roles of Sigma and Gamma: (1)

slide-13
SLIDE 13

Marginalization: p(x) = ?

We integrate out over y to find the marginal: Hence we have:

Note: if we had known beforehand that p(x) would be a Gaussian distribution, then we could have found the result more quickly. We would have just needed to find and , which we had available through

slide-14
SLIDE 14

If Then

Marginalization Recap

slide-15
SLIDE 15

Self-quiz

slide-16
SLIDE 16

Conditioning: p(x | Y = y0) = ?

We have Hence we have:

  • Conditional mean moved according to correlation and variance on measurement
  • Conditional covariance does not depend on y0
slide-17
SLIDE 17

If Then

Conditioning Recap

slide-18
SLIDE 18

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

slide-19
SLIDE 19

n

Kalman Filter = special case of a Bayes’ filter with dynamics and sensory models linear Gaussians:

Kalman Filter

2

  • 1
slide-20
SLIDE 20

Time update

n

Assume we have current belief for :

n

Then, after one time step passes:

Xt+1 Xt

slide-21
SLIDE 21

n Now we can choose to continue by either of

n (i) mold it into a standard multivariate Gaussian format so we can read of

the joint distribution’s mean and covariance

n (ii) observe this is a quadratic form in x_{t} and x_{t+1} in the exponent; the

exponent is the only place they appear; hence we know this is a multivariate Gaussian. We directly compute its mean and covariance. [usually simpler!]

Time Update: Finding the joint

slide-22
SLIDE 22

n We follow (ii) and find the means and covariance matrices in

[Exercise: Try to prove each of these without referring to this slide!]

Time Update: Finding the joint

slide-23
SLIDE 23

Time Update Recap

n

Assume we have

n

Then we have

n

Marginalizing the joint, we immediately get

Xt+1 Xt

slide-24
SLIDE 24

Generality!

n

Assume we have

n

Then we have

n

Marginalizing the joint, we immediately get

W V

slide-25
SLIDE 25

Observation update

n

Assume we have:

n

Then:

n

And, by conditioning on (see lecture slides on Gaussians) we readily get:

Zt+1 Xt+1

slide-26
SLIDE 26

n

At time 0:

n

For t = 1, 2, …

n Dynamics update: n Measurement update:

n Often written as:

Complete Kalman Filtering Algorithm

(Kalman gain) “innovation”

slide-27
SLIDE 27

Kalman Filter Summary

n Highly efficient: Polynomial in measurement dimensionality k

and state dimensionality n: O(k2.376 + n2)

n Optimal for linear Gaussian systems!

slide-28
SLIDE 28

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

slide-29
SLIDE 29

Nonlinear Dynamical Systems

n Most realistic robotic problems involve nonlinear functions:

n Versus linear setting:

slide-30
SLIDE 30

Linearity Assumption Revisited

y

y

x x p(x)

p(y)

slide-31
SLIDE 31

Linearity Assumption Revisited

y

y

x x p(x)

p(y)

slide-32
SLIDE 32

Non-linear Function

“Gaussian of p(y)” has mean and variance of y under p(y)

y y x x p(x) p(y)

slide-33
SLIDE 33

EKF Linearization (1)

slide-34
SLIDE 34

EKF Linearization (2)

p(x) has HIGH variance relative to region in which linearization is accurate.

slide-35
SLIDE 35

EKF Linearization (3)

p(x) has LOW variance relative to region in which linearization is accurate.

slide-36
SLIDE 36

n Dynamics model: for xt “close to” μt we have: n Measurement model: for xt “close to” μt we have:

EKF Linearization: First Order Taylor Series Expansion

slide-37
SLIDE 37

n

At time 0:

n

For t = 1, 2, …

n Dynamics update: n Measurement update:

EKF Algorithm

slide-38
SLIDE 38

EKF Summary

n Highly efficient: Polynomial in measurement

dimensionality k and state dimensionality n: O(k2.376 + n2)

n Not optimal! n Can diverge if nonlinearities are large! n Works surprisingly well even when all assumptions are

violated!

slide-39
SLIDE 39

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

slide-40
SLIDE 40

Linearization via Unscented Transform

EKF UKF

slide-41
SLIDE 41

UKF Sigma-Point Estimate (2)

EKF UKF

slide-42
SLIDE 42

UKF Sigma-Point Estimate (3)

EKF UKF

slide-43
SLIDE 43

UKF Sigma-Point Estimate (4)

slide-44
SLIDE 44

n

Assume we know the distribution over X and it has a mean \bar{x}

n

Y = f(X)

n

EKF approximates f to first order and ignores higher-order terms

n

UKF uses f exactly, but approximates p(x).

UKF intuition why it can perform better

[Julier and Uhlmann, 1997]

slide-45
SLIDE 45

n

Picks a minimal set of sample points that match 1st, 2nd and 3rd moments of a Gaussian:

n

\bar{x} = mean, Pxx = covariance, i à i’th column, x in Rn

n

κ : extra degree of freedom to fine-tune the higher order moments of the approximation; when x is Gaussian, n+κ = 3 is a suggested heuristic

n

L = \sqrt{P_{xx}} can be chosen to be any matrix satisfying:

n

L LT = Pxx

Original Unscented Transform

[Julier and Uhlmann, 1997]

slide-46
SLIDE 46

n Dynamics update:

n Can simply use unscented transform and estimate the mean and

variance at the next time from the sample points

n Observation update:

n Use sigma-points from unscented transform to compute the covariance

matrix between xt and zt. Then can do the standard update.

Unscented Kalman filter

slide-47
SLIDE 47

[Table 3.4 in Probabilistic Robotics]

slide-48
SLIDE 48

UKF Summary

n Highly efficient: Same complexity as EKF, with a

constant factor slower in typical practical applications

n Better linearization than EKF: Accurate in first two

terms of Taylor expansion (EKF only first term) + capturing more aspects of the higher order terms

n Derivative-free: No Jacobians needed n Still not optimal!

slide-49
SLIDE 49

n

How to estimate At, Bt, Ct, Qt, Rt from data (z0:T, u0:T)

n

EM algorithm

n

How to compute (= smoothing) (note the capital “T”)

Forthcoming

slide-50
SLIDE 50

n

Square-root Kalman filter --- keeps track of square root of covariance matrices --- equally fast, numerically more stable (bit more complicated conceptually)

n

Very large systems with sparsity structure

n

Sparse Information Filter

n

Very large systems with low-rank structure

n

Ensemble Kalman Filter

n

Kalman filtering over SE(3)

n

How to estimate At, Bt, Ct, Qt, Rt from data (z0:T, u0:T)

n

EM algorithm

n

How to compute (= smoothing) (note the capital “T”)

Things to be aware of (but we won’t cover)

slide-51
SLIDE 51

n

If At = A, Qt = Q, Ct = C, Rt = R

n

If system is “observable” then covariances and Kalman gain will converge to steady-state values for t -> 1

n

Can take advantage of this: pre-compute them, only track the mean, which is done by multiplying Kalman gain with “innovation”

n

System is observable if and only if the following holds true: if there were zero noise you could determine the initial state after a finite number of time steps

n

Observable if and only if: rank( [ C ; CA ; CA2 ; CA3 ; … ; CAn-1]) = n

n

Typically if a system is not observable you will want to add a sensor to make it observable

n

Kalman filter can also be derived as the (recursively computed) least-squares solutions to a (growing) set of linear equations

Things to be aware of (but we won’t cover)

slide-52
SLIDE 52

n

If system is observable (=dual of controllable!) then Kalman filter will converge to the true state.

n

System is observable if and only if: O = [C ; CA ; CA2 ; … ; CAn-1] is full column rank (1) Intuition: if no noise, we observe y0, y1, … and we have that the unknown initial state x0 satisfies: y0 = C x0 y1 = CA x0 ... yK = CAK x0

This system of equations has a unique solution x0 iff the matrix [C; CA; … CAK] has full column rank. B/c any power of a matrix higher than n can be written in terms of lower powers of the same matrix, condition (1) is sufficient to check (i.e., the column rank will not grow anymore after having reached K=n-1).

Kalman filter property