[PPT] - CS 287 Lecture 12 (Fall 2019) Kalman Filtering Lecturer: Ignasi PowerPoint Presentation

SLIDE 1

CS 287 Lecture 12 (Fall 2019) Kalman Filtering

Lecturer: Ignasi Clavera Slides by Pieter Abbeel UC Berkeley EECS

Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

SLIDE 2

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

SLIDE 3

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

SLIDE 4

Multivariate Gaussians

SLIDE 5

Multivariate Gaussians

(integral of vector = vector

f integrals of each entry)

(integral of matrix = matrix

f integrals of each entry)

SLIDE 6

§ µ = [1; 0] § S = [1 0; 0 1] § µ = [-.5; 0] § S = [1 0; 0 1] § µ = [-1; -1.5] § S = [1 0; 0 1]

Multivariate Gaussians: Examples

SLIDE 7

n

µ = [0; 0]

n

S = [1 0 ; 0 1]

§ µ = [0; 0] § S = [.6 0 ; 0 .6] § µ = [0; 0] § S = [2 0 ; 0 2]

Multivariate Gaussians: Examples

SLIDE 8

§ µ = [0; 0] § S = [1 0; 0 1] § µ = [0; 0] § S = [1 0.5; 0.5 1] § µ = [0; 0] § S = [1 0.8; 0.8 1]

Multivariate Gaussians: Examples

SLIDE 9

§ µ = [0; 0] § S = [1 0; 0 1] § µ = [0; 0] § S = [1 0.5; 0.5 1] § µ = [0; 0] § S = [1 0.8; 0.8 1]

Multivariate Gaussians: Examples

SLIDE 10

§ µ = [0; 0] § S = [1 -0.5 ; -0.5 1] § µ = [0; 0] § S = [1 -0.8 ; -0.8 1] § µ = [0; 0] § S = [3 0.8 ; 0.8 1]

Multivariate Gaussians: Examples

SLIDE 11

Partitioned Multivariate Gaussian

n

Consider a multi-variate Gaussian and partition random vector into (X, Y).

SLIDE 12

Partitioned Multivariate Gaussian: Dual Representation

n

Precision matrix

n

Straightforward to verify from (1) that:

n

And swapping the roles of Sigma and Gamma: (1)

SLIDE 13

Marginalization: p(x) = ?

We integrate out over y to find the marginal: Hence we have:

Note: if we had known beforehand that p(x) would be a Gaussian distribution, then we could have found the result more quickly. We would have just needed to find and , which we had available through

SLIDE 14

If Then

Marginalization Recap

SLIDE 15

Self-quiz

SLIDE 16

Conditioning: p(x | Y = y0) = ?

We have Hence we have:

Conditional mean moved according to correlation and variance on measurement
Conditional covariance does not depend on y0

SLIDE 17

If Then

Conditioning Recap

SLIDE 18

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

SLIDE 19

n

Kalman Filter = special case of a Bayes’ filter with dynamics and sensory models linear Gaussians:

Kalman Filter

2

1

SLIDE 20

Time update

n

Assume we have current belief for :

n

Then, after one time step passes:

Xt+1 Xt

SLIDE 21

n Now we can choose to continue by either of

n (i) mold it into a standard multivariate Gaussian format so we can read of

the joint distribution’s mean and covariance

n (ii) observe this is a quadratic form in x_{t} and x_{t+1} in the exponent; the

exponent is the only place they appear; hence we know this is a multivariate Gaussian. We directly compute its mean and covariance. [usually simpler!]

Time Update: Finding the joint

SLIDE 22

n We follow (ii) and find the means and covariance matrices in

[Exercise: Try to prove each of these without referring to this slide!]

Time Update: Finding the joint

SLIDE 23

Time Update Recap

n

Assume we have

n

Then we have

n

Marginalizing the joint, we immediately get

Xt+1 Xt

SLIDE 24

Generality!

n

Assume we have

n

Then we have

n

Marginalizing the joint, we immediately get

W V

SLIDE 25

Observation update

n

Assume we have:

n

Then:

n

And, by conditioning on (see lecture slides on Gaussians) we readily get:

Zt+1 Xt+1

SLIDE 26

n

At time 0:

n

For t = 1, 2, …

n Dynamics update: n Measurement update:

n Often written as:

Complete Kalman Filtering Algorithm

(Kalman gain) “innovation”

SLIDE 27

Kalman Filter Summary

n Highly efficient: Polynomial in measurement dimensionality k

and state dimensionality n: O(k2.376 + n2)

n Optimal for linear Gaussian systems!

SLIDE 28

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

SLIDE 29

Nonlinear Dynamical Systems

n Most realistic robotic problems involve nonlinear functions:

n Versus linear setting:

SLIDE 30

Linearity Assumption Revisited

y

x x p(x)

p(y)

SLIDE 31

Linearity Assumption Revisited

y

x x p(x)

p(y)

SLIDE 32

Non-linear Function

“Gaussian of p(y)” has mean and variance of y under p(y)

y y x x p(x) p(y)

SLIDE 33

EKF Linearization (1)

SLIDE 34

EKF Linearization (2)

p(x) has HIGH variance relative to region in which linearization is accurate.

SLIDE 35

EKF Linearization (3)

p(x) has LOW variance relative to region in which linearization is accurate.

SLIDE 36

n Dynamics model: for xt “close to” μt we have: n Measurement model: for xt “close to” μt we have:

EKF Linearization: First Order Taylor Series Expansion

SLIDE 37

n

At time 0:

n

For t = 1, 2, …

n Dynamics update: n Measurement update:

EKF Algorithm

SLIDE 38

EKF Summary

n Highly efficient: Polynomial in measurement

dimensionality k and state dimensionality n: O(k2.376 + n2)

n Not optimal! n Can diverge if nonlinearities are large! n Works surprisingly well even when all assumptions are

violated!

SLIDE 39

Outline

n Gaussians n Kalman filtering n Extend Kalman Filter (EKF) n Unscented Kalman Filter (UKF) [aka “sigma-point filter”]

SLIDE 40

Linearization via Unscented Transform

EKF UKF

SLIDE 41

UKF Sigma-Point Estimate (2)

EKF UKF

SLIDE 42

UKF Sigma-Point Estimate (3)

EKF UKF

SLIDE 43

UKF Sigma-Point Estimate (4)

SLIDE 44

n

Assume we know the distribution over X and it has a mean \bar{x}

n

Y = f(X)

n

EKF approximates f to first order and ignores higher-order terms

n

UKF uses f exactly, but approximates p(x).

UKF intuition why it can perform better

[Julier and Uhlmann, 1997]

SLIDE 45

n

Picks a minimal set of sample points that match 1st, 2nd and 3rd moments of a Gaussian:

n

\bar{x} = mean, Pxx = covariance, i à i’th column, x in Rn

n

κ : extra degree of freedom to fine-tune the higher order moments of the approximation; when x is Gaussian, n+κ = 3 is a suggested heuristic

n

L = \sqrt{P_{xx}} can be chosen to be any matrix satisfying:

n

L LT = Pxx

Original Unscented Transform

[Julier and Uhlmann, 1997]

SLIDE 46

n Dynamics update:

n Can simply use unscented transform and estimate the mean and

variance at the next time from the sample points

n Observation update:

n Use sigma-points from unscented transform to compute the covariance

matrix between xt and zt. Then can do the standard update.

Unscented Kalman filter

SLIDE 47

[Table 3.4 in Probabilistic Robotics]

SLIDE 48

UKF Summary

n Highly efficient: Same complexity as EKF, with a

constant factor slower in typical practical applications

n Better linearization than EKF: Accurate in first two

terms of Taylor expansion (EKF only first term) + capturing more aspects of the higher order terms

n Derivative-free: No Jacobians needed n Still not optimal!

SLIDE 49

n

How to estimate At, Bt, Ct, Qt, Rt from data (z0:T, u0:T)

n

EM algorithm

n

How to compute (= smoothing) (note the capital “T”)

Forthcoming

SLIDE 50

n

Square-root Kalman filter --- keeps track of square root of covariance matrices --- equally fast, numerically more stable (bit more complicated conceptually)

n

Very large systems with sparsity structure

n

Sparse Information Filter

n

Very large systems with low-rank structure

n

Ensemble Kalman Filter

n

Kalman filtering over SE(3)

n

How to estimate At, Bt, Ct, Qt, Rt from data (z0:T, u0:T)

n

EM algorithm

n

How to compute (= smoothing) (note the capital “T”)

Things to be aware of (but we won’t cover)

SLIDE 51

n

If At = A, Qt = Q, Ct = C, Rt = R

n

If system is “observable” then covariances and Kalman gain will converge to steady-state values for t -> 1

n

Can take advantage of this: pre-compute them, only track the mean, which is done by multiplying Kalman gain with “innovation”

n

System is observable if and only if the following holds true: if there were zero noise you could determine the initial state after a finite number of time steps

n

Observable if and only if: rank( [ C ; CA ; CA2 ; CA3 ; … ; CAn-1]) = n

n

Typically if a system is not observable you will want to add a sensor to make it observable

n

Kalman filter can also be derived as the (recursively computed) least-squares solutions to a (growing) set of linear equations

Things to be aware of (but we won’t cover)

SLIDE 52

n

If system is observable (=dual of controllable!) then Kalman filter will converge to the true state.

n

System is observable if and only if: O = [C ; CA ; CA2 ; … ; CAn-1] is full column rank (1) Intuition: if no noise, we observe y0, y1, … and we have that the unknown initial state x0 satisfies: y0 = C x0 y1 = CA x0 ... yK = CAK x0

This system of equations has a unique solution x0 iff the matrix [C; CA; … CAK] has full column rank. B/c any power of a matrix higher than n can be written in terms of lower powers of the same matrix, condition (1) is sufficient to check (i.e., the column rank will not grow anymore after having reached K=n-1).

CS 287 Lecture 12 (Fall 2019) Kalman Filtering

Outline

Outline

Multivariate Gaussians

Multivariate Gaussians

Multivariate Gaussians: Examples

Multivariate Gaussians: Examples

Multivariate Gaussians: Examples

Multivariate Gaussians: Examples

Multivariate Gaussians: Examples

Partitioned Multivariate Gaussian

Consider a multi-variate Gaussian and partition random vector into (X, Y).

Partitioned Multivariate Gaussian: Dual Representation

Marginalization: p(x) = ?

Marginalization Recap

Self-quiz

Conditioning: p(x | Y = y0) = ?

Conditioning Recap

Outline

Kalman Filter

Time update

the joint distribution’s mean and covariance

exponent is the only place they appear; hence we know this is a multivariate Gaussian. We directly compute its mean and covariance. [usually simpler!]

Time Update: Finding the joint

Time Update: Finding the joint

Time Update Recap

Generality!

Observation update

At time 0:

For t = 1, 2, …

Complete Kalman Filtering Algorithm

Kalman Filter Summary

and state dimensionality n: O(k2.376 + n2)

Outline

Nonlinear Dynamical Systems

Linearity Assumption Revisited

y

y

x x p(x)

p(y)

Linearity Assumption Revisited

y

y

x x p(x)

p(y)

Non-linear Function

y y x x p(x) p(y)

EKF Linearization (1)

EKF Linearization (2)

EKF Linearization (3)

EKF Linearization: First Order Taylor Series Expansion

At time 0:

For t = 1, 2, …

EKF Algorithm

EKF Summary

dimensionality k and state dimensionality n: O(k2.376 + n2)

violated!

Outline

Linearization via Unscented Transform

UKF Sigma-Point Estimate (2)

UKF Sigma-Point Estimate (3)

UKF Sigma-Point Estimate (4)

UKF intuition why it can perform better

Original Unscented Transform

variance at the next time from the sample points

matrix between xt and zt. Then can do the standard update.

Unscented Kalman filter

UKF Summary

constant factor slower in typical practical applications

terms of Taylor expansion (EKF only first term) + capturing more aspects of the higher order terms

Forthcoming

Things to be aware of (but we won’t cover)

Things to be aware of (but we won’t cover)

Kalman filter property