CS 287 Advanced Robotics (Fall 2019) Lecture 13: Kalman Smoother, - PowerPoint PPT Presentation

CS 287 Advanced Robotics (Fall 2019) Lecture 13: Kalman Smoother, Maximum A Posteriori, Maximum Likelihood, Expectation Maximization Pieter Abbeel UC Berkeley EECS

Outline n Kalman smoothing n Maximum a posteriori sequence n Maximum likelihood n Maximum a posteriori parameters n Expectation maximization

Overview X t- X 0 X t Filtering: 1 n z 0 z t-1 z t Smoothing: n X t- X t+ X 0 X t X T 1 1 z t+ z 0 z t-1 z t z T 1 Note: by now it should be clear that the “u” variables don’t really change anything n conceptually, and going to leave them out to have less symbols appear in our equations.

Filtering n Generally, recursively compute:

Smoothing Generally, recursively compute: n Forward: (same as filter) Backward: n n Combine: n

Complete Smoother Algorithm Forward pass (= filter): n Backward pass: n Note 1: for all times t in one forward+backward pass Note 2: find P(x t | z 0 , …, z T ) by renormalizing Combine: n

Pairwise Posterior n Find n Recall: a t ( x t ) = P ( x t , z 0 , . . . , z t ) b t ( x t ) = P ( z t +1 , . . . , z T | x t ) n So we can readily compute P ( x t , x t +1 , z 0 , . . . , z T ) (Law of total probability) = P ( x t , z 0 , . . . , z t ) P ( x t +1 | x t , z 0 , . . . , z t ) P ( z t +1 | x t +1 , x t , z 0 , . . . , z t ) P ( z t +2 , . . . , z T | x t +1 , x t , z 0 , . . . , z t +1 ) = P ( x t , z 0 , . . . , z t ) P ( x t +1 | x t ) P ( z t +1 | x t +1 ) P ( z t +2 , . . . , z T | x t +1 ) (Markov assumptions) = a t ( x t ) P ( x t +1 | x t ) P ( z t +1 | x t +1 ) b t +1 ( x t +1 ) (definitions a, b)

Exercise n Find

Kalman Smoother n = the smoother algorithm just covered for particular case when P(x t+1 | x t ) and P(z t | x t ) are linear Gaussians n We already know how to compute the forward pass (=Kalman filtering) n Backward pass: n Combination:

Kalman Smoother Backward Pass n Exercise: work out integral for b t

Matlab Code Data Generation Example A = [ 0.99 0.0074; -0.0136 0.99]; C = [ 1 1 ; -1 +1]; n x(:,1) = [-3;2]; n Sigma_w = diag([.3 .7]); Sigma_v = [2 .05; .05 1.5]; n w = randn(2,T); w = sqrtm(Sigma_w)*w; v = randn(2,T); v = sqrtm(Sigma_v)*v; n for t=1:T-1 n x(:,t+1) = A * x(:,t) + w(:,t); z(:,t) = C*x(:,t) + v(:,t); end % now recover the state from the measurements n P_0 = diag([100 100]); x0 =[0; 0]; n % run Kalman filter and smoother here n % + plot n

Kalman Filter/Smoother Example

Overview Filtering: X t- X 0 X t n 1 z 0 z t-1 z t Smoothing: n X t- X t+ X 0 X t X T 1 1 z t+ z 0 z t-1 z t z T 1 MAP: n X t- X t+ X 0 X t X T 1 1 z t+ z 0 z t-1 z t z T 1

MAP Sequence Naively solving by enumerating all possible combinations of x_0,…,x_T is exponential in T n Generally:

MAP --- Complete Algorithm n O(T n 2 )

Kalman Filter (aka Linear Gaussian) Setting Summations à integrals n But: can’t enumerate over all instantiations n However, we can still find solution efficiently: n the joint conditional P( x 0:T | z 0:T ) is a multivariate Gaussian n for a multivariate Gaussian the most likely instantiation equals the mean n à we just need to find the mean of P( x 0:T | z 0:T ) the marginal conditionals P( x t | z 0:T ) are Gaussians with mean equal to the mean of x t under the n joint conditional, so it suffices to find all marginal conditionals We already know how to do so: marginal conditionals can be computed by running the Kalman n smoother. Alternatively: solve convex optimization problem n

Thumbtack n Let θ = P(up), 1-θ = P(down) n How to determine θ ? n Empirical estimate: 8 up, 2 down à

http://web.me.com/todd6ton/Site/Classroom_Blog/Entries/2009/10/7_A_Thumbtack_Experiment.html

Maximum Likelihood θ = P(up), 1-θ = P(down) n Observe: n Likelihood of the observation sequence depends on θ: n Maximum likelihood finds n à extrema at θ = 0, θ = 1, θ = 0.8 à Inspection of each extremum yields θ ML = 0.8

Maximum Likelihood More generally, consider binary-valued random variable with θ = P(1), 1-θ = P(0), assume we n observe n 1 ones, and n 0 zeros Likelihood: n Derivative: n Hence we have for the extrema: n n1/(n0+n1) is the maximum n = empirical counts. n

Log-likelihood The function n is a monotonically increasing function of x Hence for any (positive-valued) function f: n Often more convenient to optimize log-likelihood rather than likelihood n Example: n

Log-likelihood ßà Likelihood Reconsider thumbtacks: 8 up, 2 down n Likelihood n Log-likelihood n Concave Not Concave Definition: A function f is concave if and only n Concave functions are generally easier to maximize then non-concave n functions

Concavity and Convexity f is convex if and only f is concave if and only x 1 x 1 x 2 x 2 λ x 2 +(1- λ )x 2 λx 2 +(1- λ )x 2 “Easy” to minimize “Easy” to maximize

ML for Multinomial n Consider having received samples

ML for Fully Observed HMM Given samples n Dynamics model: n Observation model: n à Independent ML problems for each and each

ML for Exponential Distribution Source: wikipedia n Consider having received samples n 3.1, 8.2, 1.7 ll

ML for Exponential Distribution Source: wikipedia n Consider having received samples

Uniform n Consider having received samples

ML for Gaussian n Consider having received samples

ML for Conditional Gaussian Equivalently: More generally:

ML for Conditional Gaussian

ML for Conditional Multivariate Gaussian

Aside: Key Identities for Derivation on Previous Slide

ML Estimation in Fully Observed Linear Gaussian Bayes Filter Setting Consider the Linear Gaussian setting: n Fully observed, i.e., given n à Two separate ML estimation problems for conditional multivariate n Gaussian: 1: n 2: n

Priors --- Thumbtack Let θ = P(up), 1-θ = P(down) n How to determine θ ? n ML estimate: 5 up, 0 down à n Laplace estimate: add a fake count of 1 for each outcome n

Priors --- Thumbtack n Alternatively, consider θ to be random variable n Prior P(θ) = C θ(1-θ) n Measurements: P( x | θ ) n Posterior: n Maximum A Posterior (MAP) estimation n = find θ that maximizes the posterior à

Priors --- Beta Distribution Figure source: Wikipedia

Priors --- Dirichlet Distribution n Generalizes Beta distribution n MAP estimate corresponds to adding fake counts n 1 , …, n K

MAP for Mean of Univariate Gaussian Assume variance known. (Can be extended to also find MAP for variance.) n n Prior:

MAP for Univariate Conditional Linear Gaussian Assume variance known. (Can be extended to also find MAP for variance.) n Prior: n [Interpret!]

MAP for Univariate Conditional Linear Gaussian: Example TRUE --- Samples . ML --- MAP ---

Cross Validation Choice of prior will heavily influence quality of result n Fine-tune choice of prior through cross-validation: n 1. Split data into “training” set and “validation” set n 2. For a range of priors, n n Train: compute θ MAP on training set n Cross-validate: evaluate performance on validation set by evaluating the likelihood of the validation data under θ MAP just found 3. Choose prior with highest validation score n n For this prior, compute θ MAP on (training+validation) set Typical training / validation splits: n 1-fold: 70/30, random split n 10-fold: partition into 10 sets, average performance for each set being the validation set and the other 9 being the training set n

Mixture of Gaussians Generally: n Example: n ML Objective: given data z (1) , …, z (m) n Setting derivatives w.r.t. θ , µ , Σ equal to zero does not enable to solve for their ML estimates in closed form n We can evaluate function à we can in principle perform local optimization. In this lecture: “EM” algorithm, which is typically used to efficiently optimize the objective (locally)

Expectation Maximization (EM) Example: n Model: n Goal: n Given data z (1) , …, z (m) (but no x (i) observed) n Find maximum likelihood estimates of μ 1 , μ 2 n EM basic idea: if x (i) were known à two easy-to-solve separate ML problems n EM iterates over n E-step : For i=1,…,m fill in missing data x (i) according to what is most likely given the n current model ¹ M-step : run ML for completed data, which gives new model ¹ n

EM Derivation EM solves a Maximum Likelihood problem of the form: n µ: parameters of the probabilistic model we try to find x: unobserved variables z: observed variables Jensen’s Inequality

CS 287 Advanced Robotics (Fall 2019) Lecture 13: Kalman Smoother, - PowerPoint PPT Presentation

CS 287 Advanced Robotics (Fall 2019) Lecture 13: Kalman Smoother, Maximum A Posteriori, Maximum Likelihood, Expectation Maximization Pieter Abbeel UC Berkeley EECS Outline n Kalman smoothing n Maximum a posteriori sequence n Maximum likelihood

Recursive State Estimation 2 Lecture 8 Recap Today Kalman Filter Extended Kalman Filter

CS 287 Lecture 12 (Fall 2019) Kalman Filtering Lecturer: Ignasi Clavera Slides by Pieter Abbeel

ECE 516: Adaptive Digital Filters Lecture 8 (Kalman Filtering) Mojtaba Soltanalian Kalman

Multimodality in the Kalman Filter and Ensemble Kalman Filter Maxime Conjard, Henning Omre

UNSCENTED KALMAN FILTER UNSCENTED KALMAN FILTER MATTHIEU BLOCH April 21, 2020 1 / 9 RECAP:

Kalman filter Kalman Filter Kalman filter is used to filter true system states from noisy

The Kalman Filter (part 1) Administrative Stuff Rudolf Emil Kalman

287(g) Program Sheriff Eric J. Severson Waukesha County, WI 287(g) Program Legal Authority

CS 287 Advanced Robotics (Fall 2019) Lecture 9: Motion Planning Lecture by: Huazhe (Harry) Xu

CS 287 Advanced Robotics (Fall 2019) Lecture 7: Constrained Optimization Pieter Abbeel UC

CS 287 Advanced Robotics (Fall 2019) Lecture 6: Unconstrained Optimization Pieter Abbeel UC

Robotics Engineering Prof. Michael Gennert Robotics Engineering Program Director Fall 2016

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Sensors for Robotics

THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a gentle introduction to the Kalman

Smooth Solutions to the ABC Equation Je ff Lagarias , University of Michigan July 2, 2009

Serenity MESOS OVERSUBSCRIPTION MODULE Szymon Konefa SOFTWARE ENGINEER INTEL CORPORATION

Exponential Families Leila Wehbe March 19, 2013 Leila Wehbe Exponential Families Exponential

Projection and presolve in MOSEK: exponential and power cones ISMP 2018 Henrik A. Friberg

A Friendly Smoothed Analysis of the Simplex Method Daniel Dadush (CWI) Sophie Huiberts (CWI)

SpECTRE: Toward simulations of binary black hole mergers using Charm++ Franc ois H ebert @

Manufacturing Firm Performance in South Africa 13 September 2018 UNU-Wider conference Lawrence

Second Quarter 2015 Earnings Conference Call July 29, 2015 Cautionary Note Regarding

CS 287 Advanced Robotics (Fall 2019) Lecture 13: Kalman Smoother, - PowerPoint PPT Presentation

CS 287 Advanced Robotics (Fall 2019) Lecture 13: Kalman Smoother, Maximum A Posteriori, Maximum Likelihood, Expectation Maximization Pieter Abbeel UC Berkeley EECS Outline n Kalman smoothing n Maximum a posteriori sequence n Maximum likelihood

Recursive State Estimation 2 Lecture 8 Recap Today Kalman Filter Extended Kalman Filter

CS 287 Lecture 12 (Fall 2019) Kalman Filtering Lecturer: Ignasi Clavera Slides by Pieter Abbeel

ECE 516: Adaptive Digital Filters Lecture 8 (Kalman Filtering) Mojtaba Soltanalian Kalman

Multimodality in the Kalman Filter and Ensemble Kalman Filter Maxime Conjard, Henning Omre

UNSCENTED KALMAN FILTER UNSCENTED KALMAN FILTER MATTHIEU BLOCH April 21, 2020 1 / 9 RECAP:

Kalman filter Kalman Filter Kalman filter is used to filter true system states from noisy

The Kalman Filter (part 1) Administrative Stuff Rudolf Emil Kalman

287(g) Program Sheriff Eric J. Severson Waukesha County, WI 287(g) Program Legal Authority

CS 287 Advanced Robotics (Fall 2019) Lecture 9: Motion Planning Lecture by: Huazhe (Harry) Xu

CS 287 Advanced Robotics (Fall 2019) Lecture 7: Constrained Optimization Pieter Abbeel UC

CS 287 Advanced Robotics (Fall 2019) Lecture 6: Unconstrained Optimization Pieter Abbeel UC

Robotics Engineering Prof. Michael Gennert Robotics Engineering Program Director Fall 2016

Mobile &amp; Service Robotics Mobile &amp; Service Robotics Sensors for Robotics Sensors for

Mobile &amp; Service Robotics Mobile &amp; Service Robotics Sensors for Robotics Sensors for

Mobile &amp; Service Robotics Mobile &amp; Service Robotics Sensors for Sensors for Robotics

THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a gentle introduction to the Kalman

Smooth Solutions to the ABC Equation Je ff Lagarias , University of Michigan July 2, 2009

Serenity MESOS OVERSUBSCRIPTION MODULE Szymon Konefa SOFTWARE ENGINEER INTEL CORPORATION

Exponential Families Leila Wehbe March 19, 2013 Leila Wehbe Exponential Families Exponential

Projection and presolve in MOSEK: exponential and power cones ISMP 2018 Henrik A. Friberg

A Friendly Smoothed Analysis of the Simplex Method Daniel Dadush (CWI) Sophie Huiberts (CWI)

SpECTRE: Toward simulations of binary black hole mergers using Charm++ Franc ois H ebert @

Manufacturing Firm Performance in South Africa 13 September 2018 UNU-Wider conference Lawrence

Second Quarter 2015 Earnings Conference Call July 29, 2015 Cautionary Note Regarding

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Sensors for Robotics