Variational mapping particle filter combining particle filters with - PowerPoint PPT Presentation

Variational mapping particle filter combining particle filters with local optimal transport Manuel Pulido 1 , 2 and Peter Jan vanLeeuwen 1 , 3 1 Data Assimilation Research Centre, University of Reading, UK 2 Department of Physics, Universidad Nacional del Nordeste, Argentina 3 Department of Atmospheric and Oceanic Sciences, University of Colorado, USA

Motivation 2-d marginalized posterior pdf −3 20 20 (contours) −4 19 19 −5 18 18 Particles from −6 17 17 y y z prediction pdf −7 16 16 −8 15 15 −9 14 14 −9 −8 −7 −6 −5 −4 −3 −9 −8 −7 −6 −5 −4 −3 −9 −8 −7 −6 −5 −4 −3 x x y −3 20 20 Resampled particles −4 19 19 −5 18 18 −6 17 17 y z z −7 16 16 −8 15 15 −9 14 14 −9 −8 −7 −6 −5 −4 −3 −9 −8 −7 −6 −5 −4 −3 −9 −8 −7 −6 −5 −4 −3 x x y Distribution of the particles in the SIR filter for the Lorenz-63 system using 100 particles after 115 cycles.

Motivation Posterior pdf (d) (e) (f) � ✠ � KDE 0.150 0.150 prop q 0.4 Prior pdf post p 0.125 SIR 0.125 ❈ ✄ 0.100 0.3 0.100 ❈ ✄ ❲ ❈ ✄ ✎ p 0.075 0.075 0.2 0.050 0.050 0.1 0.025 0.025 0.000 0.000 0.0 −8 −6 −4 −8 −6 −4 14 16 18 20 x y z Resulting marginalized densities. Red lines are the inferences made of the posterior by the SIR filter. True posterior density is in black lines. Sample impoverishement leads to a very crude representation of the posterior density even for a reasonable number of particles. Can we improve the sample representation of densities beyond resampling?

Homotopic particle flows 0.4 p target Given two densities, p and q , we want to express q proposal 0.3 them as a homotopy, 0.2 q λ ( x | y ) = p ( y | x ) λ q ( x ) 0.1 C λ 0.0 Transformation from the prior to the posterior den- −5 0 5 10 x sity. lim λ → 1 q λ ( x | y ) = p ( x | y ) 0.4 q λ q 1 The parameter λ varies from 0 to 1 (a pseudo-time) 0.3 q 0 Working with sample points, the deformations could 0.2 be thought as particles moving in a flow according 0.1 to a set of stochastic differential equations: 0.0 d x λ −5 0 5 10 d λ = v λ ( x λ ) + η λ ( x λ ) x The particles are tracers which are moved along the streamlines of the flow. Daum and Huang (SPST, 2008)

Approximations: Gaussian particle flows Most of the works (Daum and Huang 08,11,13; Bunch and Gosill JASA 2016, Li and Coates IEEE 2017) on particle flows are based on (different flavors of) the Gaussian approximation. The velocity field (drift term) is given by v ( x ( j ) λ , λ ) = A ( λ ) x ( j ) λ + b ( λ ) A ( λ ) = − 1 2 PH T ( HPH T + λ R ) − 1 H b ( λ ) = ( I + 2 λ A )[( I + λ A ) PH T R − 1 y + Ax 0 ] Gaussian particle flows are different from approximating the proposal distribution with the result of a Kalman filter. Here, the particles are moved smoothly and the covariances, P , are evolving (in pseudo-time) and so the velocity field.

In this work we propose to combine particle flows and optimal transport

Optimal transport Deterministic flow (no diffussion processes), η = 0. Only unknown v ( x ) . Can we determine v ( x ) from optimal transport? Given the proposal ‘mass’ distribution q ( x ) and the target ‘mass’ distribution p ( z ) , we want to transport the mass units from q to p via a mapping T : x → z that minimizes the cost of transportation given by � C = q ( x ) c ( x , T ( x )) d x = E x ∼ q [ c ( x , T ( x ))] Rigorously, T is assumed a diffeomorphic transport map T : R n → R n between the two probability measures (i.e. smooth and invertible). The optimal transport map T is the one that gives the minimal C . This is the classical Monge-Kantorovich problem.

Kullback-Leibler divergence For the sake of feasibility we take the distance measure between the random fields X and Z to be given by � q ( z ) � c ( X , Z ) = log p ( x ) This is a divergence measure (enough for our purposes). Considering the coupling z = T ( x ) , the resulting transportation cost is � � q ( T ( x )) � d x = KL ( p � q T ) C ( p , q T ) = q ( T ( x )) log p ( x ) This is the Kullback-Leibler divergence between p and q ( T ) . We want a map T that minimizes KL . In information terms, we are seeking to maximize the amount of information that is present in q from p , or to minimize the uncertainty, i.e. the relative entropy between q and p .

Gradient flows • T is a sequence of mappings T i . We want a T that is a composition of T 1 ◦ ( T 2 ◦ ( T 3 · · · ))) that minimizes the differences between p and q T in the sense of the Kullback-Leibler divergence. Each transformation is a local optimal transport problem (Villani, 2008). • Suppose we produce a smooth transformation along an arbitrary direction v ( x ) , x λ + ǫ = T λ ( x λ ) = x λ + ǫ v λ ( x λ ) assuming ǫ small. T ( x ) is a small perturbation to the identity transformation. • The optimal local map is the one that gives the steepest descent direction of the transport cost function. Under mild smoothness conditions, these local maps give the optimal transport cost (Angenet, et al SIAM 2003). We aim to find the direction v at x that gives the steepest descent of KL.

Embedding the map in a reproducing kernel Hilbert space Consider as space of mapping functions the unit ball of an RKHS such that � v � F ≤ 1 (Liu and Wang, 2016). Any function from the RKHS can be expressed as the dot product by the kernel that defines the RKHS, v ( x ) = � K ( · , x ) , v ( · ) � F The directional derivative of the KL results in � � � q ( x ′ ) [ K ( x ′ , · ) ∇ x log p ( x ′ | y ) + ∇ x K ( x ′ , · )] d x ′ , v ( · ) D v KL = − . F ∆ The definition of the gradient field is D v f ( x ) = �∇ f ( x ) , v � for | v | = 1, therefore, we have found the gradient of KL : ∇ KL ( x ) = −E x ′ ∼ q [ K ( x ′ , x ) ∇ x log p ( x ′ | y ) + ∇ x K ( x ′ , x )] .

Monte Carlo representation of the KL gradient The density q is represented through a set of samples x ( 1 : N p ) where i is the i mapping iteration. Using MC integration for the q expectation in the KL gradient N p ∇ KL ( x ) ≈ − 1 � � K ( x ( l ) i , x ) ∇ x log p ( x ( l ) i ) + ∇ x K ( x ( l ) � i , x ) . N p l = 1 The steepest descent direction of the transport cost (KL) is v ( x ) = −∇ KL ( x ) . Density transformation: q i ( x ( j ) i ) = q i − 1 ( x ( j ) i − 1 ) | J ( x ( j ) i − 1 ) | The local mappings are defined as: x ( j ) i + 1 = T ( x ( j ) i ) = x ( j ) − ǫ ∇ KL ( x ( j ) i ) i The particles are pushed forward in the gradient flow to minimize KL. The flow itself depends on the interacting particles at that mapping iteration.

Variational mapping particle filter VMPF conducts an optimization It determines the optimal positions of the sample points. The sample points are the variational parameters. Input: Given { x ( j ) k − 1 } N j = 1 , y k , M ( · ) , p ( y | x ) and p ( η ) ⊲ Parameters ǫ , N t x ( 1 : N ) ← M ( x ( 1 : N ) k − 1 ) + η k ⊲ Forecast stage k , 0 Optimization module: Given initial x ( 1 : N ) k , 0 , we seek a new set that minimizes KL. while i < I and convergence criterion do ⊲ N eff < N t or |∇ KL i | / |∇ KL i − 1 | < δ for j = 1 , N do x ( j ) k , i ← x ( j ) k , i − 1 − ǫ ∇ KL ( x ( j ) k , i − 1 ) ⊲ ∇ KL requires ∇ log p , K , ∇ K end for end while Output: { x ( j ) k , i } N j = 1

Experiment with Lorenz-63 −3 20 20 −4 19 19 −5 18 18 Distribution of particles −6 17 17 y y z of the prediction density −7 16 16 −8 15 15 −9 14 14 −9 −8 −7 −6 −5 −4 −3 −9 −8 −7 −6 −5 −4 −3 −9 −8 −7 −6 −5 −4 −3 x x y −3 20 20 −4 19 19 −5 18 18 Distribution of particles −6 17 17 y z z of the posterior density −7 16 16 −8 15 15 after the VMPF −9 14 14 −9 −8 −7 −6 −5 −4 −3 −9 −8 −7 −6 −5 −4 −3 −9 −8 −7 −6 −5 −4 −3 x x y Model error N ( 0 , Q ) Observational error N ( 0 , R ) − 1 / 2 � x − x ′ � 2 K ( x , x ′ ) = exp � � Radial basis functions as kernel: A Mahalanobis distance. Kernel covariance: A = γ 1 / 2 Q , γ bandwidth

Marginal distributions Excelent convergence (a) (b) (c) MPF experiment ❅ 0.175 0.175 0.175 last iter ❘ ❅ ini q 0.150 0.150 0.150 post p 0.125 0.125 0.125 0.100 0.100 0.100 p 0.075 0.075 0.075 0.050 0.050 0.050 0.025 0.025 0.025 0.000 0.000 0.000 −8 −6 −4 −8 −6 −4 14 16 18 20 x y z (d) (e) (f) SIR experiment KDE 0.150 0.150 prop q 0.4 post p 0.125 0.125 0.100 0.3 0.100 p 0.075 0.075 0.2 0.050 0.050 0.1 0.025 0.025 0.000 0.000 0.0 −8 −6 −4 −8 −6 −4 14 16 18 20 x y z

Mean square error 100 particles 20 particles 5 particles (a) (b) (c) 2.0 1.4 MPF 3.0 SIR 1.2 ✲ SIR 2.5 1.5 1.0 2.0 0.8 RMSE 1.0 1.5 0.6 1.0 0.4 0.5 ✲ MPF 0.5 0.2 0.0 0.0 0.0 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 Time Time Time RMSE as a function of assimilation cycle. The time-averaged RMSE exhibits a robust behavior in the MPF: For 100 particles is 0.482, for 5 particles is 0.489!.

Variational mapping particle filter combining particle filters with - PowerPoint PPT Presentation

Variational mapping particle filter combining particle filters with local optimal transport Manuel Pulido 1 , 2 and Peter Jan vanLeeuwen 1 , 3 1 Data Assimilation Research Centre, University of Reading, UK 2 Department of Physics, Universidad

STUFF FILTER POPULARITY FILTER PERSONALITY FILTER TALENT FILTER GODS FILTER WE HAVE THE

Recursive State Estimation 2 Lecture 8 Recap Today Kalman Filter Extended Kalman Filter

Kalman filter Kalman Filter Kalman filter is used to filter true system states from noisy

Texture and other Mappings Texture Mapping Texture Mapping Bump Mapping Bump Mapping

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

I ntroduction to Mobile Robotics Bayes Filter Kalm an Filter Wolfram Burgard 1 Bayes

Filter Design Specifications Chaiwoot Boonyasiriwat September 29, 2020 Filter Design

THE REPO DOES NOT FORGET STEP 1: GIT FILTER-BRANCH git filter-branch --index-filter 'git rm -rf

Image Warping Image Mapping Image Mapping - Examples Forward Mapping Forward Mapping -

TEXTURE MAPPING 1 OUTLINE Introduce Mapping Methods Texture Mapping Environment

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

PURE POWER FILTERS brand guarantee the high quality and performance I N D E X Introduction Air

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

CSE-571 Probabilistic Robotics Bayes Filter Implementations Particle filters Motivation So

Uniqueness and purity in multi-agent matching problems Brendan Pass (joint with Y.-H. Kim (UBC))

110401367 Transportation Engineering and Planning Dr. Randa Oqab Mujalli Introduction of

Network Flow VI - Min-Cost Flow Applications Lecture 17 October 24, 2013 Sariel (UIUC) CS573

Data-driven optimal transport Esteban G. Tabak NYU - Courant Institute of Mathematical Sciences

Plan Optimizing Matrix Transpose with CUDA 1 CS4402-9535: High-Performance Computing with CUDA

Quiz Define the vectors a 1 and a 2 as follows: a 1 = # $ % # $ % 1 and a 2 = 2 1 1

Announcements Class is 170. Matlab Grader homework, 1 and 2 (of less than 9) homeworks Due 22

McBits Revisited ia.cr/2017/793 Tung Chou Osaka University, Japan Code-based cryptography