Algorithms for high-dimensional non-linear filtering and smoothing - - PowerPoint PPT Presentation

algorithms for high dimensional non linear filtering and
SMART_READER_LITE
LIVE PREVIEW

Algorithms for high-dimensional non-linear filtering and smoothing - - PowerPoint PPT Presentation

Algorithms for high-dimensional non-linear filtering and smoothing problems Jana de Wiljes, Sahani Pathiraja, Sebastian Reich Kobe 2019 Setting Model: x k +1 = f ( x k , . . . , x k n ) + k , k N (0 , Q ) (1) where x k +1 R N


slide-1
SLIDE 1

Algorithms for high-dimensional non-linear filtering and smoothing problems

Jana de Wiljes, Sahani Pathiraja, Sebastian Reich

Kobe 2019

slide-2
SLIDE 2

Setting

Model: xk+1 = f (xk, . . . , xk−n) + ǫk, ǫk ∼ N(0, Q) (1) where xk+1 ∈ RNx and Q ∈ RNx×Nx. Observations: yk+1 = h(xk+1) + νk, νk ∼ N(0, R) (2) where yk+1 ∈ RNy and R ∈ RNy×Ny .

1

slide-3
SLIDE 3

Filtering and Smoothing Problem

Goal: Approximate

  • p(x0:T|y1:T) (joint smoothing density)
  • p(xk|y1:T) (marginal smoothing density)
  • p(xk|y1:k) (filtering density)

Here focus on: Fixed-lag smoothing: p(xk−L:k|y1:k) (3) Remark:

  • For L = 0 Fixed-lag smoothing = filtering
  • For L > 0 Fixed-lag smoothing can be interpreted as filtering with

augmented state space

2

slide-4
SLIDE 4

State of the art

Linear Model: Kalman Filter/Smoother

  • Gaussian posterior
  • deterministic formulas

Nonlinear Model: approximate with empirical measure, i.e., p(x0:T|y1:T) ≈

M

  • i=1

1 M δ(x − x(i)

0:T|1:T).

→ This approach leads to a variety of smoothers

  • Particle smoother
  • Ensemble Kalman smoother

3

slide-5
SLIDE 5

Particle collapse

Degeneracy

  • f Particle Smoother

yk x(i)

k−L:k

w(i)

k−L:k > 0

4

slide-6
SLIDE 6

Alternative approach

Move particles towards observation x(i)

k−L:k

yk Dk−L:k|k

5

slide-7
SLIDE 7

Linear Ensemble Transform Smoothers

slide-8
SLIDE 8

Linear Ensemble Transform Smoothers

Update via linear transformation: Xk−L:k|k = Xk−L:k|k−1Dk−L:k|k = Xk−L:k|k−LDk−L:k|k−L+1 · · · · · Dk−L:k|k where Xk−L:k|k = [x(1)

k−L:k|k, ..., x(M) k−L:k|k] ∈ RNX L×M and

x(i)

k−L:k|s ∼ p(xk−L:k|y1:s)

(4) with s ∈ {k − L, . . . , k}

6

slide-9
SLIDE 9

Motivation: LETS

Benefits:

  • localisation [Reich, 2013][Poterjoy, 2015]
  • hybrid [Frei and K¨

unsch, 2013][Chustagulprom et al., 2016]

  • moment matching [T¨
  • dter and Ahrens, 2015, Bishop et al., 2001,

Xiong et al., 2006, Lei and Bickel, 2011, Acevedo et al., 2017] Motivation for Hybrid formulations

  • Ensemble Kalman Filters (EnKFs)

+

Robust and moderately affordable

Gaussian assumptions

  • Traditional Particle Filters (PFs)

+

No Gaussianity assumptions

Liable to the ”Curse of Dimensionality”

+

consistent in the ensemble limit

7

slide-10
SLIDE 10

Example: Ensemble Kalman smoother

Analysis step: Xk|k = Xk|k−1Dk|k with {Dk|k}ij = δij − 1 M − 1(x(i)

k|k−1 − mk|k−1)

· H⊤(HPk|k−1H⊤ + R)−1(Hx(j)

k|k−1 + ξ(j) − yk)

Pros and Cons:

+

Robust and moderately affordable

+

Works well in practice

Gaussian assumptions

Mathematical foundation for nonlinear models largely missing

8

slide-11
SLIDE 11

Ensemble Square Root Smoother

Transform coefficients: (DESRS

k−L:k|k)ij := (Sk−L:k|k)ij + ˆ

w (i)

k−L:k|k − 1

M , (5) with Square root matrix Sk−L:k|k =

  • I +

1 M − 1(˜ HAk−L:k|k−1)TR−1 ˜ HAk−L:k|k−1 −1/2 , where Ak−L:k|k−1 = 1 √ M − 1

  • x(1)

k−L:k|k−1−

mk−L:k|k−1, . . . , x(M)

k−L:k|k−1 −

mk−L:k|k−1

  • and

ˆ w (i)

k−L:k|k = 1

M − 1 M − 1eT

i S2 k−L:k|k(˜

HAk−L:k|k−1)TR−1(˜ H mk−L:k|k−1−yk)

9

slide-12
SLIDE 12

Nonlinear Ensemble Transform Smoother [T¨

  • dter and Ahrens, 2015]

NETF: DNETS

k−L:k|k = wk|k1 +

√ M[Wk|k − wk|k(wk|k)T]1/2Ω (6) where w = (w (1)

k|k, . . . , w (M) k|k )T ∈ RM×1

and Wk|k = diag (wk|k)

10

slide-13
SLIDE 13

Ensemble Transform Particle Smoother [de Wiljes et al., 2019]

Given:

  • M samples x(i)

k−L:k|k−1 ∼ p(xk−L:k|y1:k−1) (prior ensemble)

  • normalized importance weights w (j)

k

= p(yk|x(j)

k−L:k|k−1) (likelihood)

Ansatz: replace resampling step with linear transformation by interpreting it as discrete Markov chain given by transition matrix [Reich, 2013] DETPS

k−L:k|k ∈ RM×M

subject to: (A) dETPS

ij

= (DETPS

k−L:k|k)ij ≥ 0 ∀ i, j

(B) DETPS

k−L−k|k1 = Mwk

(C) (DETPS

k−L:k|k)T1 = 1

(7)

11

slide-14
SLIDE 14

Ensemble Transform Particle Smoother [de Wiljes et al., 2019]

Then the posterior ensemble members are distributed according to the columns of the transformation ˜ X (j) ∼       dETPS

1j

dETPS

2j

. . . dETPS

Mj

      and x(j)

k−L:k|k−1 = E[ ˜

X (j)] =

  • i

x(i)

k−L:k|k−1dETPS ij

  • Example. Monomial resampling

DMono

k−L:k|k := w ⊗ 1 =

      w (1) w (1) · · · w (1) w (2) w (2) · · · w (2) . . . . . . ... . . . w (M) w (M) · · · w (M)       . Here: X1:k|k−1 and X1:k|k are independent. Idea: increase correlation between X1:k|k−1 and X1:k|k

12

slide-15
SLIDE 15

Ensemble Transform Particle Smoother [de Wiljes et al., 2019]

Idea: increase correlation between X1:k|k−1 and X1:k|k Underlying ansatz: approximate a transfer map between X1:k|k−1 and X1:k|k Note: choose transfer map that induces a coupling µ∗

Z that minimizes

µ∗ = arg inf

µ∈Π(pX1:k|k−1,pX1:k|k )

  • E||X1:k|k−1 − X1:k|k||2

13

slide-16
SLIDE 16

Ensemble Transform Particle Smoother [de Wiljes et al., 2019]

Discretization: Solve optimization problem DETPS

k−L:k|k = arg min M

  • i,j=1

(DETPS

k−L:k|k)ij||x(i) k−L:k|k−1 − x(j) k−L:k|k−1||2

to find transformation matrix DETPS

k−L:k|k that increases correlation

([Reich, 2013]). Remarks:

  • consistent for M → ∞
  • maximizes correlation between prior and posterior ensembles
  • is deterministic, preserving the regularity of the state fields

14

slide-17
SLIDE 17

OT Solvers

Exact solution

  • Computational complexity O(M3 ln(M))
  • Efficient Earth Mover Distance algorithms available, e.g. FastEMD

[Pele and Werman, 2009]. Entropic Regularized Approximation [Cuturi, 2013] J(D) =

M

  • i,j=1

(DETPS

k−L:k|k)ij||x(i) k−L:k|k−1 − x(j) k−L:k|k−1||2

+ 1 λ(DETPS

k:k+L|k+L)ij ln(DETPS k:k+L|k+L)ij

  • Sinkhorn’s fixed point iteration can be used.
  • Computational complexity O(M2 · C(λ))

1D Approximation(Each variable independently updated)

  • OT problem reduces to reordering.
  • Computational complexity O(M ln(M))
  • No particle distance needed

15

slide-18
SLIDE 18

Combating challenges of nonlinear smoothing

slide-19
SLIDE 19

Adaptive spread correction and rotation

slide-20
SLIDE 20

Moment Matching

Idea: design filter to match specific moments for a finite ensemble [Xiong et al., 2006][Lei and Bickel, 2011][T¨

  • dter and Ahrens, 2015]

[Acevedo et al., 2017] Linear case: Kalman Formula mk−L:k|k = mk−L:k|k−1 − K(Hmk−L:k|k−1 − yk) (8) Pk−L:k|k = Pk−L:k|k−1 − KHPk−L:k|k−1 (9) Nonlinear case: Particle filter, e.g., first and second moment mk−L:k|k =

M

  • i=1

w (i)

k x(i) k−L:k|k−1

Pk−L:k|k =

M

  • i=1

w (i)

k (x(i) k−L:k|k − mk−L:k|k)(x(i) k−L:k|k − mk−L:k|k)T 16

slide-21
SLIDE 21

First-order accurate LETSs

LETS is first-order accurate if mk−L:k|k =

M

  • i=1

w (i)

k x(i) k−L:k|k−1

(10) is equal to

  • mk−L:k|k = 1

M

M

  • i=1

x(i)

k−L:k|k

(11)

17

slide-22
SLIDE 22

First-order accurate LETSs

Class of first-order accurate LETSs D1 = {D ∈ RM×M| DT1 = 1, D1 = Mw } Examples:

  • DEnKS /

∈ D1

  • DESRS /

∈ D1

  • DETPS ∈ D1
  • DMono ∈ D1
  • DNETS ∈ D1

18

slide-23
SLIDE 23

First-order accurate LETSs

Linear transformations Dλ replacing resampling step : Dλ

k−L:k|k = arg min M

  • i,j=1

(Dλ

k−L:k|k)ij||x(i) k−L:k|k−1 − x(j) k−L:k|k−1||2

(12) + 1 λ(Dλ

k−L:k|k)ij ln(Dλ k−L:k|k)ij

(13) for given λ > 0 subject to (Dλ

k−L:k|k)ij ≥ 0,

  • i

(Dλ

k−L:k|k)ij = 1,

1 M

  • j

(Dλ

k−L:k|k)ij = w (i) k|k

Remark.

  • λ → 0: D0 = w ⊗ 1 (monomial resampling).
  • λ → ∞: D∞ solves the optimal coupling/transport problem.
  • Effective iterative solvers are available [Cuturi, 2013].

19

slide-24
SLIDE 24

Moment Matching

The analysis covariance matrix

  • Pk−L:k|k = 1

M

M

  • i=1

(x(i)

k−L:k|k −

mk−L:k|k)(x(i)

k−L:k|k −

mk−L:k|k)T. is equal to the covariance matrix defined by the importance weights, i.e. Pk−L:k|k =

M

  • i=1

w (i)

k|k(x(i) k−L:k|k − mk−L:k|k)(x(i) k−L:k|k − mk−L:k|k)T

→ the following equation has to be satisfied (Dk−L:k|k − wk|k1T)(Dk−L:k|k − wk|k1T)T = Wk|k − wk|kwT

k|k. 20

slide-25
SLIDE 25

Second-order corrected LETSs

Given: D ∈ D1 Goal: correct transformation D ∈ D2 Ansatz:

  • D = D + ∆

with ∆ ∈ RM×M such that ∆1 = 0, ∆T1 = 0 ([Acevedo et al., 2017]). Need to solve algebraic Riccati equation: M(W − wwT) − (D − w1T)(D − w1T)T = (D − w1T)∆T + ∆(D − w1T)T + ∆∆T

21

slide-26
SLIDE 26

Second-order corrected LETSs

Given: D ∈ D1 and D = D + ∆ ∈ D2 Second-order accurate:

  • DS = D + ∆S

with S any orthogonal matrix s.t. S1 = 1 Remark Second-order corrected LETSs have been proposed by [Xiong et al., 2006] and [T¨

  • dter and Ahrens, 2015] corresponding to

D = D0 and S = Id or S randomly chosen.

22

slide-27
SLIDE 27

Numerical example

Gaussian prior, non-Gaussian likelihood:

Figure 1: Prior and posterior distribution for the single Bayesian inference step

23

slide-28
SLIDE 28

Numerical example

Figure 2: Absolute errors in the first four moments of the posterior distribution.

24

slide-29
SLIDE 29

Lorenz 63 model

25

slide-30
SLIDE 30

Memory of Model

26

slide-31
SLIDE 31

Mackey-Glass model

27

slide-32
SLIDE 32

Hybrid

slide-33
SLIDE 33

Hybrid

Splitting Likelihood: p(yk|xk) ∝ exp α 2 (h(xk) − yk)⊤R−1(h(xk) − yk)

  • × exp

1 − α 2 (h(xk) − yk)R−1(h(xk) − yk)

  • Hybrid Update Example:

Xk−L:k|k = Xk−L:k|k−1DETPS

k−L:k|k(α)DESRS k−L:k|k(1 − α)

Note that

  • α = 0 ⇒ Pure Ensemble Square Root Smoother
  • α = 1 ⇒ Pure Ensemble transform Particle Smoother

28

slide-34
SLIDE 34

Lorenz 63 model

˙ x1 = 10(x2 − x1) ˙ x2 = x1(28 − x3) − x2 ˙ x3 = x1x2 − 8/3x3

−20 −10 10 20 −50 50 5 10 15 20 25 30 35 40 45 50

Setting:

  • ∆t = 0.00005
  • ε ∈ {10−1, . . . , 10−4, 10−5}
  • 107 time-steps

29

slide-35
SLIDE 35

Lorenz 63 model

Figure 3: RMSEs for hybrid ESRS (α = 0) and 2nd-order corrected ETPS (α = 1) as a function of the sample size, M.

30

slide-36
SLIDE 36

Localisation

slide-37
SLIDE 37

Localisation

R-Localisation

  • Motivation: only take close observations into account
  • localisation takes place in observational space
  • R is tuned for every grid point with respect to spatial temporal

distance to y Approach: (Rloc(gs(k − l)))−1 = {˜ Cs(k − l)}R−1. (14) where {˜ Cs(k − l)}uu = ρ rs,u rloc

  • with u ∈ 1, . . . , Ny

(15) with time-dependent spatial distance function, rL

s,u(k − l) = |gs(k − l) − g y u (k)|f radius

(16) where ρ is an appropriately chosen function

31

slide-38
SLIDE 38

Autocorrelation

10 20 30 40 5 10 15 20 25 30 35 40 lag 0

  • 1
  • 0.5

0.5 1 10 20 30 40 5 10 15 20 25 30 35 40

lag 1

  • 1
  • 0.5

0.5 1 10 20 30 40 5 10 15 20 25 30 35 40

lag 5

  • 1
  • 0.5

0.5 1 10 20 30 40 5 10 15 20 25 30 35 40

lag 10

  • 1
  • 0.5

0.5 1

32

slide-39
SLIDE 39

Lorenz 96 model

1 2 3 4 5 6 7 8

lag

2.1 2.12 2.14 2.16 2.18 2.2 2.22

RMSE

Autocorr based loc Fixed loc

1 2 3 4 5 6 7 8 9

lag

1.35 1.4 1.45 1.5 1.55 1.6

RMSE

Autocorr based loc Fixed loc

33

slide-40
SLIDE 40

Lorenz 96 model

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

bridging parameter ( )

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3

RMSE

filter (M=30) smoother (M=30) filter (M=40) smoother (M=40)

34

slide-41
SLIDE 41

Thanks !

slide-42
SLIDE 42

References i

Acevedo, W., de Wiljes, J., and Reich, S. (2017). Second-order accurate ensemble transform particle filters. SIAM J. Sci Comp., 39(5):1834–1850. Bishop, C. H., Etherton, B. J., , and Majumdar, S. J. (2001). Adaptive sampling with the ensemble transform kalman filter. part i: Theoretical aspects. MONTHLY WEATHER REVIEW, 129. Chustagulprom, N., Reich, S., and Reinhardt, M. (2016). A hybrid ensemble transform particle filter for nonlinear and spatially extended dynamical systems. SIAM/ASA J. Uncertainty Quantification, 4(1):592–608.

35

slide-43
SLIDE 43

References ii

Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems 26, pages 2292–2300. Curran Associates, Inc. de Wiljes, J., Pathiraja, S., and Reich, S. (2019). Algorithms for high-dimensional non-linear filtering and smoothing problems. https://arxiv.org/pdf/1901.06300.pdf, page submitted to SIAM SISC. Frei, M. and K¨ unsch, H. R. (2013). Bridging the ensemble kalman and particle filters. Biometrika.

36

slide-44
SLIDE 44

References iii

Lei, J. and Bickel, P. (2011). A moment matching ensemble filter for nonlinear non-gaussian data assimilation.

  • Mon. Wea. Rev., 139(12):3964–3973.

Pele, O. and Werman, M. (2009). Fast and robust earth mover’s distances. In ICCV. Poterjoy, J. (2015). A localized particle filter for high-dimensional nonlinear systems.

  • Mon. Wea. Rev., 144(1):59–76.

37

slide-45
SLIDE 45

References iv

Reich, S. (2013). A non-parametric ensemble transform method for bayesian inference. SIAM J Sci Comput, 35:A2013–A2014. T¨

  • dter, J. and Ahrens, B. (2015).

A second-order exact ensemble square root filter for nonlinear data assimilation.

  • Mon. Wea. Rev., 143:1347–1367.

Xiong, X., Navon, I., and Uzungoglu, B. (2006). A note on the particle filter with posterior Gaussian resampling. Tellus, 85A:456–460.

38