Robust method for EnKF in the presence of observation - - PowerPoint PPT Presentation

robust method for enkf in the presence of observation
SMART_READER_LITE
LIVE PREVIEW

Robust method for EnKF in the presence of observation - - PowerPoint PPT Presentation

Robust method for EnKF in the presence of observation outliers/Multivariate localization methods for EnKF Mikyoung Jun Department of Statistics Texas A&M University May 19, 2015 M. Jun (TAMU) Multivariate Localization/Robust Methods for


slide-1
SLIDE 1

Robust method for EnKF in the presence of

  • bservation outliers/Multivariate localization

methods for EnKF

Mikyoung Jun

Department of Statistics Texas A&M University

May 19, 2015

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 1 / 32

slide-2
SLIDE 2

Setup

◮ yt ∈ Rp: observations at time t ◮ xt ∈ Rn: unobservable state at time t ◮ Nonlinear system eq. xt = M(xt−1) + et ◮ Observation eq. yt = Htxt + ǫt ◮ System error et ∼ Nn(0, Qt) ◮ Observation error ǫt ∼ Np(0, Rt) ◮ System and observation errors are uncorrelated. ◮ M, Ht, Qt, Rt assumed to be known.

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 2 / 32

slide-3
SLIDE 3

Definitions

◮ M-member background ensemble

xb = {xb(k) : k = 1, . . . , M} ∈ Rn×M

◮ background mean ¯

xb = 1

M

M

k=1 xb(k) ◮ ensemble-based estimate of the background error covariance

Pb =

1 M−1

M

k=1 Xb(k)[Xb(k)]T, where Xb(k) = xb(k) − ¯

xb

◮ analysis mean ¯

xa = ¯ xb + K(y − H¯ xb)

◮ analysis covariance Pa = (I − KH)Pb ◮ Kalman gain K = PbHT(HPbH + R)−1

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 3 / 32

slide-4
SLIDE 4

Part 1: Multivariate Localization for EnKF

Outline

Part 1: Multivariate Localization for EnKF Part 2: Robust EnKF in the presence of observation outliers

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 4 / 32

slide-5
SLIDE 5

Part 1: Multivariate Localization for EnKF

Motivation

Joint work with Soojin Roh, Istvan Szunyogh, and Marc Genton

◮ Localization: Schur (elementwise) product of Pb and a localization

matrix from a compactly supported correlation function ρ(·)

◮ In statistics, localization function is called “taper” or compactly

supported covariance function; needs to be positive definite

◮ For multivariate state variables, current practice is to apply the

same localization function to each “block” of Pb

◮ Does it not matter? (K = PbHT(HPbH + R)−1) ◮ Kang et al. (2011, JGR) zeroes out covariances between

physically unrelated variables (not about positive-definiteness)

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 5 / 32

slide-6
SLIDE 6

Part 1: Multivariate Localization for EnKF

Motivation

◮ Problem of rank deficiency:

◮ Localization matrix

L L L L

  • ◮ Problem is more serious when Pb

ij’s are “significantly” non-zero

◮ We need

ρ(·) = {ρij(·)}i,j=1,...,N: matrix-valued correlation (positive definite) function, N : number of state variables – multivariate version of Gaspari-Cohn functions?

◮ In statistics literature, not many known such “valid” ρ (parametric)

functions are available yet

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 6 / 32

slide-7
SLIDE 7

Part 1: Multivariate Localization for EnKF

One simple idea

◮ Use ρij(·) = βij · ρ(·) with |βij| < 1, |βji| < 1, and βii = βjj = 1.

1 β β 1

  • is positive-definite and of full rank for any β with |β| < 1.

◮ For ρ, use any localization functions in Gaspari and Cohn (1999). ◮ Example: ρ(d;c) =

8 > < > :

− 1

4 (|d|/c)5+ 1 2 (d/c)4+ 5 8 (|d|/c)3− 5 3 (d/c)2+1,

0≤|d|≤c;

1 12 (|d|/c)5− 1 2 (d/c)4+ 5 8 (|d|/c)3+ 5 3 (d/c)2−5(|d|/c)+4− 2 3 c/|d|,

c≤|d|≤2c; 0, 2c≤|d|

(1) and β11 = β22 = 1, 0 ≤ βij≤1.

◮ This multivariate localization function is separable in the sense

that multivariate component (in the above example, 1 β β 1

  • )

and localization function (in the above example, ρ) are factored

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 7 / 32

slide-8
SLIDE 8

Part 1: Multivariate Localization for EnKF

Another idea

◮ Use one of a few multivariate compactly supported correlation

functions available in statistics literature.

◮ e.g. Bivariate Askey function (Porcu et al. 2012)

ρij(d; ν, c) = βij

  • 1 − d

c ν+µij

+

,

◮ c > 0, µ12 = µ21 ≥ 1

2(µ11 + µ22), ν ≥ [ 1 2s] + 2, and s is space

dimension.

◮ |βij| ≤

Γ(1+µ12) Γ(1+ν+µ12)

  • Γ(1+ν+µ11)Γ(1+ν+µ22)

Γ(1+µ11)Γ(1+µ22)

, βii = βjj = 1

◮ |βij| ≤ 1 if µ11 = µ22.

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 8 / 32

slide-9
SLIDE 9

Part 1: Multivariate Localization for EnKF

10 20 30 40 50 0.0 0.4 0.8 distance covariance

  • Gaspari−Cohn

Askey (ν=1) Askey (ν=2) Askey (ν=3)

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 9 / 32

slide-10
SLIDE 10

Part 1: Multivariate Localization for EnKF

Experiment with bivariate Lorenz Model

◮ Xk and Yj,k are equally spaced on a latitude circle (j = 1, . . . , J

and k = 1, . . . , K).

◮ With boundary conditions Xk±K = XK, Yj,k±K = Yj,k,

Yj−J,k = Yj,k−1, and Yj+J,k = Yj,k+1, dXk dt = −Xk−1(Xk−2 − Xk+1) − Xk − (ha/b)

J

  • j=1

Yj,k + F, dYj,k dt = −abYj+1,k(Yj+2,k − Yj−1,k) − aYj,k + (ha/b)Xk

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 10 / 32

slide-11
SLIDE 11

Part 1: Multivariate Localization for EnKF

Experiment details

◮ True model states generated by a long time step integration of the

model

◮ Initialize ensembles by adding Gaussian noise to the true state ◮ We discard first 3000 time steps ◮ Simulated observations are generated by adding mean zero noise

(variance 0.02 for X and 0.005 for Y) to the truth

◮ 20 ensemble members are used (we tested 40 ensembles as well) ◮ Covariance inflation of 1.015 ◮ RMSE calculated using last 1000 time steps and we repeat 50

times to produce boxplots

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 11 / 32

slide-12
SLIDE 12

Part 1: Multivariate Localization for EnKF

Bivariate Lorenz Model

◮ 36 variables of X, 360 variables of Y, a = 10, b = 10, h = 2

X Y

50 100 150 200 250 300 350 2 4 6 variable state X Y

locations longitudinal profiles

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 12 / 32

slide-13
SLIDE 13

Part 1: Multivariate Localization for EnKF

Experiment set up

◮ Two scenarios for observation

1 Observe 20% of X and 90% of Y at locations where X is not

  • bserved.

2 Fully observe X and Y

◮ Four localization schemes

S1 No localization S2 No localization and let Pb

12 = Pb 21 = 0.

S3 localize Pb

11 and Pb 22, but let Pb 12 = Pb 21 = 0.

S4 localize Pb

11, Pb 22, Pb 12, Pb 21

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 13 / 32

slide-14
SLIDE 14

Part 1: Multivariate Localization for EnKF

Localization (S4)

1 Gaspari-Cohn function: ρij(d; c) = βijρ(d; c), i, j = 1, 2, where

ρ(d;c) =

    

− 1

4 (|d|/c)5+ 1 2 (d/c)4+ 5 8 (|d|/c)3− 5 3 (d/c)2+1,

0≤|d|≤c;

1 12 (|d|/c)5− 1 2(d/c)4+ 5 8(|d|/c)3+ 5 3(d/c)2−5(|d|/c)+4− 2 3 c/|d|,

c≤|d|≤2c; 0, 2c≤|d|

and β11 = β22 = 1, 0 ≤ βij≤1. (support=2c) 2 Bivariate Askey function ρij(d; c) = βij

  • 1 − |d|

c ν+µij

+

, i, j = 1, 2 with µ11 = 0, µ22 = 2, µij = 1, ν = 3, and β11 = β22 = 1, 0 ≤ βij ≤ 0.7. (support=c)

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 14 / 32

slide-15
SLIDE 15

Part 1: Multivariate Localization for EnKF

Results for X in scenario 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 1 2 3 4

support 50

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1 Gaspari−Cohn Askey

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 1 2 3 4

support 70

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 1 2 3 4

support 100

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 1 2 3 4

support 160

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 15 / 32

slide-16
SLIDE 16

Part 1: Multivariate Localization for EnKF

Results for X in scenario 1

◮ S1 vs S2: ignoring cross-covariance is better than not doing

localization

◮ S3 is worse than S1 ◮ S4 with β = 0.01 performs the best regardless of localization

radius

◮ Askey seems to be better than the Gaspari-Cohn function

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 16 / 32

slide-17
SLIDE 17

Part 1: Multivariate Localization for EnKF

Results for Y in scenario 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.1 0.2 0.3 0.4

support 50

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.1 0.2 0.3 0.4

support 70

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.1 0.2 0.3 0.4

support 100

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1 Gaspari−Cohn Askey

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.1 0.2 0.3 0.4

support 160

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 17 / 32

slide-18
SLIDE 18

Part 1: Multivariate Localization for EnKF

Results for Y in scenario 1

◮ Askey clearly performs better ◮ Smaller localization radius is advantageous ◮ S3 is better than S1 or S2 and S4 performs only slightly better

(only for lowest value of support)

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 18 / 32

slide-19
SLIDE 19

Part 1: Multivariate Localization for EnKF

Results for X in scenario 2

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.5 1.0 1.5

support 50

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1 Gaspari−Cohn Askey

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.5 1.0 1.5

support 70

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.5 1.0 1.5

support 100

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.5 1.0 1.5

support 160

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 19 / 32

slide-20
SLIDE 20

Part 1: Multivariate Localization for EnKF

Results for X in scenario 2

◮ Less sensitive to the localization radius compared to Scenario 1 ◮ State estimates are more accurate ◮ S3 and S4 are comparable and better than S1 and S2

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 20 / 32

slide-21
SLIDE 21

Part 1: Multivariate Localization for EnKF

Results for Y in scenario 2

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.1 0.2 0.3 0.4

support 50

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.1 0.2 0.3 0.4

support 70

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • S1

S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.1 0.2 0.3 0.4

support 100

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1 Gaspari−Cohn Askey S1 S2 S3 S4 S4 S4 S4 S4 S4 0.0 0.1 0.2 0.3 0.4

support 160

RMSE

  • 5e−3 1e−2 0.1

0.4 0.7 1

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 21 / 32

slide-22
SLIDE 22

Part 1: Multivariate Localization for EnKF

Results for Y in scenario 2

◮ Best result by Askey with a short localization radius ◮ S3 and S4 perform similarly

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 22 / 32

slide-23
SLIDE 23

Part 1: Multivariate Localization for EnKF

Some issues

◮ More flexible multivariate tapers? (different localization length for

each state variable)

◮ Estimation of “tuning parameters” ◮ Experiment with more realistic system

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 23 / 32

slide-24
SLIDE 24

Part 2: Robust EnKF in the presence of observation outliers

Outline

Part 1: Multivariate Localization for EnKF Part 2: Robust EnKF in the presence of observation outliers

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 24 / 32

slide-25
SLIDE 25

Part 2: Robust EnKF in the presence of observation outliers

Motivation

Joint work with Soojin Roh, Istvan Szunyogh, Marc Genton, and Ibrahim Hoteit (MWR 2013)

◮ Ensemble Kalman filter is known to be not robust to observational

  • utliers (Ruckdeschel 2010, Luo and Hoteit 2011)
  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 25 / 32

slide-26
SLIDE 26

Part 2: Robust EnKF in the presence of observation outliers

Robust Ensemble Kalman Filter (REnKF)

◮ What’s been done in practice: usually discard suspicious

  • bservations.

◮ Huberization (Ruckdeschel 2010)

ˆ xa = ¯ xb + KGc(y − H¯ xb), where for any c ∈ Rp

+ and u ∈ Rp, the Huber function Gc(u) is

(i = 1, . . . , p) {Gc(u)}i =      ui, if |ui| < ci, ci, if ui ≥ ci, −ci, if ui ≤ −ci.

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 26 / 32

slide-27
SLIDE 27

Part 2: Robust EnKF in the presence of observation outliers

Choosing c

1 Efficiency criterion E|x − ¯ xa|2 E|x − ˆ xa|2 = δ, for a given efficiency δ ∈ (0, 1). Denominator = E|

  • x − ¯

xb − (K)iGci

  • H
  • x − ¯

xb + ǫ

  • i |2, where

x − ¯ xb ∼ Nn(0, Pb). 2 Radius criterion E(|(y − H¯ xb)i| − ci)+ : ci = r : (1 − r), for a given radius r ∈ (0, 1). Here, r is a proportion of the amount

  • f clipping in the innovation.

◮ Computing a common c

◮ Use lim

t→∞ Pb t with sufficiently large ensemble size.

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 27 / 32

slide-28
SLIDE 28

Part 2: Robust EnKF in the presence of observation outliers

Effects of Outliers

1 Additive outliers (AO) yt = Hxt + ξt + ǫt, where ξt ∈ Rp is the outlying value. 2 Innovations outliers (IO) ǫt ∼ (1 − α)Np(0, Rt) + αNp(0, kt · Rt), where 0 < α < 1, kt = diag(kt1, . . . , ktp), and some kti > 1.

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 28 / 32

slide-29
SLIDE 29

Part 2: Robust EnKF in the presence of observation outliers

Lorenz model

◮ 40-variable Lorenz 96 model with F = 8 ◮ Observation eq. yt = xt + ǫt with ǫt ∼ N40(0, 0.05 · I40) ◮ Additive outliers ξt = 10 at variable x11 at t = 71 − 73 ◮ Traditional EnKF vs Huberizing vs Discarding δ = 0.999 r = 0.001

20 40 60 80 −5 5 10

time state

20 40 60 80 −5 5 10 15

time state truth EnKF Huberize discard

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 29 / 32

slide-30
SLIDE 30

Part 2: Robust EnKF in the presence of observation outliers

Lorenz model

◮ Additive outliers ξ71 = 10 at variables x11 ◮ Traditional EnKF vs Huberizing vs Discarding various δ various r

  • −10

−5 5 10

bias

0.9999 0.999 0.99 0.985 0.98

  • −15

−10 −5 5 10

bias

0.00001 0.00005 0.001 0.01 0.05

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 30 / 32

slide-31
SLIDE 31

Part 2: Robust EnKF in the presence of observation outliers

Lorenz model

◮ Innovations outliers with k71 = 100 and α = 0.2 at variables x11 ◮ Traditional EnKF vs Huberizing vs Discarding various δ various r

  • −15

−10 −5 5 10 15

bias

0.9999 0.999 0.99 0.985 0.98

  • −10

−5 5 10

bias

0.00001 0.00005 0.001 0.01 0.05

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 31 / 32

slide-32
SLIDE 32

Part 2: Robust EnKF in the presence of observation outliers

Summary

◮ REnKF reduces the estimation bias at the expense of increasing

the error variance.

◮ Huberizing filter performs better than discarding suspect

  • bservations

◮ Multivariate extension

  • M. Jun (TAMU)

Multivariate Localization/Robust Methods for EnKF May 19 32 / 32