Ensemble Kalman Inversion Derivative-Free Optimization Andrew - PowerPoint PPT Presentation

Ensemble Kalman Inversion Derivative-Free Optimization Andrew Stuart Computing and Mathematical Sciences California Institute of Technology Allen Philanthropies, Mission Control for Earth, NSF, ONR, Schmidt Futures. ISDA 2019, Kobe, Japan January 23rd 2019

Overview What Is EKI? EKI For Complex Dynamics EKI With Constraints Conclusions and References

What Is EKI? Anderson, J.L. (2001) Lorentzen, R., K. Fjelde, J. Froyen, A. Lage, G. Naevdal and E. Vefring (2001) Naevdal, G., T. Mannseth and E. Vefring (2002) Skjervheim, J. and G. Evensen (2011) Chen, Y. and D. Oliver (2002) Emerick, A. and A. Reynolds (2013) Evensen, G. (2018)

Ensemble Kalman Filter (Kalman [1], Evensen [2]) State Space Model n ∈ Z + Dynamics Model: v n +1 = Ψ( v n ) + ξ n , n ∈ Z + Data Model: y n +1 = Hv n +1 + η n +1 , Noise Structure: ξ n ∼ N (0 , Σ) , η n ∼ N (0 , Γ) . Sequential Optimization Perspective v ( k ) n +1 = Ψ( v ( k ) n ) + ξ ( k ) n ∈ Z + Predict: � n , n ( v ) = 1 n +1 ) � 2 + 1 − 1 2 � Γ − 1 v ( k ) 2 ( y ( k ) J ( k ) 2 � � n +1 − Hv ) � 2 Model/Data Compromise: C n +1 ( v − � 2 v ( k ) n +1 = argmin v J ( k ) Optimize: n ( v ) . � v ( k ) C n +1 is empirical covariance of the { � n +1 } .

Ensemble Kalman Inversion Standard Inverse Problem Formulation Find θ from y where G : U �→ Y , η is noise and y = G( θ ) + η, η ∼ N(0 , Γ) . State Space Estimation Formulation (Oliver et al [3], Iglesias et al [4]) n ∈ Z + Dynamics Model: θ n +1 = θ n , n ∈ Z + Dynamics Model: w n +1 = G( θ n ) , n ∈ Z + Data Model: y n +1 = w n +1 + η n +1 , Employ Ensemble Kalman Filter with y n +1 ≡ y .

Ensemble Kalman Inversion Standard Inverse Problem Formulation Find θ from y where G : U �→ Y , η is noise and y = G( θ ) + η, η ∼ N(0 , Γ) . State Space Estimation Formulation v = ( θ, w ), Ψ( v ) = ( θ, G( θ )) and H = (0 , I ): n ∈ Z + Dynamics Model: v n +1 = Ψ( v n ) + ξ n , n ∈ Z + Data Model: y n +1 = Hv n +1 + η n +1 , Employ Ensemble Kalman Filter with y n +1 ≡ y .

Ensemble Kalman Inversion EKI: Discrete Time + Γ) − 1 � � θ ( j ) n +1 = θ ( j ) n + C θ G n ( C GG y ( j ) − G ( θ ( j ) θ ( j ) (0) = θ ( j ) n ) , 0 . n n Let Γ �→ h − 1 Γ, θ ( j ) n ≈ θ ( j ) ( nh ) and h → 0: EKI: Continuous Time Limit � G , G ( θ ( j ) ) − y ( j ) � � J θ ( j ) = − 1 ˙ G ( θ ( k ) ) − ¯ Γ θ ( k ) , θ ( j ) (0) = θ ( j ) 0 . J k =1

Electrical Impedance Tomography (EIT) 1. Chada et al [5] Forward Problem Given ( κ, I ) ∈ L ∞ ( D ; R + ) × R m find ( ν, V ) ∈ H 1 ( D ) × R m : e l −∇ · ( κ ∇ ν ) = 0 ∈ D , ∂D ν + z ℓ κ ∇ ν · n = V ℓ ∈ e ℓ , ℓ = 1 , . . . , m , ∈ ∂ D \ ∪ m ∇ ν · n = 0 ℓ =1 e ℓ , � κ ∇ ν · n ds = I ℓ ∈ e ℓ , ℓ = 1 , . . . , m . D Ohm’s Law: V = R ( κ ) × I . Inverse Problem Set κ = exp( u ) . Given a set of K noisy measurements of voltage V ( k ) from � � currents I ( k ), and G k ( u ) = R exp( u ) × I ( k ), find u from y where: η ∼ N(0 , γ 2 ) , y ( k ) = G k ( u ) + η, k = 1 , . . . , K .

EIT 2 5 4.5 4 3.5 3 2.5 2 1.5 1 Figure: True Conductivity. Parameterization ◮ Continuous level set function. ◮ Lengthscale of level set function. ◮ Smoothness of level set function.

EIT 3 Figure: Five succesive iterations: level set function u .

EIT 4 Figure: Five succesive iterations: thresholded level set function v = S ( u ).

EKI For Complex Dynamics collaboration with: Cleary, Garbuno-Inigo, Lan, Schneider (2019)

Data From Dynamics Parameter-Dependent Dynamics du dt = F ( u ; θ ) , u (0) = u 0 . Time-Averaged Data � T � � y = G T ( θ ; u 0 ) = 1 ϕ u ( t ) dt . T 0 Central Limit Theorem 1 √ G T ( θ ; u 0 ) = G ( θ ) + N (0 , Σ) , T 1 y = G ( θ ) + √ N (0 , Σ) . T

Example 1 – Lorenz ’63 Governing Dynamics x = 10 ( y − x ) ˙ y = r x − y − x z ˙ z = x y − b z ˙ ◮ 2-dimensional unknown: θ = [ r , b ] ⊤ ◮ G only available to us approximately via G T . ◮ η only available to us approximately via G T .

Example 1 – Lorenz ’63 EKI: Continuous Time Limit � G , G ( θ ( j ) ) − y ( j ) � � J θ ( j ) = − 1 ˙ G ( θ ( k ) ) − ¯ Γ θ ( k ) , θ ( j ) (0) = θ ( j ) 0 J k =1 ◮ ϕ ( · ) is the vector of first and second order moments. ◮ The observation y is simulated from a truth θ ∗ ◮ θ ∗ = [28 , 8 / 3] ◮ Each y ( j ) i . i . d ∼ N ( y , Γ) ◮ G only available to us approximately via G T .

Example 1 – Lorenz ’63 7 30 10 6 25 10 5 20 10 4 15 10 3 10 10 2 5 10 1 0 10 0 5 10 15 20 25 30

Example 1 – Lorenz ’63 Objective Function 1500 1500 1250 1250 G ( )| 2 G ( )| 2 1000 1000 750 750 2 | y 2 | y 500 500 1 1 250 250 0 0 25 26 27 28 29 30 2.0 2.5 3.0 3.5 4.0 r b ◮ Marginals of the objective function Φ( u ) = 1 2 � y − G ( θ ) � 2 . ◮ In blue, noisy evaluations of the misfit Φ( θ ) ◮ In orange: the GP mean from emulation of log Φ. ◮ In shaded orange: is the GP 95% uncertainty bands. ◮ k ( θ, θ ′ ) = σ 2 exp( −|� ℓ, θ − θ ′ �| 2 ) + δ 2 .

Example 1 – Lorenz ’63 Bayesian Inversion 0.120 0.021 28.1 0.105 28.1 0.018 0.090 0.015 28.0 28.0 0.075 0.012 0.060 r r 0.009 27.9 27.9 0.045 0.006 0.030 27.8 27.8 0.003 0.015 0.000 0.000 2.60 2.65 2.70 2.60 2.65 2.70 b b (a) MCMC using noisy Φ( θ ) (b) MCMC using GP Figure: MCMC-RWM Sample Density Plots MCMC-RWM: Comparisons using ∼ 5 × 10 4 samples ◮ Noisy Φ: 1% acceptance rate; not converged. ◮ GP Φ: 30% acceptance rate; converged.

Example 2 – Simplified Betts-Miller Scheme (’86, ’07, ’08) Forward Model ∂ q ∂ t + v · ∇ q = − q − q ref ( T ; θ ) Moisture Conservation: τ q ( q , T ; θ ) ∂ T ∂ t + v · ∇ T = T − T ref ( q , T ; θ ) Energy Conservation: + RAD + ... τ T ( q , T ; θ ) ◮ Coupled with mass and momentum conservation equations. ◮ Model computes precipitation rates; initially P q � = P T . ◮ Adjusts reference profiles and/or timescales until P q = P T . ◮ Enthalpy constraints imposed. ◮ O (10 5 ) unknowns ◮ Employs 2 unknown parameters: ◮ θ RH : reference relative humidity ◮ θ τ : default relaxation timescale

Example 2 – Betts-Miller EKI: Discrete Time + Γ) − 1 � � θ ( j ) n +1 = θ ( j ) y ( j ) − G ( θ ( j ) θ ( j ) (0) = θ ( j ) n + C θ G n ( C GG n ) , 0 . n n ◮ ϕ ( θ ) = [ � RH ( q , T ; θ ) � , � P ( q , T ; θ ) � , F ( q , T ; θ )] ⊤ ◮ RH : relative humidity ◮ P : daily precipitation rate ◮ �·� : denotes longitudinal average ◮ F : longutudinal probability of extreme precipitation (90th %ile) ◮ ϕ ( θ ) is evaluated over all latitudes at fixed altitude of 5 km ◮ ϕ ( θ ) ∈ R 96

Example 2 – Betts-Miller

Example 2 – Betts-Miller Objective Function 300 500 250 400 G ( )| 2 G ( )| 2 200 300 150 2 | y 2 | y 200 1 1 100 100 50 0.600 0.625 0.650 0.675 0.700 0.725 1 2 3 4 [hrs] RH Figure: GP trained from EKI data In orange, the GP mean from emulation of Φ.

Example 2 – Betts-Miller MCMC 2.3 0.016 2.3 0.024 2.2 0.014 2.2 0.021 2.1 0.012 2.1 0.018 2.0 0.010 2.0 0.015 [hrs] [hrs] 1.9 0.008 1.9 0.012 1.8 0.006 1.8 0.009 1.7 0.004 1.7 0.006 1.6 0.002 1.6 0.003 1.5 0.000 1.5 0.000 0.690 0.695 0.700 0.705 0.710 0.690 0.695 0.700 0.705 0.710 RH RH (a) GP from uniform samples (b) GP from EKI Figure: MCMC Sample density plot RWM with ∼ 10 6 iterations

EKI With Constraints collaboration with Albers, Blancquart, Levine, Esmaeilzadeh-Seylabi [6] Wang D, Chen Y and Cai X 2009 Janji´ c T, McLaughlin D, Cohn S E and Verlaan M 2014

Constrained Ensemble Kalman Filter State Space Model n ∈ Z + Dynamics Model: v n +1 = Ψ( v n ) + ξ n , n ∈ Z + Data Model: y n +1 = Hv n +1 + η n +1 , Noise Structure: ξ n ∼ N (0 , Σ) , η n ∼ N (0 , Γ) . Sequential Optimization Perspective v ( k ) n +1 = Ψ( v ( k ) n ) + ξ ( k ) n ∈ Z + Predict: n , � n ( v ) = 1 n +1 ) � 2 + 1 − 1 2 � Γ − 1 J ( k ) v ( k ) 2 ( y ( k ) n +1 − Hv ) � 2 2 � � Model/Data Compromise: n +1 ( v − � 2 C v ( k ) n +1 = argmin v ∈A J ( k ) Optimize: n ( v ) Constraint Set: A = { v : Fv = f , Gv � g } .

Example 3 – Seismology Governing Dynamics � � − ∂ 2 d ( z , t ) ∂ c 2 ( z ; θ ) ∂ d ( z , t ) = 0 , ∂ z ∂ z ∂ t 2 d ( H , t ) = d 0 ( t ) , ∂ d (0 , t ) /∂ z = 0 , d ( z , 0) = 0 , ∂ d ( z , 0) /∂ t = 0 . Observation Operator G ( θ ) = ∂ 2 d (0 , t ) /∂ t 2 .

Example 3 – Seismology Parameterization  c 0 0 ≤ z ≤ z 0   � � n  c 0 1 + k ( z − z 0 ) z 0 ≤ z ≤ z 1 c ( z ) = .  � � n   α c 0 1 + k ( z 1 − z 0 ) z 1 ≤ z ≤ H θ = ( c 0 , k , z 0 , n , z 1 , α ) . Constraints 0 ≤ c 0 ≤ 1000 , 0 ≤ k ≤ 100 , 0 ≤ z 0 ≤ z 1 , 0 ≤ n ≤ 1 , z 0 ≤ z 1 ≤ H , 1 ≤ α ≤ 10 .

Example 3 – Seismology Evolution of Ensemble

Example 3 – Seismology Percentage Of Violations c s 0 < 0 60 k < 0 50 z 0 < 0 n < 0 40 z 0 > z 1 , < 1 30 c s 0 > 1000 k > 100 20 n > 1 10 z 1 > H , > 10 0 5 10 15 20 Iteration

Example 3 – Seismology Velocity 0 u y u j =0 7 -0.2 7 u j =40 -0.4 z=H -0.6 -0.8 -1 10 2 10 4 c s (m/s)

Ensemble Kalman Inversion Derivative-Free Optimization Andrew - PowerPoint PPT Presentation

Ensemble Kalman Inversion Derivative-Free Optimization Andrew Stuart Computing and Mathematical Sciences California Institute of Technology Allen Philanthropies, Mission Control for Earth, NSF, ONR, Schmidt Futures. ISDA 2019, Kobe, Japan

Multimodality in the Kalman Filter and Ensemble Kalman Filter Maxime Conjard, Henning Omre

Recursive State Estimation 2 Lecture 8 Recap Today Kalman Filter Extended Kalman Filter

Progress Report of Local Ensemble Kalman Progress Report of Local Ensemble Kalman Filter/fvGCM

ECE 516: Adaptive Digital Filters Lecture 8 (Kalman Filtering) Mojtaba Soltanalian Kalman

UNSCENTED KALMAN FILTER UNSCENTED KALMAN FILTER MATTHIEU BLOCH April 21, 2020 1 / 9 RECAP:

Kalman filter Kalman Filter Kalman filter is used to filter true system states from noisy

The Kalman Filter (part 1) Administrative Stuff Rudolf Emil Kalman

Linear ensemble transform filters: A unified perspective on ensemble Kalman and particle filters

Convergence of ensemble Kalman filters in the large ensemble limit and infinite dimension Jan

THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a gentle introduction to the Kalman

Boosting (ensemble) Module 4 - Ensemble classifiers - Objectives module 4: boosting (ensemble

Strengthening the inversion Tactic in Coq Dependent Types Inversion Lemmas Implications Anne

S E I S C O P E Using the Ensemble Transform Kalman Filter to estimate uncertainty in Full

Derivative-Free Methods for Machine Learning Tasks Inverse Problem Formulations The Ensemble

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Using the Mixture Kalman Filter to Track a Hidden State in Changepoint Models Sarah Oscroft

Multivariate generalizations of the Foata-Sch utzenberger equidistribution Fourth Colloquium

Approximate Neumann Series or Exact Matrix Inversion for Massive MIMO? Oscar Gustafsson, Erik

New Ideas about Traditional Approach . . . Joint Inversion First Idea: in Brief Details: A

Invertible Convolutional Flow M. Karami , J. Sohl-Dickstein, D. Schuurmans, L. Dinh, D. Duckworth

Probabilistic dipole inversion for adaptive quantitative susceptibility mapping Jinwei Zhang 1,2

Frequency domain electromagnetic data inversion Caterina Fenu Joint work with Gian Piero Deidda

IN3170/4170, Spring 2020 Philipp Hfliger hafliger@ifi.uio.no Excerpt of Sedra/Smith Chapter 5:

Correcting errors in a matrix inverse (or product) Daniel S. Roche Computer Science Department

Sambuz

Useful Links

Newsletter

Mail Us