Optimal transport for Seismic Imaging Bjorn Engquist In - PowerPoint PPT Presentation

Optimal transport for Seismic Imaging Bjorn Engquist In collaboration with Brittany Froese and Yunan Yang ICERM Workshop - Recent Advances in Seismic Modeling and Inversion: From Analysis to Applications, Brown University, November 6-10, 2017

Outline 1. Remarks on Full Waveform Inversion (FWI) 2. Measures of mismatch 3. Optimal transport and Wasserstein metric 4. Monge-Ampère equation and its numerical approximation 5. Applications to full waveform inversion 6. Conclusions “Background: matching problem and technique”

1. Remarks on Full Waveform Inversion (FWI) • Full Waveform Inversion is an increasingly important technique in the inverse seismic imaging process • It is a PDE constrained optimization formulation • Model parameters v are determined to fit data ( ) min u comp ( m ) − u data A + λ Lv B m ( x )

FWI: PDE constrained optimization • FWI: Measured and processed data is compared to a computed wave field based on model parameters v to be determined (for example, P- wave velocity) ( ) min u comp ( m ) − u data A + λ Lv B Over determined m ( x ) boundary conditions • || . || A measure of mismatch at the surface – L 2 the standard choice • || Lc || B potential regularization term, which we will omit for this presentation • at the surface

Mathematical and computational challenges • Important computational steps • Relevant measure of mismatch ( ✔ ) • Fast wave field solver – In our case scalar wave equation in time or frequency domain, for example u tt = m ( x ) 2 Δ u , • Efficient optimization – Adjoint state method for gradient computation

2. Measures of mismatch • We will denote the computed wave field by f( x , t ; v ) and the data by g ( x , t ), u comp ( x , t ; m ) = f ( x , t ; m ), u data ( x , t ) = g ( x , t ) • The common and original measure of mismatch between the computed signal f and the measured data g is L 2 , [Tarantola, 1984, 1986] min f m − g L 2 m • We will make some remarks on different Measures of mismatch starting with local estimates to more global

Global minimum • It can be expected that the mismatch functional will have local minima that complicates minimization algorithms • Ideally, local minima different from the global min should be avoided for some natural parameterizations as “shift” and “dilations” ( f ( t ) = g ( at - s )) • Shift as a function of t , dilation as a function of x u tt = m 2 u xx , x > 0, t > 0 u (0, t ) = u 0 ( t ) → u = u 0 ( t − x / m )

Local measures • In the L 2 local mismatch, estimators f and g are compared point wise, 1/2 ( ) 2 ∑ J ( v ) = f − g L 2 = f ( x i , t j ) − g ( x i , t j ) i , j • This works well if the starting values for the model parameters are good otherwise there is risk for trapping in local minima “cycle skipping”

“Cycle skipping” • The need for better mismatch functionals can be seen from a simple shift example – small basin of attraction • For other examples, [Vireux et al 2009] L 2 2 “Cycle skipping” Shift or displacement Local minima

Global measures • Different measures have been introduced to to compare all of f and g – not just point wise. • Integrated functions – NIM [Liu et al 2014] – [Donno et al 2014] • Stationary marching filters – Example AWI, [Warner et al 2014] • Non-stationary marching filters – Example [Fomel et al 2013] • Measures based on optimal transport ( ✔ )

Integrated functions • f and g are integrated, typically in 1D-time, before L 2 comparison 1/2 ( ) 2 ∑ J = F − G L 2 = F ( x i , t j ) − G ( x i , t j ) , i , j j j ∑ ∑ F f ( x i , t k ), G i , j = g ( x i , t k ), i , j = k = 1 k = 1 • In mathematical notation this is the H -1 semi-norm • Slight increase in wave length for short signals (Ricker wavelet) • Often applied to modified signals like squaring scaling or envelope to have f and g positive and with equal integral

Matching filters • The filter based measures typically has two steps – Computing filter coefficients K K = argmin K ∗ f − g L 2 – Estimation of difference between computed filter and the identity map. K − I • The filter can be stationary or non-stationary • The optimal transport based techniques Does this in one step – Minimization is of a measure of transform K or as it is called transport

3. Optimal transport and Wasserstein metric • Wasserstein metric measures the “ cost ” for optimally transport one measure (signal) f to the other: g – Monge-Kantorivich optimal transport measure “earth movers distance” in computer science f ( x ) g(y) Compare travel time distance Classic in seismology

Optimal transport and Wasserstein metric • The Wasserstein metric is directly based on one cost function • Signals in exploration seismology are not as clean as above and a robust functional combining features of L 2 and travel time is desirable • Extensive mathematical foundation f ( x ) g ( y )

Wasserstein distance 1/ p ⎛ ⎞ d ( x , y ) p d γ ( x , y ) ∫ W p ( f , g ) = inf ⎜ ⎟ γ ⎝ ⎠ X × Y γ ∈ Γ⊂ X × Y , the set of product measure : f and g ∫ ∫ f ( x ) dx = g ( y ) dy , f , g ≥ 0 X Y 1/2 ⎛ ⎞ 2 f ( x ) dx ∫ W 2 ( f , g ) = inf x − T f , g ( x ) ⎜ ⎟ 2 T f , g ⎝ ⎠ X • Here the “plan” T is the optimal transport map from positive Borel measures f to g of equal mass • Well developed mathematical theory, [Villani, 2003, 2009]

Wasserstein distance s “distance” = s 2 (mass) f g

Wasserstein distance s f g • In this model example W 2 and L 2 is equal (modulo a constant) to leading order when separation distance s is small. Recall L 2 is the standard measure

Wasserstein distance s f g • When s is large W 2 = s = travel distance (time), ( “ higher frequency ” ), L 2 independent of s • Potential for avoiding cycle skipping

Wasserstein distance vs L 2 • Fidelity measure “Cycle skipping” Local minima L 2 2 Function of displacement

Wasserstein distance vs L 2 • Fidelity measure L 2 2 W 2 2

Wasserstein distance vs L 2 • Fidelity measure This is the basic motivation for exploring Wasserstein metric to measure the misfit Local min are well known problems L 2 2 W 2 2

Wasserstein distance vs L 2 • Fidelity measure We will see that there are hidden difficulties in making this work in practice Normalized signal in right frame L 2 2 W 2 2

Analysis • Theorem 1: W 2 2 is convex with respect to translation, s and dilation, a, 2 ( f , g )[ α , s ], f ( x ) = g ( ax − s ) α d , a > 0, x , s ∈ R d W 2 • Theorem 2: W 2 2 is convex with respect to local amplitude change , λ # g ( x ) λ , x ∈ Ω 1 % 2 ( f , g )[ β ], f ( x ) = W 2 β ∈ R , Ω = Ω 1 ∪Ω 2 $ β g ( x ) λ , x ∈ Ω 2 % & ( ) ∫ ∫ ∫ gdx / gdx gdx λ = + β Ω Ω 1 Ω 2 • ( L 2 only satisfies 2 nd theorem)

Remarks • The scalar dilation ax can be generalized to Ax where A is a positive definite matrix. Convexity is then in terms of the eigenvalues • The proof of theorem 1 is based on c-cyclic monotonicity { ( ) } ∈ Γ → ∑ ( ) ∑ ( ) x j , x j c x j , x j c x j , x σ ( j ) ≤ j j • The proof of theorem two is based on the inequality 2 ( sf 1 + (1 − s ) f 2 , g ) ≤ sW 2 2 ( f 1 , g ) + (1 − s ) W 2 2 ( f 2 , g ) W 2

Illustration: discrete proof (theorem 1) • Equal point masses then weak limit for generl theorem alternative to using the c-cyclic propery

Illustration: discrete proof 2 J 2 = min ∑ ( ) W 2 x ο j − ( x j − s ξ ) σ : permutation = σ j = 1 $ ' 2 J J 2 ( ) ∑ ∑ & ) min x ο j − x j − 2 s x ο j − x j ⋅ ξ + J s ξ ) = & σ % ( j = 1 j = 1 $ ' 2 J J J 2 ∑ ∑ ∑ & ) min x ο j − x j + J s ξ ) , from x ο j x j = & σ % ( j = 1 j = 1 j = 1 → x ο j = x j → σ j = j

Noise • W 2 2 less sensitive to noise than L 2 • Theorem 3: f = g + δ , δ uniformly distributed uncorrelated random noise, ( f > 0), discrete i.e. piecewise constant: N intervals 2 = O (1), 2 f − g ) = O ( N − 1 ) ( f − g L 2 W 2 ( ) f = f 1 , f 2 ,.., f J • Proof by “domain decomposition” dimension by dimension and standard deviation estimates using closed 1D formula

Computing the optimal transport • In 1D, optimal transport is equivalent to sorting with efficient numerical algorithms O( N log( N )) complexity, N data points ( ) dy ∫ F − 1 ( y ) − G − 1 ( y ) W 2 ( f , g ) = x x ∫ ∫ F ( x ) = f ( ξ ) d ξ , g ( x ) = g ( ξ ) d ξ • In higher dimensions such combinatorial methods as the Hungarian algorithm are very costly O( N 3 ), Alternatives: linear programming, sliced Wasserstein, ADMM

Computing of optimal transport • For higher dimensions fortunately the optimal transport related to W 2 can be solved via a Monge-Ampère equation [Brenier 1991, 1998] 1/2 ⎛ ⎞ 2 f ( x ) dx ∫ W 2 ( f , g ) = x − ∇ u ( x ) 2 ⎜ ⎟ ⎝ ⎠ X det D 2 ( u ) ( ) = f ( x ) / g ( ∇ u ( x )) Brenier map T ( x ) = ∇ u ( x ) • Recently there are now alternative PDF formulations

Optimal transport for Seismic Imaging Bjorn Engquist In - PowerPoint PPT Presentation

Optimal transport for Seismic Imaging Bjorn Engquist In collaboration with Brittany Froese and Yunan Yang ICERM Workshop - Recent Advances in Seismic Modeling and Inversion: From Analysis to Applications, Brown University, November 6-10, 2017

Seismic Design Seismic Design Seismic Design Seismic Design General Requirements General

Nuclear Imaging Medical Imaging Medical Imaging Nuclear Imaging Nuclear Imaging Nuclear

Seismic Re-Engineering of the VMT TOEM Committee RCAC Staff Seismic Hazards Safety Commission

PACIFIC ISLAND REGION SEISMIC NETWORK WITH EMPHASIS ON SAMOA SEISMIC MONITORING. FAATALI

Martingale Optimal Transport in Higher Hadrien De March Dimension Optimal transport

Seismic inverse scattering by reverse time migration Chris Stolk University of Amsterdam RICAM

A seismic reflection imaging workflow based on the Common-Reflection-Surface (CRS) stack:

Compressive seismic imaging with simultaneous acquisition Felix J. Herrmann*

SEISMIC CONTROL MEASURES By Richard Lvesque, ing., P.Eng., LEED AP Outline What and Why of

Wide-geometry 3D land seismic Wide-geometry 3D land seismic acquisition in limited channel-count

Seismic Response and Seismic Response and Capacity Evaluation of Exterior Capacity Evaluation of

Seismic Assessment of Buildings What Happens After the Council Risk Assessment? What are ISAs,

ET: Underground Seismic Array and Seismic Noise Gaetano De Luca ET WP1 meeting (Ver.

Uncertainty of anticipation of seismic intensities Uncertainty of anticipation of seismic

Introduction to Seismic Essentials in Groningen 7.3 Timber Structures By Prof Andr Jorissen

Introduction to Medical Imaging Dr Kevin Ho-Shon Head of Medical Imaging Macquarie Medical

Analytic Combinatorics A Calculus of Discrete Structures Philippe Flajolet INRIA

History and goals of NLU; course plan and goals Bill MacCartney and Christopher Potts CS 244U:

Tricks for Statistical Semantic Tricks for Statistical Semantic Knowledge Discovery: Knowledge

Cambridge Cybercrime Centre Richard Clayton Director San Diego 27 th February 2020 My

Classification of Rare Recipes Requires Linguistic Features as Special Ingredients Elham

!"#$%&'()$+,--,.($ !"#$%&'()&"+,-%.**&(

Reconstruction and pattern recognition via the Petitot model J.P. Gauthier, U. Boscain, Dario

s r Michael Charleston and Zoltn Szab

Sambuz

Useful Links

Newsletter

Mail Us

Optimal transport for Seismic Imaging Bjorn Engquist In - PowerPoint PPT Presentation

Optimal transport for Seismic Imaging Bjorn Engquist In collaboration with Brittany Froese and Yunan Yang ICERM Workshop - Recent Advances in Seismic Modeling and Inversion: From Analysis to Applications, Brown University, November 6-10, 2017

Seismic Design Seismic Design Seismic Design Seismic Design General Requirements General

Nuclear Imaging Medical Imaging Medical Imaging Nuclear Imaging Nuclear Imaging Nuclear

Seismic Re-Engineering of the VMT TOEM Committee RCAC Staff Seismic Hazards Safety Commission

PACIFIC ISLAND REGION SEISMIC NETWORK WITH EMPHASIS ON SAMOA SEISMIC MONITORING. FAATALI

Martingale Optimal Transport in Higher Hadrien De March Dimension Optimal transport

Seismic inverse scattering by reverse time migration Chris Stolk University of Amsterdam RICAM

A seismic reflection imaging workflow based on the Common-Reflection-Surface (CRS) stack:

Compressive seismic imaging with simultaneous acquisition Felix J. Herrmann*

SEISMIC CONTROL MEASURES By Richard Lvesque, ing., P.Eng., LEED AP Outline What and Why of

Wide-geometry 3D land seismic Wide-geometry 3D land seismic acquisition in limited channel-count

Seismic Response and Seismic Response and Capacity Evaluation of Exterior Capacity Evaluation of

Seismic Assessment of Buildings What Happens After the Council Risk Assessment? What are ISAs,

ET: Underground Seismic Array and Seismic Noise Gaetano De Luca ET WP1 meeting (Ver.

Uncertainty of anticipation of seismic intensities Uncertainty of anticipation of seismic

Introduction to Seismic Essentials in Groningen 7.3 Timber Structures By Prof Andr Jorissen

Introduction to Medical Imaging Dr Kevin Ho-Shon Head of Medical Imaging Macquarie Medical

Analytic Combinatorics A Calculus of Discrete Structures Philippe Flajolet INRIA

History and goals of NLU; course plan and goals Bill MacCartney and Christopher Potts CS 244U:

Tricks for Statistical Semantic Tricks for Statistical Semantic Knowledge Discovery: Knowledge

Cambridge Cybercrime Centre Richard Clayton Director San Diego 27 th February 2020 My

Classification of Rare Recipes Requires Linguistic Features as Special Ingredients Elham

!&quot;#$%&amp;'()*$+,--,.($ !&quot;#$%&amp;'()&amp;*&quot;+,-%.**&amp;(

Reconstruction and pattern recognition via the Petitot model J.P. Gauthier, U. Boscain, Dario

s r Michael Charleston and Zoltn Szab

Sambuz

Useful Links

Newsletter

Mail Us

!"#$%&'()$+,--,.($ !"#$%&'()&"+,-%.**&(