Inverse Modeling with the aid of Surrogate Models Dongxiao Zhang, - - PowerPoint PPT Presentation

inverse modeling with the aid of surrogate models
SMART_READER_LITE
LIVE PREVIEW

Inverse Modeling with the aid of Surrogate Models Dongxiao Zhang, - - PowerPoint PPT Presentation

Inverse Modeling with the aid of Surrogate Models Dongxiao Zhang, Qinzhuo Liao, Haibin Chang College of Engineering Peking (Beijing) University dxz@pku.edu.cn The 2017 EnKF Workshop Inverse modeling Estimate parameters from physical


slide-1
SLIDE 1

Inverse Modeling with the aid of Surrogate Models

Dongxiao Zhang, Qinzhuo Liao, Haibin Chang College of Engineering Peking (Beijing) University dxz@pku.edu.cn The 2017 EnKF Workshop

slide-2
SLIDE 2

Inverse modeling

  • Estimate parameters from physical models and observations

2

  • utput observation: s

input parameter: θ model: f s = f(θ) + e Input

  • conductivity
  • porosity
  • boundary condition
  • source & sink

Output

  • hydraulic head
  • velocity/flux
  • phase saturation
  • solute concentration

Model

  • groundwater supply
  • contaminant control
  • oil and gas production
  • CO2 sequestration
slide-3
SLIDE 3

Stochastic approach

  • Bayesian inference
  • Markov chain Monte Carlo

− Use Monte Carlo simulations to construct a Markov chain − Computationally expensive: repeated evaluations of the forward model

  • Surrogate model

− Can generate a large number of samples at low cost − Posterior error depends on forward solution error

3

( ) ( ) ( | ) ( | ) ( ) s f e p p s p s p s       

( ): prior density ( | ): posterior density ( | ):likelihood function ( ): normalization factor p p s p s p s   

slide-4
SLIDE 4

Forwar ward d Stochastic

  • chastic Formulation

ulation

  • SPDE:

which has a finite (random) dimensionality.

  • Weak form solution:

where

1 2

where ( , ,..., )T

N

    ξ

( ; , ) ( , ), , , L u x g x P x D    ξ ξ ξ

ˆ ( ; , ) ( ) ( ) ( , ) ( ) ( )

P P

L u x w p d g x w p d         

 

ˆ( , ) , where trial function space u x V V   

( ) , where test (weighting) function space w W W   

( ) probability density function of ( ) p    ξ

slide-5
SLIDE 5

Forwar ward d Stochastic

  • chastic Metho

ethods ds

  • Galerkin polynomial chaos expansion (PCE) [e.g., Ghanem and

Spanos, 1991]:

  • Probabilistic collocation method (PCM) [Tatang et al., 1997;

Sarma et al., 2005; Li and Zhang, 2007, 2009]:

  • Stochastic collocation method (SCM) [Mathelin et al., 2005;

Xiu and Hesthaven, 2005; Chang and Zhang, 2009]:

   

1 1

( ) , ( )

M M i i i i

V span W span

 

     

   

1 1

( ) , ( )

M M i i i i

V span W span

 

       

   

1 1

( ) , ( )

M M i i i i

V span L W span

 

      

1

where { ( )} lagrange interpolation basis

M i i

L

 

 

1

where ( )

  • rthogonal polynomials

M i i

  

slide-6
SLIDE 6

Key y Components ponents fo for Stochastic

  • chastic Method

ethods

  • Random dimensionality of underlying stochastic

fields

  • How to effectively approximate the input random fields

with finite dimensions

  • Karhunen-Loeve and other expansions may be used
  • Trial function space
  • How to approximate the dependent random fields
  • Perturbation series, polynomial chaos expansion, or

Lagrange interpolation basis

  • Test (weighting) function space
  • How to evaluate the integration in random space?
  • Intrusive or non-intrusive schemes?
slide-7
SLIDE 7

Stochastic collocation method (SCM)

  • Based on polynomial interpolations in random space
  • Collocation points: Smolyak sparse grid algorithm
  • Converge fast in case of smooth functions

7

 

 

1

1

1 ( , ) 1 ... is univariate interpolation

N

q i i i q N i q

N q N U U q i U

    

             

   

1 1 1

is a set of nodes in

  • dimensional random space

( ) ( ) ( ) ( ) , ( ) ,1 ,

M N i i M i i i M j i i j ij j i j j i

M N f f L L L i j M           

   

         

 

Xiu and Hesthaven (2005) Xiu and Hesthaven (2005) Chang & Zhang (2009) Lin & Tartakovsky (2009)

slide-8
SLIDE 8

Choices ices of C f Colloca

  • cation

tion Points ts

  • Tensor product of one-dimensional nodal sets
  • Smolyak sparse grid (level: k=q-N)
  • Tensor product vs. level-2 sparse grid
  • N=2, 49 knots vs. 17 (shown right)
  • N=6, 117,649 knots vs. 97
  • 3
  • 2
  • 1

1 2 3

  • 3
  • 2
  • 1

1 2 3

  • 3
  • 2
  • 1

1 2 3

  • 3
  • 2
  • 1

1 2 3

Each dimension: knots dimension:

N

m N M m  For N>1, preserving interpolation property of N=1 with a small number

  • f knots
slide-9
SLIDE 9 1 2 3 4 5 6 7 8 9 10 5 5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8 7

2nd order PCM: 28 representations,  = 4.0, Y

2 = 1.0

x Head, h

MCS S vs. . PCM/S M/SCM CM

1 2 3 4 5 6 7 8 9 10 5 5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8 7

MC: 1000 realizations  = 4.0, Y

2 = 1.0

x Head, h

PCM/SCM:

  • Structured sampling

(collocation points)

  • Non-equal weights for hj

(representations) MCS:

  • Random sampling of

(realizations)

  • Equal weights for hj

(realizations)

slide-10
SLIDE 10

Stochastic collocation method

  • Inaccurate results

− Non-physical realizations/Gibbs oscillation − Inaccurate statistical moments and probability density functions

10

Zhang et al. (2010) Lin & Tartakovsky (2009)

slide-11
SLIDE 11

Stochastic collocation method

  • Inaccurate results

− When: advection dominated (Pe = 100)  low regularity − Why: physical space  random space

  • Illustration

− Unit mass instantaneously released at x = 0, t = 0 − Input parameter: conductivity k = exp(0.3θ), θ ~ N(0,1) − Output response: concentration c at x = 0.3, t = 1

11

slide-12
SLIDE 12

Transformed stochastic collocation method

  • Stochastic collocation method (SCM)

− Approximate s as a function of θ at fixed x and t

  • Transformed stochastic collocation method (TSCM)

− Approximate x as a function of θ for a given s at fixed t − Approximate t as a function of θ for a given s at fixed x

12

( , ; ) s t  x

( , ; ) s t  x ( , ; ) t s  x

Liao & Zhang (WRR, 2016)

slide-13
SLIDE 13

1D example

  • Continuous injection

− Input parameter: conductivity k = exp(0.3θ), θ ~ N(0,1), − Output response: concentration c at x = 0.3, t = 1

13

slide-14
SLIDE 14

1D example

  • Forward solution approximation and posterior approximation

14

true observation c = 0.842

  • bservation error e ~ N(0, 0.01)

true parameter θ = 0.2

slide-15
SLIDE 15

1D example

  • Convergence rate

15

   

2

2 1

surrogate model: ( ) ( ) ( ) error: ( ) ( ) ( ) 0, ( ) poster Kullback-Leibler diverg ior: ( ) ( | ) : || ( )log 0, ( ) ence

M M i i M M L i M M M

s f L s s s s p d M p s D d M                   

           

  

Marzouk & Xiu (2009)

slide-16
SLIDE 16

2D example

  • Assume conductivity field is known
  • Top and bottom are no-flow boundaries , right head h2=0
  • One instantaneous release location (circle), four observation wells (triangles)
  • Input parameters: release time t0∈[0,20], mass m0∈[1,2], left head h1∈[3,10]
  • Output responses: concentration c from t = 0 to 80, observation error e ~ N(0, 0.001)

16

slide-17
SLIDE 17

2D example

  • Compare true concentration and approximated concentration

17

slide-18
SLIDE 18

2D example

  • Surrogate approximation error
  • Adaptive transformed SCM (ATSCM)

− Dimension-adaptive: automatically select important dimensions − Further reduce the number of collocation points

18

Klimke (2006) Liao et al. (JCP, 2016)

slide-19
SLIDE 19

2D example

  • Marginal PDF

− MCMC with 105 model runs as a reference − ATSCM with 67 model runs is more accurate than SCM with 6017 model runs

19

slide-20
SLIDE 20

2D example

  • Marginal PDF

− Black: MCMC, red: SCM, blue: ATSCM

20

slide-21
SLIDE 21

Inverse modeling

  • Maximizing the posterior PDF
  • For Gaussian prior and error, minimizing an objective function
  • Ensemble based methods

21

EnKF

  • Sequentially assimilate

the data

  • One step method
  • Moderate simulation

effort (restart required)

Iterative ES

  • Simultaneously assimilate

all the data

  • Multi-step method
  • Large simulation

effort (no restart, iteration)

  • Suitable for highly

non-linear problems

ES

  • Simultaneously assimilate

all the data

  • One step method
  • Small simulation

effort (no restart)

( ) ( | ) ( | ) ( )

  • bs
  • bs
  • bs

p p p p  m d m m d d

1 1

1 1 ( ) ( ( ) ) ( ( ) ) ( ) ( ). 2 2

  • bs

T

  • bs

pr T pr D M

J g C g C

 

      m m d m d m m m m

slide-22
SLIDE 22

Iterative ensemble smoother

  • Gauss-Newton method

22

   

1 1 1 1 1 1

(1 ) ( )

T l l l M l D l pr T

  • bs

M l l D l

C G C G C G C g 

     

                m m m m m d

       

1 1 1 1

1 (1 ) 1 (1 ) ( ) .

l l T T pr M M l l D l M l l M M l l T T

  • bs

M l l D l M l l

C C G C G C G G C C C G C G C G g d    m m m m m

   

                

       

1, , 1 1 , , , , , 1 , , , ,

1 (1 ) 1 (1 ) ( ) , 1,..., ,

l j l j T T pr M M l j l D l j M l j l j M M l j j l T T

  • bs

M l j l D l j M l j l j j e

C C G C G C G G C C C G C G C G g j N    m m m m m d

   

                 

  • Equivalent form
  • Ensemble based implementation

Size: Nm x Nm Size: Nd x Nd

Can be further approximated

slide-23
SLIDE 23

Iterative ensemble smoother

  • Further modification

23

 

 

  

1, , 1 1 , 1 ,

1 (1 ) 1 (1 ) ( ) , 1,..., ,

l l l l l l

l j l j T T pr M M l l D l M l l M M l j j l T T

  • bs

M l l D l M l l j j e

C C G C G C G G C C C G C G C G g j N    m m m m m d

   

                 

 

 

  

1, , 1 1 , 1 ,

1 (1 ) 1 (1 ) ( ) , 1,..., .

l l l l l l l l l l l

l j l j pr M M D l D D D D M M l j j l

  • bs

M D l D D D l j j e

C C C C C C C C C g j N    m m m m m d

   

                 

Change with iteration Mean sensitivity Can be approximated by Can be approximated by

l l

M D

C

l l

D D

C

  • One has (Chen & Oliver, 2013)
slide-24
SLIDE 24

Surrogate model based iterative ES

  • Independent model parameters
  • Random fields can be represented via Karhu

arhunen nen-Loe Loeve ve Expan ansio ion (KLE) :

1 2

: { , ,..., }

N T

     m

1

( , ) ( ) ( )

N n n n n

Y f    

  x x

n n

10 20 30 40 0.00 0.10 0.20

x1 x2

2 4 6 8 10 2 4 6 8 10 1.5 1 0.5
  • 0.5
  • 1
  • 1.5

(c) n=10 x1 x2

2 4 6 8 10 2 4 6 8 10 1.5 1 0.5
  • 0.5
  • 1
  • 1.5

(b) n=4 x1 x2

2 4 6 8 10 2 4 6 8 10 1.5 1 0.5
  • 0.5
  • 1
  • 1.5

(d) n=20 x1 x2

2 4 6 8 10 2 4 6 8 10 1.3 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4

(a) n=1

slide-25
SLIDE 25

Surrogate model based iterative ES

  • Independent model parameters
  • Independent parameter based iterative ES

25

1 2

: { , ,..., }

N T

     m

 

 

  

1 1, , , 1 ,

1 (1 ) 1 (1 ) ( ) , 1,..., .

l l l l l l l l l l l

pr l j l j D l D D D D l j j l

  • bs

D l D D D l j j e

C C C C C C C C g j N    d

      

                      

 

 

  

1 1, , , 1 ,

1 (1 ) 1 (1 ) ( ) , 1,..., .

l l l l l l l l l l l

surr surr surr pr l j l j D l D D D D l j j l surr surr surr

  • bs

D l D D D l j j e

C C C C C C C C j N   

      

                       d d

Can be obtained from surrogate

  • Surrogate model based iterative ES (Chang, Liao & Zhang, AWR, 2016)
slide-26
SLIDE 26

Case study: single-phase flow

  • Model size: 800 m x 800 m x 1 m
  • Grid: 40 x 40 x 1
  • Boundary: constant head (left and right);

no flow (two lateral boundaries).

  • Well: pumping well at block (12, 12);

injecting well at block (28, 28).

  • Observation: hydraulic heads
  • Observation duration:

start from day 0.2, at every 0.6 day, up to 5 days

26

 

( , ) ( ) ( , ) = ( , )

s

h t S K h t q t t     x x x x

200 400 600 800 200 400 600 800 Y (m) X (m)

slide-27
SLIDE 27

Uncertain random field

  • log-transformed conductivity is Gaussian random field
  • Retained terms in KLE: 11
  • Define errors:

27

ln ( ) ln(m / day), K    x

1 2 1 2 1 1 2 2

2 2 2 ln , , ln

( , ) exp , / 0.4, / 0.2,

i i j j K i j i j K x y x x y y

x x y y C L L                                                 x x

2 ln ( )

0.75,

K

  x

150 300 450 600 750 150 300 450 600 750

X Y

  • 2.0
  • 1.5
  • 1.0
  • 0.5

0.5 1.0 1.5 2.0

1

, , , 1 1 1

1 ,

d

i sim i surr N N j j surr i sim j i d j

d d e N N d

 

 



, , , 1 1

1 ,

e d

i obs i update N N j j d i obs j i e d j

d d e N N d

 

 



 

2 , , 1 1

1 .

e m

N N i ref i update m j j i e m

e m m N N

 

 



Surrogate error: data match error: Parameter estimation error:

slide-28
SLIDE 28

Results

  • Updated mean fields:

28 150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

  • 2.0
  • 1.5
  • 1.0
  • 0.5

0.5 1.0 1.5 2.0 150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

  • 2.0
  • 1.5
  • 1.0
  • 0.5

0.5 1.0 1.5 2.0

  • rder 2-PCM

level 2-Smolyak

slide-29
SLIDE 29

Traditional iterative ES

  • Updated mean fields:

29

150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

  • 2.0
  • 1.5
  • 1.0
  • 0.5

0.5 1.0 1.5 2.0 150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

  • 2.0
  • 1.5
  • 1.0
  • 0.5

0.5 1.0 1.5 2.0 150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

  • 2.0
  • 1.5
  • 1.0
  • 0.5

0.5 1.0 1.5 2.0 150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

  • 2.0
  • 1.5
  • 1.0
  • 0.5

0.5 1.0 1.5 2.0

Ne=20 Ne=60 Ne=100 Ne=1000

slide-30
SLIDE 30

Standard deviation comparison

  • Level 2-Smolyak based iterative ES:
  • Traditional iterative ES with Ne being 1000:

30

150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

0.15 0.22 0.29 0.36 0.43 0.49 0.56 0.63 0.70 150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

0.09 0.17 0.26 0.35 0.44 0.53 0.61 0.70 150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

0.15 0.22 0.29 0.36 0.42 0.49 0.56 0.63 0.70 150 300 450 600 750 150 300 450 600 750

X (m) Y (m)

0.09 0.17 0.26 0.35 0.44 0.53 0.61 0.70

iteration 1 iteration 2 iteration 1 iteration 2

slide-31
SLIDE 31

Case study: Multi-phase flow

Water-oil two-phase system:

  • Model size: 410 m x 410 m x 1 m
  • Grid: 41 x 41 x 1
  • Boundary: no flow
  • Well: injector at the center;

producers at the corners.

  • Observation: WCT and OPR
  • Observation duration:

every 30 day, up to 510 days

31

 

( ) ( ) , ,

ri i i i i i

k k S p g z q i w o t                   x x

100 200 300 400 100 200 300 400 Y (m) X (m)

slide-32
SLIDE 32

Uncertain random field

  • log-transformed permeability is Gaussian random field
  • Retained terms in KLE: 55
  • Surrogate model: tTPCM (Liao & Zhang, 2016)
  • Simulation number: 111

32

ln ( ) 4 ln(mD), k    x

2 ln ( )

0.16,

k

  x

1 2 1 2 1 1 2 2

2 ln , , ln

( , ) exp , / 0.4, / 0.4.

i i j j k i j i j k x y x x y y

x x y y C L L                               x x

100 200 300 400 100 200 300 400

X (m) Y (m)

3.0 3.3 3.5 3.8 4.1 4.4 4.7 4.9 5.2

slide-33
SLIDE 33

Results

  • tTPCM based iterative ES, ( )
  • Traditional iterative ES (each run takes 10 iterations):

33

100 200 300 400 100 200 300 400

X (m) Y (m)

3.0 3.3 3.5 3.8 4.1 4.4 4.7 4.9 5.2 100 200 300 400 100 200 300 400

X (m) Y (m)

0.12 0.16 0.19 0.23 0.26 0.30 0.33 0.37 0.40 100 200 300 400 100 200 300 400

X (m) Y (m)

3.0 3.3 3.5 3.8 4.1 4.4 4.7 4.9 5.2 100 200 300 400 100 200 300 400

X (m) Y (m)

3.0 3.3 3.5 3.8 4.1 4.4 4.7 4.9 5.2

Ne=20 ( ) Ne=100 ( ) Mean standard deviation

0.27

m

e  0.445

m

e  0.268

m

e 

slide-34
SLIDE 34

Data match of water cut from tTPCM

34

2 4 6 8 10 12 14 16 0.0 0.2 0.4 0.6 0.8 1.0 WCT of Producer 1 Observation step 2 4 6 8 10 12 14 16 0.0 0.2 0.4 0.6 0.8 1.0 WCT of Producer 2 Observation step 2 4 6 8 10 12 14 16 0.0 0.2 0.4 0.6 0.8 1.0 WCT of Producer 3 Observation step 2 4 6 8 10 12 14 16 0.0 0.2 0.4 0.6 0.8 1.0 WCT of Producer 4 Observation step 2 4 6 8 10 12 14 16 0.0 0.2 0.4 0.6 0.8 1.0 WCT of Producer 1 Observation step 2 4 6 8 10 12 14 16 0.0 0.2 0.4 0.6 0.8 1.0 WCT of Producer 2 Observation step 2 4 6 8 10 12 14 16 0.0 0.2 0.4 0.6 0.8 1.0 WCT of Producer 3 Observation step 2 4 6 8 10 12 14 16 0.0 0.2 0.4 0.6 0.8 1.0 WCT of Producer 4 Observation step

Initial ensemble updated ensemble

slide-35
SLIDE 35

Conclusions

  • Inverse problems may be solved efficiently with the

aid of surrogate models − Use surrogate model to approximate the forward solution − Apply transformation to address the low-regularity − Select the important dimensions adaptively to reduce the points

  • Performance

− Fast convergence of the surrogate solution to the exact forward solution − Fast convergence of the surrogate posterior to the true posterior

35

slide-36
SLIDE 36

Thanks ! Q&A

36

slide-37
SLIDE 37

References

  • Chang, H., & Zhang, D. (2009). A comparative study of stochastic collocation methods

for flow in spatially correlated random fields. Communications in Computational Physics, 6(3), 509-535.

  • Chang, H., Liao, Q., Zhang, D., 2016. Surrogate model based iterative ensemble smoother for

subsurface flow data assimilation. Advances in Water Resources. Doi:10.1016/j.advwatres.2016.12.001

  • Liao, Q., & Zhang, D. (2016). Probabilistic collocation method for strongly nonlinear

problems: 3. transform by time. Water Resources Research,52(3), 7911–7928.

  • Liao, Q., Zhang, D., & Tchelepi, H. (2016). A two-stage adaptive stochastic collocation

method on nested sparse grids for multiphase flow in randomly heterogeneous porous

  • media. Journal of Computational Physics. In press.
  • Lin, G., and A. M. Tartakovsky (2009), An efficient, high-order probabilistic collocation

method on sparse grids for three-dimensional flow and solute transport in randomly heterogeneous porous media, Adv. Water Resour., 32(5), 712–722.

  • Marzouk, Y., and D. Xiu (2009), A stochastic collocation approach to Bayesian inference

in inverse problems, Commun. Comput. Phys., 6(4), 826–847.

  • W.A. Klimke (2006), Uncertainty modeling using fuzzy arithmetic and sparse grids,

Citeseer.

  • Xiu, D., and J. S. Hesthaven (2005), High-order collocation methods for differential

equations with random inputs, SIAM J. Sci. Comput., 27(3), 1118-1139.

  • Zhang, D., L. Shi, H. Chang, and J. Yang (2010), A comparative study of numerical

approaches to risk assessment, Stoch. Environ. Res. Risk. Assess. 24, 971–984.

37

slide-38
SLIDE 38
  • 2D synthetic case: 20 random variables after Karhunen-Loeve expansion

38

Backup: transformed EnKF (TEnKF)

Liao, Q., & Zhang, D. (2015). Data assimilation for strongly nonlinear problems by transformed ensemble kalman filter. SPE Journal, 20(1), 202-221.