Adaptive and stochastic algorithms for piecewise constant EIT and DC - - PowerPoint PPT Presentation

adaptive and stochastic algorithms for piecewise constant
SMART_READER_LITE
LIVE PREVIEW

Adaptive and stochastic algorithms for piecewise constant EIT and DC - - PowerPoint PPT Presentation

. . Adaptive and stochastic algorithms for piecewise constant EIT and DC resistivity problems with many measurements . . . . . Uri Ascher Department of Computer Science University of British Columbia October 2011 . . . . . . Kees


slide-1
SLIDE 1

. . . . . .

. . . . . . .

Adaptive and stochastic algorithms for piecewise constant EIT and DC resistivity problems with many measurements

Uri Ascher Department of Computer Science University of British Columbia October 2011

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 1 / 44

slide-2
SLIDE 2

. . . . . .

Motivation EIT on wikipedia

. . From the horse’s mouth (wikipedia)

Electrical impedance tomography (EIT) is a medical imaging technique in which an image of the conductivity or permittivity of part of the body is inferred from surface electrical measurements. Typically, conducting electrodes are attached to the skin of the subject and small alternating currents are applied to some or all of the electrodes. The resulting electrical potentials are measured, and the process may be repeated for numerous different configurations of applied current. Proposed applications include monitoring of lung function, detection

  • f cancer in the skin and breast and location of epileptic foci [...] In

2011 the first commercial EIT device for lung function monitoring in intensive care patients was introduced. [...]

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 2 / 44

slide-3
SLIDE 3

. . . . . .

Motivation EIT on wikipedia

. . From the horse’s mouth (wikipedia)

Electrical impedance tomography (EIT) is a medical imaging technique in which an image of the conductivity or permittivity of part of the body is inferred from surface electrical measurements. Typically, conducting electrodes are attached to the skin of the subject and small alternating currents are applied to some or all of the electrodes. The resulting electrical potentials are measured, and the process may be repeated for numerous different configurations of applied current. Proposed applications include monitoring of lung function, detection

  • f cancer in the skin and breast and location of epileptic foci [...] In

2011 the first commercial EIT device for lung function monitoring in intensive care patients was introduced. [...]

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 2 / 44

slide-4
SLIDE 4

. . . . . .

Motivation EIT on wikipedia

. . Lung imaging

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 3 / 44

slide-5
SLIDE 5

. . . . . .

Motivation EIT on wikipedia

. . From the horse’s mouth cont.

Mathematically [...] non-linear inverse problem and is severely ill-posed. The mathematical formulation of the problem is due to

[Calderon, 1980] There is extensive mathematical research on the

problem of uniqueness of solution and numerical algorithms for this problem [Uhlmann, 1999]. In geophysics a similar technique (called electrical resistivity tomography) is used using electrodes on the surface of the earth or in bore holes to locate resistivity anomalies, and in industrial process monitoring the arrays of electrodes are used for example to monitor mixtures of conductive fluids in vessels or pipes. [...] broadly similar to the medical case. In geophysics, the idea dates from the 1930s.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 4 / 44

slide-6
SLIDE 6

. . . . . .

Motivation EIT on wikipedia

. . From the horse’s mouth cont.

Mathematically [...] non-linear inverse problem and is severely ill-posed. The mathematical formulation of the problem is due to

[Calderon, 1980] There is extensive mathematical research on the

problem of uniqueness of solution and numerical algorithms for this problem [Uhlmann, 1999]. In geophysics a similar technique (called electrical resistivity tomography) is used using electrodes on the surface of the earth or in bore holes to locate resistivity anomalies, and in industrial process monitoring the arrays of electrodes are used for example to monitor mixtures of conductive fluids in vessels or pipes. [...] broadly similar to the medical case. In geophysics, the idea dates from the 1930s.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 4 / 44

slide-7
SLIDE 7

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . Inverse problem setup

Goal: Recover distributed parameter function m(x) given

forward operator F(m) data b s.t. b = F(m) + ϵ ϵ - noise Sensitivity matrix J = ∂F ∂m

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 5 / 44

slide-8
SLIDE 8

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . Inverse problem setup

Goal: Recover distributed parameter function m(x) given

forward operator F(m) data b s.t. b = F(m) + ϵ ϵ - noise Sensitivity matrix J = ∂F ∂m

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 5 / 44

slide-9
SLIDE 9

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . Forward operator

F(m) = Qu Field u solves discretized elliptic PDE Au = q, with A depending on m. Multi-experiment version: Fi(m) = Qiui, i = 1, . . . , s data bi. Thus, F = (F1, . . . , Fs), b = (b1, . . . , bs). Field ui solves discretized elliptic PDE A(m)ui = qi. Note same A for different experimental settings i.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 6 / 44

slide-10
SLIDE 10

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . Forward operator

F(m) = Qu Field u solves discretized elliptic PDE Au = q, with A depending on m. Multi-experiment version: Fi(m) = Qiui, i = 1, . . . , s data bi. Thus, F = (F1, . . . , Fs), b = (b1, . . . , bs). Field ui solves discretized elliptic PDE A(m)ui = qi. Note same A for different experimental settings i.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 6 / 44

slide-11
SLIDE 11

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . EIT and DC resistivity problem setup

PDE with multiple sources ∇ · (σ(x)∇ui) = qi, i = 1, . . . , s, ∂ui ∂ν |∂Ω = 0. Conductivity σ(x) is expressed as a pointwise function of m(x). The operator A(m) is the above PDE discretized on a staggered grid. Source qi and observation locations Qi correspond to different selection of sources and receivers. Need to solve s PDEs just to evaluate forward operator F: proceed with caution.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 7 / 44

slide-12
SLIDE 12

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . EIT and DC resistivity problem setup

PDE with multiple sources ∇ · (σ(x)∇ui) = qi, i = 1, . . . , s, ∂ui ∂ν |∂Ω = 0. Conductivity σ(x) is expressed as a pointwise function of m(x). The operator A(m) is the above PDE discretized on a staggered grid. Source qi and observation locations Qi correspond to different selection of sources and receivers. Need to solve s PDEs just to evaluate forward operator F: proceed with caution.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 7 / 44

slide-13
SLIDE 13

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . EIT theory

If the entire Dirichlet-to-Neumann map (equivalently, the boundary data u|∂Ω expressed in terms of the source q for any q(x) restricted to ∂Ω) is given then σ(x) can be uniquely recovered provided that it is at least differentiable on the domain Ω. Much thoeretical effort spent on reducing continuity requirements on σ [Astala & Paivarinta, 2006]. In practice there is noise, discretization of PDEs and discretization of D2N map. Still, theory correctly predicts that (i) things can go wrong when σ has discontinuities and that (ii) many experiments are useful. The problem is much stabilized if we may add a priori information: assume σ(x) can take only one of two (or four) values at each x. Thus, shape optimization. This is a stabilizing assumption

[Alessandrini & Vessella, 2005].

∇ · (σ(x)∇ui) = qi, i = 1, . . . , s.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 8 / 44

slide-14
SLIDE 14

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . EIT theory

If the entire Dirichlet-to-Neumann map (equivalently, the boundary data u|∂Ω expressed in terms of the source q for any q(x) restricted to ∂Ω) is given then σ(x) can be uniquely recovered provided that it is at least differentiable on the domain Ω. Much thoeretical effort spent on reducing continuity requirements on σ [Astala & Paivarinta, 2006]. In practice there is noise, discretization of PDEs and discretization of D2N map. Still, theory correctly predicts that (i) things can go wrong when σ has discontinuities and that (ii) many experiments are useful. The problem is much stabilized if we may add a priori information: assume σ(x) can take only one of two (or four) values at each x. Thus, shape optimization. This is a stabilizing assumption

[Alessandrini & Vessella, 2005].

∇ · (σ(x)∇ui) = qi, i = 1, . . . , s.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 8 / 44

slide-15
SLIDE 15

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . EIT theory

If the entire Dirichlet-to-Neumann map (equivalently, the boundary data u|∂Ω expressed in terms of the source q for any q(x) restricted to ∂Ω) is given then σ(x) can be uniquely recovered provided that it is at least differentiable on the domain Ω. Much thoeretical effort spent on reducing continuity requirements on σ [Astala & Paivarinta, 2006]. In practice there is noise, discretization of PDEs and discretization of D2N map. Still, theory correctly predicts that (i) things can go wrong when σ has discontinuities and that (ii) many experiments are useful. The problem is much stabilized if we may add a priori information: assume σ(x) can take only one of two (or four) values at each x. Thus, shape optimization. This is a stabilizing assumption

[Alessandrini & Vessella, 2005].

∇ · (σ(x)∇ui) = qi, i = 1, . . . , s.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 8 / 44

slide-16
SLIDE 16

. . . . . .

Motivation EIT, DC resistivity, and many measurements

. . Goals of this work

Consider EIT / DC resistivity problems with piecewise constant solutions σ and algorithms that may or may not take the latter information into account. Construct efficient algorithms in the presence of many experiments.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 9 / 44

slide-17
SLIDE 17

. . . . . .

Motivation Outline

. . Outline

Motivation

From the horse’s mouth EIT, DC resistivity, and many measurements

Previous work

Level set functions and shape optimization Tikhonov, PCG and dynamical regularization

Adaptive algorithms

Selecting number of inner PCG iterations Stochastic algorithm

Conclusions

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 10 / 44

slide-18
SLIDE 18

. . . . . .

Previous work Level set functions and shape optimization

. . Forward problem reformulations

.

.

.

1 Define m(x) = ln σ(x), obtaining PDE

∇ · (em∇ui) = qi, i = 1, . . . , s, ∂ui ∂ν

  • ∂Ω = 0,

Note σ > 0. Discretize on staggered grid [Ascher & Haber, 2001]. .

.

.

2 Assuming σ(x) takes only one of two values σI or σII, let

σ = eP(m;0) = eP(m) with P(ξ; h) = ln σI − ln σII 2 tanh(ξ/h) + ln σI + ln σII 2 . The function P(m; h) depends on the resolution or grid width h. Note m(x) has bounded first derivatives, whereas σ(x) is discontinuous.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 11 / 44

slide-19
SLIDE 19

. . . . . .

Previous work Level set functions and shape optimization

. . Forward problem reformulations

.

.

.

1 Define m(x) = ln σ(x), obtaining PDE

∇ · (em∇ui) = qi, i = 1, . . . , s, ∂ui ∂ν

  • ∂Ω = 0,

Note σ > 0. Discretize on staggered grid [Ascher & Haber, 2001]. .

.

.

2 Assuming σ(x) takes only one of two values σI or σII, let

σ = eP(m;0) = eP(m) with P(ξ; h) = ln σI − ln σII 2 tanh(ξ/h) + ln σI + ln σII 2 . The function P(m; h) depends on the resolution or grid width h. Note m(x) has bounded first derivatives, whereas σ(x) is discontinuous.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 11 / 44

slide-20
SLIDE 20

. . . . . .

Previous work Level set functions and shape optimization

. . A level set approach

Set σ(x) = eP(m(x)), e.g., P(s; h) = ln σI − ln σII 2 tanh(s/h) + ln σI + ln σII 2 . Sharpening happens at each iteration n across the 0- level set of mn(x).

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 12 / 44

slide-21
SLIDE 21

. . . . . .

Previous work Level set functions and shape optimization

. . Motivating experiments

From [van den Doel & Ascher, 2007]; Setups and code RESINVM3D from

[Pidilsecky, Haber & Knight, 2007].

∇ · (em∇ui) = qi, i = 1, . . . , s, ∂ui ∂ν |∂Ω = 0, discretized on a staggered grid Surface electrodes, mI = −2.3, mII = −5.3, s = 41, 3% noise Subsurface electrodes, 3% noise Subsurface electrodes and receivers, 1% noise

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 13 / 44

slide-22
SLIDE 22

. . . . . .

Previous work Level set functions and shape optimization

. . Experiment M.1

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 14 / 44

slide-23
SLIDE 23

. . . . . .

Previous work Level set functions and shape optimization

. . Experiment M.2

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 15 / 44

slide-24
SLIDE 24

. . . . . .

Previous work Level set functions and shape optimization

. . Experiment M.3

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 16 / 44

slide-25
SLIDE 25

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Tikhonov-type regularization

This highly ill-posed problem requires regularization. Tikhonov-type: Solve min

m ϕ(m; β) ≡ 1

2∥F(m) − b∥2 + βR(m). Necessary conditions ∇ϕ = JT(F(m) − b) + βR′(m) = 0. Consider for regularization operator a same-grid discretization of 1 2 ∫

|∇m|2. Then minm ϕ is a nonlinear least squares problem.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 17 / 44

slide-26
SLIDE 26

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Tikhonov-type regularization

This highly ill-posed problem requires regularization. Tikhonov-type: Solve min

m ϕ(m; β) ≡ 1

2∥F(m) − b∥2 + βR(m). Necessary conditions ∇ϕ = JT(F(m) − b) + βR′(m) = 0. Consider for regularization operator a same-grid discretization of 1 2 ∫

|∇m|2. Then minm ϕ is a nonlinear least squares problem.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 17 / 44

slide-27
SLIDE 27

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Tikhonov-type regularization

This highly ill-posed problem requires regularization. Tikhonov-type: Solve min

m ϕ(m; β) ≡ 1

2∥F(m) − b∥2 + βR(m). Necessary conditions ∇ϕ = JT(F(m) − b) + βR′(m) = 0. Consider for regularization operator a same-grid discretization of 1 2 ∫

|∇m|2. Then minm ϕ is a nonlinear least squares problem.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 17 / 44

slide-28
SLIDE 28

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Solving the minimization problem

Modified Gauss-Newton (GN) Set m = mn at the nth iteration. Obtain next iterate by ( JTJ + β0R′′) δm = −∇ϕ(m; β) γ = argmin ϕ(m + γδm; β) mn+1 ← m + γδm. Unmodified GN: β0 = β, γ = 1. However, in general β0 > 0 is used to stabilize iteration for Tikhonov with β > 0. Even for one experiment, F(m) = QA(m)−1q, the Jacobian J(m) = −QA(m)−1G is very expensive to calculate and store. So use conjugate gradient (CG), preconditioned with Laplacian R′′.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 18 / 44

slide-29
SLIDE 29

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Solving the minimization problem

Modified Gauss-Newton (GN) Set m = mn at the nth iteration. Obtain next iterate by ( JTJ + β0R′′) δm = −∇ϕ(m; β) γ = argmin ϕ(m + γδm; β) mn+1 ← m + γδm. Unmodified GN: β0 = β, γ = 1. However, in general β0 > 0 is used to stabilize iteration for Tikhonov with β > 0. Even for one experiment, F(m) = QA(m)−1q, the Jacobian J(m) = −QA(m)−1G is very expensive to calculate and store. So use conjugate gradient (CG), preconditioned with Laplacian R′′.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 18 / 44

slide-30
SLIDE 30

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . What is PCG?

Consider the basic problem ˜ A˜ x = ˜ b, with ˜ A symmetric positive definite matrix. Can efficiently carry out matrix-vector products ˜ Av, but can’t decompose matrix directly. Typical iterative algorithm ˜ xk+1 = ˜ xk + αkpk, k = 0, 1, . . . with search direction pk and step size αk > 0. Gradient descent method pk = rk = ˜ b − ˜ A˜ xk. Steepest descent: choose αk as exact line search for minimizing f (˜ x) = 1

xT ˜ A˜ x − ˜ bT˜ x along the line ˜ x = ˜ xk + αpk.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 19 / 44

slide-31
SLIDE 31

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . What is PCG?

Consider the basic problem ˜ A˜ x = ˜ b, with ˜ A symmetric positive definite matrix. Can efficiently carry out matrix-vector products ˜ Av, but can’t decompose matrix directly. Typical iterative algorithm ˜ xk+1 = ˜ xk + αkpk, k = 0, 1, . . . with search direction pk and step size αk > 0. Gradient descent method pk = rk = ˜ b − ˜ A˜ xk. Steepest descent: choose αk as exact line search for minimizing f (˜ x) = 1

xT ˜ A˜ x − ˜ bT˜ x along the line ˜ x = ˜ xk + αpk.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 19 / 44

slide-32
SLIDE 32

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . What is PCG?

Consider the basic problem ˜ A˜ x = ˜ b, with ˜ A symmetric positive definite matrix. Can efficiently carry out matrix-vector products ˜ Av, but can’t decompose matrix directly. Typical iterative algorithm ˜ xk+1 = ˜ xk + αkpk, k = 0, 1, . . . with search direction pk and step size αk > 0. Gradient descent method pk = rk = ˜ b − ˜ A˜ xk. Steepest descent: choose αk as exact line search for minimizing f (˜ x) = 1

xT ˜ A˜ x − ˜ bT˜ x along the line ˜ x = ˜ xk + αpk.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 19 / 44

slide-33
SLIDE 33

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . What is PCG?

conjugate gradient (CG) method: choose pk = rk + βkpk−1 with βk so as to obtain ˜ A-conjugacy. The number of iterations required to reduce residual by a fixed amount is O (√ cond(˜ A) ) . preconditioned conjugate gradient (PCG) method: use preconditioner ˜ P to effectively solve ˜ P ˜ A˜ x = ˜ P˜ b, where ˜ P ˜ A is much better conditioned than ˜ A.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 20 / 44

slide-34
SLIDE 34

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . What is PCG?

conjugate gradient (CG) method: choose pk = rk + βkpk−1 with βk so as to obtain ˜ A-conjugacy. The number of iterations required to reduce residual by a fixed amount is O (√ cond(˜ A) ) . preconditioned conjugate gradient (PCG) method: use preconditioner ˜ P to effectively solve ˜ P ˜ A˜ x = ˜ P˜ b, where ˜ P ˜ A is much better conditioned than ˜ A.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 20 / 44

slide-35
SLIDE 35

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . What is PCG?

conjugate gradient (CG) method: choose pk = rk + βkpk−1 with βk so as to obtain ˜ A-conjugacy. The number of iterations required to reduce residual by a fixed amount is O (√ cond(˜ A) ) . preconditioned conjugate gradient (PCG) method: use preconditioner ˜ P to effectively solve ˜ P ˜ A˜ x = ˜ P˜ b, where ˜ P ˜ A is much better conditioned than ˜ A.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 20 / 44

slide-36
SLIDE 36

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . What’s the cost?

Estimate cost W by number of PDE solves (i.e., calculating A(m)−1q). Assume N outer GN iterations. At nth outer iteration (1 ≤ n ≤ N) assume:

Mn inner PCG iterations, ln additional PDE solves (e.g. line search), sn ≤ s experiments are used.

Then total cost is W =

N

n=1

(2Mn + ln)sn. Our essential goal is to reduce this number as much as possible while still obtaining quality reconstructions.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 21 / 44

slide-37
SLIDE 37

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . What’s the cost?

Estimate cost W by number of PDE solves (i.e., calculating A(m)−1q). Assume N outer GN iterations. At nth outer iteration (1 ≤ n ≤ N) assume:

Mn inner PCG iterations, ln additional PDE solves (e.g. line search), sn ≤ s experiments are used.

Then total cost is W =

N

n=1

(2Mn + ln)sn. Our essential goal is to reduce this number as much as possible while still obtaining quality reconstructions.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 21 / 44

slide-38
SLIDE 38

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Solving by iterative regularization

Simplest way of reducing cost: use small Mn, no longer aiming to solve for δm exactly. This has a regularizing effect on the linear outer iteration [Hanke,

1997, Lechleiter & Rieder, 2009].

May set β0 = 0, δm = 0, and apply iterations directly to the singular linear system: iterative regularization [Hansen, 1998]. Apply CGLS iterations preconditioned by R′′ to the underdetermined problem min

δm ∥Jδm − δb∥.

Theorem: As iteration number j increases,

∥Jδmj − δb∥ decreases monotonically. R(δmj) increases monotonically.

But how should we choose Mn so that N does not grow significantly?!

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 22 / 44

slide-39
SLIDE 39

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Solving by iterative regularization

Simplest way of reducing cost: use small Mn, no longer aiming to solve for δm exactly. This has a regularizing effect on the linear outer iteration [Hanke,

1997, Lechleiter & Rieder, 2009].

May set β0 = 0, δm = 0, and apply iterations directly to the singular linear system: iterative regularization [Hansen, 1998]. Apply CGLS iterations preconditioned by R′′ to the underdetermined problem min

δm ∥Jδm − δb∥.

Theorem: As iteration number j increases,

∥Jδmj − δb∥ decreases monotonically. R(δmj) increases monotonically.

But how should we choose Mn so that N does not grow significantly?!

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 22 / 44

slide-40
SLIDE 40

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Solving by iterative regularization

Simplest way of reducing cost: use small Mn, no longer aiming to solve for δm exactly. This has a regularizing effect on the linear outer iteration [Hanke,

1997, Lechleiter & Rieder, 2009].

May set β0 = 0, δm = 0, and apply iterations directly to the singular linear system: iterative regularization [Hansen, 1998]. Apply CGLS iterations preconditioned by R′′ to the underdetermined problem min

δm ∥Jδm − δb∥.

Theorem: As iteration number j increases,

∥Jδmj − δb∥ decreases monotonically. R(δmj) increases monotonically.

But how should we choose Mn so that N does not grow significantly?!

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 22 / 44

slide-41
SLIDE 41

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Solving by iterative regularization

Simplest way of reducing cost: use small Mn, no longer aiming to solve for δm exactly. This has a regularizing effect on the linear outer iteration [Hanke,

1997, Lechleiter & Rieder, 2009].

May set β0 = 0, δm = 0, and apply iterations directly to the singular linear system: iterative regularization [Hansen, 1998]. Apply CGLS iterations preconditioned by R′′ to the underdetermined problem min

δm ∥Jδm − δb∥.

Theorem: As iteration number j increases,

∥Jδmj − δb∥ decreases monotonically. R(δmj) increases monotonically.

But how should we choose Mn so that N does not grow significantly?!

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 22 / 44

slide-42
SLIDE 42

. . . . . .

Previous work Tikhonov, PCG and dynamical regularization

. . Dynamical regularization

May also set β = 0, replacing Tikhonov by dynamical regularization

[van den Doel & Ascher, 2007].

The finite number of outer iterations N plus the stabilization at each iteration by small Mn lead to regularization of the nonlinear inverse problem. Terminate iteration by misfit size control (discrepancy principle). The effect of R is still felt, but only through the preconditioner R′′! Advantage over Tikhonov: no need to determine β by costly trial and

  • error. Nothing is evaluated very accurately.

Disadvantage over Tikhonov: less controlled convergence, less theory.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 23 / 44

slide-43
SLIDE 43

. . . . . .

Previous work Outline

. . Outline

Motivation

From the horse’s mouth EIT, DC resistivity, and many measurements

Previous work

Level set functions and shape optimization Tikhonov, PCG and dynamical regularization

Adaptive algorithms

Selecting number of inner PCG iterations Stochastic algorithm

Conclusions

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 24 / 44

slide-44
SLIDE 44

. . . . . .

Adaptive algorithms Selecting number of inner PCG iterations

. . Determining Mn adaptively

Treat each outer iteration as if it was last. Starting from current iterate m = mn and a guess for Mn, want to find Mn obtaining sufficient decrease in ϕ(m + δm(Mn)) without evaluating all ϕ(m + δm(j)), j = 1, . . . , Mn. So, modify the outer iteration δm(Mn) = PCG ( JTJ, −∇ϕ, R′′, Mn ) ; Mn = argminM ϕ(m + δm(M)) s.t. ∥Jδm(M) + ∇ϕ∥ > ∥ϵ∥; if Mn > 0 : γ = 1, else Mn = 1, γ = argminγ ϕ(m + γδm(1)); mn+1 = m + γδm(Mn). Determine Mn by “suffient decrease” in ϕ starting from current value.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 25 / 44

slide-45
SLIDE 45

. . . . . .

Adaptive algorithms Selecting number of inner PCG iterations

. . Numerical experiments (sn = s)

DC resistivity problem on the unit square. Display resistivities 1/σ. Sources of the form qi(x) = δx,pi

L − δx,pi R,

Obtain many sources by varying positions pi

L, pi R.

K equidistant points on right and left boundaries give s = K 2 data

  • sets. Measure voltages on entire boundary.

Ground truth: piecewise constant with σI = 1, σII = 0.1.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 26 / 44

slide-46
SLIDE 46

. . . . . .

Adaptive algorithms Selecting number of inner PCG iterations

. . Example 1

Noise level 2.5%, s = 9 experiments, 652-grid. Use Tikhonov L2 regularization (directly on ln σ) with β = 6.5e-5. Mn W 20 1440 3 1404 adaptive 540 Details: N = 4 for adaptive and Mn = 20; N = 26 for Mn = 3; adaptive Mn = (5, 8, 7, 10). Reconstruction quality is poor, as expected.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 27 / 44

slide-47
SLIDE 47

. . . . . .

Adaptive algorithms Selecting number of inner PCG iterations

. . Example 2

Same as previous example, but use knowledge of piecewise constant solution structure and values. Use levelset function representation and regularization. Use Tikhonov with β = 1.e-3 (T) or dynamical regularization (D) with different initial guesses. Mn(T) W (T) Mn(D) W (D) 2 1548 5 900 5 1000 20 ∞ 20 ∞ adaptive 792 adaptive 756

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 28 / 44

slide-48
SLIDE 48

. . . . . .

Adaptive algorithms Selecting number of inner PCG iterations

. . Example 2 cont.

(a) Initial guess I

2 3 4 5 6 7 8 9 10

(b) Initial guess II

2 3 4 5 6 7 8 9

(c) Tikhonov

2 3 4 5 6 7 8 9 10

(d) Dynamical from (a)

1 2 3 4 5 6 7 8 9 10

(e) Dynamical from (b)

2 3 4 5 6 7 8 9

(f) Dynamical from m=0

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 29 / 44

slide-49
SLIDE 49

. . . . . .

Adaptive algorithms Selecting number of inner PCG iterations

. . Example 3

Same as previous example, but noise level 0.8%. Use levelset function representation and regularization. Use 2572-grid, s = 9 still. Use dynamical regularization with zero initial guess. Mn W 5 19089 10 1260 adaptive 720 Note: reconstruction quality does not improve!

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 30 / 44

slide-50
SLIDE 50

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments?

A large number of experiments s may imply redundancy; however, assume no “embarrassing redundancy”. May not need all experiments at each iteration n, especially in early

  • uter iterations.

Hence, it makes sense to let sn depend on the outer iteration n.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 31 / 44

slide-51
SLIDE 51

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments?

A large number of experiments s may imply redundancy; however, assume no “embarrassing redundancy”. May not need all experiments at each iteration n, especially in early

  • uter iterations.

Hence, it makes sense to let sn depend on the outer iteration n.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 31 / 44

slide-52
SLIDE 52

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments?

A large number of experiments s may imply redundancy; however, assume no “embarrassing redundancy”. May not need all experiments at each iteration n, especially in early

  • uter iterations.

Hence, it makes sense to let sn depend on the outer iteration n.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 31 / 44

slide-53
SLIDE 53

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments

Rewrite objective function as ϕ(m; β) = 1 2

s

i=1

∥Fi(m) − bi∥2 + βR(m). Introduce a stochastic variable w = (w1, . . . , ws) with probability distribution satisfying ⟨wiwj⟩ = δij. Hence ϕ(m; β) = 1 2⟨∥

s

i=1

wi(Fi(m) − bi)∥2⟩ + βR(m).

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 32 / 44

slide-54
SLIDE 54

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments: constant Q

[Shapiro, Dentcheva & Ruszczynski, 2009], [Rohmberg et al, 2010], [Haber, Chung & Herrmann, 2011]

Consider approximating the expectation value at outer iteration n by random sampling from a set of sn vectors w, with sn ≤ s. Simultaneous random sources: select the wi’s to be randomly ±1.

A single realization (sn = 1) combined with averaging: stochastic approximation (SA). A small fixed number of realizations 1 < sn ≪ s: stochastic averaging approximation (SAA).

Very efficient if Qi = Q, ∀i, i.e. data in different experiments is measured at same locations (receivers), because then

s

i=1

wiFi =

s

i=1

wiQiA(m)−1qi = QA(m)−1(

s

i=1

wiqi) can be computed with a single PDE solve.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 33 / 44

slide-55
SLIDE 55

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments: constant Q

[Shapiro, Dentcheva & Ruszczynski, 2009], [Rohmberg et al, 2010], [Haber, Chung & Herrmann, 2011]

Consider approximating the expectation value at outer iteration n by random sampling from a set of sn vectors w, with sn ≤ s. Simultaneous random sources: select the wi’s to be randomly ±1.

A single realization (sn = 1) combined with averaging: stochastic approximation (SA). A small fixed number of realizations 1 < sn ≪ s: stochastic averaging approximation (SAA).

Very efficient if Qi = Q, ∀i, i.e. data in different experiments is measured at same locations (receivers), because then

s

i=1

wiFi =

s

i=1

wiQiA(m)−1qi = QA(m)−1(

s

i=1

wiqi) can be computed with a single PDE solve.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 33 / 44

slide-56
SLIDE 56

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments: Qi varies

Note: other, non-stochastic simplifications are also possible for Qi = Q ∀i, but here proceed with general case. At each outer iteration estimate ϕ based on a set S of sn random samples of w. Use a scaled random column of identity matrix for each w. Perform GN step using previous adaptive algorithm. Cross-validate by checking that ϕ estimated by a different set also decreases. Start with small set K of random experiments. At each step use some (half) for S and the rest C to validate. If validation fails, double size of K (and S) by adding random columns. If outer iteration goes well, halve sn for next iteration start. After outer iteration use all K to estimate misfit. If discrepancy principle is satisfied then use all experiments to validate termination.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 34 / 44

slide-57
SLIDE 57

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments: Qi varies

Note: other, non-stochastic simplifications are also possible for Qi = Q ∀i, but here proceed with general case. At each outer iteration estimate ϕ based on a set S of sn random samples of w. Use a scaled random column of identity matrix for each w. Perform GN step using previous adaptive algorithm. Cross-validate by checking that ϕ estimated by a different set also decreases. Start with small set K of random experiments. At each step use some (half) for S and the rest C to validate. If validation fails, double size of K (and S) by adding random columns. If outer iteration goes well, halve sn for next iteration start. After outer iteration use all K to estimate misfit. If discrepancy principle is satisfied then use all experiments to validate termination.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 34 / 44

slide-58
SLIDE 58

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments: Qi varies

Note: other, non-stochastic simplifications are also possible for Qi = Q ∀i, but here proceed with general case. At each outer iteration estimate ϕ based on a set S of sn random samples of w. Use a scaled random column of identity matrix for each w. Perform GN step using previous adaptive algorithm. Cross-validate by checking that ϕ estimated by a different set also decreases. Start with small set K of random experiments. At each step use some (half) for S and the rest C to validate. If validation fails, double size of K (and S) by adding random columns. If outer iteration goes well, halve sn for next iteration start. After outer iteration use all K to estimate misfit. If discrepancy principle is satisfied then use all experiments to validate termination.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 34 / 44

slide-59
SLIDE 59

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments: Qi varies

Note: other, non-stochastic simplifications are also possible for Qi = Q ∀i, but here proceed with general case. At each outer iteration estimate ϕ based on a set S of sn random samples of w. Use a scaled random column of identity matrix for each w. Perform GN step using previous adaptive algorithm. Cross-validate by checking that ϕ estimated by a different set also decreases. Start with small set K of random experiments. At each step use some (half) for S and the rest C to validate. If validation fails, double size of K (and S) by adding random columns. If outer iteration goes well, halve sn for next iteration start. After outer iteration use all K to estimate misfit. If discrepancy principle is satisfied then use all experiments to validate termination.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 34 / 44

slide-60
SLIDE 60

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Using only sn ≤ s experiments: Qi varies

Note: other, non-stochastic simplifications are also possible for Qi = Q ∀i, but here proceed with general case. At each outer iteration estimate ϕ based on a set S of sn random samples of w. Use a scaled random column of identity matrix for each w. Perform GN step using previous adaptive algorithm. Cross-validate by checking that ϕ estimated by a different set also decreases. Start with small set K of random experiments. At each step use some (half) for S and the rest C to validate. If validation fails, double size of K (and S) by adding random columns. If outer iteration goes well, halve sn for next iteration start. After outer iteration use all K to estimate misfit. If discrepancy principle is satisfied then use all experiments to validate termination.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 34 / 44

slide-61
SLIDE 61

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Numerical experiments using both adaptive algorithms

With a similar setting as Example 1 on unit square, no new advantage is gained with s = 9. Example 4: set s = 36. Example 5: set s = 625. Using dynamical regularization regularization experiments all stochastic L2 36 2880 401 levelset(0) 36 2952 686 levelset 36 2880 1528 L2 625 162500 3196 levelset(0) 625 23125 706

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 35 / 44

slide-62
SLIDE 62

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Numerical experiments using both adaptive algorithms

With a similar setting as Example 1 on unit square, no new advantage is gained with s = 9. Example 4: set s = 36. Example 5: set s = 625. Using dynamical regularization regularization experiments all stochastic L2 36 2880 401 levelset(0) 36 2952 686 levelset 36 2880 1528 L2 625 162500 3196 levelset(0) 625 23125 706

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 35 / 44

slide-63
SLIDE 63

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Example 4 reconstructions

(g) True model

5 6 7 8 9 10 11 12 13 14

(h) L2

4 6 8 10 12 14 16

(i) L2 Stochastic

2 3 4 5 6 7 8 9 10

(j) Level set

1 2 3 4 5 6 7 8 9 10

(k) Level set stochas- tic

2 3 4 5 6 7 8 9

(l) Level set starting at 0

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 36 / 44

slide-64
SLIDE 64

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Example 5 reconstructions

2 4 6 8 10 12 14 16 18

(m) L2 Stochastic

2 4 6 8 10 12 14 16

(n) L2 Deterministic

2 3 4 5 6 7 8 9

(o) Level set stochas- tic

2 3 4 5 6 7 8 9

(p) Level set deter- ministic

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 37 / 44

slide-65
SLIDE 65

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Experiment in 3D: Example 6

Employ surface electrodes to image injection of a saline tracer into the subsurface. σ−1

I

= 10 Ω.m, σ−1

II

= 200 Ω.m. On the surface, 96 electrodes. s = 41 independent current pairs, 93 data values each. 3% noise, 163-grid. “vanilla” = good fixed Mn, sn = s. regularization vanilla stochastic L2 4920 2160 Levelset(0) 3690 1577 Levelset 4100 1785 Not exactly super-impressive savings!

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 38 / 44

slide-66
SLIDE 66

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Experiment in 3D: Example 6

Employ surface electrodes to image injection of a saline tracer into the subsurface. σ−1

I

= 10 Ω.m, σ−1

II

= 200 Ω.m. On the surface, 96 electrodes. s = 41 independent current pairs, 93 data values each. 3% noise, 163-grid. “vanilla” = good fixed Mn, sn = s. regularization vanilla stochastic L2 4920 2160 Levelset(0) 3690 1577 Levelset 4100 1785 Not exactly super-impressive savings!

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 38 / 44

slide-67
SLIDE 67

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Example 6

(q) True model (r) Initial shape (s) Level set from ini- tial shape (t) Level set from zero

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 39 / 44

slide-68
SLIDE 68

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Experiment in 3D: Example 7

Setup as in Example 6, but differs as follows: data at 81 receivers located at surface, plus use four vertical boreholes near the corners to place virtual sources and sinks. In total, s = 578 experiments. Generate data with 1% noise. Compare work with and without stochastic algorithm: regularization Adaptive Mn Plus stochastic L2 102350 24369 Levelset(0) 55488 14116 Levelset 21964 400

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 40 / 44

slide-69
SLIDE 69

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Experiment in 3D: Example 7

Setup as in Example 6, but differs as follows: data at 81 receivers located at surface, plus use four vertical boreholes near the corners to place virtual sources and sinks. In total, s = 578 experiments. Generate data with 1% noise. Compare work with and without stochastic algorithm: regularization Adaptive Mn Plus stochastic L2 102350 24369 Levelset(0) 55488 14116 Levelset 21964 400

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 40 / 44

slide-70
SLIDE 70

. . . . . .

Adaptive algorithms Stochastic algorithm

. . Example 7

(u) True model (v) Initial shape (w) Level set from ini- tial shape (x) Level set from zero

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 41 / 44

slide-71
SLIDE 71

. . . . . .

Adaptive algorithms Outline

. . Outline

Motivation

From the horse’s mouth EIT, DC resistivity, and many measurements

Previous work

Level set functions and shape optimization Tikhonov, PCG and dynamical regularization

Adaptive algorithms

Selecting number of inner PCG iterations Stochastic algorithm

Conclusions

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 42 / 44

slide-72
SLIDE 72

. . . . . .

Conclusions Conclusions

. . Conclusions

Highly ill-posed problems can be much harder to solve than mildly ill-posed ones. Dynamical regularization is simpler thus better than stabilized Tikhonov in certain challenging situations. The adaptive algorithm for selecting Mn works very well, consistently (somewhat) better than best fixed value. Relative advantage of the stochastic algorithm varies greatly from modest to highly impressive. The advantage of the cross validation approach is that it enables to claim great success when that is possible while avoiding failure in

  • ther situations.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 43 / 44

slide-73
SLIDE 73

. . . . . .

Conclusions Conclusions

. . Conclusions

Highly ill-posed problems can be much harder to solve than mildly ill-posed ones. Dynamical regularization is simpler thus better than stabilized Tikhonov in certain challenging situations. The adaptive algorithm for selecting Mn works very well, consistently (somewhat) better than best fixed value. Relative advantage of the stochastic algorithm varies greatly from modest to highly impressive. The advantage of the cross validation approach is that it enables to claim great success when that is possible while avoiding failure in

  • ther situations.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 43 / 44

slide-74
SLIDE 74

. . . . . .

Conclusions Conclusions

. . Conclusions

Highly ill-posed problems can be much harder to solve than mildly ill-posed ones. Dynamical regularization is simpler thus better than stabilized Tikhonov in certain challenging situations. The adaptive algorithm for selecting Mn works very well, consistently (somewhat) better than best fixed value. Relative advantage of the stochastic algorithm varies greatly from modest to highly impressive. The advantage of the cross validation approach is that it enables to claim great success when that is possible while avoiding failure in

  • ther situations.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 43 / 44

slide-75
SLIDE 75

. . . . . .

Conclusions References

. . References

  • K. van den Doel and U. M. Ascher, Adaptive and stochastic

algorithms for EIT and DC resistivity problems with piecewise constant solutions and many measurements, SIAM J. SISC, to appear.

  • K. van den Doel and U. M. Ascher, Dynamic level set regularization

for large distributed parameter estimation problems, Inverse Problems 23, 1271-1288, 2007.

  • K. van den Doel and U. M. Ascher, On level set regularization for

highly ill-posed distributed parameter estimation problems, J. Computational Physics 216, 707-723, 2006.

Kees van den Doel & (UBC) Adaptive stochastic algorithms October 2011 44 / 44