High-Order Quasi Monte-Carlo Integration for Bayesian Inversion of - - PowerPoint PPT Presentation
High-Order Quasi Monte-Carlo Integration for Bayesian Inversion of - - PowerPoint PPT Presentation
High-Order Quasi Monte-Carlo Integration for Bayesian Inversion of Parametric Operator Equations Christoph Schwab Seminar f ur Angewandte Mathematik ETH Z urich, Switzerland Joint with J. Dick, F. Kuo and T. Le Gia (Sydney), D. Nuyens
Outline
- Infinite-Dimensional Parametric Operator Equations
- Quasi Monte-Carlo Integration and High Order Digital Nets
- Quasi Monte-Carlo Petrov-Galerkin FEM convergence rate
- Bayesian Inverse Problem
- Numerical Results
- Conclusions
- References
Infinite-Dimensional Parametric Operator Equations
- Example: Linear, Affine-Parametric Operator Equation
Givenf ∈ Y′ , for every y ∈ U find u(y) ∈ X : A(y) u(y) = f . (1) Here A(y) = A0 +
- j≥1
yj Aj ∈ L(X; Y′) , ∀y := (yj)j≥1 ∈ U := [−1/2, 1/2]N . (2)
- Assumptions:
Small Fluctuations:
- j≥1
AjL(X,Y′) “small” , Sparsity: ∃0 < p < 1 :
- j≥1
Ajp
L(X,Y′) < ∞ .
(3)
- Example: Karh´
unen-Loeve expansion: a(x, y) = ¯ a(x) +
j≥1 yjψj(x)
A(y) = −∇x · a(x, y)∇x = −∇x · ¯ a(x)∇x
- A0
+
- j≥1
−∇x · ψj(x)∇x
- Aj
, X = Y = H1
0(D) .
- Goal: given G ∈ X ′, find E[G(u(·))] with respect to y ∈ U, i.e.,
I(G(u)) :=
- U
G(u(y)) dy . (4)
Infinite-Dimensional Parametric Operator Equations
- Strategy:
- 1. Dimension Truncation: truncate (2) to s terms,
- 2. solve s-dimensional equation (1) by Petrov-Galerkin discretization from {X h} ⊂ X,
- 3. approximate s-dimensional integral using QMC integration,
1 N
N−1
- n=0
G
- uh
s
- yn − 1
2
- ,
(5) where y0, . . . , yN−1 ∈ [0, 1]s denote N points from a higher order digital net.
- 4. Error bounds explicit w.r. to N, h and truncation dimension s.
Infinite-Dimensional Parametric Operator Equations
- Variational Formulation: aj(·, ·) : X × Y → R via
∀v ∈ X, w ∈ Y : aj(v, w) = Yw, AjvY′ , j = 0, 1, 2, . . . . Assumption: The sequence {Aj}j≥0 satisfies:
- 1. the nominal operator A0 ∈ L(X, Y′) is boundedly invertible:
inf
0=v∈X sup 0=w∈Y
a0(v, w) vXwY ≥ µ0 > 0 , inf
0=w∈Y sup 0=v∈X
a0(v, w) vXwY ≥ µ0 > 0 . (6)
- 2. the fluctuation operators {Aj}j≥1 are small with respect to A0: exists 0 < κ < 2 such that
- j≥1
βj ≤ κ < 2 , where βj := A−1
0 AjL(X,Y′) ,
j = 1, 2, . . . . (7)
Infinite-Dimensional Parametric Operator Equations
Assumption is sufficient for bounded invertibility of A(y) uniformly w.r. to y ∈ U: for every realization y ∈ U of the parameter vector a(y; v, w) := Yw, A(y)vY′ , (8) satisfies uniform (with respect to y ∈ U) inf-sup conditions: with µ = (1 − κ/2)µ0 > 0, ∀y ∈ U : inf
0=v∈X sup 0=w∈Y
a(y; v, w) vXwY ≥ µ , inf
0=w∈Y sup 0=v∈X
a(y; v, w) vXwY ≥ µ . (9) For every f ∈ Y′ and for every y ∈ U, the parametric problem find u(y) ∈ X : a(y; u(y), w) = Yw, fY′ ∀w ∈ Y (10) admits unique solution u(y) which satisfies the a-priori estimate u(y)X ≤ 1 µ fY′ . (11)
Infinite-Dimensional Parametric Operator Equations
- Parametric regularity of solutions
∀y ∈ U : (∂ν
yu)(y)X ≤ C0 |ν|! βνfY′
for all ν ∈ NN
0 with |ν| < ∞ ,
(12) where 0! := 1, βν :=
j≥1 β νj j , with βj = A−1 0 AjL(X,X) and |ν| = j≥1 νj.
- Spatial regularity: scales of smoothness spaces {Xt}t≥0, {Yt}t≥0, with
X = X0 ⊃ X1 ⊃ X2 ⊃ · · · , Y = Y0 ⊃ Y1 ⊃ Y2 ⊃ · · · , and X ′ = X ′
0 ⊃ X ′ 1 ⊃ X ′ 2 ⊃ · · · , Y′ = Y′ 0 ⊃ Y′ 1 ⊃ Y′ 2 ⊃ · · · .
(13)
- Uniform regularity shift (sufficient for single-level QMC-PG): for 0 < t ≤ ¯
t, ∀y ∈ U : f ∈ Y′
t =
⇒ u(y) = A(y)−1f ∈ Xt .
- Mixed regularity shift (necessary for multi-level QMC-PG):
f ∈ Y′
t =
⇒ sup
y∈U
(∂ν
yu)(y)Xt ≤ C0 |ν|! βν t fY′
t
for all ν ∈ NN
0 with |ν| < ∞ .
Infinite-Dimensional Parametric Operator Equations
- Best N-term polynomial chaos approximation
Under Assumption, u(y) : U → X can be expanded in (unconditionally convergent in L2(U; dy)) gpc series ∀y ∈ U : u(y) =
- F
uνLν(y) , where uν = (u, Lν)L2(U;dy) ∈ X . Here F = {ν ∈ NN
0 : |ν| < ∞}, and Lν is the (L2(U; dy)-normalized) tensorized Legendre polynomial
∀ν ∈ F, ∀y ∈ U : Lν(y) :=
- j≥1
Lνj(yj) (note L0 ≡ 1) .
- [Cohen & DeVore & CS (2011), (Chkifa & Cohen & CS 2014)]
Assume that β ∈ ℓp(N) for some 0 < p < 1. Then for every N ∈ N exists Λ ⊂ F such that, for q = 2, ∞ #(Λ) = N and u − uΛLq(U,dy;X) ≤ C(q)N −(1/p−1/q′) . q′ conjugate of q = 2, ∞, constant C > 0 independent of N and of dimension.
- Proof nonconstructive. “Constructive versions” (Chkifa & Cohen & DeVore & CS 2012-2014).
Infinite-Dimensional Parametric Operator Equations
- Petrov-Galerkin discretization:
Let {X h}h>0 ⊂ X and {Yh}h>0 ⊂ Y dense families of subspaces in X and Y.
- Approximation Properties: for 0 < t ≤ ¯
t and 0 < t′ ≤ ¯ t′, and for 0 < h ≤ h0, there hold ∀v ∈ Xt : inf
vh∈X h v − vhX ≤ Ct ht vXt ,
∀w ∈ Yt′ : inf
wh∈Yh w − whY ≤ Ct′ ht′ wYt′ .
(14) ∀ 0 ≤ t ≤ ¯ t : sup
y∈U
A(y)−1L(Y′
t,Xt) < ∞ .
(15)
- Stability: assume that (X h, Yh) satisfy discr. inf-sup condition for A0.
Then there hold uniform (with respect to y ∈ U) discrete inf-sup conditions ∀y ∈ U : inf
0=vh∈X h
sup
0=wh∈Yh
a(y; vh, wh) vhXwhY ≥ ¯ µ > 0 , (16) ∀y ∈ U : inf
0=wh∈Yh
sup
0=vh∈X h
a(y; vh, wh) vhXwhY ≥ ¯ µ > 0 . (17)
Infinite-Dimensional Parametric Operator Equations
- For every 0 < h ≤ h0 and for every y ∈ U, Petrov-Galerkin approximation
find uh(y) ∈ X h : a(y; uh(y), wh) = Ywh, fY′ ∀wh ∈ Yh , (18) admits a unique solution uh(y) which satisfies the a-priori estimate uh(y)X ≤ 1 ¯ µ fY′ . (19)
- Quasioptimality: exists a constant C > 0 such that for all y ∈ U
u(y) − uh(y)X ≤ C ¯ µ inf
0=vh∈X h u(y) − vhX .
(20)
- Convergence Rate: Ex. C > 0 such that for every f ∈ Y′
t with 0 < t ≤ ¯
t as h → 0 u(y) − uh(y)X ≤ C ht fY′
t .
(21)
- Dimension-Truncation: For y ∈ U and s ∈ N, (y1, y2, ..., ys, 0, 0, ...) ∈ U, so
all bounds valid for [−1/2, 1/2]s uniformly w.r. to s.
Infinite-Dimensional Parametric Operator Equations
- Regularity:
∃ 0 < t′ ≤ ¯ t : G(·) ∈ X ′
t′ ,
(22)
- Adjoint Regularity: exists Ct′ > 0 such that for every y ∈ U,
∀y ∈ U : w(y) = (A∗(y))−1G ∈ Yt′ , w(y)Yt′ ≤ Ct′ GX ′
t′ .
(23)
- Superconvergence (Aubin-Nitsche):
for every f ∈ Y′
t with 0 < t ≤ ¯
t, for every G(·) ∈ X ′
t′ with 0 < t′ ≤ ¯
t sup
y∈U
- G(u(y)) − G(uh(y))
- ≤ C hτ fY′
t GX ′ t′ .
(24) where 0 < τ := t + t′ ≤ 2¯ t.
Dimension Truncation
Assume the Aj are enumerated s.t. βj := A−1
0 AjX satisfy
β1 ≥ β2 ≥ · · · ≥ βj ≥ · · · . (25) Then, for every f ∈ Y′, every y ∈ U and for every s ∈ N, sup
y∈U
u(y) − us(y)X ≤ C µ fY′
- j≥s+1
βj . (26) For every G(·) ∈ X ′, |I(G(u)) − I(G(us))| ≤ ˜ C µ fY′ GX ′
j≥s+1
βj 2 (27) for ˜ C > 0 independent of s, f and G. If conditions (25) hold, for any 0 < p < 1 and for any s ∈ N holds the dimension-truncation error bound
- j≥s+1
βj ≤ min
- 1
1/p − 1, 1
j≥1
βp
j
1/p s−(1/p−1) .
Quasi Monte-Carlo Integration and High Order Digital Nets
- Consider general s-variate integrand F ∈ C0([0, 1]s). Approximate s-dimensional integral
Is(F) :=
- [0,1]s F(y) dy
(28) where F(y) = G(uh
s(y − 1 2)) by
- N-point QMC method: an equal-weight quadrature rule
QN,s(F) := 1 N
N−1
- n=0
F(yn) , (29) with judiciously chosen points y0, . . . , yN−1 ∈ [0, 1]s.
Quasi Monte-Carlo Integration and High Order Digital Nets
Theorem 1 [Dick, Gia, Kuo, Nuyens, CS 2013] Let s ≥ 1 and N = bm for m ≥ 1 and prime b. Let β = (βj)j≥1 be a sequence of positive numbers s.t. ∃ 0 < p < 1 :
∞
- j=1
βp
j < ∞ .
(30) Define β{1:s} = (βj)1≤j≤s and α := ⌊1/p⌋ + 1 . (31) Assume for every s ∈ N, for every y ∈ [−1/2, 1/2]N ∀ ν ∈ {0, 1, . . . , α}s : |(∂ν
yF)(y{1:s})| ≤ c |ν|! βν {1:s}
(32) Then, an interlaced polynomial lattice rule of order α ≥ 1 with N points can be constructed using a fast component-by-component algorithm, with cost O(α s N log N + α2 s2N) operations, such that |Is(F) − QN,s(F)| ≤ Cα,β,b,p N −1/p , where Cα,β,b,p < ∞ is independent of s and N.
Combined Error Bound
Theorem 2 [Single-Level High-Order Quasi Monte-Carlo Galerkin Error bound]
- 1. Approximate I(G(u(·))) by dimension-truncation and interlaced polynomial lattice rule (5)
- rder α = ⌊1/p⌋ + 1 , withN = bm points in s dimensions,
with Petrov-Galerkin discretization in D with subspace X h with Mh = dim(X h) DoF, cost O(Mh). Then ex. C > 0 independent of s, h and N such that with τ = t + t′ |I(G(u)) − QN,s(G(uh
s))| ≤ C
- (s−2(1/p−1) + N −1/p)fY′ G(·)X ′ + hτ fY′
tG(·)X ′ t′
- .
- 2. Cost for evaluation of QN,s(G(uh
s)) is O(sNMh) operations.
- 3. Cost for CBC construction of the interlaced polynomial lattice rule
O(α s N log N + α2s2N) operations, O(α s N) memory . Proof I(G(u)) − QN,s(G(uh
s)) = [I(G(u)) − I(G(us))] + [I(G(us)) − I(G(uh s))] + [I(G(uh s)) − QN,s(G(uh s))] .
✷
Bayesian Inverse Problems
- uncertain input data u(y) = u +
j≥1 yjψj ∈ X, uniform prior π0 on u,
- forward equation: A(u; q) = f, f ∈ Y′ known, forward solution map G(u(y)) holomorphic w.r. to y.
- noisy observation data: δ = G(u) + η ∈ Y , G = O ◦ G : X → Y , η ∼ N(0, Γ) ∈ Y = RK,
- QoI: φ(u).
Bayes’ Estimate: IΓ(φ(u)) := 1 Z
- y∈[−1/2,1/2]N exp
- −1
Γδ − G(u(y))2
Y
- φ(u(y))
- =:F(y)
π0(dy) Theorem 3 [High-Order Quasi Monte-Carlo for Bayesian Inverse Problems] For every Γ > 0, the parametric, deterministic integrand function F(y) satisfies the derivative bounds (32). The preceding error bounds apply to Bayesian Estimation. Proof: Analytic continuation and Cauchy’s Theorem. ✷
10 10
1
10
2
10
3
Number of Dimensions s
10
2
10
3
10
4
10
5
10
6
10
7
10
8
CBC Generation Time [ms] 1 2
Generation Time vs. Dimension. θ =0.1, C =0.1, ζ =4, m =15 α =2 α =3 α =4
(a) SPOD, C = 0.1
10 10
1
10
2
10
3
Number of Dimensions s
10
2
10
3
10
4
10
5
CBC Generation Time [ms] 1 1
Generation Time vs. Dimension. θ =0.1, C =1, ζ =4, m =15 α =2 α =3 α =4
(b) product, C = 1
Figure 1: CPU time required for the construction of generating vectors of varying order α = 2, 3, 4 for product
and SPOD weights vs. the dimension s in (a) and (b) and vs. the number of points N = 2m in (a) and (b).
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
1
10
2
10
3
10
4
10
5
10
6
10
7
CBC Generation Time [ms] 1 1
Generation Time vs. N. s =100, θ =0.1, C =0.1, ζ =4 α =2 α =3 α =4
(a) SPOD, s = 100
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
1
10
2
10
3
10
4
10
5
10
6
10
7
CBC Generation Time [ms] 1 1
Generation Time vs. N. s =1000, θ =0.1, C =0.1, ζ =4 α =2 α =3 α =4
(b) product, s = 1000
Figure 2: CPU time required for the construction of generating vectors of varying order α = 2, 3, 4 for product
and SPOD weights vs. the dimension s in (a) and (b) and vs. the number of points N = 2m in (a) and (b).
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
- 15
10
- 14
10
- 13
10
- 12
10
- 11
10
- 10
10
- 9
10
- 8
10
- 7
10
- 6
10
- 5
10
- 4
10
- 3
10
- 2
10
- 1
Quadrature Error |I[f]−Q[f]| 1 −2
Error vs. Number of Points N. s =100, θ =0.2, C =Cα,b, ζ =4 QMC, α =2 QMC, α =3 QMC, α =4 QMC (pruned), α =3 QMC (pruned), α =4
(a) C = Cα,b
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
- 15
10
- 14
10
- 13
10
- 12
10
- 11
10
- 10
10
- 9
10
- 8
10
- 7
10
- 6
10
- 5
10
- 4
10
- 3
10
- 2
10
- 1
Quadrature Error |I[f]−Q[f]| 1 −2
Error vs. Number of Points N. s =100, θ =0.2, C =1, ζ =4 QMC, α =2 QMC, α =3 QMC, α =4 QMC (pruned), α =3 QMC (pruned), α =4
(b) C = 1
Figure 3: Convergence of QMC approximation for SPOD integrand (inv. Karh´
unen-Loeve ) in s = 100 dimensions with interlacing parameter α = 2, 3, 4, with and without pruning.
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
- 15
10
- 14
10
- 13
10
- 12
10
- 11
10
- 10
10
- 9
10
- 8
10
- 7
10
- 6
10
- 5
10
- 4
10
- 3
10
- 2
10
- 1
Quadrature Error |I[f]−Q[f]| 1 −2 1 −4
Error vs. Number of Points N. s =100, θ =0.1, C =0.1, ζ =2 α =2 α =3 α =4
(a) s = 100, ζ = 2
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
- 15
10
- 14
10
- 13
10
- 12
10
- 11
10
- 10
10
- 9
10
- 8
10
- 7
10
- 6
10
- 5
10
- 4
10
- 3
10
- 2
10
- 1
Quadrature Error |I[f]−Q[f]| 1 −2 1 −4
Error vs. Number of Points N. s =100, θ =0.1, C =0.1, ζ =4 α =2 α =3 α =4
(b) s = 100, ζ = 4
Figure 4: Convergence of QMC approximation to integral for product weight integrand in s = 100, 1000 dimensions
with interlacing parameter α = 2, 3, 4.
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
- 15
10
- 14
10
- 13
10
- 12
10
- 11
10
- 10
10
- 9
10
- 8
10
- 7
10
- 6
10
- 5
10
- 4
10
- 3
10
- 2
10
- 1
Quadrature Error |I[f]−Q[f]| 1 −2 1 −4
Error vs. Number of Points N. s =1000, θ =0.1, C =0.1, ζ =2 α =2 α =3 α =4
(a) s = 1000, ζ = 2
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
- 15
10
- 14
10
- 13
10
- 12
10
- 11
10
- 10
10
- 9
10
- 8
10
- 7
10
- 6
10
- 5
10
- 4
10
- 3
10
- 2
10
- 1
Quadrature Error |I[f]−Q[f]| 1 −2 1 −4
Error vs. Number of Points N. s =1000, θ =0.1, C =0.1, ζ =4 α =2 α =3 α =4
(b) s = 1000, ζ = 4
Figure 5: Convergence of QMC approximation to integral for product weight integrand in s = 100, 1000 dimensions
with interlacing parameter α = 2, 3, 4.
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
- 15
10
- 14
10
- 13
10
- 12
10
- 11
10
- 10
10
- 9
10
- 8
10
- 7
10
- 6
10
- 5
10
- 4
10
- 3
10
- 2
10
- 1
Quadrature Error |I[f]−Q[f]| 1 −2 1 −4
Error vs. Number of Points N. s =100, θ =0.1, C =0.1, ζ =2 α =2 α =3 α =4
(a) ζ = 2
10 10
1
10
2
10
3
10
4
10
5
10
6
10
7
Number of Points N =bm
10
- 15
10
- 14
10
- 13
10
- 12
10
- 11
10
- 10
10
- 9
10
- 8
10
- 7
10
- 6
10
- 5
10
- 4
10
- 3
10
- 2
10
- 1
Quadrature Error |I[f]−Q[f]| 1 −2 1 −4
Error vs. Number of Points N. s =100, θ =0.1, C =0.1, ζ =4 α =2 α =3 α =4
(b) ζ = 4
Figure 6: Convergence of QMC approximation for SPOD weight integrand in s = 100 dimensions with interlacing
parameter α = 2, 3, 4.
Conclusions
- Infinite-dimensional, holomorphic-parametric operator equations,
- Advection-Diffusion, Helmholtz in random media, random domains,
- Sparsity:
– affine-parametric equations
- j≥1 A−1
0 Ajp L(X,X) < ∞ some 0 < p < 1 .
– holomorphic-parametric equations A(y; ·) : X → Y′ is holomorphic in polydiscs.
- Petrov-Galerkin discretization in x ∈ D, Dimension truncation at dimension s ∈ N,
- N - point QMC quadrature based on digital net of order α = 1 + ⌊1/p⌋ imply convergence rate
- E[G(u)] − QN,s[G(uh
s)]
- ≤ C(N −1/p + s−2(1/p−1) + ht+t′) .
- Work = O(NsMh) analogous to (Single-Level) Monte-Carlo for p = 1.
- Multi-Level Extension: Report 2014-16, SAM ETH Z¨
urich (Dick, Kuo, LeGia & CS)
- Holomorphic-parametric operator equations: Report 2014-23, SAM ETH Z¨
urich (Dick, Kuo, LeGia & CS).
- Bayesian Inverse Problems:
for holomorphic-parametric problems, countably-parametric posterior density satisfies derivative bounds (Sparsity: Schillings, CS and Stuart 2013, weighted RKHS: Dick, LeGia and CS 2014) = ⇒ QMC Rate N −1/p vs. 1/2 for MCMC, N −(1/p−1) for adaptive Smolyak, CS-based sampling,
References
- J. Dick and F.Y. Kuo and Q.T. Le Gia and D. Nuyens and Ch. Schwab:
Higher order QMC Galerkin discretization for parametric operator equations, Research Report 2013-29, SAM ETH Z¨ urich (to appear in SINUM 2015)
- J. Dick and F.Y. Kuo and Q.T. Le Gia and Ch. Schwab:
Multi-level Higher order QMC Galerkin discretization for parametric operator equations, Research Report 2014-14, SAM ETH Z¨ urich (in review)
- J. Dick and Q.T. Le Gia and Ch. Schwab:
Higher order QMC Petrov-Galerkin discretizations for holomorphic parametric operator equations Research Report 2014-23, SAM ETH Z¨ urich (in review)
- Ch. Schwab
QMC Galerkin discretization of parametric operator equations
- Proc. MCMQC 2012, Springer Publ. 2014.
- A. Chkifa, A. Cohen and Ch. Schwab:
Breaking the curse of dimensionality in sparse polynomial approximation of parametric PDEs
- Journ. Math. Pures & Appliquees (2014).
- R.N. Gantner and Ch. Schwab: