Data-Driven Multifidelity Methods for Monte Carlo Estimation
Benjamin Peherstorfer Courant Institute of Mathematical Sciences New York University Karen Willcox Massachusetts Institute of Technology Max Gunzburger Florida State University
1 / 36
Data-Driven Multifidelity Methods for Monte Carlo Estimation - - PowerPoint PPT Presentation
Data-Driven Multifidelity Methods for Monte Carlo Estimation Benjamin Peherstorfer Courant Institute of Mathematical Sciences New York University Karen Willcox Massachusetts Institute of Technology Max Gunzburger Florida State University 1
1 / 36
2 / 36
◮ Large-scale numerical simulation ◮ Achieves required accuracy ◮ Computationally expensive
◮ Approximate high-fidelity f (1) ◮ Often orders of magnitudes cheaper
costs error high-fidelity model surrogate model surrogate model surrogate model surrogate model data-fit models, response surfaces, machine learning coarse-grid approximations
RN u(z1) u(z2) u(zM) {u(z) | z ∈ D}
reduced basis, proper orthogonal decomposition simplified models, linearized models
3 / 36
◮ Costs of outer loop reduced ◮ Often orders of magnitude speedups
◮ Control with error bounds/estimators ◮ Rebuild if accuracy too low ◮ No guarantees without bounds/estimators
◮ Propagation of surrogate error on estimate ◮ Surrogates without error control ◮ Costs of rebuilding a surrogate model
4 / 36
◮ Costs of outer loop reduced ◮ Often orders of magnitude speedups
◮ Control with error bounds/estimators ◮ Rebuild if accuracy too low ◮ No guarantees without bounds/estimators
◮ Propagation of surrogate error on estimate ◮ Surrogates without error control ◮ Costs of rebuilding a surrogate model
4 / 36
◮ Leverage surrogate models for speedup ◮ Recourse to high-fidelity for accuracy
◮ Occasional recourse to high-fidelity model ◮ High-fidelity model is kept in the loop ◮ Independent of error control for surrogates
◮ Adapt, fuse, filter with surrogate models ◮ Balance #solves among models
[Brandt, 1977], [Hackbusch, 1985], [Bramble et al, 1990], [Booker et al, 1999], [Jones et al, 1998], [Alexandrov et al, 1998], [Christen et al, 2005], [Cui et al, 2014] [P., Willcox, Gunzburger, Survey of multifidelity methods in uncertainty propagation, inference, and opti- mization; SIAM Review, 2018 (to appear)]
.
5 / 36
◮ Leverage surrogate models for speedup ◮ Recourse to high-fidelity for accuracy
◮ Occasional recourse to high-fidelity model ◮ High-fidelity model is kept in the loop ◮ Independent of error control for surrogates
◮ Adapt, fuse, filter with surrogate models ◮ Balance #solves among models
[Brandt, 1977], [Hackbusch, 1985], [Bramble et al, 1990], [Booker et al, 1999], [Jones et al, 1998], [Alexandrov et al, 1998], [Christen et al, 2005], [Cui et al, 2014] [P., Willcox, Gunzburger, Survey of multifidelity methods in uncertainty propagation, inference, and opti- mization; SIAM Review, 2018 (to appear)]
.
5 / 36
6 / 36
1 P., Willcox & Gunzburger Optimal model management for multifidelity Monte Carlo estimation. SISC, 2016.
6 / 36
n
n
◮ Each high-fidelity model solve is computationally expensive ◮ Repeated model solves become prohibitive
[Rozza, Carlberg, Manzoni, Ohlberger, Veroy-Grepl, Willcox, Kramer, Benner, Ullmann, Nouy, Zahm, etc] 7 / 36
n
◮ Correlation coefficient −1 ≤ ρ ≤ 1 of A and B ◮ If ρ = 0, same MSE as regular Monte Carlo ◮ If |ρ| > 0, lower MSE ◮ The higher correlated, the lower MSE of ˆ
[Nelson, 87] 8 / 36
◮ High-fidelity model f (1) : D → Y ◮ Surrogates f (2), . . . , f (k) : D → Y
P., Willcox, Gunzburger, Optimal model management for multifidelity Monte Carlo
◮ Multilevel Monte Carlo [Giles 2008], [Heinrich 2001], [Speight, 2009] ◮ RBM and control variates [Boyaval et al, 2010, 2012], [Vidal et al 2015] ◮ Data-fit models and control variates [Tracey et al 2013] ◮ Monte Carlo with low-/high-fidelity model [Ng & Eldred 2012] ◮ Two models and control variates [Ng & Willcox 2012, 2014]
9 / 36
P., Willcox, Gunzburger, Optimal model management for multifidelity Monte Carlo
m1
k
mi − ¯
mi−1
◮ MFMC estimator ˆ
◮ Costs of each model evaluation 0 < w1, . . . , wk ∈ R give costs of MFMC
k
◮ Selection of coefficients γ2, . . . , γk and model evaluations m1, . . . , mk? ◮ Comparison in terms of costs/MSE to regular Monte Carlo estimation?
10 / 36
1
k
i σ2 i − 2γiρiσ1σi
i of f (i)(Z) ◮ Correlation coefficient ρi between f (1)(Z) and f (i)(Z)
m∈Rk,γ2,...,γk∈R
11 / 36
1 > · · · > ρ2 k > 0 and
i−1 − ρ2 i
i − ρ2 i+1
◮ Establish necessary condition for local optima with Karush-Kuhn-Tucker ◮ Only one local optima with m1 < m2 < · · · < mk ◮ This local optima has smaller objective value than any with “≤”
q
i − ρ2 i+1
q )
[P., Willcox & Gunzburger Optimal model management for multifidelity Monte Carlo estimation. SISC, 2016.] 12 / 36
◮ Inputs: nominal thickness, load, damage ◮ Output: maximum deflection of plate ◮ Only distribution of inputs known ◮ Estimate expected deflection
◮ High-fidelity model: FEM, 300 DoFs ◮ Reduced model: POD, 10 DoFs ◮ Reduced model: POD, 5 DoFs ◮ Reduced model: POD, 2 DoFs ◮ Data-fit model: linear interp., 256 pts ◮ Support vector machine: 256 pts
(a) wing panel spatial coordinate x1 0.2 0.4 0.6 0.8 1 spatial coordinate x2 1 0.8 0.6 0.4 0.2 thickness 0.05 0.06 0.07 0.08 (b) damaged plate
13 / 36
◮ Monte Carlo needs 12h runtime for estimate with error below 10−7 ◮ Multifidelity provides estimator with error below 10−7 after 9 seconds
Computed on MAC cluster 10 nodes à 64 cores total of 640 cores 14 / 36
◮ Largest improvement from “single → two” and “two → three” ◮ Adding yet another reduced/SVM model reduces variance only slightly
15 / 36
100.00% 99.99% 1.95e-3% 99.69% 0.30% 1.35e-4% 98.29% 1.36% 0.31% 0.03% 2.11e-3% 3.47e-5%
16 / 36
◮ Identify the parameters of model with largest
◮ Large-scale variance estimation problem ◮ Multifidelity makes tractable global
◮ Highly flexible, high-aspect-ratio wing ◮ Air density and root angle of attack uncertain ◮ Estimate expected flutter speed ◮ MFMC reduced runtime by more
100 101 102 Computational budget (s) 10-6 10-5 10-4 10-3 10-2 10-1 Variance of mean estimate
(MF)MC hydraulic conductivity estimation MC MFMC
Figure: Elizabeth Qian Figure: Philip S. Beran
17 / 36
18 / 36
1 P., Gunzburger & Willcox Convergence analysis of multifidelity Monte Carlo estimation. Numerische Mathematik, 2018.
18 / 36
◮ Existence and uniqueness ◮ Unbiased estimator of statistics of high-fidelity model f (1) ◮ MSE in terms of costs and correlation coefficients
◮ Estimate
◮ MSE of MFMC estimator ˆ
variance
◮ Find L ∈ N, #model evaluations m, coefficients γ such that e(ˆ
◮ Bound costs c(ˆ
Example: f (1), f (2), . . . correspond to multilevel discretization of f
[Brandt, 1977], [Goodman et al, 1989], [Heinrich, 2001], [Giles, 2008], [Cliffe, 2011] 19 / 36
◮ |E[f − f (ℓ)]| h−αℓ ,
◮ wℓ hβℓ ,
◮ Var
q
q ) ǫ with
q ) ǫ−1ǫ−β/(2α)
◮ Costs bound independent of rates α and β ◮ Agrees with results in multilevel Monte Carlo estimation [Giles, 2008]
[P., Gunzburger & Willcox: Convergence analysis of multifidelity Monte Carlo estimation. Numerische Mathematik, 2018] 20 / 36
21 / 36
1 P., Multifidelity Monte Carlo estimation with adaptive low-fidelity models. submitted, 2017.
21 / 36
. .
◮ Trade off adaptation (“deterministic approximation”) and sampling ◮ Surrogate model is constructed with outer-loop result in mind ◮ Related to “exploration vs. exploitation” in Bayesian optimization ◮ Constructing goal-oriented surrogates [Oden et al, 2000], [Bui-Thanh et al, 2007],
[Lieberman and Willcox, 2013], [Spantini et al, 2017], [Li et al, 2018] 22 / 36
. .
◮ Trade off adaptation (“deterministic approximation”) and sampling ◮ Surrogate model is constructed with outer-loop result in mind ◮ Related to “exploration vs. exploitation” in Bayesian optimization ◮ Constructing goal-oriented surrogates [Oden et al, 2000], [Bui-Thanh et al, 2007],
[Lieberman and Willcox, 2013], [Spantini et al, 2017], [Li et al, 2018] 22 / 36
. .
◮ Trade off adaptation (“deterministic approximation”) and sampling ◮ Surrogate model is constructed with outer-loop result in mind ◮ Related to “exploration vs. exploitation” in Bayesian optimization ◮ Constructing goal-oriented surrogates [Oden et al, 2000], [Bui-Thanh et al, 2007],
[Lieberman and Willcox, 2013], [Spantini et al, 2017], [Li et al, 2018] 22 / 36
. .
◮ Trade off adaptation (“deterministic approximation”) and sampling ◮ Surrogate model is constructed with outer-loop result in mind ◮ Related to “exploration vs. exploitation” in Bayesian optimization ◮ Constructing goal-oriented surrogates [Oden et al, 2000], [Bui-Thanh et al, 2007],
[Lieberman and Willcox, 2013], [Spantini et al, 2017], [Li et al, 2018] 22 / 36
. .
◮ Trade off adaptation (“deterministic approximation”) and sampling ◮ Surrogate model is constructed with outer-loop result in mind ◮ Related to “exploration vs. exploitation” in Bayesian optimization ◮ Constructing goal-oriented surrogates [Oden et al, 2000], [Bui-Thanh et al, 2007],
[Lieberman and Willcox, 2013], [Spantini et al, 2017], [Li et al, 2018] 22 / 36
. .
◮ Trade off adaptation (“deterministic approximation”) and sampling ◮ Surrogate model is constructed with outer-loop result in mind ◮ Related to “exploration vs. exploitation” in Bayesian optimization ◮ Constructing goal-oriented surrogates [Oden et al, 2000], [Bui-Thanh et al, 2007],
[Lieberman and Willcox, 2013], [Spantini et al, 2017], [Li et al, 2018] 22 / 36
. .
◮ Trade off adaptation (“deterministic approximation”) and sampling ◮ Surrogate model is constructed with outer-loop result in mind ◮ Related to “exploration vs. exploitation” in Bayesian optimization ◮ Constructing goal-oriented surrogates [Oden et al, 2000], [Bui-Thanh et al, 2007],
[Lieberman and Willcox, 2013], [Spantini et al, 2017], [Li et al, 2018] 22 / 36
. .
◮ Trade off adaptation (“deterministic approximation”) and sampling ◮ Surrogate model is constructed with outer-loop result in mind ◮ Related to “exploration vs. exploitation” in Bayesian optimization ◮ Constructing goal-oriented surrogates [Oden et al, 2000], [Bui-Thanh et al, 2007],
[Lieberman and Willcox, 2013], [Spantini et al, 2017], [Li et al, 2018] 22 / 36
n ≤ c1n−α ,
◮ Constructing f (n) requires n evaluations of f ◮ Evaluations of f dominate construction costs ◮ Construct costs are significant (e.g., model reduction)
23 / 36
n +
n
◮ If spend n for constructing f (n), budget q = p − n remains for sampling
n +
n
◮ Measures error with respect to goal of estimating E[f (Z)] ◮ Takes construction costs n into account ◮ Measures efficacy of surrogate model for variance reduction (context)
24 / 36
n∈(0,p)
n ◮ Computes ˆ
n∗) ◮ Use MFMC to combine f and surrogate f (ˆ n∗) with budget p − ˆ
[P., Multifidelity Monte Carlo estimation with adaptive low-fidelity models, 2017 (submitted)] 25 / 36
◮ Number of adaptations ˆ
◮ Stop adapting surrogate model even with unlimited budget p → ∞ ◮ Surrogate models can be “too accurate” for multifidelity methods
n ) ∈ O(p−1−α) ◮ Can interpret wn = 0 as E[f (ˆ n∗)(Z)] is known
◮ Helps to understand case wn ≪ 1 (f (ˆ n∗) much cheaper than f )
26 / 36
◮ Measure velocity of fluid ◮ Three inputs uniformly distributed in
◮ Output is velocity ◮ Estimate expected velocity
◮ Based on convection-diffusion equation ◮ Discretized with finite elements ◮ High-fidelity model has 29008 DoFs
Figures: MORWiki https://morwiki.mpi-magdeburg.mpg.de/morwiki/index.php/Anemometer 27 / 36
◮ Gaussian process regression ◮ Take n realizations of Z ◮ Train on corresponding n outputs of f
◮ One dimensional convex problem ◮ Numerically solve for ˆ
◮ Numerically estimate rates from pilot runs ◮ Optimize for ˆ
1e-05 1e-04 1e-03 1e-02 1e-01 1e+02 1e+03 1e+04 error #adaptation samples n estimate of 1 − ρ2
n
rate α ≈ 1.3187 1e-06 1e-05 1e-04 1e+02 1e+03 1e+04 costs [s] #adaptation samples n measurements of costs wn rate β ≈ 0.56 28 / 36
◮ Speedups of up to 3 orders of magnitude compared crude Monte Carlo ◮ MSE of AMFMC decays with p−1−α in pre-asymptotic regime
29 / 36
◮ Approximation of ˆ
◮ Lower and upper bounds seem tight in pre-asymptotic regime
30 / 36
◮ AMFMC optimally trades off adaptation and sampling costs ◮ Up to two orders of magnitude speedups compared to static models
31 / 36
◮ Length and height uniformly distributed
◮ Output is displacement of beam ◮ Estimate expected displacement
◮ High-fidelity finite element model ◮ Surrogate is Gaussian process model ◮ Measure rates numerically
1e-03 1e-02 1e-01 1e+00 1e+02 1e+03 1e+04 1e+05 error #adaptation samples n estimate of 1 − ρ2
n
rate α ≈ 0.91 1e-04 1e-03 1e-02 1e+02 1e+03 1e+04 1e+05 costs [s] #adaptation samples n measurements of costs wn rate β ≈ 0.46
32 / 36
◮ AMFMC achieves about an order of magnitude speedup ◮ Decay of MSE slows down from p−1−α to p−1
33 / 36
34 / 36
◮ Estimate statistics in optimization iteration ◮ Robust optimization
◮ Estimate probability of rare event ◮ Crucial for risk-averse optimization
◮ Identify parameters of model that lead to
◮ Large-scale variance estimation problem
◮ Markov chain Monte Carlo sampling ◮ Increase acceptance probability of moves
0.2 0.4 0.6 0.8 1 1.2 1.4 t mean density
Figure: Elizabeth Qian [P., Willcox, Gunzburger, Survey of multifidelity methods in uncertainty propagation, inference, and opti- mization; SIAM Review, 2018 (to appear)] 35 / 36
e m
e l t w
e l s t h r e e m
e l s s i x m
e l s share of samples[%] 10 -4 10 -2 10 0 10 2
100.00% 99.99% 1.95e-3% 99.69% 0.30% 1.35e-4% 98.29% 1.36% 0.31% 0.03% 2.11e-3% 3.47e-5%
high--delity f(1) reduced f(2) reduced f(4) reduced f(5) data f(3) SVM f(6)
1e-07 1e-06 1e-05 1e-04 1e-03 1e-02 1e-01 1e+00 1e+02 1e+03 1e+04 1e+05 1e+06 estimated MSE budget p (runtime [s]) AMFMC Static MFMC, n = 57 Static MFMC, n = 568
◮ Leverage surrogate models for runtime speedup ◮ Recourse to high-fidelity model for accuracy guarantees ◮ Optimally trade off approximation, sampling, and construction ◮ Context aware construction of surrogate models
1 P., Willcox & Gunzburger Optimal model management for multifidelity Monte Carlo
2 P., Gunzburger & Willcox: Convergence analysis of multifidelity Monte Carlo estimation. Numerische Mathematik, 2018 3 P. Multifidelity Monte Carlo estimation with adaptive low-fidelity models. submitted, 2017.
36 / 36