A Multifidelity Cross-entropy Method for Rare Event Simulation
Benjamin Peherstorfer Courant Institute of Mathematical Sciences New York University Karen Willcox and Boris Kramer MIT
1 / 22
A Multifidelity Cross-entropy Method for Rare Event Simulation - - PowerPoint PPT Presentation
A Multifidelity Cross-entropy Method for Rare Event Simulation Benjamin Peherstorfer Courant Institute of Mathematical Sciences New York University Karen Willcox and Boris Kramer MIT 1 / 22 Problem setup High-fidelity model with costs w 1
1 / 22
t ] with
t (z) =
realizations density
2 / 22
◮ Costs of uncertainty quantification reduced ◮ Often orders of magnitude speedups
◮ Control with error bounds/estimators ◮ Rebuild if accuracy too low ◮ No guarantees without bounds/estimators
◮ Propagation of surrogate error on estimate ◮ Surrogates without error control ◮ Costs of rebuilding a surrogate model
3 / 22
◮ Leverage surrogate models for speedup ◮ Recourse to high-fidelity for accuracy
◮ Balance #solves among models ◮ Adapt, fuse, filter with surrogate models
◮ Occasional recourse to high-fidelity model ◮ High-fidelity model is kept in the loop ◮ Independent of error control for surrogates
[P., Willcox, Gunzburger, Survey of multifidelity methods in uncertainty propagation, inference, and opti- mization; SIAM Review, 2017 (to appear)]
.
4 / 22
◮ Leverage surrogate models for speedup ◮ Recourse to high-fidelity for accuracy
◮ Balance #solves among models ◮ Adapt, fuse, filter with surrogate models
◮ Occasional recourse to high-fidelity model ◮ High-fidelity model is kept in the loop ◮ Independent of error control for surrogates
[P., Willcox, Gunzburger, Survey of multifidelity methods in uncertainty propagation, inference, and opti- mization; SIAM Review, 2017 (to appear)]
.
4 / 22
t ] = Eq
t
t
t ]
m
t ( ˜
realizations nominal biasing
5 / 22
◮ Switch between models [Li, Xiu et al., 2010, 2011, 2014] ◮ Reduced basis models with error estimators [Chen and Quarteroni, 2013] ◮ Kriging models and importance sampling [Dubourg et al., 2013] ◮ Subset method with machine-learning-based models [Bourinet et al., 2011],
[Papadopoulos et al., 2012]
◮ Surrogates and importance sampling [P., Cui, Marzouk, Willcox, 2016]
◮ Variance reduction via control variates [Giles et al., 2015], [Elfverson et al., 2014, 2016],
[Fagerlund et al., 2016]
◮ Subset method with coarse-grid approximations [Ullmann and Papaioannou, 2015]
◮ Importance sampling + control variates [P., Kramer, Willcox, 2017]
[P., Willcox, Gunzburger, Survey of multifidelity methods in uncertainty propagation, inference, and opti- mization; SIAM Review, 2017 (to appear)] 6 / 22
◮ Reduces costs per sample ◮ Number of samples to construct biasing density remains the same ◮ Works well for probabilities > 10−5
[P., Cui, Marzouk, Willcox, 2016], [P., Kramer, Willcox, 2017] 7 / 22
◮ Reduces costs per sample ◮ Number of samples to construct biasing density remains the same ◮ Works well for probabilities > 10−5
[P., Cui, Marzouk, Willcox, 2016], [P., Kramer, Willcox, 2017] 7 / 22
t (z) =
[Rubinstein, 1999], [Rubinstein, 2001] 8 / 22
t (z) =
[Rubinstein, 1999], [Rubinstein, 2001] 8 / 22
t (z) =
[Rubinstein, 1999], [Rubinstein, 2001] 8 / 22
t (z) =
[Rubinstein, 1999], [Rubinstein, 2001] 8 / 22
t (z) =
[Rubinstein, 1999], [Rubinstein, 2001] 8 / 22
t (z) =
[Rubinstein, 1999], [Rubinstein, 2001] 8 / 22
t (z) =
[Rubinstein, 1999], [Rubinstein, 2001] 8 / 22
t (z) =
[Rubinstein, 1999], [Rubinstein, 2001] 8 / 22
t (z) =
[Rubinstein, 1999], [Rubinstein, 2001] 8 / 22
◮ Optimal biasing density that reduces variance to 0
ti (z)p(z)
◮ Find qv i ∈ Q = {qv : v ∈ P} with min Kullback-Leibler distance to qi
v i∈P DKL(qi||qv i) ◮ Reformulate as (independent of normalizing constant of qi)
v i∈P Ep[I(1) ti log(qv i)] ◮ Solve approximately by replacing Ep with Monte Carlo estimator
v i∈P
m
ti (Zi) log(qv i(Zi)) ,
ti (Z)
[Rubinstein, 1999], [Rubinstein, 2001] 9 / 22
v i∈P Eqvi−1 [I(1) ti
◮ Choose t1 ≫ t, solve for v 1 ∈ P with
v 1∈P Ep[I(1) t1 log(qv 1)] ◮ Select t2 < t1, solve for v 2 ∈ P with
v 2∈P Eqv1 [I(1) t2
◮ Repeat until threshold t is reached and parameter v ∗ is obtained ◮ Reweighed optimization problems have same optimum as original problems
t
m
t (Z ∗ i ) p(Z ∗ i )
i ) ,
1 , . . . , Z ∗ m ∼ qv ∗
10 / 22
v i∈P Eqvi−1 [I(1) ti
◮ Choose t1 ≫ t, solve for v 1 ∈ P with
v 1∈P Ep[I(1) t1 log(qv 1)] ◮ Select t2 < t1, solve for v 2 ∈ P with
v 2∈P Eqv1 [I(1) t2
◮ Repeat until threshold t is reached and parameter v ∗ is obtained ◮ Reweighed optimization problems have same optimum as original problems
t
m
t (Z ∗ i ) p(Z ∗ i )
i ) ,
1 , . . . , Z ∗ m ∼ qv ∗
10 / 22
◮ Quantile parameter typically ρ ∈ [10−2, 10−1] ◮ Step ti is ρ quantile corresponding to qv i−1
◮ Introduce minimal step size δ > 0 ◮ Number steps T is bounded as
t
11 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t2 t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t3 t2 t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t4 t3 t2 t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t5 t4 t3 t2 t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t6 t5 t4 t3 t2 t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t6 t5 t4 t3 t2 t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t6 t5 t4 t3 t2 t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t6 t5 t4 t3 t2 t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t t1 mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
surrogate model high-fidelity model 0.5 1 1.5 2 2.5 3 3.5 4 t t6 t5 t4 t3 t2 t1 mean density realizations of f (2)(Z) 0.5 1 1.5 2 2.5 3 3.5 4 t t1 mean density realizations f (1)(Z) ◮ Find biasing density q(k) v ∗ with surrogate f (k) ◮ Start with q(k) v ∗ to find biasing density q(k−1) v ∗
◮ Repeat until q(1) v ∗ is found with f (1)
12 / 22
◮ Let q(2) v ∗ be the biasing density found with f (2) ◮ Set t(1) 1
v ∗ ◮ Set tp to be the ρ-quantile corresponding to the nominal density p
v∗
1
1
1
f
f
13 / 22
◮ A “local” bound on model error: Let 0 < α < 1. The models satisfy
◮ Lipschitz continuous distribution functions F (i) q (t) = Pq[f (i) ≤ t]
v∗
1
1
◮ Consider set B = {z ∈ D : |f (2)(z) − t| ≤ α} ◮ Show that corresponding indicator functions are equal for D \ B ◮ Use Lipschitz continuity to bound the probability of B
14 / 22
◮ Coefficient a is given as
0.0225 + ez2(ω)e−0.5 |ξ−0.8|2 0.0225
◮ Random vector Z = [z1, z2] normally distributed ◮ System response
◮ Discretize with varying mesh width h ∈ {2−8, 2−7, . . . , 2−3} ◮ Obtain models f (1), . . . , f (6)
15 / 22
(a) biasing runtime (b) total runtime ◮ Single-fidelity approach uses f (1) ◮ MFCE uses f (1), . . . , f (6) ◮ MFCE reduces runtime by almost 2 orders of magnitude
16 / 22
◮ Number of iterations averaged over 30 runs ◮ MFCE performs most iterations with coarse-grid models
17 / 22
x1 [cm] x2 [cm] 0.5 1 1.5 0.2 0.4 0.6 0.8 temp [K] 500 1000 1500 2000
(a) biasing runtime (b) total runtime
◮ Three models: data-fit, projection-based, and high-fidelity model ◮ Estimate probability that temperature is below threshold, ref Pf ≈ 10−6 ◮ MFCE reduces iterations with high-fidelity model from ≈ 5 to ≈ 1
18 / 22
x1 [cm] x2 [cm] 0.5 1 1.5 0.2 0.4 0.6 0.8 temp [K] 500 1000 1500 2000
(a) biasing runtime (b) total runtime
◮ Same setup as previous reacting-flow example, except now five inputs ◮ First 2 inputs follow Gaussian, 3-4th Gamma, and 5th a log-normal ◮ MFCE achieves speedup compared to using high-fidelity model alone
19 / 22
[Jasa et al. 2018] https://github.com/johnjasa/OpenAeroStruct/
◮ Design variables are thickness and position of control points ◮ Uncertain flight conditions (angle of attack, air density, Mach number) ◮ Output is fuel burn
◮ Take a 3 × 3 × 3 grid in stochastic domain ◮ Evaluate high-fidelity model at those 27 points ◮ Derive linear interpolant of output
20 / 22
◮ Computing 10−6-quantile for a fixed design point ◮ Multifidelity approach achieves up to one order of magnitude speedup
21 / 22
◮ Leverage surrogate models for runtime speedup ◮ Recourse to high-fidelity model for accuracy guarantees
3 P., Kramer, Willcox: Multifidelity preconditioning of the cross-entropy method for rare event simulation and failure probability estimation. SIAM/ASA J. Uncertainty Quantification, 2018. 2 P., Kramer, Willcox: Combining multiple surrogate models to accelerate failure probability estimation with expensive high-fidelity models. J. of Computational Physics, 341:61-75, 2017. 1 P., Cui, Marzouk, Willcox: Multifidelity Importance Sampling. Computer Methods in Applied Mechanics and Engineering, 300:490-509, 2016.
22 / 22