Forecasting Volcanic Plume Hazards With Fast UQ
Elena Ramona Stefanescu, A.K. Patra, Marcus Bursik, E Bruce Pitman, P . Webley, and M. D. Jones
SUNY Buffalo Buffalo, NY 14260 . University of Alaska-Fairbanks Fairbanks, AK
June 1, 2015
Forecasting Volcanic Plume Hazards With Fast UQ Elena Ramona - - PowerPoint PPT Presentation
Forecasting Volcanic Plume Hazards With Fast UQ Elena Ramona Stefanescu, A.K. Patra , Marcus Bursik, E Bruce Pitman, P . Webley, and M. D. Jones SUNY Buffalo Buffalo, NY 14260 . University of Alaska-Fairbanks Fairbanks, AK June 1, 2015
Elena Ramona Stefanescu, A.K. Patra, Marcus Bursik, E Bruce Pitman, P . Webley, and M. D. Jones
SUNY Buffalo Buffalo, NY 14260 . University of Alaska-Fairbanks Fairbanks, AK
June 1, 2015
Motivation Volcanic hazard Mathematical models Multilevel-Multiscale Methods MLA Concluding Remarks
When modeling volcanic eruptions we deal with two types of flows:
above ground (i.e ash dispersion)
Figure: Volcanic eruptions1
1pictures from public domain
Is a particular village likely to be affected by flows in a given time frame? Should we construct a road along this valley or along a different one? Should a village be evacuated and villagers moved to a different location? How safe it is to fly over this country/state? “A volcanic hazard refers to the probability that a given area will be affected by a potentially destructive volcanic process" – d’Albe, 1979 Hazard maps portray the impact of harmful future volcanic events In a crisis situation these have to be generated and updated in “hours" ) DDDAS.
We have to deal with complex physics; Poorly characterized observation data; Uncertainties in flow characteristics and model parameters complicate the construction of accurate hazard maps; Large datasets that characterize the terrain and the input space.
Motivation Mathematical models
Clustering and Lower Order Representations Multilevel-Multiscale Methods MLA Concluding Remarks
The Bent integral eruption column model was used to produce eruption column parameters (mass loading, column height, grain size distribution) given a specific atmospheric sounding and source conditions 2.
is estimated as a function of axial jet velocity, U, bent radius, b0 and bulk density of the erupting mixture, 0:
00U
(1)
2Bursik, M.I. et al. (2009) Volcanic plumes and wind: Jetstream interaction
examples and implications for air traffic. J. Volc. and Geo. Res., 186, 60-67
The Puff Lagrangian model tracks a finite number of Lagrangian point particles of different sizes, whose location R is propagated from timestep k to timestep k + 1 via an advection/diffusion equation 3.
(2)
3Searcy, C. et al. (1998) PUFF: A high-resolution volcanic ash tracking model. J.
To obtain three-digit accuracy in the expected value would require O(106) simulations; One million 20 minute simulations running non-stop in parallel on 64 processors will take 217 days to complete.
A simulator with uncertain inputs is a stochastic process; An emulator is a statistical model of a stochastic process that can be built from multiple sources of different fidelity data; To construct a Gausian Stochastic Process (GASP) emulator, the covariance structure of the Gaussian must be assumed and parameters determined by Bayesian methodology ( Conti et al. 2007, Barry et al. 2010); A Bayes Linear Model (BLM) emulator is a least square fit plus a gaussian error model that maps inputs to outputs and interpolates the "simulated" data ( O’Hagan et al. 2006); Stochastic Collocation Methods - efficiently use appropriate orthogonal polynomials to model various probability distributions ( Xiu et al. 2002, Dalbey et al. 2008); Multilevel-multiscale methods that combine clustering, low rank approximation and out-of-sample extension.
s(x) = µ(x) + (x) Cov(ˆ
(x), ˆ (yi)) = 2ri(x)
ri(x) = r(xyi) = exp(PNin
iin iin (xiin yi,iin )p)
Ri,j = ri(yj) = rj(yi)
s - represents the simulator output, x is an arbitrary input
E(sBLM(x)|sy) = g(x)T + r(x)TR1
(3)
Var(sBLM(x)|sy) = 2(1r(x)TR1r(x))
(4) g - are the least squares basis functions, are their coefficients, (x) is "Gaussian" model of the error, p=2,1 (absolute val.)
4dalbeythesis.
a) b)
Figure: Probability that a flow will exceed 1 m in depth as a function of position on Mammoth Mountain, CA, given the uncertainties in DEM and input parameters (a) 4 uncertain parameters (east and north location, basal friction, height)– the arrow indicates the center of the town (b) 8 uncertain parameters (including DEM)
Given geospatial data (e.g. elevation ! DEM), a weighted graph is set up G(V,E) where each data point of the is a node in the graph G points form the edges of the graph.
Figure: 120m resolution Mammoth Mountain, CA
values
2
F
2
x
(5) where F(i) represents the feature vector for node i, and x(i) represents the coordinate location of ith node. F and x are positive tuning parameter that controls the decay of the affinity.
The Laplacian matrix L is symmetric with the zero smallest eigenvalue and constant eigenvector (free boundary). The second eigenvector, called the Fiedler vector, describes the partitioning.
(6) where D is diagonal degree matrix.
Compute eigenvalues and eigenvectors of the Laplacian matrix; Map each point to a lower - dimensional representation based on one or more eigenvectors.
Results
3.14 3.16 3.18 3.2 3.22 3.24 3.26 3.28 x 10
5
4.163 4.164 4.165 4.166 4.167 4.168 4.169 4.17 4.171 x 10
6
easting northing
Figure: Feature space represented by the elevation
−0.5 0.5 −0.5 −0.4 −0.3 −0.2 −0.1 0.1 0.2 0.3 0.4 0.5 easting northing
Figure: Feature space represented by the elevation, slope and curvature
Motivation Mathematical models Multilevel-Multiscale Methods Randomized Decomposition MLA Concluding Remarks
Advances in instrumentation and computation making the spatiotemporal data even bigger. Usually DEMs have a very high resolution and cover large area resulting in O(109 1010) data points. Data analysis require efficient low rank approximation. Computational cost: O(N3/2) (most efficient case).
solver for a given problem involves four main tasks: Choosing an appropriate local process Choosing appropriate coarse (large-scale) variables Choosing appropriate methods for transferring information across scale and, Developing appropriate equations (or processes) for the coarse variables
Every cluster S is associated with a state vector
1
N )
i
if i 2 S(m), if i < S(m). The restriction matrix is defined as:
l
if i 2 C,j = i,
k2Ni wik
Coarse-level graph weights are calculated by:
l
l
(7)
a) b) c) d)
Figure: Multilevel aggregation
low-rank approximation to A with rank k. Traditional deterministic approaches (via truncated SVD, rank-revealing QR, Kyrlov space methods) cost at least
communications costs. Solution: Use randomness to assist in the design of algorithms for finding low-rank approximations of large matrices.
Capture the range of the “important" part of A using a sampling matrix
Three factors determine probability of getting a good approximation: Spectral decay of A.
Figure: The numerical rank decays and so does the number of significant eigenvalues.
Type of randomness used to generate Ω. Amount of oversampling (as k ! n, ˜
A natural choice for Ω is matrix of i.i.d. N(0,1) Gaussians, proposed in ( Martinsson et al. 2006). Computation of AΩ takes O(mnk) time for general A. The columns of A are well-mixed. ( Woolfe et al. 2008) proposed using structured random projections. Computation of AΩ takes reduced time O(mnlog(s)). Mixing not as uniform, so potential accuracy loss.
Algorithm 1: Single-scale extension Data: The sampled data Ds = {x1,...,xsl(s)}, B(s), a new data point x⇤ 2 Rd, a function f = [f1,...,fn]T to be extended Output: f (s)
⇤
Calculate the pseudo-inverse (B(s))† of B(s); Calculate the coordinates vector of the orthogonal projection of f (s) on the range of B(s) in the basis of
Calculate the orthogonal projection of f on the columns
Form the matrix G(s)
⇤
Calculate the extension f (s)
⇤
Motivation Mathematical models Multilevel-Multiscale Methods MLA Application to DEMs MLA Emulator Concluding Remarks
(a) 120m DEM - Mammoth Mountain with 6552 points (b) Interpolation using 26 points (c) Interpolation using 64 points (d) Interpolation using 140 points
Figure: DEM interpolation at different levels
Our scheme relies on graph-based algorithms and low-rank approximations of the entire adjacency matrix of the graph. Unlike PCQ or other weighted schemes, our methods do not weight the samples. The framework is not computationaly intensive and it is comparable to MC.
Goal: A map showing the probability of absolute airborne concentration (mgm3) > 0 on April 16 1200 UTC, at 2000 m at each location in the parcel, for Eyjafjallajokull eruption on 18 April, 2010 Ranges of the vent radius were uniformly distributed between 65 and 150 m, while the vent velocity followed the same distribution with values between 45 and 124 ms
50 100 150 40 60 80 100 120 140 0.5 1 1.5 2 x 10
−14
vent velocity vent radius ash
(a) Ash concentration at location 12,55 lat lon - 200 re-sample points
60 80 100 120 140 160 50 100 150 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 10
−13
vent radius vent velocity ash
(b) Ash concentration at location 22,48 lat lon - 1000 re-sample points
Figure: Ash concentration in terms of vent velocity and vent radius. Interpolation in the parameter space: blue dots are simulator samples and red dots are emulator samples
Figure: a) Ash at April 16 0012 level 2000m b) Sample points 55 out of 274
Figure: a) Ash at April 16 0012 level 2000m b) Interpolated value
RMS=0.03
Figure: a) MC 3500 sample runs b) PCQ 255 sample runs
Figure: a) Multiscale 128 sample runs and 50 re-samples b) MC 3500 sample runs
Figure: a) Multiscale 128 sample runs and 200 re-samples b) MC 3500 sample run
Motivation Mathematical models Multilevel-Multiscale Methods MLA Concluding Remarks Conclusions
Create numerically-stable multiscale approach in creating a fast surrogate of the expensive numerical model; Combine graph representation with low-rank approximation to determine representative data points; Perform extensions in both spatial and parameter space, resulting in a high-resolution representation of the space.