[PPT] - Forecasting Volcanic Plume Hazards With Fast UQ Elena Ramona PowerPoint Presentation

SLIDE 1

Forecasting Volcanic Plume Hazards With Fast UQ

Elena Ramona Stefanescu, A.K. Patra, Marcus Bursik, E Bruce Pitman, P . Webley, and M. D. Jones

SUNY Buffalo Buffalo, NY 14260 . University of Alaska-Fairbanks Fairbanks, AK

June 1, 2015

SLIDE 2

Volcanic events

When modeling volcanic eruptions we deal with two types of flows:

n the ground (i.e pyroclastic flows)

above ground (i.e ash dispersion)

Figure: Volcanic eruptions1

1pictures from public domain

SLIDE 4

Questions when studying hazardous volcanic eruptions

Is a particular village likely to be affected by flows in a given time frame? Should we construct a road along this valley or along a different one? Should a village be evacuated and villagers moved to a different location? How safe it is to fly over this country/state? “A volcanic hazard refers to the probability that a given area will be affected by a potentially destructive volcanic process" – d’Albe, 1979 Hazard maps portray the impact of harmful future volcanic events In a crisis situation these have to be generated and updated in “hours" ) DDDAS.

SLIDE 5

Some problems

We have to deal with complex physics; Poorly characterized observation data; Uncertainties in flow characteristics and model parameters complicate the construction of accurate hazard maps; Large datasets that characterize the terrain and the input space.

SLIDE 6

bent & Puff Puff

Clustering and Lower Order Representations Multilevel-Multiscale Methods MLA Concluding Remarks

SLIDE 7

Bent

The Bent integral eruption column model was used to produce eruption column parameters (mass loading, column height, grain size distribution) given a specific atmospheric sounding and source conditions 2.

The mass eruption rate, M

is estimated as a function of axial jet velocity, U, bent radius, b0 and bulk density of the erupting mixture, 0:

M = b2

00U

(1)

2Bursik, M.I. et al. (2009) Volcanic plumes and wind: Jetstream interaction

examples and implications for air traffic. J. Volc. and Geo. Res., 186, 60-67

SLIDE 8

Puff

The Puff Lagrangian model tracks a finite number of Lagrangian point particles of different sizes, whose location R is propagated from timestep k to timestep k + 1 via an advection/diffusion equation 3.

Ri(tk+1) = Ri(tk) + W(tk)∆t + Z(tk)∆t + Si(tk)∆t

(2)

3Searcy, C. et al. (1998) PUFF: A high-resolution volcanic ash tracking model. J.

Volc. and Geo. Res., 80, 1-16

SLIDE 9

Computational Issues

To obtain three-digit accuracy in the expected value would require O(106) simulations; One million 20 minute simulations running non-stop in parallel on 64 processors will take 217 days to complete.

SLIDE 10

Hazard Map Construction

A simulator with uncertain inputs is a stochastic process; An emulator is a statistical model of a stochastic process that can be built from multiple sources of different fidelity data; To construct a Gausian Stochastic Process (GASP) emulator, the covariance structure of the Gaussian must be assumed and parameters determined by Bayesian methodology ( Conti et al. 2007, Barry et al. 2010); A Bayes Linear Model (BLM) emulator is a least square fit plus a gaussian error model that maps inputs to outputs and interpolates the "simulated" data ( O’Hagan et al. 2006); Stochastic Collocation Methods - efficiently use appropriate orthogonal polynomials to model various probability distributions ( Xiu et al. 2002, Dalbey et al. 2008); Multilevel-multiscale methods that combine clustering, low rank approximation and out-of-sample extension.

SLIDE 11

Hierarchical Emulator4

s(x) = µ(x) + (x) Cov(ˆ

(x), ˆ (yi)) = 2ri(x)

ri(x) = r(xyi) = exp(PNin

iin iin (xiin yi,iin )p)

Ri,j = ri(yj) = rj(yi)

s - represents the simulator output, x is an arbitrary input

E(sBLM(x)|sy) = g(x)T + r(x)TR1

(3)

Var(sBLM(x)|sy) = 2(1r(x)TR1r(x))

(4) g - are the least squares basis functions, are their coefficients, (x) is "Gaussian" model of the error, p=2,1 (absolute val.)

4dalbeythesis.

SLIDE 12

Hazard Map

a) b)

Figure: Probability that a flow will exceed 1 m in depth as a function of position on Mammoth Mountain, CA, given the uncertainties in DEM and input parameters (a) 4 uncertain parameters (east and north location, basal friction, height)– the arrow indicates the center of the town (b) 8 uncertain parameters (including DEM)

SLIDE 13

Graph Representation of Geospatial Data

Given geospatial data (e.g. elevation ! DEM), a weighted graph is set up G(V,E) where each data point of the is a node in the graph G points form the edges of the graph.

Figure: 120m resolution Mammoth Mountain, CA

Figure: Graph based representation of DEM

values

SLIDE 14

The Affinity Matrix

Gij = 8 > > < > > : exp kF(i)F(j)k

2

F

⇤expkx(i)x(j)k

2

x

, if kx(i) x(j)k  r 0,

therwise,

(5) where F(i) represents the feature vector for node i, and x(i) represents the coordinate location of ith node. F and x are positive tuning parameter that controls the decay of the affinity.

SLIDE 15

Spectral Clustering for Graphs

The Laplacian matrix L is symmetric with the zero smallest eigenvalue and constant eigenvector (free boundary). The second eigenvector, called the Fiedler vector, describes the partitioning.

L = DG,

(6) where D is diagonal degree matrix.

Spectral representation

Compute eigenvalues and eigenvectors of the Laplacian matrix; Map each point to a lower - dimensional representation based on one or more eigenvectors.

SLIDE 16

Spectral Clustering

Results

3.14 3.16 3.18 3.2 3.22 3.24 3.26 3.28 x 10

5

4.163 4.164 4.165 4.166 4.167 4.168 4.169 4.17 4.171 x 10

6

easting northing

Figure: Feature space represented by the elevation

−0.5 0.5 −0.5 −0.4 −0.3 −0.2 −0.1 0.1 0.2 0.3 0.4 0.5 easting northing

Figure: Feature space represented by the elevation, slope and curvature

SLIDE 17

Why use a Multilevel-multiscale approach?

Advances in instrumentation and computation making the spatiotemporal data even bigger. Usually DEMs have a very high resolution and cover large area resulting in O(109 1010) data points. Data analysis require efficient low rank approximation. Computational cost: O(N3/2) (most efficient case).

SLIDE 19

Multilevel Approximations (MLA)

In general, developing a multiscale

solver for a given problem involves four main tasks: Choosing an appropriate local process Choosing appropriate coarse (large-scale) variables Choosing appropriate methods for transferring information across scale and, Developing appropriate equations (or processes) for the coarse variables

SLIDE 20

SLIDE 21

AMG - Weighted Aggregation Clustering

Every cluster S is associated with a state vector

u(m) = (u(m)

1

,...,u(m)

N )

u(m)

i

= 8 > > < > > : 1

if i 2 S(m), if i < S(m). The restriction matrix is defined as:

(Rij)l+1

l

= 8 > > > > > < > > > > > : 1

if i 2 C,j = i,

wij/P

k2Ni wik

i 2 F,j 2 Ni,

therwise.

Coarse-level graph weights are calculated by:

(Gij)l+1 = ⇣ (Rij)l+1

l

⌘T (Gij)l(Rij)l+1

l

(7)

SLIDE 22

a) b) c) d)

Figure: Multilevel aggregation

SLIDE 23

SLIDE 24

Randomized Decomposition

A 2 Rm⇥n is a huge matrix. Given k ⌧ min{m,n}, we would like a

low-rank approximation to A with rank k. Traditional deterministic approaches (via truncated SVD, rank-revealing QR, Kyrlov space methods) cost at least

O(mnklogmin{m,n}) operations, and can have high

communications costs. Solution: Use randomness to assist in the design of algorithms for finding low-rank approximations of large matrices.

SLIDE 25

The random projection methodology

Capture the range of the “important" part of A using a sampling matrix

Ω, then project A onto this space to reduce rank.

SLIDE 26

Three factors determine probability of getting a good approximation: Spectral decay of A.

Figure: The numerical rank decays and so does the number of significant eigenvalues.

Type of randomness used to generate Ω. Amount of oversampling (as k ! n, ˜

A ! A).

SLIDE 27

Choice of Sampling Matrix

A natural choice for Ω is matrix of i.i.d. N(0,1) Gaussians, proposed in ( Martinsson et al. 2006). Computation of AΩ takes O(mnk) time for general A. The columns of A are well-mixed. ( Woolfe et al. 2008) proposed using structured random projections. Computation of AΩ takes reduced time O(mnlog(s)). Mixing not as uniform, so potential accuracy loss.

SLIDE 28

Prolongation - G⇤

Algorithm 1: Single-scale extension Data: The sampled data Ds = {x1,...,xsl(s)}, B(s), a new data point x⇤ 2 Rd, a function f = [f1,...,fn]T to be extended Output: f (s)

⇤

Calculate the pseudo-inverse (B(s))† of B(s); Calculate the coordinates vector of the orthogonal projection of f (s) on the range of B(s) in the basis of

B(s)’s columns c = (B(s))†f ;

Calculate the orthogonal projection of f on the columns

f B(s), f (s) = B(s)c;

Form the matrix G(s)

⇤

= [g(x⇤,xs1)...g(x⇤,xsl(s))];

Calculate the extension f (s)

⇤

= G⇤c;

SLIDE 29

DEM Representation

(a) 120m DEM - Mammoth Mountain with 6552 points (b) Interpolation using 26 points (c) Interpolation using 64 points (d) Interpolation using 140 points

Figure: DEM interpolation at different levels

SLIDE 31

MLA Hazard Map

Our scheme relies on graph-based algorithms and low-rank approximations of the entire adjacency matrix of the graph. Unlike PCQ or other weighted schemes, our methods do not weight the samples. The framework is not computationaly intensive and it is comparable to MC.

SLIDE 32

Goal: A map showing the probability of absolute airborne concentration (mgm3) > 0 on April 16 1200 UTC, at 2000 m at each location in the parcel, for Eyjafjallajokull eruption on 18 April, 2010 Ranges of the vent radius were uniformly distributed between 65 and 150 m, while the vent velocity followed the same distribution with values between 45 and 124 ms

SLIDE 33

Input Parameters Extension

50 100 150 40 60 80 100 120 140 0.5 1 1.5 2 x 10

−14

vent velocity vent radius ash

(a) Ash concentration at location 12,55 lat lon - 200 re-sample points

60 80 100 120 140 160 50 100 150 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 10

−13

vent radius vent velocity ash

(b) Ash concentration at location 22,48 lat lon - 1000 re-sample points

Figure: Ash concentration in terms of vent velocity and vent radius. Interpolation in the parameter space: blue dots are simulator samples and red dots are emulator samples

SLIDE 34

Spatial Interpolation

Figure: a) Ash at April 16 0012 level 2000m b) Sample points 55 out of 274

SLIDE 35

Hazard Map

Figure: a) Ash at April 16 0012 level 2000m b) Interpolated value

RMS=0.03

SLIDE 36

Hazard Map

Figure: a) MC 3500 sample runs b) PCQ 255 sample runs

SLIDE 37

Hazard Map

Figure: a) Multiscale 128 sample runs and 50 re-samples b) MC 3500 sample runs

SLIDE 38

Hazard Map

Figure: a) Multiscale 128 sample runs and 200 re-samples b) MC 3500 sample run

SLIDE 39

Conclusions

Create numerically-stable multiscale approach in creating a fast surrogate of the expensive numerical model; Combine graph representation with low-rank approximation to determine representative data points; Perform extensions in both spatial and parameter space, resulting in a high-resolution representation of the space.

Forecasting Volcanic Plume Hazards With Fast UQ

Table of Contents

Volcanic events

Questions when studying hazardous volcanic eruptions

Some problems

Table of Contents

bent & Puff Puff

Bent

The mass eruption rate, M

M = b2

Puff

Ri(tk+1) = Ri(tk) + W(tk)∆t + Z(tk)∆t + Si(tk)∆t

Computational Issues

Hazard Map Construction

Hierarchical Emulator4

Hazard Map

Graph Representation of Geospatial Data

The Affinity Matrix

Gij = 8 > > < > > : exp kF(i)F(j)k

⇤expkx(i)x(j)k

, if kx(i) x(j)k  r 0,

Spectral Clustering for Graphs

L = DG,

Spectral representation

Spectral Clustering

Table of Contents

Why use a Multilevel-multiscale approach?

Multilevel Approximations (MLA)

AMG - Weighted Aggregation Clustering

u(m) = (u(m)

,...,u(m)

u(m)

= 8 > > < > > : 1

(Rij)l+1

= 8 > > > > > < > > > > > : 1

wij/P

i 2 F,j 2 Ni,

(Gij)l+1 = ⇣ (Rij)l+1

⌘T (Gij)l(Rij)l+1

Randomized Decomposition

A 2 Rm⇥n is a huge matrix. Given k ⌧ min{m,n}, we would like a

O(mnklogmin{m,n}) operations, and can have high

The random projection methodology

Ω, then project A onto this space to reduce rank.

A ! A).

Choice of Sampling Matrix

Prolongation - G⇤

B(s)’s columns c = (B(s))†f ;

= [g(x⇤,xs1)...g(x⇤,xsl(s))];

= G⇤c;

Table of Contents

DEM Representation

MLA Hazard Map

Input Parameters Extension

Spatial Interpolation

Hazard Map

Hazard Map

Hazard Map

Hazard Map

Table of Contents

Conclusions