Matrixfree conditional simulations of GMRF Somak Dutta Joint Work - - PowerPoint PPT Presentation

matrix free conditional simulations of gmrf
SMART_READER_LITE
LIVE PREVIEW

Matrixfree conditional simulations of GMRF Somak Dutta Joint Work - - PowerPoint PPT Presentation

Matrixfree conditional simulations of GMRF Somak Dutta Joint Work with Debashis Mondal. University of Chicago June 24, 2014 Data on a regular grid. Image of an dummy array of plot D 1 . Black = missing observations. Mixed linear model. y


slide-1
SLIDE 1

Matrix–free conditional simulations of GMRF

Somak Dutta Joint Work with Debashis Mondal.

University of Chicago

June 24, 2014

slide-2
SLIDE 2

Data on a regular grid.

Image of an dummy array of plot D1. Black = missing observations.

slide-3
SLIDE 3

Mixed linear model.

y = Tτ + Fx + ǫ. Array dimension = r × c. (VERY LARGE). y = n × 1 response vector. τ = m × 1 vector of fixed effects. T = n × m known design matrix. x = rc × 1 vector of underlying spatial random field. F = known sparse incidence matrix - Fx gives back the values of the spatial random field on n observed plots. ǫ ∼ N(0, λ−1

y In) : nugget effects.

slide-4
SLIDE 4

Intrinsic auto-regression model for x.

y = Tτ + Fx + ǫ.

◮ x is Gaussian with sparse singular precision matrix W,

xTWx = λ10 (xi,j−xi−1,j)2+λ01 (xi,j−xi,j−1)2.

◮ W has analytically known spectral decomposition

W = P(λ01D01 + λ10D10)PT.

◮ P correspond to the two dimensional discrete cosine

transformation.

slide-5
SLIDE 5

Conditional simulation.

Interested in sampling from: x|y ∼ N(λyA−1FT(y − Tτ) , A−1), A = λyFTF + W.

◮ Step 1: First draw z ∼ N(0, I). ◮ Traditional way: Compute Cholesky factor L such that

LLT = A. And let x = L−1z.

◮ Costs: memory =O((rc)1.5),

#FLOPs = O((rc)2).

◮ We will create algorithm that has costs:

memory = O(rc), #FLOPs = O(rc log rc)

slide-6
SLIDE 6

An “exact” method

◮ x|y ∼ N(A−1(y − Tτ) , A−1),

A = λyFTF + W

◮ Analytically known spectral decomposition: W = PDPT. ◮ Square root of A:

S = [λ

1 2

y FT

PD1/2] then SST = A Simulation algorithm:

◮ Strike 1: First draw z ∼ N(0, I). ◮ Strike 2: Sample with A as the covariance matrix

b = Sz + λy(y − Tτ) ∼ N(λy(y − Tτ), A)

◮ Strike 3: Solve x = A−1b ∼ N(A−1(y − Tτ) , A−1)

slide-7
SLIDE 7

Lanczos algorithm and Incomplete Cholesky Preconditioner

To solve: Ax = b using Lanczos algorithm (Dutta and Mondal, 2012).

◮ Condition number of A → ∞. ◮ L = incomplete Cholesky factorization (lower triangular):

LLT ≈ A ⇒ L−1AL−T ≈ I.

◮ solve L−1AL−Tx1 = L−1b, then x = L−Tx1. ◮ Geometric convergence of Lanczos algo in O(log rc)

iterations.

slide-8
SLIDE 8

Arsenic concentration in Bangladesh (Dutta and Mondal, 2013).

  • ● ●
  • ● ●
  • ● ●
  • 2

4 6 log−arsenic 1027

  • 1020

606 463

  • 418
  • Arsenic conc. (in ppb)

0 − 0.5 0.5 − 10 10 − 50 50 − 150 150 − 1660

Embed the data in a 500 x 300 array.

slide-9
SLIDE 9

Application: Maximal simultaneous exceedance region

◮ D is a 90% exceedance region of x for a given threshold c

P(xi,j ≥ c, ∀(i, j) ∈ D | y) ≥ 90%

◮ Finding the largest such set is not possible (NP hard?). ◮ Put a constraint: highest marginal exceedance

probabilities P(xi,j ≥ c | y) ≥ P(xi′,j′ ≥ c | y) ∀(i, j) ∈ D and (i′, j′) / ∈ D.

◮ Can be thought as a highest probability density

simultaneous exceedance region parallel to the Bayesian highest posterior density credible region.

◮ But still cannot be computed analytically.

slide-10
SLIDE 10

Simulating maximal simultaneous exceedance regions

Step 1. Rank the locations:

◮ Draw an ensemble of realizations x(1), . . . , x(N) of size N

from p(x|y).

◮ Compute marginal exceedance probabilities. ◮ Rank the locations according to decreasing marginal

exceedance probabilities.

Step 2. Compute the exeedance region.

◮ Starting from the top location keep on adding locations

until the simultaneous exceedance probability falls below 90%.

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13

References

  • 1. Dutta and Mondal (2012) An h-likelihood method for

spatial mixed linear model based on intrinsic

  • autoregressions. JRSS-B, Forthcoming.
  • 2. Dutta and Mondal (2013) REML Analysis for Spatial

Mixed Linear Models Based on Approximate Intrinsic Mat´ ern Dependence with Nugget Effects. Submitted.

  • 3. Dutta and Mondal (2014) Matrix–free conditional

simulations of Gaussian Markov random fields and their

  • applications. Preprint.
slide-14
SLIDE 14

Various details

◮ Latitude: 20 – 27 North, Longitude: 88 – 93 East. ◮ Area of each rectangular cell: 2.64 square kilometers. ◮ Embedded in 500 × 300 array ◮ Estimates:

λy = 4.72(0.02), λ01 = 3.14(0.05) and

  • λ10 = 1.17(0.13).