Approximate Bayesian inference for latent Gaussian models avard Rue - - PowerPoint PPT Presentation

approximate bayesian inference for latent gaussian models
SMART_READER_LITE
LIVE PREVIEW

Approximate Bayesian inference for latent Gaussian models avard Rue - - PowerPoint PPT Presentation

Approximative Bayesian inference Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of Mathematical Sciences NTNU, Norway December 4, 2009 1 With S.Martino/N.Chopin Approximative Bayesian inference Overview


slide-1
SLIDE 1

Approximative Bayesian inference

Approximate Bayesian inference for latent Gaussian models

H˚ avard Rue1 Department of Mathematical Sciences NTNU, Norway December 4, 2009

1With S.Martino/N.Chopin

slide-2
SLIDE 2

Approximative Bayesian inference Overview Latent Gaussian models

Latent Gaussian models

Latent Gaussian models have often the following hierarchical structure

  • Observed data y, yi|xi ∼ π(yi|xi, θ)
  • Latent Gaussian field x ∼ N(·, Σ(θ))
  • Hyperparameters θ
  • variability
  • length/strength of dependence
  • parameters in the likelihood

π(x, θ | y) ∝ π(θ) π(x | θ)

  • i∈I

π(yi | xi, θ)

slide-3
SLIDE 3

Approximative Bayesian inference Overview Latent Gaussian models

Latent Gaussian models

Latent Gaussian models have often the following hierarchical structure

  • Observed data y, yi|xi ∼ π(yi|xi, θ)
  • Latent Gaussian field x ∼ N(·, Σ(θ))
  • Hyperparameters θ
  • variability
  • length/strength of dependence
  • parameters in the likelihood

π(x, θ | y) ∝ π(θ) π(x | θ)

  • i∈I

π(yi | xi, θ)

slide-4
SLIDE 4

Approximative Bayesian inference Overview Latent Gaussian models

Latent Gaussian models

Latent Gaussian models have often the following hierarchical structure

  • Observed data y, yi|xi ∼ π(yi|xi, θ)
  • Latent Gaussian field x ∼ N(·, Σ(θ))
  • Hyperparameters θ
  • variability
  • length/strength of dependence
  • parameters in the likelihood

π(x, θ | y) ∝ π(θ) π(x | θ)

  • i∈I

π(yi | xi, θ)

slide-5
SLIDE 5

Approximative Bayesian inference Overview Latent Gaussian models

Latent Gaussian models

Latent Gaussian models have often the following hierarchical structure

  • Observed data y, yi|xi ∼ π(yi|xi, θ)
  • Latent Gaussian field x ∼ N(·, Σ(θ))
  • Hyperparameters θ
  • variability
  • length/strength of dependence
  • parameters in the likelihood

π(x, θ | y) ∝ π(θ) π(x | θ)

  • i∈I

π(yi | xi, θ)

slide-6
SLIDE 6

Approximative Bayesian inference Overview Latent Gaussian models

Example: Generalised additive (mixed) models

g(µi) =

  • j

fj(zji) +

  • k

βj zji + ǫi where

  • each fj(·), is a smooth (random) function
  • βj is the linear effect of zj

Observations {yi} from an exponential family with mean {µi}

slide-7
SLIDE 7

Approximative Bayesian inference Overview Latent Gaussian models

Examples 1D Smoothing count data, general spline smoothing, semi-parametric regression, GLM(M), GAM(M), etc 2D Disease mapping, log-Gaussian Cox-processes, model-based geostatistics, 1D-models with spatial effect(s) 3D Time-series of images, spatio-temporal models. Features

  • Dimension of the latent Gaussian field, n, is large, 102 − 105,

but often Markov.

  • Dimension of the hyperparameters dim(θ) is small, 1 − 5, say.
  • Dimension of the data dim(y) might vary, but is often

non-Gaussian.

slide-8
SLIDE 8

Approximative Bayesian inference Overview Latent Gaussian models

Examples 1D Smoothing count data, general spline smoothing, semi-parametric regression, GLM(M), GAM(M), etc 2D Disease mapping, log-Gaussian Cox-processes, model-based geostatistics, 1D-models with spatial effect(s) 3D Time-series of images, spatio-temporal models. Features

  • Dimension of the latent Gaussian field, n, is large, 102 − 105,

but often Markov.

  • Dimension of the hyperparameters dim(θ) is small, 1 − 5, say.
  • Dimension of the data dim(y) might vary, but is often

non-Gaussian.

slide-9
SLIDE 9

Approximative Bayesian inference Overview Latent Gaussian models

Examples 1D Smoothing count data, general spline smoothing, semi-parametric regression, GLM(M), GAM(M), etc 2D Disease mapping, log-Gaussian Cox-processes, model-based geostatistics, 1D-models with spatial effect(s) 3D Time-series of images, spatio-temporal models. Features

  • Dimension of the latent Gaussian field, n, is large, 102 − 105,

but often Markov.

  • Dimension of the hyperparameters dim(θ) is small, 1 − 5, say.
  • Dimension of the data dim(y) might vary, but is often

non-Gaussian.

slide-10
SLIDE 10

Approximative Bayesian inference Overview Latent Gaussian models

Examples 1D Smoothing count data, general spline smoothing, semi-parametric regression, GLM(M), GAM(M), etc 2D Disease mapping, log-Gaussian Cox-processes, model-based geostatistics, 1D-models with spatial effect(s) 3D Time-series of images, spatio-temporal models. Features

  • Dimension of the latent Gaussian field, n, is large, 102 − 105,

but often Markov.

  • Dimension of the hyperparameters dim(θ) is small, 1 − 5, say.
  • Dimension of the data dim(y) might vary, but is often

non-Gaussian.

slide-11
SLIDE 11

Approximative Bayesian inference Overview Latent Gaussian models

Examples 1D Smoothing count data, general spline smoothing, semi-parametric regression, GLM(M), GAM(M), etc 2D Disease mapping, log-Gaussian Cox-processes, model-based geostatistics, 1D-models with spatial effect(s) 3D Time-series of images, spatio-temporal models. Features

  • Dimension of the latent Gaussian field, n, is large, 102 − 105,

but often Markov.

  • Dimension of the hyperparameters dim(θ) is small, 1 − 5, say.
  • Dimension of the data dim(y) might vary, but is often

non-Gaussian.

slide-12
SLIDE 12

Approximative Bayesian inference Overview Latent Gaussian models

Examples 1D Smoothing count data, general spline smoothing, semi-parametric regression, GLM(M), GAM(M), etc 2D Disease mapping, log-Gaussian Cox-processes, model-based geostatistics, 1D-models with spatial effect(s) 3D Time-series of images, spatio-temporal models. Features

  • Dimension of the latent Gaussian field, n, is large, 102 − 105,

but often Markov.

  • Dimension of the hyperparameters dim(θ) is small, 1 − 5, say.
  • Dimension of the data dim(y) might vary, but is often

non-Gaussian.

slide-13
SLIDE 13

Approximative Bayesian inference Overview Examples: 1D

Examples of latent Gaussian models: 1D

slide-14
SLIDE 14

Approximative Bayesian inference Overview Examples: 1D

Longitudinal mixed effects model: Epil-example from BUGS

slide-15
SLIDE 15

Approximative Bayesian inference Overview Examples: 1D

Longitudinal mixed effects model: Epil-example from BUGS

slide-16
SLIDE 16

Approximative Bayesian inference Overview Examples: 1D

Longitudinal mixed effects model: Epil-example from BUGS

slide-17
SLIDE 17

Approximative Bayesian inference Overview Examples: 2D

Examples of latent Gaussian models: 2D

Disease mapping: Poisson data

slide-18
SLIDE 18

Approximative Bayesian inference Overview Examples: 2D

Examples of latent Gaussian models: 2D

Joint disease mapping: Poisson data

slide-19
SLIDE 19

Approximative Bayesian inference Overview Examples: 2D

Examples of latent Gaussian models: 2D

Spatial GLM with Binomial data

slide-20
SLIDE 20

Approximative Bayesian inference Overview Examples: 2D

Examples of latent Gaussian models: 2D

20 40 60 80 100 20 40 60 80 100 x [m] y [m]

Log-Gaussian Cox-process; Oaks-data

slide-21
SLIDE 21

Approximative Bayesian inference Overview Examples: 2D+

Examples of latent Gaussian models: 2D+

Spatial logit-model with semiparametric covariates

slide-22
SLIDE 22

Approximative Bayesian inference Latent Gaussian models: Characteristic features Tasks

Tasks

Compute from π(x, θ | y) ∝ π(θ) π(x | θ)

  • i∈I

π(yi | xi) the posterior marginals: π(xi | y), for some or all i and/or π(θi | y), for some or all i

slide-23
SLIDE 23

Approximative Bayesian inference Latent Gaussian models: Characteristic features Our approach

Our approach: Approximate Bayesian Inference

  • Can we compute (approximate) marginals directly without

using MCMC?

  • YES!
  • Gain
  • Huge speedup & accuracy
  • The ability to treat latent Gaussian models properly ;-)
slide-24
SLIDE 24

Approximative Bayesian inference Latent Gaussian models: Characteristic features Our approach

Our approach: Approximate Bayesian Inference

  • Can we compute (approximate) marginals directly without

using MCMC?

  • YES!
  • Gain
  • Huge speedup & accuracy
  • The ability to treat latent Gaussian models properly ;-)
slide-25
SLIDE 25

Approximative Bayesian inference Latent Gaussian models: Characteristic features Our approach

Our approach: Approximate Bayesian Inference

  • Can we compute (approximate) marginals directly without

using MCMC?

  • YES!
  • Gain
  • Huge speedup & accuracy
  • The ability to treat latent Gaussian models properly ;-)
slide-26
SLIDE 26

Approximative Bayesian inference Latent Gaussian models: Characteristic features Main ideas

Main ideas (I)

Main ideas are simple and based on the identity π(z) = π(x, z) π(x|z) leading to

  • π(z) = π(x, z)
  • π(x|z)

When π(x|z) is the Gaussian-approximation, this is the Laplace-approximation.

slide-27
SLIDE 27

Approximative Bayesian inference Latent Gaussian models: Characteristic features Main ideas

Main ideas (I)

Main ideas are simple and based on the identity π(z) = π(x, z) π(x|z) leading to

  • π(z) = π(x, z)
  • π(x|z)

When π(x|z) is the Gaussian-approximation, this is the Laplace-approximation.

slide-28
SLIDE 28

Approximative Bayesian inference Latent Gaussian models: Characteristic features Main ideas

Main ideas (II)

Construct the approximations to

  • 1. π(θ|y)
  • 2. π(xi|θ, y)

then we integrate π(xi|y) =

  • π(θ|y) π(xi|θ, y) dθ

π(θj|y) =

  • π(θ|y) dθ−j
slide-29
SLIDE 29

Approximative Bayesian inference Latent Gaussian models: Characteristic features Main ideas

Main ideas (II)

Construct the approximations to

  • 1. π(θ|y)
  • 2. π(xi|θ, y)

then we integrate π(xi|y) =

  • π(θ|y) π(xi|θ, y) dθ

π(θj|y) =

  • π(θ|y) dθ−j
slide-30
SLIDE 30

Approximative Bayesian inference Latent Gaussian models: Characteristic features Main ideas

Main ideas (II)

Construct the approximations to

  • 1. π(θ|y)
  • 2. π(xi|θ, y)

then we integrate π(xi|y) =

  • π(θ|y) π(xi|θ, y) dθ

π(θj|y) =

  • π(θ|y) dθ−j
slide-31
SLIDE 31

Approximative Bayesian inference Gaussian Markov Random fields (GMRFs)

GMRFs: def

A Gaussian Markov random field (GMRF), x = (x1, . . . , xn)T, is a normal distributed random vector with additional Markov properties xi ⊥ xj | x−ij ⇐ ⇒ Qij = 0 where Q is the precision matrix (inverse covariance) Sparse matrices gives fast computations!

slide-32
SLIDE 32

Approximative Bayesian inference The GMRF-approximation

The GMRF-approximation

π(x | y) ∝ exp

  • −1

2xTQx +

  • i

log π(yi|xi)

  • ≈ exp
  • −1

2(x − µ)T(Q + diag(ci))(x − µ)

  • =

π(x|θ, y) Constructed as follows:

  • Locate the mode x∗
  • Expand to second order

Markov and computational properties are preserved

slide-33
SLIDE 33

Approximative Bayesian inference The GMRF-approximation

The GMRF-approximation

π(x | y) ∝ exp

  • −1

2xTQx +

  • i

log π(yi|xi)

  • ≈ exp
  • −1

2(x − µ)T(Q + diag(ci))(x − µ)

  • =

π(x|θ, y) Constructed as follows:

  • Locate the mode x∗
  • Expand to second order

Markov and computational properties are preserved

slide-34
SLIDE 34

Approximative Bayesian inference

Part I Some more background: The Laplace approximation

slide-35
SLIDE 35

Approximative Bayesian inference

Outline I

Background: The Laplace approximation The Laplace-approximation for π(θ|y) The Laplace-approximation for π(xi|θ, y) The Integrated nested Laplace-approximation (INLA) Summary Assessing the error Examples Stochastic volatility Longitudinal mixed effect model Log-Gaussian Cox process Extensions Model choice Automatic detection of “surprising” observations Summary and discussion Bonus

slide-36
SLIDE 36

Approximative Bayesian inference

Outline II

High(er) number of hyperparameters Parallel computing using OpenMP Spatial GLMs

slide-37
SLIDE 37

Approximative Bayesian inference Background: The Laplace approximation

The Laplace approximation: The classic case

Compute and approximation to the integral

  • exp(ng(x)) dx

where n is the parameter going to ∞. Let x0 be the mode of g(x) and assume g(x0) = 0: g(x) = 1 2g′′(x0)(x − x0)2 + · · · .

slide-38
SLIDE 38

Approximative Bayesian inference Background: The Laplace approximation

The Laplace approximation: The classic case

Compute and approximation to the integral

  • exp(ng(x)) dx

where n is the parameter going to ∞. Let x0 be the mode of g(x) and assume g(x0) = 0: g(x) = 1 2g′′(x0)(x − x0)2 + · · · .

slide-39
SLIDE 39

Approximative Bayesian inference Background: The Laplace approximation

The Laplace approximation: The classic case...

Then

  • exp(ng(x)) dx =

n(−g′′(x0)) + · · ·

  • As n → ∞, then the integrand gets more and more peaked.
  • Error should tends to zero as n → ∞
  • Detailed analysis gives

relative error(n) = 1 + O(1/n)

slide-40
SLIDE 40

Approximative Bayesian inference Background: The Laplace approximation

The Laplace approximation: The classic case...

Then

  • exp(ng(x)) dx =

n(−g′′(x0)) + · · ·

  • As n → ∞, then the integrand gets more and more peaked.
  • Error should tends to zero as n → ∞
  • Detailed analysis gives

relative error(n) = 1 + O(1/n)

slide-41
SLIDE 41

Approximative Bayesian inference Background: The Laplace approximation

The Laplace approximation: The classic case...

Then

  • exp(ng(x)) dx =

n(−g′′(x0)) + · · ·

  • As n → ∞, then the integrand gets more and more peaked.
  • Error should tends to zero as n → ∞
  • Detailed analysis gives

relative error(n) = 1 + O(1/n)

slide-42
SLIDE 42

Approximative Bayesian inference Background: The Laplace approximation

The Laplace approximation: The classic case...

Then

  • exp(ng(x)) dx =

n(−g′′(x0)) + · · ·

  • As n → ∞, then the integrand gets more and more peaked.
  • Error should tends to zero as n → ∞
  • Detailed analysis gives

relative error(n) = 1 + O(1/n)

slide-43
SLIDE 43

Approximative Bayesian inference Background: The Laplace approximation

Extension I

gn(x) = 1 n

n

  • i=1

gi(x) then the mode x0 depends on n as well.

slide-44
SLIDE 44

Approximative Bayesian inference Background: The Laplace approximation

Extension II

  • exp(ng(x)) dx

and x is multivariate, then

  • exp(ng(x)) dx =
  • (2π)n

n| − H| where H is the hessian (matrix) at the mode Hij = ∂2 ∂xi∂xj g(x)

  • x=x0
slide-45
SLIDE 45

Approximative Bayesian inference Background: The Laplace approximation

Extension II

  • exp(ng(x)) dx

and x is multivariate, then

  • exp(ng(x)) dx =
  • (2π)n

n| − H| where H is the hessian (matrix) at the mode Hij = ∂2 ∂xi∂xj g(x)

  • x=x0
slide-46
SLIDE 46

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals

  • Our main issue is to compute marginals
  • We can use the Laplace-approximation for this issue as well
  • A more “statistical” derivation might be appropriate
slide-47
SLIDE 47

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals

  • Our main issue is to compute marginals
  • We can use the Laplace-approximation for this issue as well
  • A more “statistical” derivation might be appropriate
slide-48
SLIDE 48

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals

  • Our main issue is to compute marginals
  • We can use the Laplace-approximation for this issue as well
  • A more “statistical” derivation might be appropriate
slide-49
SLIDE 49

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals...

Consider the general problem

  • θ is hyper-parameter with prior π(θ)
  • x is latent with density π(x|θ)
  • y is observed with likelihood π(y|x)

then π(θ|y) = π(x, θ|y) π(x|θ, y) for any x!

slide-50
SLIDE 50

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals...

Consider the general problem

  • θ is hyper-parameter with prior π(θ)
  • x is latent with density π(x|θ)
  • y is observed with likelihood π(y|x)

then π(θ|y) = π(x, θ|y) π(x|θ, y) for any x!

slide-51
SLIDE 51

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals...

Consider the general problem

  • θ is hyper-parameter with prior π(θ)
  • x is latent with density π(x|θ)
  • y is observed with likelihood π(y|x)

then π(θ|y) = π(x, θ|y) π(x|θ, y) for any x!

slide-52
SLIDE 52

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals...

Further, π(θ|y) = π(x, θ|y) π(x|θ, y) ∝ π(θ) π(x|θ) π(y|x) π(x|θ, y) ≈ π(θ) π(x|θ) π(y|x) πG(x|θ, y)

  • x=x∗(θ)

where πG(x|θ, y) is the Gaussian approximation of π(x|θ, y) and x∗(θ) is the mode.

slide-53
SLIDE 53

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals...

Further, π(θ|y) = π(x, θ|y) π(x|θ, y) ∝ π(θ) π(x|θ) π(y|x) π(x|θ, y) ≈ π(θ) π(x|θ) π(y|x) πG(x|θ, y)

  • x=x∗(θ)

where πG(x|θ, y) is the Gaussian approximation of π(x|θ, y) and x∗(θ) is the mode.

slide-54
SLIDE 54

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals...

Further, π(θ|y) = π(x, θ|y) π(x|θ, y) ∝ π(θ) π(x|θ) π(y|x) π(x|θ, y) ≈ π(θ) π(x|θ) π(y|x) πG(x|θ, y)

  • x=x∗(θ)

where πG(x|θ, y) is the Gaussian approximation of π(x|θ, y) and x∗(θ) is the mode.

slide-55
SLIDE 55

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals...

Error: With n repeated measurements of the same x, then the error is

  • π(θ|y) = π(θ|y)(1 + O(n−3/2))

after renormalisation. Relative error is a very nice property!

slide-56
SLIDE 56

Approximative Bayesian inference Background: The Laplace approximation

Computing marginals...

Error: With n repeated measurements of the same x, then the error is

  • π(θ|y) = π(θ|y)(1 + O(n−3/2))

after renormalisation. Relative error is a very nice property!

slide-57
SLIDE 57

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(θ|y)

The Laplace approximation

The Laplace approximation for π(θ|y) is π(θ | y) = π(x, θ|y) π(x|y, θ) (any x) ≈ π(x, θ|y)

  • π(x|y, θ)
  • x=x∗(θ)

= π(θ|y) (1)

slide-58
SLIDE 58

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(θ|y)

Remarks

The Laplace approximation

  • π(θ|y)

turn out to be accurate: x|y, θ appears almost Gaussian in most cases, as

  • x is a priori Gaussian.
  • y is typically not very informative.
  • Observational model is usually ‘well-behaved’.

Note: π(θ|y) itself does not look Gaussian. Thus, a Gaussian approximation of (θ, x) will be inaccurate.

slide-59
SLIDE 59

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(θ|y)

Remarks

The Laplace approximation

  • π(θ|y)

turn out to be accurate: x|y, θ appears almost Gaussian in most cases, as

  • x is a priori Gaussian.
  • y is typically not very informative.
  • Observational model is usually ‘well-behaved’.

Note: π(θ|y) itself does not look Gaussian. Thus, a Gaussian approximation of (θ, x) will be inaccurate.

slide-60
SLIDE 60

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(θ|y)

Remarks

The Laplace approximation

  • π(θ|y)

turn out to be accurate: x|y, θ appears almost Gaussian in most cases, as

  • x is a priori Gaussian.
  • y is typically not very informative.
  • Observational model is usually ‘well-behaved’.

Note: π(θ|y) itself does not look Gaussian. Thus, a Gaussian approximation of (θ, x) will be inaccurate.

slide-61
SLIDE 61

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(θ|y)

Remarks

The Laplace approximation

  • π(θ|y)

turn out to be accurate: x|y, θ appears almost Gaussian in most cases, as

  • x is a priori Gaussian.
  • y is typically not very informative.
  • Observational model is usually ‘well-behaved’.

Note: π(θ|y) itself does not look Gaussian. Thus, a Gaussian approximation of (θ, x) will be inaccurate.

slide-62
SLIDE 62

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(θ|y)

Remarks

The Laplace approximation

  • π(θ|y)

turn out to be accurate: x|y, θ appears almost Gaussian in most cases, as

  • x is a priori Gaussian.
  • y is typically not very informative.
  • Observational model is usually ‘well-behaved’.

Note: π(θ|y) itself does not look Gaussian. Thus, a Gaussian approximation of (θ, x) will be inaccurate.

slide-63
SLIDE 63

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Approximating π(xi|y, θ)

This task is more challenging, since

  • dimension of x, n is large
  • and there are potential n marginals to compute, or at least

O(n). An obvious simple and fast alternative, is to use the GMRF-approximation

  • π(xi|θ, y) = N(xi; µ(θ), σ2(θ))
slide-64
SLIDE 64

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Approximating π(xi|y, θ)

This task is more challenging, since

  • dimension of x, n is large
  • and there are potential n marginals to compute, or at least

O(n). An obvious simple and fast alternative, is to use the GMRF-approximation

  • π(xi|θ, y) = N(xi; µ(θ), σ2(θ))
slide-65
SLIDE 65

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Approximating π(xi|y, θ)

This task is more challenging, since

  • dimension of x, n is large
  • and there are potential n marginals to compute, or at least

O(n). An obvious simple and fast alternative, is to use the GMRF-approximation

  • π(xi|θ, y) = N(xi; µ(θ), σ2(θ))
slide-66
SLIDE 66

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Laplace approximation of π(xi|θ, y)

  • The Laplace approximation:
  • π(xi | y, θ) ≈

π(x, θ|y)

  • π(x−i|xi, y, θ)
  • x−i=x∗

−i(xi,θ)

  • Again, approximation is very good, as x−i|xi, θ is ‘almost

Gaussian’,

  • but it is expensive. In order to get the n marginals:
  • perform n optimisations, and
  • n factorisations of n − 1 × n − 1 matrices.

Can be solved.

slide-67
SLIDE 67

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Laplace approximation of π(xi|θ, y)

  • The Laplace approximation:
  • π(xi | y, θ) ≈

π(x, θ|y)

  • π(x−i|xi, y, θ)
  • x−i=x∗

−i(xi,θ)

  • Again, approximation is very good, as x−i|xi, θ is ‘almost

Gaussian’,

  • but it is expensive. In order to get the n marginals:
  • perform n optimisations, and
  • n factorisations of n − 1 × n − 1 matrices.

Can be solved.

slide-68
SLIDE 68

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Laplace approximation of π(xi|θ, y)

  • The Laplace approximation:
  • π(xi | y, θ) ≈

π(x, θ|y)

  • π(x−i|xi, y, θ)
  • x−i=x∗

−i(xi,θ)

  • Again, approximation is very good, as x−i|xi, θ is ‘almost

Gaussian’,

  • but it is expensive. In order to get the n marginals:
  • perform n optimisations, and
  • n factorisations of n − 1 × n − 1 matrices.

Can be solved.

slide-69
SLIDE 69

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Simplified Laplace Approximation

An series expansion of the LA for π(xi|θ, y):

  • computational much faster: O(n log n) for each i
  • correct the Gaussian approximation for error in shift and

skewness log π(xi|θ, y) = −1 2x2

i + bxi + 1

6d x3

i + · · ·

  • Fit a skew-Normal density

2φ(x)Φ(ax)

  • sufficiently accurate for most applications
slide-70
SLIDE 70

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Simplified Laplace Approximation

An series expansion of the LA for π(xi|θ, y):

  • computational much faster: O(n log n) for each i
  • correct the Gaussian approximation for error in shift and

skewness log π(xi|θ, y) = −1 2x2

i + bxi + 1

6d x3

i + · · ·

  • Fit a skew-Normal density

2φ(x)Φ(ax)

  • sufficiently accurate for most applications
slide-71
SLIDE 71

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Simplified Laplace Approximation

An series expansion of the LA for π(xi|θ, y):

  • computational much faster: O(n log n) for each i
  • correct the Gaussian approximation for error in shift and

skewness log π(xi|θ, y) = −1 2x2

i + bxi + 1

6d x3

i + · · ·

  • Fit a skew-Normal density

2φ(x)Φ(ax)

  • sufficiently accurate for most applications
slide-72
SLIDE 72

Approximative Bayesian inference Background: The Laplace approximation The Laplace-approximation for π(xi |θ, y)

Simplified Laplace Approximation

An series expansion of the LA for π(xi|θ, y):

  • computational much faster: O(n log n) for each i
  • correct the Gaussian approximation for error in shift and

skewness log π(xi|θ, y) = −1 2x2

i + bxi + 1

6d x3

i + · · ·

  • Fit a skew-Normal density

2φ(x)Φ(ax)

  • sufficiently accurate for most applications
slide-73
SLIDE 73

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) I

Step I Explore π(θ|y)

  • Locate the mode
  • Use the Hessian to construct new variables
  • Grid-search
  • Can be case-specific
slide-74
SLIDE 74

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) I

Step I Explore π(θ|y)

  • Locate the mode
  • Use the Hessian to construct new variables
  • Grid-search
  • Can be case-specific
slide-75
SLIDE 75

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) I

Step I Explore π(θ|y)

  • Locate the mode
  • Use the Hessian to construct new variables
  • Grid-search
  • Can be case-specific
slide-76
SLIDE 76

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) I

Step I Explore π(θ|y)

  • Locate the mode
  • Use the Hessian to construct new variables
  • Grid-search
  • Can be case-specific
slide-77
SLIDE 77

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) I

Step I Explore π(θ|y)

  • Locate the mode
  • Use the Hessian to construct new variables
  • Grid-search
  • Can be case-specific
slide-78
SLIDE 78

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) II

Step II For each θj

  • For each i, evaluate the Laplace approximation

for selected values of xi

  • Build a Skew-Normal or log-spline corrected

Gaussian N(xi; µi, σ2

i ) × exp(spline)

to represent the conditional marginal density.

slide-79
SLIDE 79

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) II

Step II For each θj

  • For each i, evaluate the Laplace approximation

for selected values of xi

  • Build a Skew-Normal or log-spline corrected

Gaussian N(xi; µi, σ2

i ) × exp(spline)

to represent the conditional marginal density.

slide-80
SLIDE 80

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) II

Step II For each θj

  • For each i, evaluate the Laplace approximation

for selected values of xi

  • Build a Skew-Normal or log-spline corrected

Gaussian N(xi; µi, σ2

i ) × exp(spline)

to represent the conditional marginal density.

slide-81
SLIDE 81

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) III

Step III Sum out θj

  • For each i, sum out θ
  • π(xi | y) ∝
  • j
  • π(xi | y, θj) ×

π(θj | y)

  • Build a log-spline corrected Gaussian

N(xi; µi, σ2

i ) × exp(spline)

to represent π(xi | y).

slide-82
SLIDE 82

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) III

Step III Sum out θj

  • For each i, sum out θ
  • π(xi | y) ∝
  • j
  • π(xi | y, θj) ×

π(θj | y)

  • Build a log-spline corrected Gaussian

N(xi; µi, σ2

i ) × exp(spline)

to represent π(xi | y).

slide-83
SLIDE 83

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

The integrated nested Laplace approximation (INLA) III

Step III Sum out θj

  • For each i, sum out θ
  • π(xi | y) ∝
  • j
  • π(xi | y, θj) ×

π(θj | y)

  • Build a log-spline corrected Gaussian

N(xi; µi, σ2

i ) × exp(spline)

to represent π(xi | y).

slide-84
SLIDE 84

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

Computing posterior marginals for θj (I)

Main idea

  • Use the integration-points and build an interpolant
  • Use numerical integration on that interpolant
slide-85
SLIDE 85

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

Computing posterior marginals for θj (I)

Main idea

  • Use the integration-points and build an interpolant
  • Use numerical integration on that interpolant
slide-86
SLIDE 86

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

Computing posterior marginals for θj (II)

Practical approach (high accuracy)

  • Rerun using a fine integration grid
  • Possibly with no rotation
  • Just sum up at grid points, then interpolate
slide-87
SLIDE 87

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

Computing posterior marginals for θj (II)

Practical approach (high accuracy)

  • Rerun using a fine integration grid
  • Possibly with no rotation
  • Just sum up at grid points, then interpolate
slide-88
SLIDE 88

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

Computing posterior marginals for θj (II)

Practical approach (high accuracy)

  • Rerun using a fine integration grid
  • Possibly with no rotation
  • Just sum up at grid points, then interpolate
slide-89
SLIDE 89

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

Computing posterior marginals for θj (II)

Practical approach (lower accuracy)

  • Use the Gaussian approximation at the mode θ∗
  • ...BUT, adjust the standard deviation in each direction
  • Then use numerical integration
slide-90
SLIDE 90

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

Computing posterior marginals for θj (II)

Practical approach (lower accuracy)

  • Use the Gaussian approximation at the mode θ∗
  • ...BUT, adjust the standard deviation in each direction
  • Then use numerical integration
slide-91
SLIDE 91

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

Computing posterior marginals for θj (II)

Practical approach (lower accuracy)

  • Use the Gaussian approximation at the mode θ∗
  • ...BUT, adjust the standard deviation in each direction
  • Then use numerical integration
slide-92
SLIDE 92

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

−4 −2 2 4 0.0 0.2 0.4 0.6 0.8 1.0 x dnorm(x)/dnorm(0)

slide-93
SLIDE 93

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Summary

slide-94
SLIDE 94

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Assessing the error

How can we assess the error in the approximations?

Tool 1: Compare a sequence of improved approximations

  • 1. Gaussian approximation
  • 2. Simplified Laplace
  • 3. Laplace
slide-95
SLIDE 95

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Assessing the error

How can we assess the error in the approximations?

Tool 2: Estimate the error using Monte Carlo

  • πu(θ | y)

π(θ | y) −1 ∝ Ee

πG [exp {r(x; θ, y)}]

where r() is the sum of the log-likelihood minus the second order Taylor expansion.

slide-96
SLIDE 96

Approximative Bayesian inference The Integrated nested Laplace-approximation (INLA) Assessing the error

How can we assess the error in the approximations?

Tool 3: Estimate the “effective” number of parameters as defined in the Deviance Information Criteria: pD(θ) = D(x; θ) − D(x; θ) and compare this with the number of observations. Low ratio is good. This criteria has theoretical justification.

slide-97
SLIDE 97

Approximative Bayesian inference Examples Stochastic volatility

Stochastic Volatility model

200 400 600 800 1000 −2 2 4

Log of the daily difference of the pound-dollar exchange rate from October 1st, 1981, to June 28th, 1985.

slide-98
SLIDE 98

Approximative Bayesian inference Examples Stochastic volatility

Stochastic Volatility model

Simple model xt | x1, . . . , xt−1, τ, φ ∼ N (φxt−1, 1/τ) where |φ| < 1 to ensure a stationary process. Observations are taken to be yt | x1, . . . , xt, µ ∼ N(0, exp(µ + xt))

slide-99
SLIDE 99

Approximative Bayesian inference Examples Stochastic volatility

Stochastic Volatility model

Simple model xt | x1, . . . , xt−1, τ, φ ∼ N (φxt−1, 1/τ) where |φ| < 1 to ensure a stationary process. Observations are taken to be yt | x1, . . . , xt, µ ∼ N(0, exp(µ + xt))

slide-100
SLIDE 100

Approximative Bayesian inference Examples Stochastic volatility

Results

Using just the first 50 data-points only, which makes the problem much harder.

slide-101
SLIDE 101

Approximative Bayesian inference Examples Stochastic volatility

Results

−10 −5 5 10 15 20 0.00 0.02 0.04 0.06 0.08 0.10

ν = logit(2φ − 1)

slide-102
SLIDE 102

Approximative Bayesian inference Examples Stochastic volatility

Results

2 4 6 0.00 0.05 0.10 0.15 0.20 0.25 0.30

log(κx)

slide-103
SLIDE 103

Approximative Bayesian inference Examples Stochastic volatility

Using the full dataset

200 400 600 800 1000 −3 −2 −1 1 2 x$V1 x$V2

Predictions for µ + xt+k

slide-104
SLIDE 104

Approximative Bayesian inference Examples Stochastic volatility

Student-tν

20 40 60 80 100 0.00 0.02 0.04 0.06 0.08 convert.dens(xx, yy, FUN = dof.trans)$x convert.dens(xx, yy, FUN = dof.trans)$y

Posterior marginal for ν.

slide-105
SLIDE 105

Approximative Bayesian inference Examples Longitudinal mixed effect model

Epil-example from Win/Open-BUGS

slide-106
SLIDE 106

Approximative Bayesian inference Examples Longitudinal mixed effect model

Epil-example from Win/Open-BUGS

slide-107
SLIDE 107

Approximative Bayesian inference Examples Longitudinal mixed effect model

Epil-example from Win/Open-BUGS

1.2 1.4 1.6 1.8 2.0 1 2 3 4 5

Marginals for a0

slide-108
SLIDE 108

Approximative Bayesian inference Examples Longitudinal mixed effect model

Epil-example from Win/Open-BUGS

5 10 15 0.0 0.1 0.2 0.3

Marginals for τb1

slide-109
SLIDE 109

Approximative Bayesian inference Examples Log-Gaussian Cox process

Log-Gaussian Cox process

40 80 120 160 200 20 40 60 80 100

Locations of trees of a particular type: Data comes from a 50-hectare permanent tree plot which was established in 1980 in the tropical moist forest of Barro Colorado Island in Gatun Lake in central Panama.

slide-110
SLIDE 110

Approximative Bayesian inference Examples Log-Gaussian Cox process

Log-Gaussian Cox process

50 100 150 200 20 40 60 80 100

Covariate: altitude

slide-111
SLIDE 111

Approximative Bayesian inference Examples Log-Gaussian Cox process

Log-Gaussian Cox process

50 100 150 200 20 40 60 80 100

Covariate: norm of gradient

slide-112
SLIDE 112

Approximative Bayesian inference Examples Log-Gaussian Cox process

Model

Model for log-density at each “pixel” in a 200 × 100 lattice ηi = β0 + β1c1i + β2c2i + ui + vi,

  • i

ui = 0 The spatial term is an IGMRF E(ui | u−i) = 1 20

  • 8
  • ◦ ◦ ◦ ◦
  • ◦ • ◦ ◦
  • • ◦ • ◦
  • ◦ • ◦ ◦
  • ◦ ◦ ◦ ◦

− 2

  • ◦ ◦ ◦ ◦
  • • ◦ • ◦
  • ◦ ◦ ◦ ◦
  • • ◦ • ◦
  • ◦ ◦ ◦ ◦

− 1

  • ◦ • ◦ ◦
  • ◦ ◦ ◦ ◦
  • ◦ ◦ ◦ •
  • ◦ ◦ ◦ ◦
  • ◦ • ◦ ◦
  • Prec(ui | u−i) = 20κu
slide-113
SLIDE 113

Approximative Bayesian inference Examples Log-Gaussian Cox process

Model

Model for log-density at each “pixel” in a 200 × 100 lattice ηi = β0 + β1c1i + β2c2i + ui + vi,

  • i

ui = 0 The spatial term is an IGMRF E(ui | u−i) = 1 20

  • 8
  • ◦ ◦ ◦ ◦
  • ◦ • ◦ ◦
  • • ◦ • ◦
  • ◦ • ◦ ◦
  • ◦ ◦ ◦ ◦

− 2

  • ◦ ◦ ◦ ◦
  • • ◦ • ◦
  • ◦ ◦ ◦ ◦
  • • ◦ • ◦
  • ◦ ◦ ◦ ◦

− 1

  • ◦ • ◦ ◦
  • ◦ ◦ ◦ ◦
  • ◦ ◦ ◦ •
  • ◦ ◦ ◦ ◦
  • ◦ • ◦ ◦
  • Prec(ui | u−i) = 20κu
slide-114
SLIDE 114

Approximative Bayesian inference Examples Log-Gaussian Cox process

Results

40 80 120 160 200 20 40 60 80 100

The posterior expectation of the spatial field

slide-115
SLIDE 115

Approximative Bayesian inference Examples Log-Gaussian Cox process

Results

40 80 120 160 200 20 40 60 80 100

Locations with high KLD

slide-116
SLIDE 116

Approximative Bayesian inference Examples Log-Gaussian Cox process

Results

−0.05 0.00 0.05 0.10 0.15 0.20 0.25 2 4 6 8 10

Effect of altitude

slide-117
SLIDE 117

Approximative Bayesian inference Examples Log-Gaussian Cox process

Results

2 4 6 8 10 12 0.00 0.05 0.10 0.15 0.20 0.25

Effect of norm of the gradient

slide-118
SLIDE 118

Approximative Bayesian inference Extensions

Extensions

  • Model choice/selection
  • Automatic detection of “surprising” observations

Will not discuss

  • High(er) number of hyperparameters
  • Parallel computing using OpenMP
  • Sensitivity Analysis
slide-119
SLIDE 119

Approximative Bayesian inference Extensions

Extensions

  • Model choice/selection
  • Automatic detection of “surprising” observations

Will not discuss

  • High(er) number of hyperparameters
  • Parallel computing using OpenMP
  • Sensitivity Analysis
slide-120
SLIDE 120

Approximative Bayesian inference Extensions

Extensions

  • Model choice/selection
  • Automatic detection of “surprising” observations

Will not discuss

  • High(er) number of hyperparameters
  • Parallel computing using OpenMP
  • Sensitivity Analysis
slide-121
SLIDE 121

Approximative Bayesian inference Extensions Model choice

Model choice

Chose/compare various model is important but difficult

  • Bayes factors (general available)
  • Deviance information criterion (DIC) (hierarchical models)
slide-122
SLIDE 122

Approximative Bayesian inference Extensions Model choice

Marginal likelihood

Marginal likelihood is the normalising constant for π(θ|y),

  • π(y) =

π(θ)π(x|θ)π(y|x, θ)

  • πG(x|θ, y)
  • x=x⋆(θ)

dθ. (2) I many hierarchical GMRF models the prior is intrinsic/improper, so this is difficult to use.

slide-123
SLIDE 123

Approximative Bayesian inference Extensions Model choice

Marginal likelihood

Marginal likelihood is the normalising constant for π(θ|y),

  • π(y) =

π(θ)π(x|θ)π(y|x, θ)

  • πG(x|θ, y)
  • x=x⋆(θ)

dθ. (2) I many hierarchical GMRF models the prior is intrinsic/improper, so this is difficult to use.

slide-124
SLIDE 124

Approximative Bayesian inference Extensions Model choice

Deviance Information Criteria

Based on the deviance D(x; θ) = −2

  • i

log(yi | xi, θ) and DIC = 2 × Mean (D(x; θ)) − D(Mean(x); θ∗) This is quite easy to compute

slide-125
SLIDE 125

Approximative Bayesian inference Extensions Model choice

Bayesian Cross-validation

Easy to compute using the INLA-approach π(yi | y−i) =

  • θ
  • xi

π(yi | xi, θ) π(xi | y−i, θ) dxi

  • π(θ | y−i) dθ

where π(xi | y−i, θ) ∝ π(xi|y, θ) π(yi|xi, θ) Require a one-dimensional integral for each i and θ.

slide-126
SLIDE 126

Approximative Bayesian inference Extensions Automatic detection of “surprising” observations

Automatic detection of “surprising” observations

Compute Prob(ynew

i

≤ yi | y−i) Look for unusual large or small values

slide-127
SLIDE 127

Approximative Bayesian inference Summary and discussion

Summary and discussion

  • Latent Gaussian models are an important class of models with

a wide range of applications!

  • The integrated nested Laplace-approximations works

extremely well, way beyond my expectations!!!

  • Obtain in practice “exact” results
  • Relative error only
  • Computationally FAST even for large n
  • Take advantage of multicore architecture using OpenMP
  • Extensions
  • Compare models (DIC/Bayes factors)
  • Cross-validation and “surprising” observations
  • High(er) number of hyperparameters
  • Sensitivity analysis
slide-128
SLIDE 128

Approximative Bayesian inference Summary and discussion

Summary and discussion

  • Latent Gaussian models are an important class of models with

a wide range of applications!

  • The integrated nested Laplace-approximations works

extremely well, way beyond my expectations!!!

  • Obtain in practice “exact” results
  • Relative error only
  • Computationally FAST even for large n
  • Take advantage of multicore architecture using OpenMP
  • Extensions
  • Compare models (DIC/Bayes factors)
  • Cross-validation and “surprising” observations
  • High(er) number of hyperparameters
  • Sensitivity analysis
slide-129
SLIDE 129

Approximative Bayesian inference Summary and discussion

Summary and discussion

  • Latent Gaussian models are an important class of models with

a wide range of applications!

  • The integrated nested Laplace-approximations works

extremely well, way beyond my expectations!!!

  • Obtain in practice “exact” results
  • Relative error only
  • Computationally FAST even for large n
  • Take advantage of multicore architecture using OpenMP
  • Extensions
  • Compare models (DIC/Bayes factors)
  • Cross-validation and “surprising” observations
  • High(er) number of hyperparameters
  • Sensitivity analysis
slide-130
SLIDE 130

Approximative Bayesian inference Bonus High(er) number of hyperparameters

High(er) number of hyperparameters

Numerical (grid) integration is costly and costs at least 3dim(θ) Need another approach for “high-dimensional” hyperparameters.

slide-131
SLIDE 131

Approximative Bayesian inference Bonus High(er) number of hyperparameters

Borrow ideas from experimental design...

www.wikipedia.org: In statistics, a central composite design is an experimental design, useful in response surface methodology, for building a second order (quadratic) model for the response variable without needing to use a complete three-level factorial experiment.

slide-132
SLIDE 132

Approximative Bayesian inference Bonus High(er) number of hyperparameters

Idea

slide-133
SLIDE 133

Approximative Bayesian inference Bonus High(er) number of hyperparameters

Number of integration points

Dimension #Int.pts CCD #Int.pts GRID: 3dim 2 9 8 3 15 27 4 25 64 5 27 125 6 45 216 7 79 343 8 81 512 9 147 729 10 149 1000 14 285 2744 18 549 5832 22 1069 10648

slide-134
SLIDE 134

Approximative Bayesian inference Bonus High(er) number of hyperparameters

Experience so far

  • Works quite well
  • The integration problems is well-behaved.
slide-135
SLIDE 135

Approximative Bayesian inference Bonus Parallel computing using OpenMP

Parallel computing using OpenMP

Why?

  • Speed (primary)
  • Ability to run larger models (secondary)

Why are so few doing this?

  • (Seemingly) difficult
  • Better to wait more than to code more
  • Lack of local parallel machines.
slide-136
SLIDE 136

Approximative Bayesian inference Bonus Parallel computing using OpenMP

Parallel computing using OpenMP

Why?

  • Speed (primary)
  • Ability to run larger models (secondary)

Why are so few doing this?

  • (Seemingly) difficult
  • Better to wait more than to code more
  • Lack of local parallel machines.
slide-137
SLIDE 137

Approximative Bayesian inference Bonus Parallel computing using OpenMP

Result

The Gain/Pain-ratio is simply to low! But there is hope, due to

  • new trends in computing
  • including parallel tools into mainstream compilers
slide-138
SLIDE 138

Approximative Bayesian inference Bonus Parallel computing using OpenMP

Result

The Gain/Pain-ratio is simply to low! But there is hope, due to

  • new trends in computing
  • including parallel tools into mainstream compilers
slide-139
SLIDE 139

Approximative Bayesian inference Bonus Parallel computing using OpenMP

Trends in computing

Once upon a time, chip makers made computer chips faster every year by increasing their processing speeds. But lately, the microprocessor industry has run into some fundamental limits to those speeds.

slide-140
SLIDE 140

Approximative Bayesian inference Bonus Parallel computing using OpenMP

Trends in computing

The latest solution: Design chips with multiple processor cores.

slide-141
SLIDE 141

Approximative Bayesian inference Bonus Parallel computing using OpenMP

Trends in computing

The result: Today’s big-brained chips that can do more processing than ever before, if the software is modified to take advantage of their design.

slide-142
SLIDE 142

Approximative Bayesian inference Bonus Parallel computing using OpenMP

Parallel machines are now everywhere...

slide-143
SLIDE 143

Approximative Bayesian inference Bonus Parallel computing using OpenMP

How to make use of multicore machines?

May 13, 2007: GCC 4.2 Release Series

OpenMP is now supported for the C, C++

and Fortran compilers.

slide-144
SLIDE 144

Approximative Bayesian inference Bonus Parallel computing using OpenMP

OpenMP: coding

  • Easy way to parallelize code
  • Start with a serial version
  • Parallel parts of the code when you have time
  • Will still run on a serial machine
  • Very little interference with the code itself, mainly compiler

directives

slide-145
SLIDE 145

Approximative Bayesian inference Bonus Parallel computing using OpenMP

OpenMP: running

  • Just run the program and the run-time environment will take

care of the rest.

  • This includes how many CPU’s that are used at the time.
  • This will change during the execution of the program.
slide-146
SLIDE 146

Approximative Bayesian inference Bonus Parallel computing using OpenMP

Example from GMRFLib

#pragma omp p a r a l l e l for p r i v a t e ( i ) for ( i = 0; i < n ; i++) { GMRFLib 2order approx (NULL, &bb [ i ] , &cc [ i ] , d [ i ] , mode [ i ] , i , mode , loglFunc , loglFunc arg , &( blockupdate par − >s t e p l e n ) ) ; cc [ i ] = MAX( 0 . 0 , cc [ i ] ) ; }

slide-147
SLIDE 147

Approximative Bayesian inference Bonus Parallel computing using OpenMP

GMRFLib

  • INLA-routines make quite good use of OpenMP
  • and so does the inla-program.
slide-148
SLIDE 148

Approximative Bayesian inference Bonus Spatial GLMs

Spatial GLMs (w/S.Martino/J.Eidsvik)

Model

  • Stationary Gaussian field on a torus
  • non-Gaussian observations
  • n is huge: n = 5122 or n = 10242
  • number of observations, m, is small, a few hundred.

Solve using

  • INLA, but the computational tools are now very different
  • Exploit the block Toeplitz structure using DFTs
  • and simply rank-m correct for the observations using soft

constraints.

slide-149
SLIDE 149

Approximative Bayesian inference Bonus Spatial GLMs

Spatial GLMs (w/S.Martino/J.Eidsvik)

Model

  • Stationary Gaussian field on a torus
  • non-Gaussian observations
  • n is huge: n = 5122 or n = 10242
  • number of observations, m, is small, a few hundred.

Solve using

  • INLA, but the computational tools are now very different
  • Exploit the block Toeplitz structure using DFTs
  • and simply rank-m correct for the observations using soft

constraints.

slide-150
SLIDE 150

Approximative Bayesian inference Bonus Spatial GLMs

Spatial GLMs (w/S.Martino/J.Eidsvik)

Model

  • Stationary Gaussian field on a torus
  • non-Gaussian observations
  • n is huge: n = 5122 or n = 10242
  • number of observations, m, is small, a few hundred.

Solve using

  • INLA, but the computational tools are now very different
  • Exploit the block Toeplitz structure using DFTs
  • and simply rank-m correct for the observations using soft

constraints.

slide-151
SLIDE 151

Approximative Bayesian inference Bonus Spatial GLMs

Spatial GLMs (w/S.Martino/J.Eidsvik)

Model

  • Stationary Gaussian field on a torus
  • non-Gaussian observations
  • n is huge: n = 5122 or n = 10242
  • number of observations, m, is small, a few hundred.

Solve using

  • INLA, but the computational tools are now very different
  • Exploit the block Toeplitz structure using DFTs
  • and simply rank-m correct for the observations using soft

constraints.

slide-152
SLIDE 152

Approximative Bayesian inference Bonus Spatial GLMs

Spatial GLMs (w/S.Martino/J.Eidsvik)

Model

  • Stationary Gaussian field on a torus
  • non-Gaussian observations
  • n is huge: n = 5122 or n = 10242
  • number of observations, m, is small, a few hundred.

Solve using

  • INLA, but the computational tools are now very different
  • Exploit the block Toeplitz structure using DFTs
  • and simply rank-m correct for the observations using soft

constraints.

slide-153
SLIDE 153

Approximative Bayesian inference Bonus Spatial GLMs

Example: Rongelap data

−6000 −5000 −4000 −3000 −2000 −1000 −4000 −3500 −3000 −2500 −2000 −1500 −1000 −500 500 Rongelap Island with 157 measurement locations East (m) North (m)

slide-154
SLIDE 154

Approximative Bayesian inference Bonus Spatial GLMs

Example: Rongelap data, results

Marginal predictions. East (m) North (m) −6000 −5000 −4000 −3000 −2000 −1000 −4000 −3500 −3000 −2500 −2000 −1500 −1000 −500 500 −1 −0.5 0.5 1 1.5 2 2.5

slide-155
SLIDE 155

Approximative Bayesian inference Bonus Spatial GLMs

Example: Rongelap data, results

Marginal predicted standard deviations. East (m) North (m) −6000 −5000 −4000 −3000 −2000 −1000 −4000 −3500 −3000 −2500 −2000 −1500 −1000 −500 500 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55

slide-156
SLIDE 156

Approximative Bayesian inference Bonus Spatial GLMs

Spatial GLMs: Summary

  • Main interest is to predict unobserved sites
  • Gaussian approximations seems sufficient
  • they are O(m)-times faster to compute...
  • Can also use GMRFs for large m using GMRF-proxies for

Gaussian fields

slide-157
SLIDE 157

Approximative Bayesian inference Bonus Spatial GLMs

Spatial GLMs: Summary

  • Main interest is to predict unobserved sites
  • Gaussian approximations seems sufficient
  • they are O(m)-times faster to compute...
  • Can also use GMRFs for large m using GMRF-proxies for

Gaussian fields

slide-158
SLIDE 158

Approximative Bayesian inference Bonus Spatial GLMs

Spatial GLMs: Summary

  • Main interest is to predict unobserved sites
  • Gaussian approximations seems sufficient
  • they are O(m)-times faster to compute...
  • Can also use GMRFs for large m using GMRF-proxies for

Gaussian fields

slide-159
SLIDE 159

Approximative Bayesian inference Bonus Spatial GLMs

Spatial GLMs: Summary

  • Main interest is to predict unobserved sites
  • Gaussian approximations seems sufficient
  • they are O(m)-times faster to compute...
  • Can also use GMRFs for large m using GMRF-proxies for

Gaussian fields