Predictive Coarse-Graining M. Schberl (TUM), N.Zabaras - - PowerPoint PPT Presentation

predictive coarse graining
SMART_READER_LITE
LIVE PREVIEW

Predictive Coarse-Graining M. Schberl (TUM), N.Zabaras - - PowerPoint PPT Presentation

Predictive Coarse-Graining M. Schberl (TUM), N.Zabaras (Warwick/TUM-IAS), P .S. Koutsourelakis (TUM) Predictive Multiscale Materials Modelling Turing Gateway to Mathematics, Isaac Newton Institute for Mathematical Sciences Decemeber 1 2015


slide-1
SLIDE 1

Predictive Coarse-Graining

  • M. Schöberl (TUM), N.Zabaras (Warwick/TUM-IAS),

P .S. Koutsourelakis (TUM) Predictive Multiscale Materials Modelling Turing Gateway to Mathematics, Isaac Newton Institute for Mathematical Sciences

Decemeber 1 2015

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 1 / 32

slide-2
SLIDE 2

Problem Definition - Equilibrium Statistical Mechanics

Fine scale

pf(x) ∝ e−βVf (x)

  • x: fine-scale dofs
  • Vf(x): atomistic potential

Observables: Epf [a] =

  • a(x) pf(x) dx

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 2 / 32

slide-3
SLIDE 3

Problem Definition - Equilibrium Statistical Mechanics

Fine scale

pf(x) ∝ e−βVf (x)

  • x: fine-scale dofs
  • Vf(x): atomistic potential

Observables: Epf [a] =

  • a(x) pf(x) dx

Coarse scale

X = R(x), dim(X) << dim(x)

  • X: coarse-scale dofs
  • R: restriction operator

(fine → coarse)

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 2 / 32

slide-4
SLIDE 4

Problem Definition - Equilibrium Statistical Mechanics

Fine scale

pf(x) ∝ e−βVf (x)

  • x: fine-scale dofs
  • Vf(x): atomistic potential

Observables: Epf [a] =

  • a(x) pf(x) dx

Coarse scale

X = R(x), dim(X) << dim(x)

  • X: coarse-scale dofs
  • R: restriction operator

(fine → coarse)

Goal:

How can one simulate X and still predict Epf [a]?

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 2 / 32

slide-5
SLIDE 5

Problem Definition - Equilibrium Statistical Mechanics

  • Suppose the observable of interest a(x) depends on X i.e.:

a(x) = A(X) = A(R(x))

  • Then:

Epf [a] =

  • a(x) pf(x) dx

=

  • A(R(x) pf(x) dx

= A(X)δ(X − R(x)) dX

  • pf(x) dx

=

  • A(X)
  • δ(X − R(x)) pf(x) dx
  • dX

=

  • A(X) pc(X) dX

where pc is the (marginal) PDF of the CG variables X: pc(X) =

  • δ(X − R(x)) pf(x) dx ∝ e−βVc(X)

and the CG potential Vc(X) is the potential of mean force (PMF) w.r.t. X: Vc(X) = −β−1 log

  • δ(X − R(x)) pf(x) dx

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 3 / 32

slide-6
SLIDE 6

Existing Methods

  • Free-energy methods (for low-dimensional X) [Lelièvre et al 2010]
  • Lattice systems [Katsoulakis 2003], Soft matter ]Peter & Kremer 2010]
  • Inversion-based methods: Iterative Boltzmann Inversion [Reith et al.

(2003)], Inverse Monte Carlo [Lyubartsev & Laaksonen (1995), Soper (1996)],

Molecular RG-CG [Savelyev & Papoian 2009]

  • Variational methods: Multiscale CG [Izvekov &et al. (2005), Noid et al.

(2007)] , Relative Entropy [Shell (2008)], Ultra-Coarse-Graining [Dama et al. 2013]

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 4 / 32

slide-7
SLIDE 7

Motivation

  • What are good coarse-grained variables X (how many, what is

fine-to-coarse mapping R)

  • What is the right CG potential (or CG model)?
  • How much information is lost during coarse-graining and how does this

affect predictive uncertainty?

  • Given finite simulation data at the fine-scale, how (un)certain can we be

in our predictions?

  • Can one use the same CG variables X to make predictions about
  • bservables a(x) = A(X)?

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 5 / 32

slide-8
SLIDE 8

Motivation

Existing methods

pf(x)

fine R(x)=X

− − − − − → ¯ pc(X)

coarse

Coarse-Scale X Fine to Coarse Fine-Scale x

R(x)

Proposed (Generative model)

pc(X)

coarse pcf (x|X)

− − − − − → ¯ pf(x) =

  • pcf(x|X) pc(X) dX
  • fine

Coarse-Scale X

Coarse to Fine

Fine-Scale x

Notes

  • No restriction operator (fine-to-coarse R(x) = X).
  • A probabilistic coarse-to-fine map pcf(x|X) is prescribed
  • The coarse model pc(X) is not the marginal of X (given R(x) = X)

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 6 / 32

slide-9
SLIDE 9

Motivation

Relative Entropy CG [Shell, 2008]

  • A fine→coarse map R and an (approximate) CG density ¯

pc(X) imply: ¯ pf(x) = δ(R(x) − X) Ω(R(x)) ¯ pc(R(x)) where: Ω(R(x)) =

  • δ(R(x) − X) dx
  • Find ¯

pc(X) that minimizes KL-divergence between pf(x) (exact) and ¯ pf(x) (approximate): KL(pf(x)||¯ pf(x)) = KL(pc(X)||¯ pc(X))

  • inf. loss due to error in PMF

+ Smap(R)

  • inf. loss due to map R

where Smap =

  • pf(x) log Ω(R(x)) dx

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 7 / 32

slide-10
SLIDE 10

Motivation

Proposed Probabilistic Generative model

  • A probabilistic coarse→fine map pcf(x|X) and a CG model pc(X), imply:

¯ pf(x) =

  • pcf(x|X) pc(X) dX
  • Find pcf(x|X) and pc(X) that minimize:

KL(pf(x)||¯ pf(x)) = −

  • pf(x) log
  • pcf(x|X) pc(X) dX

pf(x) dx

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 8 / 32

slide-11
SLIDE 11

Learning

Proposed Probabilistic Generative model

  • Parametrize:

pc(X|θc)

  • coarse model

, pcf(x|X, θcf)

  • coarse→fine map
  • Optimize:

min

θc,θcf KL(pf(x) || ¯

pf(x|θc, θcf)) ↔ min

θc,θcf −

  • pf(x) log
  • pcf (x|X,θcf ) pc(X|θc) dX

pf (x)

dx ↔ max

θc,θcf

  • pf(x)
  • log
  • pcf(x|X, θcf) pc(X|θc) dX
  • dx

↔ max

θc,θcf

N

i=1 log

  • pcf(x(i)|X, θcf) pc(X|θc) dX

↔ max

θc,θcf L(θc, θcf),

(MLE)

  • MAP estimate: max

θc,θcf L(θc, θcf) + log p(θc, θcf)

  • log−prior
  • Fully Bayesian i.e. posterior: p(θc, θcf|x(1:N)) ∝ exp{L(θc, θcf) p(θc, θcf)}

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 9 / 32

slide-12
SLIDE 12

Learning

Proposed Probabilistic Generative model

  • Parametrize:

pc(X|θc)

  • coarse model

, pcf(x|X, θcf)

  • coarse→fine map
  • Optimize:

min

θc,θcf KL(pf(x) || ¯

pf(x|θc, θcf)) ↔ min

θc,θcf −

  • pf(x) log
  • pcf (x|X,θcf ) pc(X|θc) dX

pf (x)

dx ↔ max

θc,θcf

  • pf(x)
  • log
  • pcf(x|X, θcf) pc(X|θc) dX
  • dx

↔ max

θc,θcf

N

i=1 log

  • pcf(x(i)|X, θcf) pc(X|θc) dX

↔ max

θc,θcf L(θc, θcf),

(MLE)

  • MAP estimate: max

θc,θcf L(θc, θcf) + log p(θc, θcf)

  • log−prior
  • Fully Bayesian i.e. posterior: p(θc, θcf|x(1:N)) ∝ exp{L(θc, θcf) p(θc, θcf)}

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 9 / 32

slide-13
SLIDE 13

Learning

Proposed Probabilistic Generative model

  • Parametrize:

pc(X|θc)

  • coarse model

, pcf(x|X, θcf)

  • coarse→fine map
  • Optimize:

min

θc,θcf KL(pf(x) || ¯

pf(x|θc, θcf)) ↔ min

θc,θcf −

  • pf(x) log
  • pcf (x|X,θcf ) pc(X|θc) dX

pf (x)

dx ↔ max

θc,θcf

  • pf(x)
  • log
  • pcf(x|X, θcf) pc(X|θc) dX
  • dx

↔ max

θc,θcf

N

i=1 log

  • pcf(x(i)|X, θcf) pc(X|θc) dX

↔ max

θc,θcf L(θc, θcf),

(MLE)

  • MAP estimate: max

θc,θcf L(θc, θcf) + log p(θc, θcf)

  • log−prior
  • Fully Bayesian i.e. posterior: p(θc, θcf|x(1:N)) ∝ exp{L(θc, θcf) p(θc, θcf)}

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 9 / 32

slide-14
SLIDE 14

Prediction

Probabilistic Prediction

  • For an observable a(x) (reconstruction, [Katsoulakis et al. 2006, Trashorras

et al. 2010]):

Epf [a] ≈ E¯

pf [a| data

x(1:N)] =

  • a(x) ¯

pf(x|x(1:N)) dx

  • For each θc, θcf from the posterior, one gets an estimate of the
  • bservable
  • Not just point-estimates anymore, but whole distributions!

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 10 / 32

slide-15
SLIDE 15

Prediction

Probabilistic Prediction

  • For an observable a(x) (reconstruction, [Katsoulakis et al. 2006, Trashorras

et al. 2010]):

Epf [a] ≈ E¯

pf [a| data

x(1:N)] =

  • a(x) ¯

pf(x|x(1:N)) dx =

  • a(x) ¯

pf(x, θc, θcf|x(1:N)) dθc dθcf dx =

  • a(x) ¯

pf(x|θc, θcf)p(θc, θcf|x(1:N)) dθc dθcf dx =

  • a(x)
  • pcf(x|X, θcf)pc(X|θc) dX
  • p(θc, θcf|x(1:N)) dθc dθcf dx

= a(x)pcf(x|X, θcf)pc(X|θc) dXdx

  • p(θc, θcf|x(1:N)) dθc dθcf

= ˆ a(θc, θcf) p(θc, θcf|x(1:N))

  • posterior

dθc dθcf

  • For each θc, θcf from the posterior, one gets an estimate of the
  • bservable
  • Not just point-estimates anymore, but whole distributions!

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 10 / 32

slide-16
SLIDE 16

Learning/Inference

MCMC-SA (Expectation-Maximization) [Gu & Kong 1998]

L(θc, θcf ) = N

i=1 log

  • pcf (x(i)|X (i), θcf ) pc(X (i)|θc) dX (i)

= N

i=1 log

  • q(X (i))

pcf (x(i)|X(i),θcf ) pc (X(i)|θc ) q(X(i))

dX (i) ≥ N

i=1

  • q(X (i)) log

pcf (x(i)|X(i),θcf ) pc (X(i)|θc ) q(X(i))

dX (i) = N

i=1 F(q(X (i)), θc, θcf )

θc θcf X (i) x(i)

i = 1, . . . , N

  • E-step (given (θc, θcf)) : Sample each X (i) from qopt(X (i)):

qopt(X (i)) ∝ pcf(x(i)|X (i), θcf) pc(X (i)|θc)

  • M-step: Compute gradients N

i=1 ∇θcF, N i=1 ∇θcf F, (and Hessian) and

update (θc, θcf)

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 11 / 32

slide-17
SLIDE 17

Learning/Inference

MCMC-SA (Expectation-Maximization) [Gu & Kong 1998]

L(θc, θcf ) = N

i=1 log

  • pcf (x(i)|X (i), θcf ) pc(X (i)|θc) dX (i)

= N

i=1 log

  • q(X (i))

pcf (x(i)|X(i),θcf ) pc (X(i)|θc ) q(X(i))

dX (i) ≥ N

i=1

  • q(X (i)) log

pcf (x(i)|X(i),θcf ) pc (X(i)|θc ) q(X(i))

dX (i) = N

i=1 F(q(X (i)), θc, θcf )

θc θcf X (i) x(i)

i = 1, . . . , N

  • E-step (given (θc, θcf)) : Sample each X (i) from qopt(X (i)):

qopt(X (i)) ∝ pcf(x(i)|X (i), θcf) pc(X (i)|θc)

  • M-step: Compute gradients N

i=1 ∇θcF, N i=1 ∇θcf F, (and Hessian) and

update (θc, θcf) Robbins-Monro: θt+1 = θt + αt∇θF αt = ∞, α2

t < ∞

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 11 / 32

slide-18
SLIDE 18

Learning/Inference

  • For exponential-family distributions:

pc(X|θc) = exp{θT

c φ(X) − A(θc)}

(eA(θc) =

  • eθT

c φ(X) dX)

pcf(x|X, θcf) = exp{θT

cf ψ(x, X) − B(X, θcf)}

(eB(X,θcf ) =

  • eθT

cf ψ(x,X) dx)

  • Gradients:

N

i=1 ∇θcF = N i=1 < φ(X (i)) >q(X (i)) −N < φ(X) >pc(X|θc)

( ∇θcKL = N

i=1 φ(R(x(i))) − N < φ(X) >pc(X|θc) Relative Entropy [Shell 2008])

N

i=1 ∇θcf F = N i=1(< ψ(x(i), X (i)) >q(X (i)) − < ψ(x, X (i)) >pcf (x|X (i),θcf )q(X (i))

  • Hessian:

N

i=1 ∇2 θcF = −N Covpc(X|θc)[φ(X)]

N

i=1 ∇2 θcf F = − N i=1 Covpcf (x|X (i),θcf )q(X (i))[ψ(x, X (i))]

→ Concave

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 12 / 32

slide-19
SLIDE 19

Learning/Inference

  • For exponential-family distributions:

pc(X|θc) = exp{θT

c φ(X) − A(θc)}

(eA(θc) =

  • eθT

c φ(X) dX)

pcf(x|X, θcf) = exp{θT

cf ψ(x, X) − B(X, θcf)}

(eB(X,θcf ) =

  • eθT

cf ψ(x,X) dx)

  • Gradients:

N

i=1 ∇θcF = N i=1 < φ(X (i)) >q(X (i)) −N < φ(X) >pc(X|θc)

( ∇θcKL = N

i=1 φ(R(x(i))) − N < φ(X) >pc(X|θc) Relative Entropy [Shell 2008])

N

i=1 ∇θcf F = N i=1(< ψ(x(i), X (i)) >q(X (i)) − < ψ(x, X (i)) >pcf (x|X (i),θcf )q(X (i))

  • Hessian:

N

i=1 ∇2 θcF = −N Covpc(X|θc)[φ(X)]

N

i=1 ∇2 θcf F = − N i=1 Covpcf (x|X (i),θcf )q(X (i))[ψ(x, X (i))]

→ Concave

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 12 / 32

slide-20
SLIDE 20

Learning/Inference

  • MAP-estimates:

max

θc,θcf L(θc, θcf) + log p(θc, θcf)

  • log−prior
  • Approximate Bayesian posterior using Laplace approximation

p(θcf|x(1:N)) ≈ N(µ, S) where:

  • µ = θcf,MAP
  • S−1 = − N

i=1 ∇2 θcf F

Figure : Laplace approximation

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 13 / 32

slide-21
SLIDE 21

Learning/Inference

  • MAP-estimates:

max

θc,θcf L(θc, θcf) + log p(θc, θcf)

  • log−prior
  • Approximate Bayesian posterior using Laplace approximation

p(θcf|x(1:N)) ≈ N(µ, S) where:

  • µ = θcf,MAP
  • S−1 = − N

i=1 ∇2 θcf F

  • 1

1 2 0.5 1 1.5

exact Gaussian

Figure : Laplace approximation

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 13 / 32

slide-22
SLIDE 22

Ising Model - Fine-Scale

Fine-scale variables xi ∈ {−1, 1} following pf(x) ∝ e−βVf (x):

Fine-scale potential

Vf(x) = −1 2

Lf

  • k=1

Jk

  • |i−j|=k

xixj − µ

nf

  • i=1

xi with i, j ∈ {1, . . . , nf} having nf lattice sites.

  • Maximal interactions of Lf sites apart are regarded in the potential.
  • |i − j| = k neighbors over k-sites apart
  • Jk, strength of the k-th interaction.

with Jk following a power law for a given overall strength J0 and exponent a, Jk =

K Lka with,

K = J0L1−a

L

  • k=1

k−a in order to normalize the interaction strength [Katsoulakis et al 2007].

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 14 / 32

slide-23
SLIDE 23

Ising Model - Coarse → Fine map

Coarse-to-fine mapping pcf(x|X, θcf)

pcf (x|X, θcf)

  • parent r
  • child s

θ

1+xr,sXr 2

cf

(1 − θcf)

1+xr,sXr 2

Figure : Probabilistic coarse → fine map

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 15 / 32

slide-24
SLIDE 24

Illustration

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 16 / 32

slide-25
SLIDE 25

Overview of results

Overview:

  • Comparison with Relative Entropy at various µ - magnetization
  • Probabilistic predictions:
  • with various amounts of data
  • various levels of coarse-graining dim(x)/dim(X)
  • Model Selection

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 17 / 32

slide-26
SLIDE 26

Comparison

Figure : θc Figure : θcf

Vf(x) = −1 2

Lf

  • k=1

Jk

  • |i−j|=k

xixj − µ

nf

  • i=1

xi

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 18 / 32

slide-27
SLIDE 27

Comparison of predicted magnetization

Figure : Predicted magnetization < m(µ) > with Rel. Entropy (relEntr) and proposed method (predCg)

Fine-to-coarse map in Rel. Entropy

Xr =      +1,

1 S

S

s xr,s > 0

−1,

1 S

S

s xr,s < 0

U(−1, +1),

  • therwise

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 19 / 32

slide-28
SLIDE 28

Probabilistic Predictitons

µ

  • 3
  • 2
  • 1

1 2 3

m(µ)

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 99% confidence interval 95% confidence interval Posterior mean Truth Data

Figure : Probabilistic predictions for N = 5 data

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 20 / 32

slide-29
SLIDE 29

Probabilistic Predictitons

µ

  • 3
  • 2
  • 1

1 2 3

m(µ)

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 99% confidence interval 95% confidence interval Posterior mean Truth Data

Figure : Probabilistic predictions for N = 10 data

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 21 / 32

slide-30
SLIDE 30

Probabilistic Predictitons

µ

  • 3
  • 2
  • 1

1 2 3

m(µ)

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 99% confidence interval 95% confidence interval Posterior mean Truth Data

Figure : Probabilistic predictions for N = 20 data

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 22 / 32

slide-31
SLIDE 31

Probabilistic Predictitons

µ

  • 3
  • 2
  • 1

1 2 3

m(µ)

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 99% confidence interval 95% confidence interval Posterior mean Truth Data

Figure : Probabilistic predictions for N = 50 data

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 23 / 32

slide-32
SLIDE 32

Probabilistic Predictitons

Figure : Error in magnetization as a function of training data N Figure : Error in correlation as a function of training data N

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 24 / 32

slide-33
SLIDE 33

Effect of Coarse-Graining level

µ

  • 3
  • 2
  • 1

1 2 3

m(µ)

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1

99% confidence interval 95% confidence interval Truth Data

Figure : Probabilistic predictions for dim(x)/dim(X) = 2

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 25 / 32

slide-34
SLIDE 34

Effect of Coarse-Graining level

µ

  • 3
  • 2
  • 1

1 2 3

m(µ)

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1

99% confidence interval 95% confidence interval Truth Data

Figure : Probabilistic predictions for dim(x)/dim(X) = 4

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 26 / 32

slide-35
SLIDE 35

Effect of Coarse-Graining level

µ

  • 3
  • 2
  • 1

1 2 3

m(µ)

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1

99% confidence interval 95% confidence interval Truth Data

Figure : Probabilistic predictions for dim(x)/dim(X) = 8

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 27 / 32

slide-36
SLIDE 36

Model Selection

What is the right CG potential Vc(X)?

pc(X|θc) = exp{θT

c φ(X) − A(θc)}

Vc(X) = θT

c φ(X) features

  • Number of features φ(X) grows exponentially fast
  • What are most important?
  • Can one search across models?

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 28 / 32

slide-37
SLIDE 37

Model Selection

  • Sparsity-enforcing - Hierarchical priors (ARD, [MacKay 1994])

p(θc|τ) =

j p(θc,j|τj)

θc,j ∼ N(0, τ −1

j

) τj ∼ Gamma(a0, b0)

  • As hyperparameters τj → ∞, then θc,j → 0.
  • Readily integrated in the EM-framework:
  • E-step: Compute < τj >p(τj|θc,j)=

a0+1/2 b0+θ2

c,j/2

  • M-step: Compute

∂ ∂θc,j = − < τj > θc,j 2

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 29 / 32

slide-38
SLIDE 38

Model Selection

  • Sparsity-enforcing - Hierarchical priors (ARD, [MacKay 1994])

p(θc|τ) =

j p(θc,j|τj)

θc,j ∼ N(0, τ −1

j

) τj ∼ Gamma(a0, b0)

  • As hyperparameters τj → ∞, then θc,j → 0.
  • Readily integrated in the EM-framework:
  • E-step: Compute < τj >p(τj|θc,j)=

a0+1/2 b0+θ2

c,j/2

  • M-step: Compute

∂ ∂θc,j = − < τj > θc,j 2

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 29 / 32

slide-39
SLIDE 39

Model Selection

100 200 300 400

  • 0.4
  • 0.2

0.2 0.4 0.6 0.8

iteration θc,j

Figure : no ARD (uniform prior)

100 200 300 400 1 2 3

iteration θc,j θc,1 θc,4

Figure : with ARD prior Figure : 2nd-order interactions - Feature functions φj(X) =

i Xi Xi+j

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 30 / 32

slide-40
SLIDE 40

Model Selection

µ

  • 3
  • 2
  • 1

1 2 3

m(µ)

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 99% confidence interval 95% confidence interval Posterior mean Truth Data

Figure : Probabilistic predictions with ARD prior for N = 20 data

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 31 / 32

slide-41
SLIDE 41

Conclusions

Summary

  • A generative probabilistic model is proposed
  • It consists of a CG-density and a probabilistic coarse → fine map.
  • Can account for information loss due to CG
  • Can quantify predictive uncertainty in fine-scale observables.
  • Can be used for Model selection.

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 32 / 32

slide-42
SLIDE 42

Conclusions

Summary

  • A generative probabilistic model is proposed
  • It consists of a CG-density and a probabilostic coarse → fine map.
  • Can account for information loss due to CG
  • Can quantify predictive uncertainty in fine-scale observables.
  • Can be used for Model selection.

Outlook

  • Explore alternative definitions of coarse variables X and alternative

coarse → fine maps pcf e.g.:

  • Discrete states indicating Free-Energy wells
  • Hierachical coarse-graining:

¯ pf(x) =

  • pcf(x|X 1) pc(X 1|X 2) pc(X 2|X 3) . . . pc(X M) dX 1 . . . X M
  • Fully Bayesian or Variational Bayesian
  • Improvements in Learning by advanced sampling (instead of MCMC) and

stochastic BFGS [Byrd et al. 2014, Moritz el al. 2015]

p.s.koutsourelakis@tum.de Predictive Coarse-Graining 32 / 32