Konstantin Turitsyn T-4 & CNLS, Los Alamos National Laboratory - - PowerPoint PPT Presentation

konstantin turitsyn
SMART_READER_LITE
LIVE PREVIEW

Konstantin Turitsyn T-4 & CNLS, Los Alamos National Laboratory - - PowerPoint PPT Presentation

Accelerating MCMC algorithms by breaking detailed balance Konstantin Turitsyn T-4 & CNLS, Los Alamos National Laboratory Landau Institute for Theoretical Physics, Moscow In collaboration with Misha Chertkov (LANL) Marija Vucelja (Weizmann)


slide-1
SLIDE 1

Accelerating MCMC algorithms by breaking detailed balance

Konstantin Turitsyn

T-4 & CNLS, Los Alamos National Laboratory Landau Institute for Theoretical Physics, Moscow

In collaboration with

Misha Chertkov (LANL) Marija Vucelja (Weizmann) http://arXiv.org/abs/0809.0916

Thursday, September 3, 2009

slide-2
SLIDE 2

MCMC Algorithms

  • Problem: produce samples x from a given distribution

πx defined up to a constant.

  • Solution: use a Markovian random walk xt converging

to the required stationary distribution.

  • Detailed balance: Tyx πx = Txy πy
  • Only local moves are allowed: xt+1 is close to xt

Thursday, September 3, 2009

slide-3
SLIDE 3

Detailed balance

Pt+1

x

= ∑

y

Txy Pt

y

Random walker distribution evolves according to Assuming ∑x Txy 1 one can rewrite the equation as

y

  • Tyx Pt+1

x

−TxyPt

y

  • = 0

Balance condition: ∀t : Pt

x = πx

y

[Tyxπx −Txyπy] = 0 Detailed balance (reversibility, equilibrium): Tyxπx = Txyπy Sufficient but not necessary!

Thursday, September 3, 2009

slide-4
SLIDE 4

Loop decomposition

It is useful to consider ergodic flux matrix Qxy = Txyπy Asymmetric part of Qxy can be decomposed: Qxy −Qyx = ∑

α

xy −Cα yx

  • Here Jα is amplitude of the probability flow

xy is adjacency matrix of the loop

Flow amplitudes are bounded by the reversible part Detailed balance = symmetry of ergodic flow: Qxy = Qyx

Thursday, September 3, 2009

slide-5
SLIDE 5

Physical analogies

  • PDF evolution ⇔ Diffusion-advection of passive scalar
  • Balance condition ⇔ Flow Incompressibility
  • Reversibility ⇔ Diffusion
  • Irreversible motion ⇔ Advection
  • Loops ⇔ Vortices

Thursday, September 3, 2009

slide-6
SLIDE 6

Slow convergence

  • Glassy landscapes: Regions that

dominate the partition function are separated by “energy barriers”

  • Entropy barriers: Regions of high

probability are separated by narrow paths (high probability but small entropy)

  • Single region with high probability
  • f large size (entropy)

Several types of distributions are characterized by slow mixing:

Thursday, September 3, 2009

slide-7
SLIDE 7

Acceleration with loops

L T ~ L2 T ~ L Irreversibility can significantly accelerate random walks

  • n regular lattices.

Thursday, September 3, 2009

slide-8
SLIDE 8

Other approaches ?

  • Exponentially many loops required for real systems
  • Flow amplitude can not be determined based on local

information

  • Calculate irreversible transition probabilities “on a fly”
  • Do not enforce the balance condition, instead

compensate for compressibility Naive way: Proposed approach:

Thursday, September 3, 2009

slide-9
SLIDE 9

Skewed detailed balance

  • Create two copies of the system (‘+’ and ‘-’)
  • Decompose transition probabilities as

Txy = T (+)

xy

+T (−)

xy

T (+)

xy πx = T (−) yx πy

  • Compensate the compressibility by introducing

transition between copies: Λ(±,∓)

xx

= max

  • 0,∑

y

T (±)

xy

−T ∓

xy

  • Thursday, September 3, 2009
slide-10
SLIDE 10

Skewed detailed balance 2

  • Extended matrix satisfies balance condition and

corresponds to irreversible process: ˆ T =

  • ˆ

T (+) ˆ Λ(+,−) ˆ Λ(−,+) ˆ T (−)

  • Random walk becomes non-Markovian in original space.
  • System copy index is analogous to momentum in

physics: diffusive motion turns into ballistic/super- diffusive.

  • No complexity overhead for Glauber and other local

dynamics.

Thursday, September 3, 2009

slide-11
SLIDE 11

Spin cluster model

πs

...sN = Z−

  • J
  • N ∑

i,j

sisj

  • Simple example: classical Ising model defined on a full

graph: System experiences phase transition at J=1. Anomalous fluctuations of magnetization δS ~ N3/4 lead to critical slowdown of Glauber dynamics: T ~ N3/2 Irreversible dynamics: flip only positive spins in first copy, and only negative in second.

Thursday, September 3, 2009

slide-12
SLIDE 12

Spin cluster model: results

2 4 6 8 10 12 14 2 4 6 8 10 12 14 16 18 20

log2 N log2 T

Dynamics is strongly accelerated: convergence time (defined via correlation function of S) decreases to T~N3/4 (T~N0.85 in simulations)

Thursday, September 3, 2009

slide-13
SLIDE 13

Ising model

Two-dimensional Ising model shares a lot of properties in the critical point. One can try the same algorithm. Constant factor acceleration is observed (~3x), however the constant does not depend on system size. Flipping between the copies happens too frequently (T~L1/2)

200 400 600 800 1000 0.8 0.7 0.6 0.5 0.4 0.3 200 400 600 800 1000 1.3 1.4 1.5 1.6 1.7

Thursday, September 3, 2009

slide-14
SLIDE 14

Extensions 1

Mix irreversible fluxes in the same direction (2N copies instead of 2)

y

  • T (α)

yx πx −T (α) xy πy

  • −∑

β

Λ(α,β)

xx

πx = 0

Thursday, September 3, 2009

slide-15
SLIDE 15

Extensions 2

Mix irreversible fluxes in different directions (i.e. horizontal and vertical).

Thursday, September 3, 2009

slide-16
SLIDE 16

Extensions 3

Introduce generalized “momentum” variable. Break symmetry in momentum space.

Thursday, September 3, 2009

slide-17
SLIDE 17

Phase transitions

  • Explore the full phase space high-wavelength spatial

harmonics of the order parameter (i.e. magnetization)

  • Make the dynamics more adaptive:

Reversible dynamics : ∂tMk = −ΓkMk +ξk(t) Irreversible (Hamiltonian): ∂ttMk = −Γ2

kMk

Γk ∼ kχ, k → 0

  • Separate momentum variables for different parts of space

Thursday, September 3, 2009

slide-18
SLIDE 18

Other approaches ?

  • Lifting operation (Chen, Lovasz, Pak 99) - theoretical limits of acceleration

(Tirr > √Trev). Some toy models: (Diaconis, Holmes, Neal 97) .Applications to distributed computing: (Jung, Shah, Shin 07)

  • Hamiltonian (Hybrid) Monte Carlo (Horvath, Kennedy 88) - continuum

limit of our construction.

  • Successive over-relaxation (Adler 81), sequential updating (Ren, Orkoulas

06) - another way of producing irreversible fluxes.

Broken DB: Reversible algorithms:

  • Cluster algorithms (Swendsen, Wang 87) - teleportation instead of ballistic

motion

  • Simulated tempering (Marinari, Parisi 92) - several copies of the system,

but with different distributions

Thursday, September 3, 2009