Sequential Monte Carlo: Selected Methodological Applications Adam - - PowerPoint PPT Presentation

sequential monte carlo
SMART_READER_LITE
LIVE PREVIEW

Sequential Monte Carlo: Selected Methodological Applications Adam - - PowerPoint PPT Presentation

Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo: Selected Methodological Applications Adam M. Johansen a.m.johansen@warwick.ac.uk Warwick University Centre for Scientific Computing 1 Introduction


slide-1
SLIDE 1

Introduction Estimation Rare Events Filtering Summary References

Sequential Monte Carlo:

Selected Methodological Applications Adam M. Johansen

a.m.johansen@warwick.ac.uk

Warwick University Centre for Scientific Computing

1

slide-2
SLIDE 2

Introduction Estimation Rare Events Filtering Summary References

Outline

◮ Sequential Monte Carlo ◮ Applications

◮ Parameter Estimation ◮ Rare Event Simulation ◮ Filtering of Piecewise Deterministic Processes 2

slide-3
SLIDE 3

Introduction Estimation Rare Events Filtering Summary References

Background

3

slide-4
SLIDE 4

Introduction Estimation Rare Events Filtering Summary References Monte Carlo

Estimating π

◮ Rain is uniform. ◮ Circle is inscribed

in square.

◮ Asquare = 4r2. ◮ Acircle = πr2. ◮ p = Acircle Asquare = π 4 . ◮ 383 of 500

“successes”.

◮ ˆ

π = 4383

500 = 3.06. ◮ Also obtain

confidence intervals.

4

slide-5
SLIDE 5

Introduction Estimation Rare Events Filtering Summary References Monte Carlo

The Monte Carlo Method

◮ Given a probability density, f,

I =

  • E

ϕ(x)f(x)dx

◮ Simple Monte Carlo solution:

◮ Sample X1, . . . , XN

iid

∼ f.

◮ Estimate ˆ

I = 1

N n

  • i=1

ϕ(XN).

◮ Justified by the law of large numbers. . . ◮ and the central limit theorem.

5

slide-6
SLIDE 6

Introduction Estimation Rare Events Filtering Summary References Monte Carlo

Importance Sampling

◮ Given g, such that

◮ f(x) > 0 ⇒ g(x) > 0 ◮ and f(x)/g(x) < ∞,

define w(x) = f(x)/g(x) and: I =

  • ϕ(x)f(x)dx =
  • ϕ(x)w(x)g(x)dx.

◮ This suggests the importance sampling estimator:

◮ Sample X1, . . . , XN

iid

∼ g.

◮ Estimate ˆ

I = 1

N N

  • i=1

w(Xi)ϕ(Xi).

6

slide-7
SLIDE 7

Introduction Estimation Rare Events Filtering Summary References Monte Carlo

Markov Chain Monte Carlo

◮ Typically difficult to construct a good proposal density. ◮ MCMC works by constructing an ergodic Markov chain of

invariant distribution π, Xn using it’s ergodic averages: 1 N

N

  • i=1

ϕ(Xi) to approach Eπ[ϕ].

◮ Justified by ergodic theorems / central limit theorems. ◮ We aren’t going to take this approach.

7

slide-8
SLIDE 8

Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo

A Motivating Example: Filtering

◮ Let X1, . . . denote the position of an object which follows

Markovian dynamics.

◮ Let Y1, . . . denote a collection of observations:

Yi|Xi = xi ∼ g(·|xi).

◮ We wish to estimate, as observations arrive, p(x1:n|y1:n). ◮ A recursion obtained from Bayes rule exists but is

intractable in most cases.

8

slide-9
SLIDE 9

Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo

More Generally

◮ The problem in the previous example is really tracking a

sequence of distributions.

◮ Key structural property of the smoothing distributions:

increasing state spaces.

◮ Other problems with the same structure exist. ◮ Any problem of sequentially approximating a sequence of

such distributions, pn, can be addressed in the same way.

9

slide-10
SLIDE 10

Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo

Importance Sampling in This Setting

◮ Given pn(x1:n) for n = 1, 2, . . . . ◮ We could sample from a sequence qn(x1:n) for each n. ◮ Or we could let qn(x1:n) = qn(xn|x1:n−1)qn−1(x1:n) and

re-use our samples.

◮ The importance weights become:

wn(x1:n) ∝ pn(x1:n) qn(x1:n) = pn(x1:n) qn(xn|x1:n−1)qn−1(x1:n−1) = pn(x1:n) qn(xn|x1:n−1)pn−1(x1:n−1)wn−1(x1:n−1)

10

slide-11
SLIDE 11

Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo

Sequential Importance Sampling

At time 1. For i = 1 : N, sample X(i)

1

∼ q1 (·). For i = 1 : N, compute W i

1 ∝ w1

  • X(i)

1

  • =

p1 “ X(i)

1

” q1 “ X(i)

1

”.

At time n, n ≥ 2. Sampling Step For i = 1 : N, sample X(i)

n ∼ qn

  • ·| X(i)

n−1

  • .

Weighting Step For i = 1 : N, compute wn

  • X(i)

1:n−1, X(i) n

  • =

pn “ X(i)

1:n−1,X(i) n

” pn−1 “ X(i)

1:n−1

” qn “ X(i)

n

˛ ˛ ˛X(i)

n−1

and W (i)

n

∝ W (i)

n−1wn

  • X(i)

1:n−1, X(i) n

  • .

11

slide-12
SLIDE 12

Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo

Sequential Importance Resampling

At time n, n ≥ 2. Sampling Step For i = 1 : N, sample X(i)

n,n ∼ qn

  • ·|

X(i)

n−1

  • .

Resampling Step For i = 1 : N, compute wn

  • X(i)

n−1, X(i) n,n

  • =

pn “ e X(i)

n−1,X(i) n,n

” pn−1 “ e X(i)

n−1

” qn “ X(i)

n,n

˛ ˛ ˛ e X(i)

n−1

and W (i)

n

=

wn “ e X(i)

n−1,X(i) n,n

” PN

j=1 wn

“ e X(j)

n−1,X(j) n,n

”.

For i = 1 : N, sample X(i)

n ∼ N j=1 W (j) n δ“ e X(j)

n−1,X(j) n,n

” (dx1:n) .

12

slide-13
SLIDE 13

Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo

SMC Samplers

Actually, these techniques can be used to sample from any sequence of distributions (Del Moral et al., 2006).

◮ Given a sequence of target distributions, ηn, on En . . . , ◮ construct a synthetic sequence

ηn on spaces

n

  • p=1

Ep

◮ by introducing Markov kernels, Lp from Ep+1 to Ep:

  • ηn(x1:n) = ηn(xn)

n−1

  • p=1

Lp (xp+1, xp) ,

◮ These distributions

◮ have the target distributions as time marginals, ◮ have the correct structure to employ SMC techniques. 13

slide-14
SLIDE 14

Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo

SMC Outline

◮ Given a sample {X(i) 1:n−1}N i=1 targeting

ηn−1,

◮ sample X(i) n ∼ Kn(X(i) n−1, ·), ◮ calculate

Wn(X(i)

1:n) = ηn(X(i) n )Ln−1(X(i) n , X(i) n−1)

ηn−1(X(i)

n−1)Kn(X(i) n−1, X(i) n )

.

◮ Resample, yielding: {X(i) 1:n}N i=1 targeting

ηn.

◮ Hints that we’d like to use

Ln−1(xn, xn−1) = ηn−1(xn−1)Kn(xn−1, xn)

  • ηn−1(x′

n−1)Kn(x′ n−1, xn).

14

slide-15
SLIDE 15

Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models

Parameter Estimation in Latent Variable Models

Joint work with Arnaud Doucet and Manuel Davy.

15

slide-16
SLIDE 16

Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models

Maximum {Likelihood|a Posteriori} Estimation

◮ Consider a model with:

◮ parameters, θ, ◮ latent variables, x, and ◮ observed data, y.

◮ Aim to maximise Marginal likelihood

p(y|θ) =

  • p(x, y|θ)dx
  • r posterior

p(θ|y) ∝

  • p(x, y|θ)p(θ)dx.

◮ Traditional approach is Expectation-Maximisation (EM)

◮ Requires objective function in closed form. ◮ Susceptible to trapping in local optima. 16

slide-17
SLIDE 17

Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models

A Probabilistic Approach

◮ A distribution of the form

π(θ|y) ∝ p(θ)p(y|θ)γ will become concentrated, as γ → ∞ on the maximisers of p(y|θ) under weak conditions (Hwang, 1980).

◮ Key point: Synthetic distributions of the form:

¯ πγ(θ, x1:γ|y) ∝ p(θ)

γ

  • i=1

p(xi, y|θ) admit the marginals ¯ πγ(θ|y) ∝ p(θ)p(y|θ)γ.

17

slide-18
SLIDE 18

Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models

Maximum Likelihood via SMC

◮ Use a sequence of distributions ηn = πγn for some {γn}. ◮ Has previously been suggested in an MCMC context

(Doucet et al., 2002).

◮ Requires extremely slow “annealing”. ◮ Separation between distributions is large.

◮ SMC has two main advantages:

◮ Introducing bridging distributions, for γ = ⌊γ⌋ + γ, of:

¯ πγ(θ, x1:⌊γ⌋+1|y) ∝ p(θ)p(x⌊γ⌋+1, y|θ)γ

⌊γ⌋

  • i=1

p(xi, y|θ) is straightforward.

◮ Population of samples improves robustness. 18

slide-19
SLIDE 19

Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models

Three Algorithms

◮ A generic SMC sampler can be written down directly. . . ◮ Easy case:

◮ Sample from p(xn|y, θn−1) and p(θn|xn, y). ◮ Weight according to p(y|θn−1)γn−γn−1.

◮ General case:

◮ Sample existing variables from a ηn−1-invariant kernel:

(θn, Xn,1:γn−1) ∼ Kn−1((θn−1, Xn−1), ·).

◮ Sample new variables from an arbitrary proposal:

Xn,γn−1+1:γn ∼ q(·|θn).

◮ Use the composition of a time-reversal and optimal

auxiliary kernel.

◮ Weight expression does not involve the marginal likelihood. 19

slide-20
SLIDE 20

Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models

Toy Example

◮ Student t-distribution of unknown location parameter θ

with ν = 0.05.

◮ Four observations are available, y = (−20, 1, 2, 3). ◮ Log likelihood is:

log p(y|θ) = −0.525

4

  • i=1

log

  • 0.05 + (yi − θ)2

.

◮ Global maximum is at 1.997. ◮ Local maxima at {−19.993, 1.086, 2.906}.

20

slide-21
SLIDE 21
slide-22
SLIDE 22

Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models

It actually works. . .

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 20 40 60 80 100 T=15 T=30 T=60

22

slide-23
SLIDE 23

Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models

Example: Gaussian Mixture Model – MAP Estimation

◮ Likelihood p(y|x, ω, µ, σ) = N(y|µx, σ2 x). ◮ Marginal likelihood p(y|ω, µ, σ) = 3

  • j=1

ωjN(y|µj, σ2

j ). ◮ Diffuse conjugate priors were employed. ◮ All full conditional distributions of interest are available. ◮ Marginal posterior can be calculated.

23

slide-24
SLIDE 24

Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models

Example: GMM (Galaxy Data Set)

  • 168
  • 166
  • 164
  • 162
  • 160
  • 158
  • 156
  • 154
  • 152

100 1000 10000 100000 1e+06 smc same em

24

slide-25
SLIDE 25

Introduction Estimation Rare Events Filtering Summary References

Rare Event Simulation

Joint work with Pierre Del Moral and Arnaud Doucet.

25

slide-26
SLIDE 26

Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation

Problem Formulation

◮ Consider the canonical Markov chain:

  • Ω =

  • t=0

Et, F =

  • t=0

Ft, (Xt)t∈N, Pµ0

  • ,

◮ The law Pµ0 is defined by its finite dimensional

distributions: Pµ0 ◦ X−1

0:p(dx0:p) = µ0(dx0) p

  • i=1

Mi(xi−1, dxi).

◮ We are interested in rare events.

26

slide-27
SLIDE 27

Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation

Static Rare Events

We term the first type of rare events which we consider static rare events:

◮ The first P + 1 elements of the canonical Markov chain lie

in a rare set, T .

◮ That is, we are interested in

Pµ0 (x0:P ∈ T ) and Pµ0 (x0:P ∈ dx0:P |x0:P ∈ T )

◮ We assume that the rare event is characterised as a level

set of a suitable potential function: V : T → [ ˆ V , ∞), and V : E0:P \ T → (−∞, ˆ V ).

27

slide-28
SLIDE 28

Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation

Dynamic Rare Events

The other class of rare events in which we are interested are termed dynamic rare events:

◮ A Markov process hits some rare set, T , before its first

entrance to some recurrent set R.

◮ That is, given the stopping time τ = inf {p : Xp ∈ T ∪ R},

we seek Pµ0 (Xτ ∈ T ) and the associated conditional distribution: Pµ0 (τ = t, X0:t ∈ dx0:t|Xτ ∈ T )

28

slide-29
SLIDE 29

Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation

Intuition

◮ Principle novelty: applying an efficient sampling technique

which allows us to operate directly on the path space of the Markov chain.

◮ Two components to this approach:

◮ Constructing a sequence of synthetic distributions ◮ Applying sequential importance sampling and resampling

strategies.

29

slide-30
SLIDE 30

Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation

Static Rare Events: Our Approach

◮ Initialise by sampling from the law of the Markov chain. ◮ Iteratively obtain samples from a sequence of distributions

which moves “smoothly” towards the target.

◮ Proposed sequence of distributions:

ηn(dx0:P ) ∝ Pµ0(dx0:P )gn/T (x0:P ) gθ(x0:P ) =

  • 1 + exp
  • −α(θ)
  • V (x0:P ) − ˆ

V −1

◮ Estimate the normalising constant of the final distribution

and correct via importance sampling.

30

slide-31
SLIDE 31

Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation

Path Sampling [See ⋆⋆ or Gelman and Meng, 1998]

◮ Given a sequence of densities p(x|θ) = q(x|θ)/z(θ):

d dθ log z(θ) = Eθ d dθ log q(·|θ)

  • (⋆)

where the expectation is taken with respect to p(·|θ).

◮ Consequently, we obtain:

log z(1) z(0)

  • =

1 Eθ d dθ log q(·|θ)

  • ◮ In our case, we use our particle system to approximate both

integrals.

31

slide-32
SLIDE 32

Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation

Approximate the path sampling identity to estimate the normalising constant: ˆ Z1 = 1 2 exp T

  • n=1

(α(n/T) − α((n − 1)/T)) ˆ En−1 + ˆ En 2

  • ˆ

En =

N

  • j=1

W (j)

n V “ X(j)

n

” − ˆ V 1+exp “ αn “ V “ X(j)

n

” − ˆ V ”” N

  • j=1

W (j)

n

Estimate the rare event probability:

p⋆ = ˆ Z1

N

  • j=1

W (j)

T

  • 1 + exp(α(1)(V
  • X(j)

T

  • − ˆ

V ))

  • I( ˆ

V ,∞]

  • V
  • X(j)

T

  • N
  • j=1

W (j)

T

.

32

slide-33
SLIDE 33

Introduction Estimation Rare Events Filtering Summary References Example: Gaussian Random Walk

Example: Gaussian Random Walk

◮ A toy example: Mt(Rt−1, Rt) = N(Rt|Rt−1, 1). ◮ T = RP × [ ˆ

V , ∞).

◮ Proposal kernel:

Kn(Xn−1, Xn) =

S

  • j=−S

αn+1(Xn−1, Xn)

P

  • i=1

δXn−1,i+ijδ(Xn,i), where the weighting of individual moves is given by αn(Xn−1, Xn) ∝ ηn(Xn).

◮ Linear annealing schedule. ◮ Number of distributions T ∝ ˆ

V 3/2 (T=2500 when ˆ V = 25).

33

slide-34
SLIDE 34

Introduction Estimation Rare Events Filtering Summary References Example: Gaussian Random Walk

  • 60
  • 50
  • 40
  • 30
  • 20
  • 10

5 10 15 20 25 30 35 40 log Probability Threshold Gaussian Random Walk Example Results True Values IPS(100) IPS(1,000) IPS(20,000) SMC(100)

34

slide-35
SLIDE 35

Introduction Estimation Rare Events Filtering Summary References Example: Gaussian Random Walk

  • 5

5 10 15 20 25 30 2 4 6 8 10 12 14 16 Markov Chain State Value Markov Chain State Number Typical SMC Run -- All Particles

35

slide-36
SLIDE 36

Introduction Estimation Rare Events Filtering Summary References Example: Gaussian Random Walk

  • 5

5 10 15 20 25 30 2 4 6 8 10 12 14 16 Markov Chain State Value Markov Chain State Number Typical IPS Run -- Particles Which Hit The Rare Set 36

slide-37
SLIDE 37

Introduction Estimation Rare Events Filtering Summary References

Filtering of Piecewise Deterministic Processes

Joint work with Nick Whiteley and Simon Godsill.

37

slide-38
SLIDE 38

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes

Motivation: Observing a Manoeuvering Object

◮ For t ∈ R+ 0 , consider object with position st, velocity vt

and acceleration at

◮ Summarise state by ζt = (st, vt, at) ◮ From initial condition ζ0, state evolves until random time

τ1, at which acceleration jumps to a new random value, yielding ζτ1

◮ From ζτ1, evolution until τ2, state becomes ζτ2, etc. ◮ Observation times, (tn)n∈N, at each tn a noisy

measurement of the object’s position is made

38

slide-39
SLIDE 39

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes 5 10 15 20 25 30 35 −10 −5 5 10 15 20 25 30 sy/km sx/km 39

slide-40
SLIDE 40

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes

An Abstract Formulation

◮ Pair Markov chain (τj, θj)j∈N, τj ∈ R+, θj ∈ Θ

p(d(τj, θj)|τj−1, θj−1) = q(dθj|θj−1, τj, τj−1)f(dτj|τj−1),

◮ Count the jumps νt := j I[τj≤t] ◮ Deterministic evolution function F : R+ 0 × Θ → Θ, s.t.

∀θ ∈ Θ, F(0, θ) = θ

◮ Signal process (ζt)t∈R+

0 ,

ζt := F(t − τνt, θνt)

40

slide-41
SLIDE 41

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes

Filtering 1

◮ This describes a Piecewise Deterministic Process. ◮ It’s partially observed via observations (Yn)n∈N, e.g.,

Yn = G(ζtn) + Vn and likelihood function gn(yn|ζtn)

◮ Filtering: given observations, y1:n, estimate ζtn. ◮ How can we approximate p(ζtn|y1:n), p(ζtn+1|y1:n+1), ... ?

41

slide-42
SLIDE 42

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes

Filtering 2

◮ Sequence of spaces (En)n∈N,

En =

  • k=0

{k} × Tn,k × Θk+1, Tn,k = {τ1:k : 0 < τ1 < τ2 < ... < τk ≤ tn}.

◮ Define kn := νtn and Xn = (ζ0, kn, τ1:kn, θ1:kn) ∈ En ◮ Sequence of posterior distributions (ηn)n∈N

ηn(xn) ∝q(ζ0)

kn

  • j=1

f(τj|τj−1)q(θj|θj−1, τj, τj−1) ×

n

  • p=1

gp(yp|ζtp)S(τkn, tn)

42

slide-43
SLIDE 43

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes

SMC Filtering

◮ Recall Xn = (ζ0, kn, τ1:kn, θ1:kn) specifies a path (ζt)t∈[0,tn] ◮ If forward kernel Kn only alters the recent components of

xn−1 and adds new jumps/parameters in En \ En−1, online

  • peration is possible

p(dζtn|y1:n) ≈

N

  • i=1

W (i)

n δF(tn−τ (i)

kn ,θ(i) kn)(dζtn)

◮ A mixture proposal

Kn(xn−1, xn) =

  • m

αn,m(xn−1)Kn,m(xn−1, xn),

43

slide-44
SLIDE 44

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes

SMC Filtering

◮ When Kn corresponds to extending xn−1 into En by

sampling from the prior, obtain the algorithm of (Godsill et al., 2007).

◮ This is inefficient as involves propagating multiple copies of

particles after resampling

◮ A more efficient strategy is to propose births and to

perturb the most recent jump time/parameter, (τk, θk)

◮ To minimize the variance the importance weights, we

would like to draw from ηn(τk, θk|xn−1 \ (τk, θk)), or sensible approximations thereof.

44

slide-45
SLIDE 45

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes 5 10 15 20 25 30 35 −15 −10 −5 5 10 15 20 25 30 sx/km sy/km 45

slide-46
SLIDE 46

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes

Godsill et al. 2007 Whiteley et al. 2007 N RMSE / km CPU / s RMSE / km CPU / s 50 42.62 0.24 0.88 1.32 100 33.49 0.49 0.66 2.62 250 22.89 1.23 0.54 6.56 500 17.26 2.42 0.51 12.98 1000 12.68 5.00 0.50 26.07 2500 6.18 13.20 0.49 67.32 5000 3.52 28.79 0.48 142.84 Root mean square filtering error and CPU time, over 200 runs.

20 40 60 80 100 120 −2 2 4 CPU/s log RMSE

46

slide-47
SLIDE 47

Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes

Convergence

◮ This framework allows us to analyse algorithm of Godsill et

  • al. 2007

◮ µn(ϕ) :=

  • ϕ(ζtn)p(dζtn|y1:n) and µN

n (ϕ) the corresponding

SMC approximation

◮ Under standard regularity conditions

√ N(µN

n (ϕ) − µn(ϕ)) ⇒ N(0, σ2 n(ϕ)) ◮ Under rather strong assumptions*

E

  • |µN

n (ϕ) − µn(ϕ)|p1/p ≤ cp(ϕ)

√ N

*which include: (ζtn)n∈N is uniformly ergodic Markov, likelihood bounded above and away from zero uniformly in time

47

slide-48
SLIDE 48

Introduction Estimation Rare Events Filtering Summary References

Summary

48

slide-49
SLIDE 49

Introduction Estimation Rare Events Filtering Summary References

SMCTC: C++ Template Class for SMC Algorithms

◮ Implementing SMC algorithms in C/C++ isn’t hard. ◮ Software for implementing general SMC algorithms. ◮ C++ element largely confined to the library. ◮ Available (under a GPL-3 license from)

www2.warwick.ac.uk/fac/sci/statistics/staff/ academic/johansen/smctc/

  • r type “smctc” into google.

◮ Example code includes estimation of Gaussian tail

probabilities using the method described here.

◮ Particle filters can also be implemented easily.

49

slide-50
SLIDE 50

Introduction Estimation Rare Events Filtering Summary References

In Conclusion

◮ Monte Carlo Methods have uses beyond the calculation of

posterior means.

◮ SMC provides a viable alternative to MCMC. ◮ SMC is effective at:

◮ ML and MAP estimation; ◮ rare event estimation; ◮ filtering outside the standard particle filtering framework. ◮ . . . ◮ Other published applications include: approximate Bayesian

computation, Bayesian estimation in GLMMs, options pricing and estimation in partially observed marked point processes.

◮ A huge amount of work remains to be done. . .

50

slide-51
SLIDE 51

Introduction Estimation Rare Events Filtering Summary References

References

[1]

  • P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo methods for Bayesian
  • Computation. In Bayesian Statistics 8. Oxford University Press, 2006.

[2]

  • P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the

Royal Statistical Society B, 63(3):411–436, 2006. [3]

  • A. Doucet, S. J. Godsill, and C. P. Robert. Marginal maximum a posteriori estimation

using Markov chain Monte Carlo. Statistics and Computing, 12:77–84, 2002. [4]

  • A. Gelman and X.-L. Meng. Simulating normalizing constants: From importance

sampling to bridge sampling to path sampling. Statistical Science, 13(2):163–185, 1998. [5]

  • S. J. Godsill, J. Vermaak, K.-F. Ng, and J.-F. Li. Models and algorithms for tracking of

manoeuvring objects using variable rate particle filters. Proceedings of IEEE, 95(5): 925–952, 2007. [6] C.-R. Hwang. Laplace’s method revisited: Weak convergence of probability measures. The Annals of Probability, 8(6):1177–1182, December 1980. [7]

  • A. M. Johansen. SMCTC: Sequential Monte Carlo in C++. Research Report 08:16,

University of Bristol, Department of Mathematics – Statistics Group, University Walk, Bristol, BS8 1TW, UK, July 2008. [8]

  • A. M. Johansen, P. Del Moral, and A. Doucet. Sequential Monte Carlo samplers for rare
  • events. In Proceedings of the 6th International Workshop on Rare Event Simulation, pages

256–267, Bamberg, Germany, October 2006. [9]

  • A. M. Johansen, A. Doucet, and M. Davy. Particle methods for maximum likelihood

parameter estimation in latent variable models. Statistics and Computing, 18(1):47–57, March 2008. [10]

  • N. Whiteley, A. M. Johansen, and S. Godsill. Efficient Monte Carlo filtering for discretely
  • bserved jumping processes. In Proceedings of IEEE Statistical Signal Processing Workshop,

pages 89–93, Madison, WI, USA, August 26th–29th 2007. IEEE. [11]

  • N. Whiteley, A. M. Johansen, and S. Godsill. Monte Carlo filtering of

piecewise-deterministic processes. In revision., 2008. 51

slide-52
SLIDE 52

Introduction Estimation Rare Events Filtering Summary References Justification of path sampling identity

Path Sampling Identity

Given a probability density, p(x|θ) = q(x|θ)/z(θ): ∂ ∂θ log z(θ) = 1 z(θ) ∂ ∂θz(θ) = 1 z(θ) ∂ ∂θ

  • q(x|θ)dx

=

  • 1

z(θ) ∂ ∂θq(x|θ)dx (⋆⋆) = p(x|θ) q(x|θ) ∂ ∂θq(x|θ)dx =

  • p(x|θ) ∂

∂θ log q(x|θ)dx = Ep(·|θ) ∂ ∂θ log q(x|θ)

  • wherever ⋆⋆ is permissible. Back to ⋆.

52