Introduction Estimation Rare Events Filtering Summary References
Sequential Monte Carlo:
Selected Methodological Applications Adam M. Johansen
a.m.johansen@warwick.ac.uk
Warwick University Centre for Scientific Computing
1
Sequential Monte Carlo: Selected Methodological Applications Adam - - PowerPoint PPT Presentation
Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo: Selected Methodological Applications Adam M. Johansen a.m.johansen@warwick.ac.uk Warwick University Centre for Scientific Computing 1 Introduction
Introduction Estimation Rare Events Filtering Summary References
a.m.johansen@warwick.ac.uk
1
Introduction Estimation Rare Events Filtering Summary References
◮ Sequential Monte Carlo ◮ Applications
◮ Parameter Estimation ◮ Rare Event Simulation ◮ Filtering of Piecewise Deterministic Processes 2
Introduction Estimation Rare Events Filtering Summary References
3
Introduction Estimation Rare Events Filtering Summary References Monte Carlo
◮ Rain is uniform. ◮ Circle is inscribed
◮ Asquare = 4r2. ◮ Acircle = πr2. ◮ p = Acircle Asquare = π 4 . ◮ 383 of 500
◮ ˆ
500 = 3.06. ◮ Also obtain
4
Introduction Estimation Rare Events Filtering Summary References Monte Carlo
◮ Given a probability density, f,
◮ Simple Monte Carlo solution:
◮ Sample X1, . . . , XN
iid
◮ Estimate ˆ
N n
◮ Justified by the law of large numbers. . . ◮ and the central limit theorem.
5
Introduction Estimation Rare Events Filtering Summary References Monte Carlo
◮ Given g, such that
◮ f(x) > 0 ⇒ g(x) > 0 ◮ and f(x)/g(x) < ∞,
◮ This suggests the importance sampling estimator:
◮ Sample X1, . . . , XN
iid
◮ Estimate ˆ
N N
6
Introduction Estimation Rare Events Filtering Summary References Monte Carlo
◮ Typically difficult to construct a good proposal density. ◮ MCMC works by constructing an ergodic Markov chain of
N
◮ Justified by ergodic theorems / central limit theorems. ◮ We aren’t going to take this approach.
7
Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo
◮ Let X1, . . . denote the position of an object which follows
◮ Let Y1, . . . denote a collection of observations:
◮ We wish to estimate, as observations arrive, p(x1:n|y1:n). ◮ A recursion obtained from Bayes rule exists but is
8
Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo
◮ The problem in the previous example is really tracking a
◮ Key structural property of the smoothing distributions:
◮ Other problems with the same structure exist. ◮ Any problem of sequentially approximating a sequence of
9
Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo
◮ Given pn(x1:n) for n = 1, 2, . . . . ◮ We could sample from a sequence qn(x1:n) for each n. ◮ Or we could let qn(x1:n) = qn(xn|x1:n−1)qn−1(x1:n) and
◮ The importance weights become:
10
Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo
1
1 ∝ w1
1
p1 “ X(i)
1
” q1 “ X(i)
1
”.
n ∼ qn
n−1
1:n−1, X(i) n
pn “ X(i)
1:n−1,X(i) n
” pn−1 “ X(i)
1:n−1
” qn “ X(i)
n
˛ ˛ ˛X(i)
n−1
”
n
n−1wn
1:n−1, X(i) n
11
Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo
n,n ∼ qn
n−1
n−1, X(i) n,n
pn “ e X(i)
n−1,X(i) n,n
” pn−1 “ e X(i)
n−1
” qn “ X(i)
n,n
˛ ˛ ˛ e X(i)
n−1
”
n
wn “ e X(i)
n−1,X(i) n,n
” PN
j=1 wn
“ e X(j)
n−1,X(j) n,n
”.
n ∼ N j=1 W (j) n δ“ e X(j)
n−1,X(j) n,n
” (dx1:n) .
12
Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo
◮ Given a sequence of target distributions, ηn, on En . . . , ◮ construct a synthetic sequence
n
◮ by introducing Markov kernels, Lp from Ep+1 to Ep:
n−1
◮ These distributions
◮ have the target distributions as time marginals, ◮ have the correct structure to employ SMC techniques. 13
Introduction Estimation Rare Events Filtering Summary References Sequential Monte Carlo
◮ Given a sample {X(i) 1:n−1}N i=1 targeting
◮ sample X(i) n ∼ Kn(X(i) n−1, ·), ◮ calculate
1:n) = ηn(X(i) n )Ln−1(X(i) n , X(i) n−1)
n−1)Kn(X(i) n−1, X(i) n )
◮ Resample, yielding: {X(i) 1:n}N i=1 targeting
◮ Hints that we’d like to use
n−1)Kn(x′ n−1, xn).
14
Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models
15
Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models
◮ Consider a model with:
◮ parameters, θ, ◮ latent variables, x, and ◮ observed data, y.
◮ Aim to maximise Marginal likelihood
◮ Traditional approach is Expectation-Maximisation (EM)
◮ Requires objective function in closed form. ◮ Susceptible to trapping in local optima. 16
Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models
◮ A distribution of the form
◮ Key point: Synthetic distributions of the form:
γ
17
Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models
◮ Use a sequence of distributions ηn = πγn for some {γn}. ◮ Has previously been suggested in an MCMC context
◮ Requires extremely slow “annealing”. ◮ Separation between distributions is large.
◮ SMC has two main advantages:
◮ Introducing bridging distributions, for γ = ⌊γ⌋ + γ, of:
⌊γ⌋
◮ Population of samples improves robustness. 18
Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models
◮ A generic SMC sampler can be written down directly. . . ◮ Easy case:
◮ Sample from p(xn|y, θn−1) and p(θn|xn, y). ◮ Weight according to p(y|θn−1)γn−γn−1.
◮ General case:
◮ Sample existing variables from a ηn−1-invariant kernel:
◮ Sample new variables from an arbitrary proposal:
◮ Use the composition of a time-reversal and optimal
◮ Weight expression does not involve the marginal likelihood. 19
Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models
◮ Student t-distribution of unknown location parameter θ
◮ Four observations are available, y = (−20, 1, 2, 3). ◮ Log likelihood is:
4
◮ Global maximum is at 1.997. ◮ Local maxima at {−19.993, 1.086, 2.906}.
20
Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 20 40 60 80 100 T=15 T=30 T=60
22
Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models
◮ Likelihood p(y|x, ω, µ, σ) = N(y|µx, σ2 x). ◮ Marginal likelihood p(y|ω, µ, σ) = 3
j ). ◮ Diffuse conjugate priors were employed. ◮ All full conditional distributions of interest are available. ◮ Marginal posterior can be calculated.
23
Introduction Estimation Rare Events Filtering Summary References Parameter Estimation in Latent Variable Models
100 1000 10000 100000 1e+06 smc same em
24
Introduction Estimation Rare Events Filtering Summary References
25
Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation
◮ Consider the canonical Markov chain:
∞
∞
◮ The law Pµ0 is defined by its finite dimensional
0:p(dx0:p) = µ0(dx0) p
◮ We are interested in rare events.
26
Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation
◮ The first P + 1 elements of the canonical Markov chain lie
◮ That is, we are interested in
◮ We assume that the rare event is characterised as a level
27
Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation
◮ A Markov process hits some rare set, T , before its first
◮ That is, given the stopping time τ = inf {p : Xp ∈ T ∪ R},
28
Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation
◮ Principle novelty: applying an efficient sampling technique
◮ Two components to this approach:
◮ Constructing a sequence of synthetic distributions ◮ Applying sequential importance sampling and resampling
29
Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation
◮ Initialise by sampling from the law of the Markov chain. ◮ Iteratively obtain samples from a sequence of distributions
◮ Proposed sequence of distributions:
◮ Estimate the normalising constant of the final distribution
30
Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation
◮ Given a sequence of densities p(x|θ) = q(x|θ)/z(θ):
◮ Consequently, we obtain:
31
Introduction Estimation Rare Events Filtering Summary References Rare Event Simulation — Formulation
N
n V “ X(j)
n
” − ˆ V 1+exp “ αn “ V “ X(j)
n
” − ˆ V ”” N
n
N
T
T
V ,∞]
T
T
32
Introduction Estimation Rare Events Filtering Summary References Example: Gaussian Random Walk
◮ A toy example: Mt(Rt−1, Rt) = N(Rt|Rt−1, 1). ◮ T = RP × [ ˆ
◮ Proposal kernel:
S
P
◮ Linear annealing schedule. ◮ Number of distributions T ∝ ˆ
33
Introduction Estimation Rare Events Filtering Summary References Example: Gaussian Random Walk
5 10 15 20 25 30 35 40 log Probability Threshold Gaussian Random Walk Example Results True Values IPS(100) IPS(1,000) IPS(20,000) SMC(100)
34
Introduction Estimation Rare Events Filtering Summary References Example: Gaussian Random Walk
5 10 15 20 25 30 2 4 6 8 10 12 14 16 Markov Chain State Value Markov Chain State Number Typical SMC Run -- All Particles
35
Introduction Estimation Rare Events Filtering Summary References Example: Gaussian Random Walk
5 10 15 20 25 30 2 4 6 8 10 12 14 16 Markov Chain State Value Markov Chain State Number Typical IPS Run -- Particles Which Hit The Rare Set 36
Introduction Estimation Rare Events Filtering Summary References
37
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes
◮ For t ∈ R+ 0 , consider object with position st, velocity vt
◮ Summarise state by ζt = (st, vt, at) ◮ From initial condition ζ0, state evolves until random time
◮ From ζτ1, evolution until τ2, state becomes ζτ2, etc. ◮ Observation times, (tn)n∈N, at each tn a noisy
38
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes 5 10 15 20 25 30 35 −10 −5 5 10 15 20 25 30 sy/km sx/km 39
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes
◮ Pair Markov chain (τj, θj)j∈N, τj ∈ R+, θj ∈ Θ
◮ Count the jumps νt := j I[τj≤t] ◮ Deterministic evolution function F : R+ 0 × Θ → Θ, s.t.
◮ Signal process (ζt)t∈R+
0 ,
40
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes
◮ This describes a Piecewise Deterministic Process. ◮ It’s partially observed via observations (Yn)n∈N, e.g.,
◮ Filtering: given observations, y1:n, estimate ζtn. ◮ How can we approximate p(ζtn|y1:n), p(ζtn+1|y1:n+1), ... ?
41
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes
◮ Sequence of spaces (En)n∈N,
∞
◮ Define kn := νtn and Xn = (ζ0, kn, τ1:kn, θ1:kn) ∈ En ◮ Sequence of posterior distributions (ηn)n∈N
kn
n
42
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes
◮ Recall Xn = (ζ0, kn, τ1:kn, θ1:kn) specifies a path (ζt)t∈[0,tn] ◮ If forward kernel Kn only alters the recent components of
N
n δF(tn−τ (i)
kn ,θ(i) kn)(dζtn)
◮ A mixture proposal
43
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes
◮ When Kn corresponds to extending xn−1 into En by
◮ This is inefficient as involves propagating multiple copies of
◮ A more efficient strategy is to propose births and to
◮ To minimize the variance the importance weights, we
44
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes 5 10 15 20 25 30 35 −15 −10 −5 5 10 15 20 25 30 sx/km sy/km 45
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes
20 40 60 80 100 120 −2 2 4 CPU/s log RMSE
46
Introduction Estimation Rare Events Filtering Summary References Filtering of PD Processes
◮ This framework allows us to analyse algorithm of Godsill et
◮ µn(ϕ) :=
n (ϕ) the corresponding
◮ Under standard regularity conditions
n (ϕ) − µn(ϕ)) ⇒ N(0, σ2 n(ϕ)) ◮ Under rather strong assumptions*
n (ϕ) − µn(ϕ)|p1/p ≤ cp(ϕ)
47
Introduction Estimation Rare Events Filtering Summary References
48
Introduction Estimation Rare Events Filtering Summary References
◮ Implementing SMC algorithms in C/C++ isn’t hard. ◮ Software for implementing general SMC algorithms. ◮ C++ element largely confined to the library. ◮ Available (under a GPL-3 license from)
◮ Example code includes estimation of Gaussian tail
◮ Particle filters can also be implemented easily.
49
Introduction Estimation Rare Events Filtering Summary References
◮ Monte Carlo Methods have uses beyond the calculation of
◮ SMC provides a viable alternative to MCMC. ◮ SMC is effective at:
◮ ML and MAP estimation; ◮ rare event estimation; ◮ filtering outside the standard particle filtering framework. ◮ . . . ◮ Other published applications include: approximate Bayesian
◮ A huge amount of work remains to be done. . .
50
Introduction Estimation Rare Events Filtering Summary References
[1]
[2]
Royal Statistical Society B, 63(3):411–436, 2006. [3]
using Markov chain Monte Carlo. Statistics and Computing, 12:77–84, 2002. [4]
sampling to bridge sampling to path sampling. Statistical Science, 13(2):163–185, 1998. [5]
manoeuvring objects using variable rate particle filters. Proceedings of IEEE, 95(5): 925–952, 2007. [6] C.-R. Hwang. Laplace’s method revisited: Weak convergence of probability measures. The Annals of Probability, 8(6):1177–1182, December 1980. [7]
University of Bristol, Department of Mathematics – Statistics Group, University Walk, Bristol, BS8 1TW, UK, July 2008. [8]
256–267, Bamberg, Germany, October 2006. [9]
parameter estimation in latent variable models. Statistics and Computing, 18(1):47–57, March 2008. [10]
pages 89–93, Madison, WI, USA, August 26th–29th 2007. IEEE. [11]
piecewise-deterministic processes. In revision., 2008. 51
Introduction Estimation Rare Events Filtering Summary References Justification of path sampling identity
52