Particle Gibbs with Ancestor Sampling Fredrik Lindsten , Michael I. - - PowerPoint PPT Presentation

particle gibbs with ancestor sampling
SMART_READER_LITE
LIVE PREVIEW

Particle Gibbs with Ancestor Sampling Fredrik Lindsten , Michael I. - - PowerPoint PPT Presentation

Particle Gibbs with Ancestor Sampling Fredrik Lindsten , Michael I. Jordan , Thomas B. Schn Chamonix, January 6, 2014 Division of Automatic Control Linkping University, Sweden Departments of EECS and Statistics University


slide-1
SLIDE 1

Particle Gibbs with Ancestor Sampling

Fredrik Lindsten⋆, Michael I. Jordan†, Thomas B. Schön‡ Chamonix, January 6, 2014

⋆Division of Automatic Control

Linköping University, Sweden

†Departments of EECS and Statistics

University of California, Berkeley, USA

‡Department of Information Technology

Uppsala University, Sweden

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-2
SLIDE 2

Identification of state-space models

2(24)

Consider a nonlinear discrete-time state-space model,

xt ∼ fθ(xt | xt−1), yt ∼ gθ(yt | xt),

and x1 ∼ π(x1). We observe y1:T = (y1, . . . , yT) and wish to estimate θ.

  • Frequentists: Find ˆ

θML = arg maxθ pθ(y1:T).

  • Use e.g. the Monte Carlo EM algorithm.
  • Bayesians: Find p(θ | y1:T).
  • Use e.g. Gibbs sampling.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-3
SLIDE 3

Gibbs sampler for SSMs

3(24)

Aim: Find p(θ, x1:T | y1:T). MCMC: Gibbs sampling for state-space models. Iterate,

  • Draw θ[k] ∼ p(θ | x1:T[k − 1], y1:T);

OK!

  • Draw x1:T[k] ∼ pθ[k](x1:T | y1:T).

Hard! Problem: pθ(x1:T | y1:T) not available! Idea: Approximate pθ(x1:T | y1:T) using a particle filter.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-4
SLIDE 4

The particle filter

4(24)

  • Resampling: P(ai

t = j | F N t−1) = wj t−1/ ∑l wl t−1.

  • Propagation: xi

t ∼ Rθ t (xt | xai

t

1:t−1) and xi 1:t = {xai

t

1:t−1, xi t}.

  • Weighting: wi

t = Wθ t (xi 1:t).

⇒ {xi

1:t, wi t}N i=1

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

Weighting Resampling Propagation Weighting Resampling

slide-5
SLIDE 5

The particle filter

5(24)

Algorithm Particle filter (PF)

  • 1. Initialize (t = 1):

(a) Draw xi

1 ∼ Rθ 1(x1) for i = 1, . . . , N.

(b) Set wi

1 = Wθ 1(xi 1) for i = 1, . . . , N.

  • 2. for t = 2, . . . , T:

(a) Draw ai

t ∼ Discrete({wj t−1}N j=1) for i = 1, . . . , N.

(b) Draw xi

t ∼ Rθ t (xt | xai

t

1:t−1) for i = 1, . . . , N.

(c) Set xi

1:t = {xai

t

1:t−1, xi t} and wi t = Wθ t (xi 1:t) for i = 1, . . . , N.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-6
SLIDE 6

The particle filter

6(24)

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

5 10 15 20 25 −4 −3 −2 −1 1 Time State

slide-7
SLIDE 7

Sampling based on the PF

7(24)

With P(x⋆

1:T = xi 1:T | F N T ) ∝ wi T we get x⋆ 1:T

approx.

∼ pθ(x1:T | y1:T).

5 10 15 20 25 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0.5 1 Time State

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-8
SLIDE 8

Conditional particle filter with ancestor sampling

8(24)

Problems with this approach,

  • Based on a PF ⇒ approximate sample.
  • Does not leave p(θ, x1:T | y1:T) invariant!
  • Relies on large N to be successful.
  • A lot of wasted computations.

Conditional particle filter with ancestor sampling (CPF-AS) Let x′

1:T = (x′ 1, . . . , x′ T) be a fixed reference trajectory.

  • At each time t, sample only N − 1 particles in the standard way.
  • Set the Nth particle deterministically: xN

t = x′ t.

  • Generate an artificial history for xN

t by ancestor sampling.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-9
SLIDE 9

Conditional particle filter with ancestor sampling

9(24)

Algorithm CPF-AS, conditioned on x′

1:T

  • 1. Initialize (t = 1):

(a) Draw xi

1 ∼ Rθ 1(x1) for i = 1, . . . , N − 1.

(b) Set xN

1 = x′ 1.

(c) Set wi

1 = Wθ 1(xi 1) for i = 1, . . . , N.

  • 2. for t = 2, . . . , T:

(a) Draw ai

t ∼ Discrete({wj t−1}N j=1) for i = 1, . . . , N − 1.

(b) Draw xi

t ∼ Rθ t (xt | xai

t

1:t−1) for i = 1, . . . , N − 1.

(c) Set xN

t = x′ t.

(d) Draw aN

t with P(aN t = i | F N t−1) ∝ wi t−1 fθ(x′ t | xi t−1).

(e) Set xi

1:t = {xai

t

1:t−1, xi t} and wi t = Wθ t (xi 1:t) for i = 1, . . . , N.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-10
SLIDE 10

The PGAS Markov kernel (I/II)

10(24)

Consider the procedure:

  • 1. Run CPF-AS(N, x′

1:T) targeting pθ(x1:T | y1:T),

  • 2. Sample x⋆

1:T with P(x⋆ 1:T = xi 1:T | F N T ) ∝ wi T.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

5 10 15 20 25 30 35 40 45 50 −3 −2 −1 1 2 3 Time State

slide-11
SLIDE 11

The PGAS Markov kernel (II/II)

11(24)

This procedure:

  • Maps x′

1:T stochastically into x⋆ 1:T.

  • Implicitly defines a Markov kernel (PN

θ ) on (XT, X T), referred to

as the PGAS (Particle Gibbs with ancestor sampling) kernel.

Theorem

For any number of particles N ≥ 1 and for any θ ∈ Θ, the PGAS kernel PN

θ leaves pθ(x1:T | y1:T) invariant,

pθ(dx⋆

1:T | y1:T) =

  • PN

θ (x′ 1:T, dx⋆ 1:T)pθ(dx′ 1:T | y1:T).

  • F. Lindsten, M. I. Jordan and T. B. Schön, P

. Bartlett, F . C. N. Pereira, C. J. C. Burges, L. Bottou and K. Q. Weinberger (Eds.), Ancestor Sampling for Particle Gibbs Advances in Neural Information Processing Systems (NIPS) 25, 2600-2608, 2012.

  • F. Lindsten, M. I. Jordan and T. B. Schön, Particle Gibbs with Ancestor sampling, arXiv:1401.0604, 2014.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-12
SLIDE 12

Ergodicity

12(24)

Theorem

Assume that there exist constants ε > 0 and κ < ∞ such that, for any θ ∈ Θ, t ∈ {1, . . . , T} and x1:t ∈ Xt, Wθ

t (x1:t) ≤ κ and

pθ(y1:T) ≥ ε. a

Then, for any N ≥ 2 the PGAS kernel PN

θ is uniformly ergodic. That

is, there exist constants R < ∞ and ρ ∈ [0, 1) such that

(PN

θ )k(x′ 1:T, ·) − pθ( · | y1:T)TV ≤ Rρk,

∀x′

1:T ∈ XT.

aN.B. These conditions are simple, but unnecessarily strong; see (Lindsten,

Douc, and Moulines. 2014).

  • F. Lindsten, M. I. Jordan and T. B. Schön, Particle Gibbs with Ancestor sampling, arXiv:1401.0604, 2014.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-13
SLIDE 13

PGAS for Bayesian identification

13(24)

Bayesian identification: PGAS + Gibbs Algorithm PGAS for Bayesian identification

  • 1. Initialize: Set {θ[0], x1:T[0]} arbitrarily.
  • 2. For k ≥ 1, iterate:

(a) Draw x1:T[k] ∼ PN

θ[k−1](x1:T[k − 1], · ).

(b) Draw θ[k] ∼ p(θ | x1:T[k], y1:T).

For any number of particles N ≥ 2, the Markov chain

{θ[k], x1:T[k]}k≥1 has limiting distribution p(θ, x1:T | y1:T).

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-14
SLIDE 14

ex) Stochastic volatility model

14(24)

Stochastic volatility model,

xt+1 = 0.9xt + vt, vt ∼ N (0, θ), yt = et exp ( 1

2xt) ,

et ∼ N (0, 1).

Consider the ACF of θ[k] − E[θ | y1:T].

50 100 150 200 0.2 0.4 0.6 0.8 1 PG-AS, T = 1000 Lag ACF N=5 N=20 N=100 N=1000 50 100 150 200 0.2 0.4 0.6 0.8 1 PG, T = 1000 Lag ACF N=5 N=20 N=100 N=1000

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-15
SLIDE 15

PGAS vs. PG

15(24)

5 10 15 20 25 30 35 40 45 50 −3 −2 −1 1 2 3 Time State 5 10 15 20 25 30 35 40 45 50 −3 −2 −1 1 2 3 Time State 5 10 15 20 25 30 35 40 45 50 −3 −2 −1 1 2 3 Time State

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

PGAS PG

slide-16
SLIDE 16

Examples

16(24)

  • F. Lindsten, T. B. Schön and M. I. Jordan, Bayesian semiparametric Wiener system

identification, Automatica, 49(7): 2053-2063, July 2013.

−10 10 20

Magnitude (dB)

0.5 1 1.5 2 2.5 3 −50 50 100

Frequency (rad/s) Phase (deg)

True Posterior mean 99 % credibility −2 −1.5 −1 −0.5 0.5 1 1.5 2 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

z h(z)

True Posterior mean 99 % credibility

  • R. Frigola, F. Lindsten, T. B. Schön and C. E. Rasmussen, Bayesian Inference and

Learning in Gaussian Process State-Space Models with Particle MCMC, Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, 2013.

−20 −10 10 20 −1 −0.5 0.5 1 −20 −10 10 20 x u f(x,u)

10 20 30 40 50 60 70 80 90 −20 −15 −10 −5 5 10 15 20

t x

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-17
SLIDE 17

Maximum likelihood identification

17(24)

Back to frequentistic objective: ˆ

θML = arg maxθ pθ(y1:T).

  • Common strategy: Particle smoothing + EM (PSEM)
  • Alternative strategy: PGAS + stochastic approximation EM

(PSAEM)

EM MCEM PSEM SAEM PSAEM

  • F. Lindsten, An efficient stochastic approximation EM algorithm using conditional particle filters, Proceedings of the 38th IEEE

International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013. Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-18
SLIDE 18

PGAS for maximum likelihood identification

18(24)

Maximum likelihood identification: PGAS + SAEM = PSAEM Algorithm PSAEM

  • 1. Initialize: Set {θ[0], x1:T[0]} arbitrarily. Set

Q0 ≡ 0.

  • 2. For k ≥ 1, iterate:

(a) Draw x1:T[k] ∼ PN

θ[k−1](x1:T[k − 1], · ).

(b) Compute

  • Qk(θ) =

Qk−1(θ) + γk

  • log pθ(x1:T[k], y1:T) −

Qk−1(θ)

  • .

(c) Set θ[k] = arg maxθ

Qk(θ).

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-19
SLIDE 19

ex) Nonlinear time series

19(24)

Consider,

xt+1 = axt + b xt 1 + x2

t

+ c cos(1.2t) + vt, vt ∼ N (0, σ2

v),

yt = 0.05x2

t + et,

et ∼ N (0, σ2

e ).

  • Parameterization: θ = (a, b, c, σ2

v, σ2 e )

  • Relative error: (θ[k] −

θML)./ θML2.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-20
SLIDE 20

ex) Nonlinear time series

20(24)

10

−2

10

−1

10 10

1

10

2

10

3

10

4

10

−3

10

−2

10

−1

10 10

1

10

2

Computational time (seconds) Average relative error PSAEM,N=15 PSEM, N=15 PSEM, N=50 PSEM, N=100 PSEM, N=500

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-21
SLIDE 21

Summary

21(24)

Particle Gibbs with ancestor sampling (PGAS):

  • Uses SMC within MCMC in a systematic manner.
  • Defines an ergodic Markov kernel on (XT, X T) leaving

pθ(x1:T | y1:T) invariant for any number of particles N ≥ 2.

  • Seems to work well with a moderate N (say 5–50).
  • Consists of two parts
  • Conditioning: Ensures correct stationary distribution for any N.
  • Ancestor sampling: Mitigates path degeneracy and enables

movement around the conditioned path.

  • Useful for both Bayesian and maximum-likelihood-based

parameter inference as well as for state inference.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-22
SLIDE 22

Ongoing and future work

22(24)

Ongoing and future work:

  • Adapting PGAS to other types of models:
  • General probabilistic graphical models (factor graphs).
  • Non-Markovian latent variable models.
  • Ancestor sampling for models with intractable transitions.
  • How to scale N with T? — establish more informative rates.

Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-23
SLIDE 23

A few selected references

23(24) Particle Gibbs with ancestor sampling:

  • F. Lindsten, M. I. Jordan and T. B. Schön, Particle Gibbs with Ancestor sampling, arXiv:1401.0604, 2014.
  • F. Lindsten, M. I. Jordan and T. B. Schön, P

. Bartlett, F . C. N. Pereira, C. J. C. Burges, L. Bottou and K. Q. Weinberger (Eds.), Ancestor Sampling for Particle Gibbs Advances in Neural Information Processing Systems (NIPS) 25, 2600-2608, 2012.

Particle Gibbs with backward simulation (related procedure):

  • N. Whiteley, Discussion on Particle Markov chain Monte Carlo methods, Journal of the Royal Statistical Society:

Series B, 72(3), 306–307, 2010.

  • N. Whiteley, C. Andrieu and A. Doucet, Efficient Bayesian Inference for Switching State-Space Models using

Discrete Particle Markov Chain Monte Carlo methods, Bristol Statistics Research Report 10:04, 2010.

  • F. Lindsten and T. B. Schön, On the use of backward simulation in the particle Gibbs sampler, Proceedings of the

37th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012.

Seminal PMCMC paper:

  • C. Andrieu, A. Doucet and R. Holenstein, Particle Markov chain Monte Carlo methods Journal of the Royal

Statistical Society: Series B, 72(3), 269–342, 2010. Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET

slide-24
SLIDE 24

A few selected references

24(24) Ergodicity of Particle Gibbs:

  • F. Lindsten, R. Douc and E. Moulines, Uniform ergodicity of the Particle Gibbs sampler, arXiv:1401.0683, 2014.
  • C. Andrieu, A. Lee and M. Vihola, Uniform Ergodicity of the Iterated Conditional SMC and Geometric Ergodicity of

Particle Gibbs samplers, arXiv:1312.6432, 2013.

  • N. Chopin, S. S. Singh, On the particle Gibbs sampler, arXiv:1304.1887, 2013.

PMCMC and SAEM:

  • F. Lindsten, An efficient stochastic approximation EM algorithm using conditional particle filters, Proceedings of the

38th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013.

  • C. Andrieu and M. Vihola, Markovian stochastic approximation with expanding projections, arXiv.org,

arXiv:1111.5421, 2011.

  • S. Donnet and A. Samson, EM algorithm coupled with particle filter for maximum likelihood parameter estimation of

stochastic differential mixed-effects models, Tech. Rep. hal-00519576, v2, Université Paris Descartes, MAP5, 2011. Fredrik Lindsten, Michael I. Jordan, Thomas B. Schön Inference in nonlinear state-space models using Particle Gibbs with Ancestor Sampling

AUTOMATIC CONTROL REGLERTEKNIK LINKÖPINGS UNIVERSITET