Least-Action Filtering L. C. G. Rogers Statistical Laboratory, - - PowerPoint PPT Presentation

least action filtering
SMART_READER_LITE
LIVE PREVIEW

Least-Action Filtering L. C. G. Rogers Statistical Laboratory, - - PowerPoint PPT Presentation

Least-Action Filtering L. C. G. Rogers Statistical Laboratory, University of Cambridge Least-Action Filtering p. 1/1 Summary Least-Action Filtering p. 2/1 Summary Basics of least-action filtering Finding the least-action


slide-1
SLIDE 1

Least-Action Filtering

  • L. C. G. Rogers

Statistical Laboratory, University of Cambridge

Least-Action Filtering – p. 1/1

slide-2
SLIDE 2

Summary

Least-Action Filtering – p. 2/1

slide-3
SLIDE 3

Summary

  • Basics of least-action filtering
  • Finding the least-action path
  • The approximate conditional distribution of the hidden path
  • Example(s)
  • Relationship to particle filtering (SMC)

Least-Action Filtering – p. 2/1

slide-4
SLIDE 4

Summary

  • Basics of least-action filtering
  • Finding the least-action path
  • The approximate conditional distribution of the hidden path
  • Example(s)
  • Relationship to particle filtering (SMC)

SAMSI Program on Sequential Monte Carlo Methods, 9/08-9/09. Organisers: Arnaud Doucet, Simon Godsill Working group on continuous-time methods (Fearnhead, Voss, ....)

Least-Action Filtering – p. 2/1

slide-5
SLIDE 5

Summary

  • Basics of least-action filtering
  • Finding the least-action path
  • The approximate conditional distribution of the hidden path
  • Example(s)
  • Relationship to particle filtering (SMC)

SAMSI Program on Sequential Monte Carlo Methods, 9/08-9/09. Organisers: Arnaud Doucet, Simon Godsill Working group on continuous-time methods (Fearnhead, Voss, ....) Markussen (SPA 119, 208-231, 2009) uses similar techniques to approximate the density of a discretely-sampled diffusion process

Least-Action Filtering – p. 2/1

slide-6
SLIDE 6

The setting.

Least-Action Filtering – p. 3/1

slide-7
SLIDE 7

The setting.

Diffusion Zt ≡ [Xt; Yt] in Rd, solving dZt = σ(t, Zt) dWt + µ(t, Zt) dt

Least-Action Filtering – p. 3/1

slide-8
SLIDE 8

The setting.

Diffusion Zt ≡ [Xt; Yt] in Rd, solving dZt = σ(t, Zt) dWt + µ(t, Zt) dt where σ, µ and σ−1 are C2

b .

Least-Action Filtering – p. 3/1

slide-9
SLIDE 9

The setting.

Diffusion Zt ≡ [Xt; Yt] in Rd, solving dZt = σ(t, Zt) dWt + µ(t, Zt) dt where σ, µ and σ−1 are C2

b .

We observe (Yt)0≤t≤T and want to find the conditional distribution of (Xt)0≤t≤T .

Least-Action Filtering – p. 3/1

slide-10
SLIDE 10

The setting.

Diffusion Zt ≡ [Xt; Yt] in Rd, solving dZt = σ(t, Zt) dWt + µ(t, Zt) dt where σ, µ and σ−1 are C2

b .

We observe (Yt)0≤t≤T and want to find the conditional distribution of (Xt)0≤t≤T . Closely related is the Euler scheme dz(n)

t

= σ(tn, z(n)

tn ) dWt + µ(tn, z(n) tn ) dt

where tn ≡ 2−n[2nt].

Least-Action Filtering – p. 3/1

slide-11
SLIDE 11

The setting.

Diffusion Zt ≡ [Xt; Yt] in Rd, solving dZt = σ(t, Zt) dWt + µ(t, Zt) dt where σ, µ and σ−1 are C2

b .

We observe (Yt)0≤t≤T and want to find the conditional distribution of (Xt)0≤t≤T . Closely related is the Euler scheme dz(n)

t

= σ(tn, z(n)

tn ) dWt + µ(tn, z(n) tn ) dt

where tn ≡ 2−n[2nt]. Despite appearances, this can be viewed as a discrete scheme.

Least-Action Filtering – p. 3/1

slide-12
SLIDE 12

The setting.

Diffusion Zt ≡ [Xt; Yt] in Rd, solving dZt = σ(t, Zt) dWt + µ(t, Zt) dt where σ, µ and σ−1 are C2

b .

We observe (Yt)0≤t≤T and want to find the conditional distribution of (Xt)0≤t≤T . Closely related is the Euler scheme dz(n)

t

= σ(tn, z(n)

tn ) dWt + µ(tn, z(n) tn ) dt

where tn ≡ 2−n[2nt]. Despite appearances, this can be viewed as a discrete scheme. We also have sup

0≤t≤T

|z(n)

t

− Zt| a.s. → 0.

Least-Action Filtering – p. 3/1

slide-13
SLIDE 13

The setting.

Diffusion Zt ≡ [Xt; Yt] in Rd, solving dZt = σ(t, Zt) dWt + µ(t, Zt) dt where σ, µ and σ−1 are C2

b .

We observe (Yt)0≤t≤T and want to find the conditional distribution of (Xt)0≤t≤T . Closely related is the Euler scheme dz(n)

t

= σ(tn, z(n)

tn ) dWt + µ(tn, z(n) tn ) dt

where tn ≡ 2−n[2nt]. Despite appearances, this can be viewed as a discrete scheme. We also have sup

0≤t≤T

|z(n)

t

− Zt| a.s. → 0. Use continuous time for guidance, discrete time for numerics and proof.

Least-Action Filtering – p. 3/1

slide-14
SLIDE 14

Log Likelihoods.

Least-Action Filtering – p. 4/1

slide-15
SLIDE 15

Log Likelihoods.

See y(j2−n)0≤j≤2nT and want conditional law of x(j2−n)0≤j≤2nT .

Least-Action Filtering – p. 4/1

slide-16
SLIDE 16

Log Likelihoods.

See y(j2−n)0≤j≤2nT and want conditional law of x(j2−n)0≤j≤2nT . Log-likelihood is (to within additive constant) (N ≡ 2nT, h ≡ 2−n) λ(x|y) = − 1

2

N−1

X

j=0

1 h ˛ ˛ σ(jh, zjh)−1` zjh+h − zjh − hµ(jh, zjh) ´ ˛ ˛2 − ϕ(x0) = − 1

2

N−1

X

j=0

h ˛ ˛ σ(jh, zjh)−1` zjh+h − zjh h − µ(jh, zjh) ´ ˛ ˛2 − ϕ(x0) “ = ” − 1

2

Z T ˛ ˛ σ(s, zs)−1` ˙ zs − µ(s, zs) ´ ˛ ˛2 ds − ϕ(x0), where exp(−ϕ) is the (prior) density of X0.

Least-Action Filtering – p. 4/1

slide-17
SLIDE 17

Log Likelihoods.

See y(j2−n)0≤j≤2nT and want conditional law of x(j2−n)0≤j≤2nT . Log-likelihood is (to within additive constant) (N ≡ 2nT, h ≡ 2−n) λ(x|y) = − 1

2

N−1

X

j=0

1 h ˛ ˛ σ(jh, zjh)−1` zjh+h − zjh − hµ(jh, zjh) ´ ˛ ˛2 − ϕ(x0) = − 1

2

N−1

X

j=0

h ˛ ˛ σ(jh, zjh)−1` zjh+h − zjh h − µ(jh, zjh) ´ ˛ ˛2 − ϕ(x0) “ = ” − 1

2

Z T ˛ ˛ σ(s, zs)−1` ˙ zs − µ(s, zs) ´ ˛ ˛2 ds − ϕ(x0), where exp(−ϕ) is the (prior) density of X0. Maximising the log-likelihood is like maximising Λ(x|y) = − 1

2

Z T ˛ ˛ σ(s, zs)−1` ˙ zs − µ(s, zs) ´ ˛ ˛2 ds − ϕ(x0) ≡ − Z T ψ(s, xs, ps) ds − ϕ(x0) where ps ≡ ˙ xs.

Least-Action Filtering – p. 4/1

slide-18
SLIDE 18

Log Likelihoods.

See y(j2−n)0≤j≤2nT and want conditional law of x(j2−n)0≤j≤2nT . Log-likelihood is (to within additive constant) (N ≡ 2nT, h ≡ 2−n) λ(x|y) = − 1

2

N−1

X

j=0

1 h ˛ ˛ σ(jh, zjh)−1` zjh+h − zjh − hµ(jh, zjh) ´ ˛ ˛2 − ϕ(x0) = − 1

2

N−1

X

j=0

h ˛ ˛ σ(jh, zjh)−1` zjh+h − zjh h − µ(jh, zjh) ´ ˛ ˛2 − ϕ(x0) “ = ” − 1

2

Z T ˛ ˛ σ(s, zs)−1` ˙ zs − µ(s, zs) ´ ˛ ˛2 ds − ϕ(x0), where exp(−ϕ) is the (prior) density of X0. Maximising the log-likelihood is like maximising Λ(x|y) = − 1

2

Z T ˛ ˛ σ(s, zs)−1` ˙ zs − µ(s, zs) ´ ˛ ˛2 ds − ϕ(x0) ≡ − Z T ψ(s, xs, ps) ds − ϕ(x0) where ps ≡ ˙ xs. This is a task for calculus of variations ....

Least-Action Filtering – p. 4/1

slide-19
SLIDE 19

Calculus of Variations.

Least-Action Filtering – p. 5/1

slide-20
SLIDE 20

Calculus of Variations.

If we perturb optimal x∗ to x∗ + ξ, the first-order change is ∆Λ = ∆ „ − Z T ψ(s, x∗

s, p∗ s) ds − ϕ(x0)

« = − Z T ˘ ξ · Dxψ + ˙ ξ · Dpψ ¯ ds − ξ(0) · Dxϕ = −[ξ · Dpψ]T

0 +

Z T ξ · ˘ Dtpψ + (Dpxψ) ˙ x + (Dppψ) ˙ p − Dxψ ¯ ds − ξ(0) · Dxϕ.

Least-Action Filtering – p. 5/1

slide-21
SLIDE 21

Calculus of Variations.

If we perturb optimal x∗ to x∗ + ξ, the first-order change is ∆Λ = ∆ „ − Z T ψ(s, x∗

s, p∗ s) ds − ϕ(x0)

« = − Z T ˘ ξ · Dxψ + ˙ ξ · Dpψ ¯ ds − ξ(0) · Dxϕ = −[ξ · Dpψ]T

0 +

Z T ξ · ˘ Dtpψ + (Dpxψ) ˙ x + (Dppψ) ˙ p − Dxψ ¯ ds − ξ(0) · Dxϕ. Since ξ is arbitrary, we conclude that = Dpψ(0, x∗

0, p∗ 0) − Dxϕ(x∗ 0)

= Dtpψ + (Dpxψ) ˙ x∗ + (Dppψ) ˙ p∗ − Dxψ = Dpψ(T, x∗

T , p∗ T )

which is a second-order ODE for the optimal x∗, with boundary conditions at 0 and at T.

Least-Action Filtering – p. 5/1

slide-22
SLIDE 22

Discrete Calculus of Variations.

Least-Action Filtering – p. 6/1

slide-23
SLIDE 23

Discrete Calculus of Variations.

With pj ≡ (xjh+h − xjh)/h, must minimize

N−1

X

j=0

hψ(tj, xj, pj) + ϕ(x0)

Least-Action Filtering – p. 6/1

slide-24
SLIDE 24

Discrete Calculus of Variations.

With pj ≡ (xjh+h − xjh)/h, must minimize

N−1

X

j=0

hψ(tj, xj, pj) + ϕ(x0) Perturbing x∗ to x∗ + ξ as before gives = h

N−1

X

j=0

˘ ξj · Dxψ(tj, xj, pj) + ξj+1 − ξj h · Dpψ(tj, xj, pj) ¯ + ξ0 · Dϕ(x0) = h

N−1

X

j=1

ξj ˘ Dxψ(tj, xj, pj) − h−1(Dpψ(tj, xj, pj) − Dpψ(tj−1, xj−1, pj−1) ¯ +ξ0 · { Dϕ(x0) − Dpψ(t0, x0, p0) } + ξN · Dpψ(T, xN−1, pN−1)

Least-Action Filtering – p. 6/1

slide-25
SLIDE 25

Discrete Calculus of Variations.

With pj ≡ (xjh+h − xjh)/h, must minimize

N−1

X

j=0

hψ(tj, xj, pj) + ϕ(x0) Perturbing x∗ to x∗ + ξ as before gives = h

N−1

X

j=0

˘ ξj · Dxψ(tj, xj, pj) + ξj+1 − ξj h · Dpψ(tj, xj, pj) ¯ + ξ0 · Dϕ(x0) = h

N−1

X

j=1

ξj ˘ Dxψ(tj, xj, pj) − h−1(Dpψ(tj, xj, pj) − Dpψ(tj−1, xj−1, pj−1) ¯ +ξ0 · { Dϕ(x0) − Dpψ(t0, x0, p0) } + ξN · Dpψ(T, xN−1, pN−1) This is an (implicit) discrete scheme for the ODE we got by calculus of variations.

Least-Action Filtering – p. 6/1

slide-26
SLIDE 26

The second-order story.

Least-Action Filtering – p. 7/1

slide-27
SLIDE 27

The second-order story.

Classical ML theory gives an asymptotic approximate Gaussian law for the MLE; what’s the analogue here?

Least-Action Filtering – p. 7/1

slide-28
SLIDE 28

The second-order story.

Classical ML theory gives an asymptotic approximate Gaussian law for the MLE; what’s the analogue here? Minimizing the action functional −Λ(x|y) = 1

2

Z T ψ(s, xs, ps) ds + ϕ(x0) gives optimal x∗;

Least-Action Filtering – p. 7/1

slide-29
SLIDE 29

The second-order story.

Classical ML theory gives an asymptotic approximate Gaussian law for the MLE; what’s the analogue here? Minimizing the action functional −Λ(x|y) = 1

2

Z T ψ(s, xs, ps) ds + ϕ(x0) gives optimal x∗; expanding in ξ = x − x∗ to second order gives Q(ξ) = Z T ˘

1 2 ξ · (Dxxψ) ξ + ξ · (Dxpψ) ˙

ξ + 1

2 ˙

ξ · (Dppψ) ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0)

≡ Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0). Least-Action Filtering – p. 7/1

slide-30
SLIDE 30

The second-order story.

Classical ML theory gives an asymptotic approximate Gaussian law for the MLE; what’s the analogue here? Minimizing the action functional −Λ(x|y) = 1

2

Z T ψ(s, xs, ps) ds + ϕ(x0) gives optimal x∗; expanding in ξ = x − x∗ to second order gives Q(ξ) = Z T ˘

1 2 ξ · (Dxxψ) ξ + ξ · (Dxpψ) ˙

ξ + 1

2 ˙

ξ · (Dppψ) ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0)

≡ Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0).

If we have

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ = 1

2 ( ˙

ξ + Ktξ) · qt( ˙ ξ + Ktξ) for some Kt,

Least-Action Filtering – p. 7/1

slide-31
SLIDE 31

The second-order story.

Classical ML theory gives an asymptotic approximate Gaussian law for the MLE; what’s the analogue here? Minimizing the action functional −Λ(x|y) = 1

2

Z T ψ(s, xs, ps) ds + ϕ(x0) gives optimal x∗; expanding in ξ = x − x∗ to second order gives Q(ξ) = Z T ˘

1 2 ξ · (Dxxψ) ξ + ξ · (Dxpψ) ˙

ξ + 1

2 ˙

ξ · (Dppψ) ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0)

≡ Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0).

If we have

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ = 1

2 ( ˙

ξ + Ktξ) · qt( ˙ ξ + Ktξ) for some Kt, then set ˙ w ≡ q1/2( ˙ ξ + Ktξ)

Least-Action Filtering – p. 7/1

slide-32
SLIDE 32

The second-order story.

Classical ML theory gives an asymptotic approximate Gaussian law for the MLE; what’s the analogue here? Minimizing the action functional −Λ(x|y) = 1

2

Z T ψ(s, xs, ps) ds + ϕ(x0) gives optimal x∗; expanding in ξ = x − x∗ to second order gives Q(ξ) = Z T ˘

1 2 ξ · (Dxxψ) ξ + ξ · (Dxpψ) ˙

ξ + 1

2 ˙

ξ · (Dppψ) ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0)

≡ Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0).

If we have

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ = 1

2 ( ˙

ξ + Ktξ) · qt( ˙ ξ + Ktξ) for some Kt, then set ˙ w ≡ q1/2( ˙ ξ + Ktξ) and see Q(ξ) = 1

2

Z T | ˙ wt|2 dt + 1

2 ξ(0) · (Dxxϕ) ξ(0). Least-Action Filtering – p. 7/1

slide-33
SLIDE 33

The second-order story.

Least-Action Filtering – p. 8/1

slide-34
SLIDE 34

The second-order story.

Thus ξ(0) has a N(0, (Dxxϕ)−1) law, and w is a Brownian motion: dξt = q−1/2

t

dwt + Ktξtdt

Least-Action Filtering – p. 8/1

slide-35
SLIDE 35

The second-order story.

Thus ξ(0) has a N(0, (Dxxϕ)−1) law, and w is a Brownian motion: dξt = q−1/2

t

dwt + Ktξtdt exhibiting ξ as a zero-mean Gaussian process, whose variance vt solves ˙ vt = −Ktvt − vtKT

t + q−1 t

.

Least-Action Filtering – p. 8/1

slide-36
SLIDE 36

The second-order story.

Thus ξ(0) has a N(0, (Dxxϕ)−1) law, and w is a Brownian motion: dξt = q−1/2

t

dwt + Ktξtdt exhibiting ξ as a zero-mean Gaussian process, whose variance vt solves ˙ vt = −Ktvt − vtKT

t + q−1 t

. Now suppose that θt is a symmetric-matrix-valued function of time, with θ(T) = 0.

Least-Action Filtering – p. 8/1

slide-37
SLIDE 37

The second-order story.

Thus ξ(0) has a N(0, (Dxxϕ)−1) law, and w is a Brownian motion: dξt = q−1/2

t

dwt + Ktξtdt exhibiting ξ as a zero-mean Gaussian process, whose variance vt solves ˙ vt = −Ktvt − vtKT

t + q−1 t

. Now suppose that θt is a symmetric-matrix-valued function of time, with θ(T) = 0. Then Q(ξ) = Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0)

= [ 1

2 ξ · θξ]T

0 +

Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ + θ(0))ξ(0)

= Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ+ ˙ ξ · θξ + 1

2 ξ · ˙

θξ ¯ dt + 1

2 ξ(0) · (Dxxϕ + θ(0))ξ(0) Least-Action Filtering – p. 8/1

slide-38
SLIDE 38

The second-order story.

Thus ξ(0) has a N(0, (Dxxϕ)−1) law, and w is a Brownian motion: dξt = q−1/2

t

dwt + Ktξtdt exhibiting ξ as a zero-mean Gaussian process, whose variance vt solves ˙ vt = −Ktvt − vtKT

t + q−1 t

. Now suppose that θt is a symmetric-matrix-valued function of time, with θ(T) = 0. Then Q(ξ) = Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ) ξ(0)

= [ 1

2 ξ · θξ]T

0 +

Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ ¯ dt + 1

2 ξ(0) · (Dxxϕ + θ(0))ξ(0)

= Z T ˘

1 2 ξ · At ξ + ξ · Bt ˙

ξ + 1

2 ˙

ξ · qt ˙ ξ+ ˙ ξ · θξ + 1

2 ξ · ˙

θξ ¯ dt + 1

2 ξ(0) · (Dxxϕ + θ(0))ξ(0)

Inside the integral is

1 2 ˙

ξ · q ˙ ξ + ξ · (B + θ) ˙ ξ + 1

2 ξ · (A + ˙

θ)ξ= 1

2 ( ˙

ξ + Ktξ) · qt( ˙ ξ + Ktξ) . . .

Least-Action Filtering – p. 8/1

slide-39
SLIDE 39

The second-order story.

... PROVIDED ...

Least-Action Filtering – p. 9/1

slide-40
SLIDE 40

The second-order story.

... PROVIDED ... qK = BT + θ A + ˙ θ = KT qK = (B + θ)q−1(BT + θ).

Least-Action Filtering – p. 9/1

slide-41
SLIDE 41

The second-order story.

... PROVIDED ... qK = BT + θ A + ˙ θ = KT qK = (B + θ)q−1(BT + θ). This gives an ODE for θ with condition θ(T) = 0.

Least-Action Filtering – p. 9/1

slide-42
SLIDE 42

The second-order story.

... PROVIDED ... qK = BT + θ A + ˙ θ = KT qK = (B + θ)q−1(BT + θ). This gives an ODE for θ with condition θ(T) = 0. REMARKS:

Least-Action Filtering – p. 9/1

slide-43
SLIDE 43

The second-order story.

... PROVIDED ... qK = BT + θ A + ˙ θ = KT qK = (B + θ)q−1(BT + θ). This gives an ODE for θ with condition θ(T) = 0. REMARKS:

  • This gives a simple probabilistic picture for (Xt)0≤t≤T conditional on

(Yt)0≤t≤T : Xt ≃ x∗

t + ξt.

Least-Action Filtering – p. 9/1

slide-44
SLIDE 44

The second-order story.

... PROVIDED ... qK = BT + θ A + ˙ θ = KT qK = (B + θ)q−1(BT + θ). This gives an ODE for θ with condition θ(T) = 0. REMARKS:

  • This gives a simple probabilistic picture for (Xt)0≤t≤T conditional on

(Yt)0≤t≤T : Xt ≃ x∗

t + ξt.

  • In particular, if there are no observations, this gives a way to approximate the

law of X ...

Least-Action Filtering – p. 9/1

slide-45
SLIDE 45

The second-order story.

... PROVIDED ... qK = BT + θ A + ˙ θ = KT qK = (B + θ)q−1(BT + θ). This gives an ODE for θ with condition θ(T) = 0. REMARKS:

  • This gives a simple probabilistic picture for (Xt)0≤t≤T conditional on

(Yt)0≤t≤T : Xt ≃ x∗

t + ξt.

  • In particular, if there are no observations, this gives a way to approximate the

law of X ...

  • Methodology relies on solving ODEs, which is well explored, and may stand a

chance in higher dimensions ..

Least-Action Filtering – p. 9/1

slide-46
SLIDE 46

Example(s).

Least-Action Filtering – p. 10/1

slide-47
SLIDE 47

Example(s).

Suppose X, y are independent 1-d OU processes, and we see Y = X + y.

Least-Action Filtering – p. 10/1

slide-48
SLIDE 48

Example(s).

Suppose X, y are independent 1-d OU processes, and we see Y = X + y. Then Z = [X; Y ] solves dZ = „ σX σX σy «„ dW dW ′ « + „ −βX −βX + βy −βy «„ X Y « dt ≡ σ „ dW dW ′ « + A „ X Y « dt

Least-Action Filtering – p. 10/1

slide-49
SLIDE 49

Example(s).

Suppose X, y are independent 1-d OU processes, and we see Y = X + y. Then Z = [X; Y ] solves dZ = „ σX σX σy «„ dW dW ′ « + „ −βX −βX + βy −βy «„ X Y « dt ≡ σ „ dW dW ′ « + A „ X Y « dt Here, ψ(t, x, p) = 1 2 „ „ p ˙ Y « − A „ x Y « «T q „ „ p ˙ Y « − A „ x Y « «

Least-Action Filtering – p. 10/1

slide-50
SLIDE 50

How did it do?

Least-Action Filtering – p. 11/1

slide-51
SLIDE 51

How did it do?

20 40 60 80 100 120 140 160 180 200 −15 −10 −5 5 10 Least−action estimate True X Observation Y

Least-Action Filtering – p. 11/1

slide-52
SLIDE 52

Particle filtering.

Least-Action Filtering – p. 12/1

slide-53
SLIDE 53

Particle filtering.

The volume of a ball of radius 1 in dimension d is Vd ≡ (2π)d/2 2−d/2Γ(1 + 1

2 d)−1. Least-Action Filtering – p. 12/1

slide-54
SLIDE 54

Particle filtering.

The volume of a ball of radius 1 in dimension d is Vd ≡ (2π)d/2 2−d/2Γ(1 + 1

2 d)−1.

Suppose we have a ball B of radius b hidden somewhere in the unit cube - perhaps the ball where f(Yt|x) > ε.

Least-Action Filtering – p. 12/1

slide-55
SLIDE 55

Particle filtering.

The volume of a ball of radius 1 in dimension d is Vd ≡ (2π)d/2 2−d/2Γ(1 + 1

2 d)−1.

Suppose we have a ball B of radius b hidden somewhere in the unit cube - perhaps the ball where f(Yt|x) > ε. Fire random uniform points into unit cube trying to hit B;

Least-Action Filtering – p. 12/1

slide-56
SLIDE 56

Particle filtering.

The volume of a ball of radius 1 in dimension d is Vd ≡ (2π)d/2 2−d/2Γ(1 + 1

2 d)−1.

Suppose we have a ball B of radius b hidden somewhere in the unit cube - perhaps the ball where f(Yt|x) > ε. Fire random uniform points into unit cube trying to hit B; how many tries do we need to stand a reasonable chance of finding B?

Least-Action Filtering – p. 12/1

slide-57
SLIDE 57

Particle filtering.

The volume of a ball of radius 1 in dimension d is Vd ≡ (2π)d/2 2−d/2Γ(1 + 1

2 d)−1.

Suppose we have a ball B of radius b hidden somewhere in the unit cube - perhaps the ball where f(Yt|x) > ε. Fire random uniform points into unit cube trying to hit B; how many tries do we need to stand a reasonable chance of finding B? Roughly b−d V −1

d

.

Least-Action Filtering – p. 12/1

slide-58
SLIDE 58

Particle filtering.

The volume of a ball of radius 1 in dimension d is Vd ≡ (2π)d/2 2−d/2Γ(1 + 1

2 d)−1.

Suppose we have a ball B of radius b hidden somewhere in the unit cube - perhaps the ball where f(Yt|x) > ε. Fire random uniform points into unit cube trying to hit B; how many tries do we need to stand a reasonable chance of finding B? Roughly b−d V −1

d

.

5 10 15 20 25 30 35 40 45 50 10 20 30 40 50 60 70 Log_10 of number of particles to get within 0.1 of hidden point Dimension Log_10(N)

Least-Action Filtering – p. 12/1

slide-59
SLIDE 59

Particle filtering.

Least-Action Filtering – p. 13/1

slide-60
SLIDE 60

Particle filtering.

Forced to use importance sampling in higher dimensions.

Least-Action Filtering – p. 13/1

slide-61
SLIDE 61

Particle filtering.

Forced to use importance sampling in higher dimensions. If we have Yt = Xt + εt can use a proposal distribution centred at Yt;

Least-Action Filtering – p. 13/1

slide-62
SLIDE 62

Particle filtering.

Forced to use importance sampling in higher dimensions. If we have Yt = Xt + εt can use a proposal distribution centred at Yt; if we have Yt = Φ(Xt) + εt for some very complicated function Φ, we have to search for an X-region to place points;

Least-Action Filtering – p. 13/1

slide-63
SLIDE 63

Particle filtering.

Forced to use importance sampling in higher dimensions. If we have Yt = Xt + εt can use a proposal distribution centred at Yt; if we have Yt = Φ(Xt) + εt for some very complicated function Φ, we have to search for an X-region to place points;

  • in effect, we are doing ML!

Least-Action Filtering – p. 13/1

slide-64
SLIDE 64

Particle filtering.

Forced to use importance sampling in higher dimensions. If we have Yt = Xt + εt can use a proposal distribution centred at Yt; if we have Yt = Φ(Xt) + εt for some very complicated function Φ, we have to search for an X-region to place points;

  • in effect, we are doing ML!
  • Need to bundle ML methods with SMC methods routinely ...

Least-Action Filtering – p. 13/1

slide-65
SLIDE 65

Particle filtering.

Forced to use importance sampling in higher dimensions. If we have Yt = Xt + εt can use a proposal distribution centred at Yt; if we have Yt = Φ(Xt) + εt for some very complicated function Φ, we have to search for an X-region to place points;

  • in effect, we are doing ML!
  • Need to bundle ML methods with SMC methods routinely ...
  • Avoid use of Gaussian observation noises in high-dimensional problems ..

Least-Action Filtering – p. 13/1

slide-66
SLIDE 66

Conclusions.

Least-Action Filtering – p. 14/1

slide-67
SLIDE 67

Conclusions.

  • Minimizing action (= maximizing likelihood) picks out ‘most likely’ path;

Least-Action Filtering – p. 14/1

slide-68
SLIDE 68

Conclusions.

  • Minimizing action (= maximizing likelihood) picks out ‘most likely’ path;
  • Second-order expansion gives Gaussian structure of ξ ≡ x − x∗;

Least-Action Filtering – p. 14/1

slide-69
SLIDE 69

Conclusions.

  • Minimizing action (= maximizing likelihood) picks out ‘most likely’ path;
  • Second-order expansion gives Gaussian structure of ξ ≡ x − x∗;
  • For simulation/SMC, this can guide you in path generation;

Least-Action Filtering – p. 14/1

slide-70
SLIDE 70

Conclusions.

  • Minimizing action (= maximizing likelihood) picks out ‘most likely’ path;
  • Second-order expansion gives Gaussian structure of ξ ≡ x − x∗;
  • For simulation/SMC, this can guide you in path generation;
  • For large T,

Least-Action Filtering – p. 14/1

slide-71
SLIDE 71

Conclusions.

  • Minimizing action (= maximizing likelihood) picks out ‘most likely’ path;
  • Second-order expansion gives Gaussian structure of ξ ≡ x − x∗;
  • For simulation/SMC, this can guide you in path generation;
  • For large T,

we need to develop recursive methodology;

Least-Action Filtering – p. 14/1

slide-72
SLIDE 72

Conclusions.

  • Minimizing action (= maximizing likelihood) picks out ‘most likely’ path;
  • Second-order expansion gives Gaussian structure of ξ ≡ x − x∗;
  • For simulation/SMC, this can guide you in path generation;
  • For large T,

we need to develop recursive methodology;

  • Offers some hope for high-dimensional problems;

Least-Action Filtering – p. 14/1

slide-73
SLIDE 73

Conclusions.

  • Minimizing action (= maximizing likelihood) picks out ‘most likely’ path;
  • Second-order expansion gives Gaussian structure of ξ ≡ x − x∗;
  • For simulation/SMC, this can guide you in path generation;
  • For large T,

we need to develop recursive methodology;

  • Offers some hope for high-dimensional problems;
  • More work needed on theory and examples.

Least-Action Filtering – p. 14/1

slide-74
SLIDE 74

Conclusions.

  • Minimizing action (= maximizing likelihood) picks out ‘most likely’ path;
  • Second-order expansion gives Gaussian structure of ξ ≡ x − x∗;
  • For simulation/SMC, this can guide you in path generation;
  • For large T,

we need to develop recursive methodology;

  • Offers some hope for high-dimensional problems;
  • More work needed on theory and examples.

———————————————————————————–

Least-Action Filtering – p. 14/1