information geometric nonlinear filtering a hilbert space
play

Information Geometric Nonlinear Filtering: a Hilbert Space Approach - PowerPoint PPT Presentation

Information Geometric Nonlinear Filtering: a Hilbert Space Approach Nigel Newton (University of Essex) Information Geometry and its Applications IV, Liblice, June 2016 In honour of Shun-ichi Amari on the occasion of his 80 th birthday


  1. Information Geometric Nonlinear Filtering: a Hilbert Space Approach Nigel Newton (University of Essex) Information Geometry and its Applications IV, Liblice, June 2016 In honour of Shun-ichi Amari on the occasion of his 80 th birthday

  2. Overview • Nonlinear Filtering (recursive Bayesian estimation) – The need for a proper state space for posterior distributions • The infinite-dimensional Hilbert manifold of probability measures, M , (and Banach variants) • An M -valued Itô stochastic differential equation for the nonlinear filter • Information geometric properties of the nonlinear filter 1 NJN U of E 2016

  3. Nonlinear Filtering      • Markov “signal” process: , [ 0 , ) X t X t   – is a metric space, with reference probability measure m m , X  m  d – Eg. R , ( 0 , ) X N I      • Partial “observation” process: R, [ 0 , ) Y t t   0 t  ( ) Y h X ds W t s t Brownian Motion, independent of X 2 NJN U of E 2016

  4. Nonlinear Filtering      • Markov “signal” process: , [ 0 , ) X t X t   – is a metric space, with reference probability measure m m , X  m  d – Eg. R , ( 0 , ) X N I      • Partial “observation” process: R, [ 0 , ) Y t t   0 t  ( ) Y h X ds W t s t Brownian Motion, independent of X • Estimate X t at each time t from its prior distribution P t and the history of the observation:   t : ( , [ 0 , ]) Y Y s t 0 s • The linear-Gaussian case yields the Kalman-Bucy filter 2 NJN U of E 2016

  5. Nonlinear Filtering  t   • Regular conditional (posterior) distribution: P : ( ) X      t ( ) | B P X B Y 0 t t  • is a random probability measure evolving on . P ( X ) t How should we represent it? 3 NJN U of E 2016

  6. Nonlinear Filtering  t   • Regular conditional (posterior) distribution: P : ( ) X      t ( ) | B P X B Y 0 t t  • is a random probability measure evolving on . P ( X ) t How should we represent it? • We could consider the conditional density (w.r.t m ), p t – typical differential equation (Shiriyayev, Wonham, Stratonovich, Kushner):   p  p  p    A " ( )( )" d dt h h dY h dt   : ( ) ( ) h h x dx t t t t t t t t • Spaces of densities are not necessarily optimal 3 NJN U of E 2016

  7. Mean-Square Errors    • Suppose for some 2 ( ) : R E f X f X t  • Then minimises the mean-square error : E f t f  t   ˆ ˆ      2 2 2 ( ( ) ) E ( ) ( ) f X f f f f f E E  t t t t t t  estimation error approximat ion error 4 NJN U of E 2016

  8. Mean-Square Errors    • Suppose for some 2 ( ) : R E f X f X t  • Then minimises the mean-square error : E f t f  t   ˆ ˆ      2 2 2 ( ( ) ) E ( ) ( ) f X f f f f f E E  t t t t t t  estimation error approximat ion error ˆ ˆ ˆ     m • If for some , and then  t   E f t f P X , : ( ) ˆ  t t t ˆ   p  p ˆ 2 2 2 ( ) E E ( ) f f f m m t t t t and so the L 2 ( m ) norm on densities may be useful 4 NJN U of E 2016

  9. Mean-Square Errors    • Suppose for some 2 ( ) : R E f X f X t  • Then minimises the mean-square error : E f t f  t   ˆ ˆ      2 2 2 ( ( ) ) E ( ) ( ) f X f f f f f E E  t t t t t t  estimation error approximat ion error ˆ ˆ ˆ     m • If for some , and then  t   E f t f P X , : ( ) ˆ  t t t ˆ   p  p ˆ 2 2 2 ( ) E E ( ) f f f m m t t t t and so the L 2 ( m ) norm on densities may be useful • Not if f = 1 B and  t ( B ) is very small (Eg. fault detection) 4 NJN U of E 2016

  10. Mean-Square Errors    • Suppose for some 2 ( ) : R E f X f X t  • Then minimises the mean-square error : E f t f  t   ˆ ˆ      2 2 2 ( ( ) ) E ( ) ( ) f X f f f f f E E  t t t t t t  estimation error approximat ion error ˆ ˆ ˆ     m • If for some , and then  t   E f t f P X , : ( ) ˆ  t t t ˆ   p  p ˆ 2 2 2 ( ) E E ( ) f f f m m t t t t and so the L 2 ( m ) norm on densities may be useful • Not if f = 1 B and  t ( B ) is very small (Eg. fault detection) • When topologised in this way, P ( X ) has a boundary 4 NJN U of E 2016

  11. Multi-Objective Mean-Square Errors • Maximising the L 2 error over square-integrable functions ˆ  2 ( ) f f   ˆ    approximat ion error   M ( | ) : sup t t     2   estimation error  t t 2 ( ) f L E ( ) f f t  t t   ˆ 2     sup E ( 1 / ) f d d   f F t t t ˆ     2 E ( 1 / ) d d  t t t        where 2 2 : ( ) : 0 , E 1 F f L f  f t t t 5 NJN U of E 2016

  12. Multi-Objective Mean-Square Errors • Maximising the L 2 error over square-integrable functions ˆ  2 ( ) f f   ˆ    approximat ion error   M ( | ) : sup t t     2   estimation error  t t 2 ( ) f L E ( ) f f t  t t   ˆ 2     sup E ( 1 / ) f d d   f F t t t ˆ     2 E ( 1 / ) d d  t t t        where 2 2 : ( ) : 0 , E 1 F f L f  f t t t ˆ  • In time-recursive approximations, the accuracy of is t ˆ  affected by that of ( s < t ). This naturally induces s multi-objective criteria at time s (nonlinear dynamics). 5 NJN U of E 2016

  13. Geometric Sensitivity • M is “geometrically sensitive”. (It requires small probabilities to be approximated with greater absolute accuracy than large probabilities) • When topologised by M , P ( X ) does not have a boundary 6 NJN U of E 2016

  14. Geometric Sensitivity • M is “geometrically sensitive”. (It requires small probabilities to be approximated with greater absolute accuracy than large probabilities) • When topologised by M , P ( X ) does not have a boundary • This is highly desirable in the context of recursive Bayesian estimation, where conditional probabilities are repeatedly multiplied by the likelihood functions of new observations. 6 NJN U of E 2016

  15. Geometric Sensitivity • M is “geometrically sensitive”. (It requires small probabilities to be approximated with greater absolute accuracy than large probabilities.) • When topologised by M , P ( X ) does not have a boundary. • This is highly desirable in the context of recursive Bayesian estimation, where conditional probabilities are repeatedly multiplied by the likelihood functions of new observations. • M is Pearson’s c 2 divergence. It belongs to the one-  D parameter family of a -divergences: M  3 6 NJN U of E 2016

  16. Geometric Sensitivity • M is “geometrically sensitive”. (It requires small probabilities to be approximated with greater absolute accuracy than large probabilities.) • When topologised by M , P ( X ) does not have a boundary. • This is highly desirable in the context of recursive Bayesian estimation, where conditional probabilities are repeatedly multiplied by the likelihood functions of new observations. • M is Pearson’s c 2 divergence. It belongs to the one-  D parameter family of a -divergences: M  3 • It is too restrictive to use in practice 6 NJN U of E 2016 NJN U of E 2016

  17. a -Divergences • As | a | becomes larger becomes increasingly D a “geometrically sensitive” • The case a = 0 yields the Hellinger metric NJN U of E 2016 7

  18. a -Divergences • As | a | becomes larger becomes increasingly D a “geometrically sensitive” • The case a = 0 yields the Hellinger metric • The case a = ± 1 yields the KL-Divergence: dP dP  D  D ( | ) : ( | ) E log P Q P Q 1 - Q dQ dQ • This is widely used in practice. NJN U of E 2016 7

  19. a -Divergences • As | a | becomes larger becomes increasingly D a “geometrically sensitive” • The case a = 0 yields the Hellinger metric • The case a = ± 1 yields the KL-Divergence: dP dP  D  D ( | ) : ( | ) E log P Q P Q 1 - Q dQ dQ • This is widely used in practice. • Symmetric error criteria may be appropriate, such as ˆ ˆ      D D ( | ) ( | ) t t t t NJN U of E 2016 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend