Information Geometric Nonlinear Filtering: a Hilbert Space Approach - - PowerPoint PPT Presentation

information geometric nonlinear filtering a hilbert space
SMART_READER_LITE
LIVE PREVIEW

Information Geometric Nonlinear Filtering: a Hilbert Space Approach - - PowerPoint PPT Presentation

Information Geometric Nonlinear Filtering: a Hilbert Space Approach Nigel Newton (University of Essex) Information Geometry and its Applications IV, Liblice, June 2016 In honour of Shun-ichi Amari on the occasion of his 80 th birthday


slide-1
SLIDE 1

Information Geometric Nonlinear Filtering: a Hilbert Space Approach

Nigel Newton (University of Essex)

Information Geometry and its Applications IV, Liblice, June 2016 In honour of Shun-ichi Amari

  • n the occasion of his 80th birthday
slide-2
SLIDE 2

Overview

  • Nonlinear Filtering (recursive Bayesian estimation)

– The need for a proper state space for posterior distributions

  • The infinite-dimensional Hilbert manifold of probability

measures, M, (and Banach variants)

  • An M-valued Itô stochastic differential equation for the

nonlinear filter

  • Information geometric properties of the nonlinear filter

NJN U of E 2016 1

slide-3
SLIDE 3

Nonlinear Filtering

  • Markov “signal” process:

– is a metric space, with reference probability measure m – Eg.

  • Partial “observation” process:

NJN U of E 2016

 

) , [ ,    t Xt X

t t s t

W ds X h Y   0 ) (

 

) , [ R,    t Yt ) , ( , R I N

d

  m X

 

m , X

2

Brownian Motion, independent of X

slide-4
SLIDE 4

Nonlinear Filtering

  • Markov “signal” process:

– is a metric space, with reference probability measure m – Eg.

  • Partial “observation” process:
  • Estimate Xt at each time t from its prior distribution Pt and

the history of the observation:

  • The linear-Gaussian case yields the Kalman-Bucy filter

NJN U of E 2016

 

) , [ ,    t Xt X

t t s t

W ds X h Y   0 ) (

Brownian Motion, independent of X

]) , [ , ( : t s Y Y

s t

 

 

m , X

 

) , [ R,    t Yt ) , ( , R I N

d

  m X

2

slide-5
SLIDE 5
  • Regular conditional (posterior) distribution:
  • is a random probability measure evolving on .

How should we represent it?

NJN U of E 2016

 

t t t

Y B X B | ) (    P

) ( : X P   t

t

 ) (X P

Nonlinear Filtering

3

slide-6
SLIDE 6

NJN U of E 2016

  • Regular conditional (posterior) distribution:
  • is a random probability measure evolving on .

How should we represent it?

  • We could consider the conditional density (w.r.t m), pt

– typical differential equation (Shiriyayev, Wonham, Stratonovich, Kushner):

  • Spaces of densities are not necessarily optimal

 

t t t

Y B X B | ) (    P

) ( : X P   t

t

 ) (X P

)" )( ( " dt h dY h h dt d

t t t t t t

    p p p A

Nonlinear Filtering

 

  ) ( ) ( : dx x h h

t t

3

slide-7
SLIDE 7

NJN U of E 2016

Mean-Square Errors

  • Suppose for some
  • Then minimises the mean-square error

 

2

) (

t

X f E

R :  X f

f ft

t

E :

 

2 2 2

) ˆ ( ) ( E ) ˆ ) ( (

t t t t t

f f f f f X f

t

    

E E

error ion approximat error estimation 

4

slide-8
SLIDE 8
  • Suppose for some
  • Then minimises the mean-square error
  • If for some , and then

and so the L2(m) norm on densities may be useful

NJN U of E 2016

 

2

) (

t

X f E

R :  X f

f ft

t

ˆ

E ˆ

) ( : ˆ X P   t

m   

t t

ˆ ,

2 2 2

) ˆ ( E E ) ˆ (

t t t t

f f f p p

m m

  

Mean-Square Errors

f ft

t

E :

error ion approximat error estimation 

 

2 2 2

) ˆ ( ) ( E ) ˆ ) ( (

t t t t t

f f f f f X f

t

    

E E

4

slide-9
SLIDE 9

NJN U of E 2016

  • Suppose for some
  • Then minimises the mean-square error
  • If for some , and then

and so the L2(m) norm on densities may be useful

  • Not if f = 1B and t(B) is very small (Eg. fault detection)

 

2

) (

t

X f E

R :  X f

) ( : ˆ X P   t

m   

t t

ˆ ,

Mean-Square Errors

f ft

t

E :

error ion approximat error estimation 

f ft

t

ˆ

E ˆ

2 2 2

) ˆ ( E E ) ˆ (

t t t t

f f f p p

m m

  

 

2 2 2

) ˆ ( ) ( E ) ˆ ) ( (

t t t t t

f f f f f X f

t

    

E E

4

slide-10
SLIDE 10

NJN U of E 2016

  • Suppose for some
  • Then minimises the mean-square error
  • If for some , and then

and so the L2(m) norm on densities may be useful

  • Not if f = 1B and t(B) is very small (Eg. fault detection)
  • When topologised in this way, P (X) has a boundary

 

2

) (

t

X f E

R :  X f

) ( : ˆ X P   t

m   

t t

ˆ ,

Mean-Square Errors

f ft

t

E :

error ion approximat error estimation 

f ft

t

ˆ

E ˆ

2 2 2

) ˆ ( E E ) ˆ (

t t t t

f f f p p

m m

  

 

2 2 2

) ˆ ( ) ( E ) ˆ ) ( (

t t t t t

f f f f f X f

t

    

E E

4

slide-11
SLIDE 11

NJN U of E 2016

Multi-Objective Mean-Square Errors

  • Maximising the L2 error over square-integrable functions

where

 

2 2 2 2 ) (

) / ˆ 1 ( E ) / ˆ 1 ( E sup ) ( E ) ˆ ( sup : ) | ˆ (

2

t t t t F f t t t L f t t

d d d d f f f f f

t t t t

            

     

M

 

1 E , : ) ( :

2 2

    

 f

f L f F

t

t t         error estimation error ion approximat

5

slide-12
SLIDE 12

Multi-Objective Mean-Square Errors

  • Maximising the L2 error over square-integrable functions

where

  • In time-recursive approximations, the accuracy of is

affected by that of (s < t). This naturally induces multi-objective criteria at time s (nonlinear dynamics).

NJN U of E 2016 t

 ˆ

s

 ˆ

        error estimation error ion approximat

 

2 2 2 2 ) (

) / ˆ 1 ( E ) / ˆ 1 ( E sup ) ( E ) ˆ ( sup : ) | ˆ (

2

t t t t F f t t t L f t t

d d d d f f f f f

t t t t

            

     

M

 

1 E , : ) ( :

2 2

    

 f

f L f F

t

t t

5

slide-13
SLIDE 13

Geometric Sensitivity

  • M is “geometrically sensitive”. (It requires small probabilities to

be approximated with greater absolute accuracy than large probabilities)

  • When topologised by M, P (X) does not have a boundary

NJN U of E 2016 6

slide-14
SLIDE 14

NJN U of E 2016

Geometric Sensitivity

  • M is “geometrically sensitive”. (It requires small probabilities to

be approximated with greater absolute accuracy than large probabilities)

  • When topologised by M, P (X) does not have a boundary
  • This is highly desirable in the context of recursive

Bayesian estimation, where conditional probabilities are repeatedly multiplied by the likelihood functions of new

  • bservations.

6

slide-15
SLIDE 15

NJN U of E 2016

Geometric Sensitivity

  • M is “geometrically sensitive”. (It requires small probabilities to

be approximated with greater absolute accuracy than large probabilities.)

  • When topologised by M, P (X) does not have a boundary.
  • This is highly desirable in the context of recursive

Bayesian estimation, where conditional probabilities are repeatedly multiplied by the likelihood functions of new

  • bservations.
  • M is Pearson’s c 2 divergence. It belongs to the one-

parameter family of a-divergences:

3 

 D M

6

slide-16
SLIDE 16

NJN U of E 2016 NJN U of E 2016

Geometric Sensitivity

  • M is “geometrically sensitive”. (It requires small probabilities to

be approximated with greater absolute accuracy than large probabilities.)

  • When topologised by M, P (X) does not have a boundary.
  • This is highly desirable in the context of recursive

Bayesian estimation, where conditional probabilities are repeatedly multiplied by the likelihood functions of new

  • bservations.
  • M is Pearson’s c 2 divergence. It belongs to the one-

parameter family of a-divergences:

  • It is too restrictive to use in practice

3 

 D M

6

slide-17
SLIDE 17

NJN U of E 2016

  • As |a | becomes larger becomes increasingly

“geometrically sensitive”

  • The case a = 0 yields the Hellinger metric

a-Divergences

a

D

7

slide-18
SLIDE 18

NJN U of E 2016

  • As |a | becomes larger becomes increasingly

“geometrically sensitive”

  • The case a = 0 yields the Hellinger metric
  • The case a = ±1 yields the KL-Divergence:
  • This is widely used in practice.

dQ dP dQ dP Q P Q P

Q

  • log

E ) | ( : ) | (

1

  D D

a-Divergences

a

D

7

slide-19
SLIDE 19

NJN U of E 2016

  • As |a | becomes larger becomes increasingly

“geometrically sensitive”

  • The case a = 0 yields the Hellinger metric
  • The case a = ±1 yields the KL-Divergence:
  • This is widely used in practice.
  • Symmetric error criteria may be appropriate, such as

dQ dP dQ dP Q P Q P

Q

  • log

E ) | ( : ) | (

1

  D D

a-Divergences

) ˆ | ( ) | ˆ (

t t t t

     D D

a

D

7

slide-20
SLIDE 20

Connections with Information Theory

  • Conditional mutual information (un-averaged):
  • Additivity property:

NJN U of E 2016

 

Z Y Z X Z XY

P P P Z Y X I

| | | |

: ) | ; (   D ) | ; ( ) ; ( )) , ( ; ( Z Y X I Z X I Z Y X I E  

8

slide-21
SLIDE 21

Connections with Information Theory

  • Conditional mutual information (unaveraged):
  • Additivity property:
  • Information Supply to the nonlinear filter:
  • The filter continuously fuses new observation information

NJN U of E 2016

) ; ( : ) (

t

Y X I t S 

 

Z Y Z X Z XY

P P P Z Y X I

| | | |

: ) | ; (   D ) | ; ( ) ( ) (

s t s Y

Y X I s S t S E   ) | ; ( ) ; ( )) , ( ; ( Z Y X I Z X I Z Y X I E  

8

slide-22
SLIDE 22

NJN U of E 2016

Appropriate Metrics on P (X)

  • The KL divergence is bilinear in the density and its log

(regarded as elements of dual spaces of functions).

  • For with P, Q << m

where p and q are the densities q p p p Q P log , log , ) | (   D

) ( , X P  Q P

9

slide-23
SLIDE 23

NJN U of E 2016

Appropriate Metrics on P (X)

  • The KL divergence is bilinear in the density and its log

(regarded as elements of dual spaces of functions).

  • For with P, Q << m

where p and q are the densities

  • So we would like the metric to “control” both p and log p

q p p p Q P log , log , ) | (   D

) ( , X P  Q P

9

slide-24
SLIDE 24

NJN U of E 2016

Maximal Exponential Model

(G. Pistone et al.)

  • Model space (exponential Orlicz):

NJN U of E 2016

 

m m

m S a a K a p P      | )) ( exp( : ) ( ) ( X P E

 

some for ) cosh( E , E : R :       a a

m m m

a a a B X

10

slide-25
SLIDE 25

NJN U of E 2016

Maximal Exponential Model

(G. Pistone et al.)

  • Model space (exponential Orlicz):
  • Global Chart: sm: E(m)  Bm

NJN U of E 2016

 

m m

m S a a K a p P      | )) ( exp( : ) ( ) ( X P E

 

some for ) cosh( E , E : R :       a a

m m m

a a a B X

) log( E ) log( : ) ( p p P s

m m

 

10

slide-26
SLIDE 26

NJN U of E 2016

Maximal Exponential Model

(G. Pistone et al.)

  • Model space (exponential Orlicz):
  • Global Chart: sm: E(m)  Bm
  • Mixture Map: hm: E(m)  *Bm

Injective and of class C, but not homeomorphic

NJN U of E 2016

 

m m

m S a a K a p P      | )) ( exp( : ) ( ) ( X P E

 

some for ) cosh( E , E : R :       a a

m m m

a a a B X

) log( E ) log( : ) ( p p P s

m m

  1 : ) (   p P

m

h

10

slide-27
SLIDE 27

The Hilbert Manifold M

NJN U of E 2016

  • M is the subset of P (X) whose members have the

following properties:     p p P

2 2

log E E , ~

m m

m and

11

slide-28
SLIDE 28

The Hilbert Manifold M

NJN U of E 2016

  • M is the subset of P (X) whose members have the

following properties:

  • Model space:
  • Global Chart: f : M  H

    p p P

2 2

log E E , ~

m m

m and

 

     

2 2

E , E : R : ) ( a a a L H

m m

m X

p p p P log E log 1 : ) (

m

f    

11

slide-29
SLIDE 29

The Hilbert Manifold M

NJN U of E 2016

  • M is the subset of P (X) whose members have the

following properties:

  • Model space:
  • Global Chart: f : M  H
  • Proposition 1: f is a bijection onto H

    p p P

2 2

log E E , ~

m m

m and

 

     

2 2

E , E : R : ) ( a a a L H

m m

m X

p p p P log E log 1 : ) (

m

f    

11

slide-30
SLIDE 30

M as a Generalised Exponential Family

  • The exponential function is replaced by the inverse of

the function (0, ) ' y  y  1 + log y  R:

NJN U of E 2016

) ( where )) ( ) ( ( ) ( P a a Z x a x p f    

12

slide-31
SLIDE 31

M as a Generalised Exponential Family

  • The exponential function is replaced by the inverse of

the function (0, ) ' y  y  1 + log y  R:

  • Convex, linear growth, bounded derivatives of all orders.

NJN U of E 2016

z (z)

) ( where )) ( ) ( ( ) ( P a a Z x a x p f    

12

slide-32
SLIDE 32

Mixture and Exponential Maps

  • The maps , defined by

are injective, but not homeomorphic (like hm of E(m))

NJN U of E 2016

p p P e p P m log E log ) ( 1 ) (

m

    and H M e m  : ,

13

slide-33
SLIDE 33

Mixture and Exponential Maps

  • The maps , defined by

are injective, but not homeomorphic (like hm of E(m))

  • They satisfy:
  • So that

and

NJN U of E 2016

p p P e p P m log E log ) ( 1 ) (

m

    and

2 2 2

) ( ) ( ) ( ) ( ) ( ) (

H H H

Q P Q e P e Q m P m f f     

H M e m  : ,

H

Q e P e Q m P m P Q Q P ) ( ) ( , ) ( ) ( ) | ( ) | (    D D

2

) ( ) ( 2 1 ) | ( ) | (

H

Q P P Q Q P f f   D D

13

slide-34
SLIDE 34

NJN U of E 2016

The Tangent Bundle

  • Global Chart: F : TM  HH

NJN U of E 2016

) ), ( ( : ) , (

P

U P U P Φ f f 

14

slide-35
SLIDE 35

The Tangent Bundle

  • Global Chart: F : TM  HH
  • m and e representations:

Injective but not homeomorphic

NJN U of E 2016

) ), ( ( : ) , (

P

U P U P Φ f f  H H Ue P U P Φ H H Um P U P Φ

P e P m

      ) ), ( ( : ) , ( , ) ), ( ( : ) , ( f f

14

slide-36
SLIDE 36

The Tangent Bundle

  • Global Chart: F : TM  HH
  • m and e representations:

Injective but not homeomorphic

  • The Fisher metric:

NJN U of E 2016

M T V U

P

 , for

) ), ( ( : ) , (

P

U P U P Φ f f  H H Ue P U P Φ H H Um P U P Φ

P e P m

      ) ), ( ( : ) , ( , ) ), ( ( : ) , ( f f

14

Eguchi) ( , : ,

H P P P P

Ve Um UV V U    D

slide-37
SLIDE 37

The Tangent Bundle

  • Global Chart: F : TM  HH
  • m and e representations:

Injective but not homeomorphic

  • The Fisher metric:
  • (TPM, < · , · >) is an inner product space with

NJN U of E 2016

Eguchi) ( , : ,

H P P P P

Ve Um UV V U    D

H H P P P

U Ue Um U f   ,

M T V U

P

 , for

) ), ( ( : ) , (

P

U P U P Φ f f  H H Ue P U P Φ H H Um P U P Φ

P e P m

      ) ), ( ( : ) , ( , ) ), ( ( : ) , ( f f

14

slide-38
SLIDE 38

e and m Parallel Transport

  • These are obtained by considering the inclusions:

together with the parallel transport on HH defined by:

H H TM Φ H H TM Φ

e m

    ) ( and ) ( ) , ( ) , (

,

u b u a T b

a

15 NJN U of E 2016

slide-39
SLIDE 39

e and m Parallel Transport

  • These are obtained by considering the inclusions:

together with the parallel transport on HH defined by:

  • Like the m parallel transport on the maximal exponential

model, they coincide with m parallel transport on the tangent bundle only in special cases.

  • a-parallel transports can be defined in the same way on

statistical Hilbert bundles.

NJN U of E 2016

H H TM Φ H H TM Φ

e m

    ) ( and ) ( ) , ( ) , (

,

u b u a T b

a

15

slide-40
SLIDE 40

Submanifolds

Like the maximal exponential model, M admits many useful submanifolds. For example…

  • Proposition 2: If N  M is a finite-dimensional

exponential family, then it is a C-embedded submanifold

  • f M, on which m, e and D are of class C

NJN U of E 2016 16

slide-41
SLIDE 41

Submanifolds

Like the maximal exponential model, M admits many useful submanifolds. For example…

  • Proposition 2: If N  M is a finite-dimensional

exponential family, then it is a C-embedded submanifold

  • f M, on which m, e and D are of class C
  • Example: the non-singular Gaussian measures on Rm

form a C-embedded submanifold of M(Rm, m), where

NJN U of E 2016

dx x dx

m

|) | exp( 2 : ) (  

m

16

slide-42
SLIDE 42

Submanifolds

Like the maximal exponential model, M admits many useful submanifolds. For example…

  • Proposition 2: If N  M is a finite-dimensional

exponential family, then it is a C-embedded submanifold

  • f M, on which m, e and D are of class C
  • Example: the non-singular Gaussian measures on Rm

form a C-embedded submanifold of M(Rm, m), where

  • Similar results hold for mixture models and a-models
  • Subspaces of H also provide natural submanifolds of M

NJN U of E 2016

dx x dx

m

|) | exp( 2 : ) (  

m

16

slide-43
SLIDE 43

Banach Variants

  • The a-divergences are twice differentiable on M.
  • Greater regularity can be obtained by the use of stronger

topologies on the model space: Ll(m), for l  2

  • This enables the definition of a-covariant derivatives on

the statistical bundles mentioned above.

  • Details in:

N.J. Newton, Infinite-dimensional statistical manifolds based on a balanced chart, Bernoulli 22, 711-731 (2016)

NJN U of E 2016 17

slide-44
SLIDE 44

NJN U of E 2016

Nonlinear Filtering

  • Markov “signal” process:

– is a metric space, with reference probability measure m – Eg.

  • Partial “observation” process:
  • Estimate Xt at each time t from its prior distribution Pt and

the history of the observation:

  • Typical equation for the density:

NJN U of E 2016

 

) , [ ,    t Xt X

t t s t

W ds X h Y   0 ) (

]) , [ , ( : t s Y Y

s t

 

 

m , X

 

) , [ R,    t Yt ) , ( , R I N

d

  m X

dt h dY W d W d h h dt d

t t t t t t t t

     : where ) ( p p p A

18

slide-45
SLIDE 45

Proposition 3: Under some technical conditions: 1.

NJN U of E 2016

  1

0     t M

t

all for P

M-Valued Nonlinear Filters

19

slide-46
SLIDE 46

Proposition 3: Under some technical conditions: 1.

  • 2. The coordinate representation f() satisfies the

following (infinite-dimensional) Itô equation where

NJN U of E 2016

  1

0     t M

t

all for P

M-Valued Nonlinear Filters

t t t t t

W d v dt u d     ) ( ) (  f

) )( 1 ( : 2 / ) ( : ) 1 ( :

2 1 t t t t t t t t

h h v h h u          

p  p p A

      

  • therwise

) , ( if

2

m

m

X L f f E f f

19

slide-47
SLIDE 47

NJN U of E 2016 20

  • Since H is of countable dimension, it admits a complete
  • rthonormal basis (hi, i = 1, 2, 3, …)
  • So the filter equations can be written in terms of the

components:

 , 3 , 2 , 1 for , ) ( : ) (

t

    i

H i i t

h f f

Components

slide-48
SLIDE 48

NJN U of E 2016 NJN U of E 2016 20

  • Since H is of countable dimension, it admits a complete
  • rthonormal basis (hi, i = 1, 2, 3, …)
  • So the filter equations can be written in terms of the

components:

  • The Fisher metric can be expressed in terms of the (hi)

where , and

 , 3 , 2 , 1 for , ) ( : ) (

t

    i

H i i t

h f f

Components

j i j i P

v u P G V U

,

) ( , 

) ), ( ( ) , (

1 i i

P Φ D P h f

i iD

u U 

P j i j i

D D P G , ) (

, 

slide-49
SLIDE 49

NJN U of E 2016 20

  • Since H is of countable dimension, it admits a complete
  • rthonormal basis (hi, i = 1, 2, 3, …)
  • So the filter equations can be written in terms of the

components:

  • The Fisher metric can be expressed in terms of the (hi)

where , and

  • The basis can be chosen to suit the problem (wavelets)
  • Truncated series could be used in approximations

 , 3 , 2 , 1 for , ) ( : ) (

t

    i

H i i t

h f f

Components

j i j i P

v u P G V U

,

) ( , 

) ), ( ( ) , (

1 i i

P Φ D P h f

i iD

u U 

P j i j i

D D P G , ) (

, 

slide-50
SLIDE 50

NJN U of E 2016 21

Quadratic Variation

  • Semimartingales on M have well-defined quadratic

variation in the Fisher metric; in particular

 

 s

j i j i t s t

d G ) ( , ) ( ) ( :

,

    

f f

slide-51
SLIDE 51

NJN U of E 2016 21

Quadratic Variation

  • Semimartingales on M have well-defined quadratic

variation in the Fisher metric; in particular

  • Proposition 4: Under the conditions of Proposition 3:

   

 

s s t s t s

Y Y Y X I | 2 1 ) | ; (     E

 

 s

j i j i t s t

d G ) ( , ) ( ) ( :

,

    

f f

slide-52
SLIDE 52

Quadratic Variation

  • Semimartingales on M have well-defined quadratic

variation in the Fisher metric; in particular

  • Proposition 4: Under the conditions of Proposition 3:
  • Results of this type are of interest in Non-equilibrium

Statistical Mechanics, where interactions between systems set up “flows of entropy”.

NJN U of E 2016

   

 

s s t s t s

Y Y Y X I | 2 1 ) | ; (     E

21

 

 s

j i j i t s t

d G ) ( , ) ( ) ( :

,

    

f f

slide-53
SLIDE 53

Finite Dimensional Filters

  • A number of filters are known to evolve on finite-

dimensional exponential manifolds (Kalman-Bucy, Benes…)

NJN U of E 2016 22

slide-54
SLIDE 54

NJN U of E 2016 22

Finite Dimensional Filters

  • A number of filters are known to evolve on finite-

dimensional exponential manifolds (Kalman-Bucy, Benes…)

  • Proposition 5: Under some technical conditions,  is the

unique strong solution of the following intrinsic Stratonovich equation on such a manifold: where is Amari’s (1)-covariant derivative, and U and V are suitably regular, time-dependent vector fields.

t t t t t V t t t

W d V dt V U d

t

  ) ( ) ( 2 1 ) (

) 1 (

             

) 1 (

slide-55
SLIDE 55

Projections onto Submanifolds

(Brigo, Pistone, Hanzon, Le Gland, Armstrong…)

  • 1. Choose a suitable C2-embedded finite-dimensional

submanifold N  M.

  • 2. The tangent space TPN is complete w.r.t. the Fisher

metric.

  • 3. Evaluate ut  zt and vt at points of N. (These are tangent

vectors of M.)

  • 4. Project onto TPN in the Fisher metric to obtain an

evolution equation on N.

NJN U of E 2016 23

slide-56
SLIDE 56

Projections onto Submanifolds

(Brigo, Pistone, Hanzon, Le Gland, Armstrong…)

  • 1. Choose a suitable C2-embedded finite-dimensional

submanifold N  M.

  • 2. The tangent space TPN is complete w.r.t. the Fisher

metric.

  • 3. Evaluate ut  zt and vt at points of N. (These are tangent

vectors of M.)

  • 4. Project onto TPN in the Fisher metric to obtain an

evolution equation on N.

  • The Hilbert manifold is very suited to this purpose
  • One could also project in the model space metric

NJN U of E 2016 23

slide-57
SLIDE 57

Details in:

NJN U of E 2016

4.

  • J. Armstrong and D. Brigo, Stochastic filtering via L2 projection on mixture manifolds

with computer algorithms and numerical examples, arXiv:1303.6236 (2013) 5.

  • D. Brigo, B. Hanzon and F. Le Gland, Approximate nonlinear filtering on exponential

manifolds of densities, Bernoulli 5, 495-534 (1999). 6.

  • D. Brigo and G. Pistone, Projection-based dimensionality reduction for measure-

valued evolution equations in statistical manifolds, arXiv:1601.04189 (2016) 7.

  • A. Cena and G. Pistone, Exponential statistical manifold, Ann. Inst. Statist. Math. 59,

27-56 (2007) 1. N.J. Newton, An infinite-dimensional statistical manifold modelled on Hilbert space, J. Functional Anal. 263, 1661-1681 (2012). 2. N.J. Newton, Information Geometric Nonlinear Filtering, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 18, 1550014 (2015). 3. N.J. Newton, Infinite-dimensional statistical manifolds based on a balanced chart, Bernoulli 22, 711-731 (2016)

Related Work

24

slide-58
SLIDE 58

NJN U of E 2016 25

8.

  • P. Gibilisco and G. Pistone, Connections on non-parametric statistical manifolds by

Orlicz space geometry, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1, 325-347 (1998) 9. M.R. Grasselli, Dual connections in non-parametric classical information geometry,

  • Ann. Inst. Statist. Math. 62, 873-896 (2010)
  • 10. G. Pistone and M.P. Rogantin, The exponential statistical manifold: mean

parameters, orthogonality and space transformations, Bernoulli 5, 721-760 (1999).

  • 11. G. Pistone and C. Sempi, An infinite-dimensional geometric structure on the space of

all probability measures equivalent to a given one, Ann. Statist. 23, 1543-1561 (1995).

Related Work (cont.)