Applications of geometric optimisation techniques to engineering - - PowerPoint PPT Presentation

applications of geometric optimisation techniques to
SMART_READER_LITE
LIVE PREVIEW

Applications of geometric optimisation techniques to engineering - - PowerPoint PPT Presentation

Applications of geometric optimisation techniques to engineering problems Jochen Trumpf Jochen.Trumpf@anu.edu.au Department of Information Engineering Research School of Information Sciences and Engineering The Australian National University


slide-1
SLIDE 1

Applications of geometric

  • ptimisation techniques to

engineering problems

Jochen Trumpf

Jochen.Trumpf@anu.edu.au

Department of Information Engineering Research School of Information Sciences and Engineering The Australian National University and National ICT Australia Ltd.

Applications of geometric optimisation techniques to engineering problems – p. 1/31

slide-2
SLIDE 2
  • verview

What is geometric optimisation?

Applications of geometric optimisation techniques to engineering problems – p. 2/31

slide-3
SLIDE 3
  • verview

What is geometric optimisation? Ex 1: Blind Source Separation (BSS)

Applications of geometric optimisation techniques to engineering problems – p. 2/31

slide-4
SLIDE 4
  • verview

What is geometric optimisation? Ex 1: Blind Source Separation (BSS) Independent Component Analysis (ICA)

Applications of geometric optimisation techniques to engineering problems – p. 2/31

slide-5
SLIDE 5
  • verview

What is geometric optimisation? Ex 1: Blind Source Separation (BSS) Independent Component Analysis (ICA) Ex 2: face recognition

Applications of geometric optimisation techniques to engineering problems – p. 2/31

slide-6
SLIDE 6
  • verview

What is geometric optimisation? Ex 1: Blind Source Separation (BSS) Independent Component Analysis (ICA) Ex 2: face recognition dominant eigenspaces of matrix pencils (LDA)

Applications of geometric optimisation techniques to engineering problems – p. 2/31

slide-7
SLIDE 7
  • verview

What is geometric optimisation? Ex 1: Blind Source Separation (BSS) Independent Component Analysis (ICA) Ex 2: face recognition dominant eigenspaces of matrix pencils (LDA) Ex 3: time series clustering

Applications of geometric optimisation techniques to engineering problems – p. 2/31

slide-8
SLIDE 8
  • verview

What is geometric optimisation? Ex 1: Blind Source Separation (BSS) Independent Component Analysis (ICA) Ex 2: face recognition dominant eigenspaces of matrix pencils (LDA) Ex 3: time series clustering “on-the-fly” geometry

Applications of geometric optimisation techniques to engineering problems – p. 2/31

slide-9
SLIDE 9
  • verview

What is geometric optimisation? Ex 1: Blind Source Separation (BSS) Independent Component Analysis (ICA) Ex 2: face recognition dominant eigenspaces of matrix pencils (LDA) Ex 3: time series clustering “on-the-fly” geometry state of the art and open problems

Applications of geometric optimisation techniques to engineering problems – p. 2/31

slide-10
SLIDE 10

What is geometric

  • ptimisation?

Given a real valued function

f : M − → R, x → f(x)

defined on some geometric object M, here a smooth manifold, find a method to compute (if it exists)

x∗ := argmin

x∈M

f(x)

that utilises the (local) geometry of M.

Applications of geometric optimisation techniques to engineering problems – p. 3/31

slide-11
SLIDE 11

Ex 1: Blind Source Separation

The cocktail party problem.

Image: http://www.lnt.de/LMS/research/projects/BSS

Applications of geometric optimisation techniques to engineering problems – p. 4/31

slide-12
SLIDE 12

Ex 1: Blind Source Separation

source signals

  • bserved mixtures

audio, EEG, MEG, fMRI, wireless, ...

Image: http://www.cis.hut.fi/aapo/papers/NCS99web/node17.html

Applications of geometric optimisation techniques to engineering problems – p. 5/31

slide-13
SLIDE 13

BSS – the model

Individual signals (i = 1, . . . , d)

xi : [0, T] − → R, t → xi(t)

are being uniformly sampled and the samples collected into row vectors

xi =

  • xi(t0) xi(t0 + ∆) . . .

xi(t0 + (N − 1) · ∆)

  • which are then stacked into a matrix

X =

    

x1

. . .

xd

     ∈ Rd×N.

Applications of geometric optimisation techniques to engineering problems – p. 6/31

slide-14
SLIDE 14

BSS – the model

It is assumed that there are as many source signals as observed signals and that they are related by

Xo = M · Xs

where Xo, Xs ∈ Rd×N and M ∈ GLd(R).

Applications of geometric optimisation techniques to engineering problems – p. 7/31

slide-15
SLIDE 15

BSS – the model

It is assumed that there are as many source signals as observed signals and that they are related by

Xo = M · Xs

where Xo, Xs ∈ Rd×N and M ∈ GLd(R). Task: Find Xs (or M −1) from knowing Xo subject to some plausible criterion.

Applications of geometric optimisation techniques to engineering problems – p. 7/31

slide-16
SLIDE 16

BSS as ICA problem

We treat the columns of Xo as i.i.d. samples of an

  • bserved random variable vector Y given by

Y = M · X

where X is the unknown random variable source vector.

Applications of geometric optimisation techniques to engineering problems – p. 8/31

slide-17
SLIDE 17

BSS as ICA problem

We treat the columns of Xo as i.i.d. samples of an

  • bserved random variable vector Y given by

Y = M · X

where X is the unknown random variable source vector. The ICA paradigm is now that the components of

X, i.e. the individual signals, are mutually

independent.

Applications of geometric optimisation techniques to engineering problems – p. 8/31

slide-18
SLIDE 18

BSS as ICA problem

Hence, we are trying to find the invertible M that makes the components of the corresponding X “as independent as possible”.

Applications of geometric optimisation techniques to engineering problems – p. 9/31

slide-19
SLIDE 19

BSS as ICA problem

Hence, we are trying to find the invertible M that makes the components of the corresponding X “as independent as possible”. Note: The matrix M in

Y = M · X

is identifiable up to scaling and permutations if and only if the components of X are mutually independent and at most one of them is Gaussian.

Applications of geometric optimisation techniques to engineering problems – p. 9/31

slide-20
SLIDE 20

BSS as ICA problem

A computational trick is centering and prewhitening: multiply by the square root of the covariance matrix of Y (assuming finite second moments) to obtain

Y = Q · X

where Q ∈ Od(R) and X and Y are zero mean and unit variance.

Applications of geometric optimisation techniques to engineering problems – p. 10/31

slide-21
SLIDE 21

BSS as ICA problem

A computational trick is centering and prewhitening: multiply by the square root of the covariance matrix of Y (assuming finite second moments) to obtain

Y = Q · X

where Q ∈ Od(R) and X and Y are zero mean and unit variance. Note: Prewhitening from samples works best in the Gaussian case ...

see IEEE TSP, 53(10):3625–3632, 2005

Applications of geometric optimisation techniques to engineering problems – p. 10/31

slide-22
SLIDE 22

ICA as geometric

  • ptimisation problem

We arrive at the geometric optimisation problem

  • f minimising mutual information between the

components of Q⊤Y over Q ∈ Od(R).

Applications of geometric optimisation techniques to engineering problems – p. 11/31

slide-23
SLIDE 23

ICA as geometric

  • ptimisation problem

We arrive at the geometric optimisation problem

  • f minimising mutual information between the

components of Q⊤Y over Q ∈ Od(R). One-unit FastICA maximises E[G(q⊤Y )] over

q ∈ Sd−1 where G : R − → R, z → 1

a log cosh(az) is a

contrast function. The expectation is computed from samples, the

  • ptimisation method is an approximate Newton
  • n manifold algorithm.

http://www.cis.hut.fi/aapo/papers/IJCNN99_tutorialweb

Applications of geometric optimisation techniques to engineering problems – p. 11/31

slide-24
SLIDE 24

Ex 2: face recognition

Image: IEEE TPAMI, 23(2):228–233, 2001

Applications of geometric optimisation techniques to engineering problems – p. 12/31

slide-25
SLIDE 25

face recognition – the model

An image is represented as a vector X ∈ Rt. Images are divided in c classes with Nj images

Xj

i , i = 1, . . . , Nj in class j = 1, . . . , c.

Applications of geometric optimisation techniques to engineering problems – p. 13/31

slide-26
SLIDE 26

face recognition – the model

An image is represented as a vector X ∈ Rt. Images are divided in c classes with Nj images

Xj

i , i = 1, . . . , Nj in class j = 1, . . . , c.

Consider the within-class scatter matrix

Sw =

  • i,j

(Xj

i − µj)(Xj i − µj)⊤

and the between-class scatter matrix

Sb =

  • j

(µj − µ)(µj − µ)⊤.

Applications of geometric optimisation techniques to engineering problems – p. 13/31

slide-27
SLIDE 27

face recognition as LDA problem

Orthogonally projecting the image vectors into a lower dimensional space Y = Q⊤X yields projected scatter matrices Q⊤S{w,b}Q.

Applications of geometric optimisation techniques to engineering problems – p. 14/31

slide-28
SLIDE 28

face recognition as LDA problem

Orthogonally projecting the image vectors into a lower dimensional space Y = Q⊤X yields projected scatter matrices Q⊤S{w,b}Q. The aim is to maximise det(Q⊤SbQ)

det(Q⊤SwQ) over Q ∈ St(d, t),

the orthogonal Stiefel manifold.

Applications of geometric optimisation techniques to engineering problems – p. 14/31

slide-29
SLIDE 29

face recognition as LDA problem

Orthogonally projecting the image vectors into a lower dimensional space Y = Q⊤X yields projected scatter matrices Q⊤S{w,b}Q. The aim is to maximise det(Q⊤SbQ)

det(Q⊤SwQ) over Q ∈ St(d, t),

the orthogonal Stiefel manifold. This amounts to finding the dominant

d-dimensional eigenspace of the pencil (Sb, Sw).

Applications of geometric optimisation techniques to engineering problems – p. 14/31

slide-30
SLIDE 30

LDA as geometric

  • ptimisation problem

Given a symmetric/positive-definite matrix pencil

(A, B) with eigenvalues (Ax = λBx) λ1 ≥ · · · ≥ λd > λd+1 ≥ · · · ≥ λn the unique d-dimensional dominant eigenspace is the

unique global maximum of

f : Grass(d, n) − → R, [Q] → tr(Q⊤AQ(QTBQ)−1)

see J Comp and Appl Math, 189(1):274–285, 2006

Applications of geometric optimisation techniques to engineering problems – p. 15/31

slide-31
SLIDE 31

Ex 3: time-series clustering

A time series is a (finite) sequence {xt}t=1,...,N of vectors (in Rn), e.g. arising from (sampling) a trajectory of a dynamical system. A popular method of time-series clustering works in delay space

                      

xp xp−1

. . .

xp−l+1

       

  • p = l, . . . , N}

Applications of geometric optimisation techniques to engineering problems – p. 16/31

slide-32
SLIDE 32

Ex 3: time-series clustering

  • Knowl. Inf. Syst., 8(2):154-177, 2005

Applications of geometric optimisation techniques to engineering problems – p. 17/31

slide-33
SLIDE 33

Ex 3: time-series clustering

ICDM 2005, pp. 114–121

Applications of geometric optimisation techniques to engineering problems – p. 18/31

slide-34
SLIDE 34

state of the art

Let M be a d-dimensional Riemannian manifold and let f : M → R be smooth. The derivative of f at x ∈ M is a linear form

D f(x) : TxM → R

A point x∗ ∈ M is called a critical point of f if

D f(x∗)ξ = 0, ∀ξ ∈ Tx∗M.

Applications of geometric optimisation techniques to engineering problems – p. 19/31

slide-35
SLIDE 35

state of the art

Fact: x∗ ∈ M is a strict local minimum of f if

(a) x∗ is a critical point of f, (b) the Hessian form

hess f(x∗) : Tx∗M × Tx∗M → R

is positive definite.

Applications of geometric optimisation techniques to engineering problems – p. 20/31

slide-36
SLIDE 36

state of the art

Fact: x∗ ∈ M is a strict local minimum of f if

(a) x∗ is a critical point of f, (b) the Hessian form

hess f(x∗) : Tx∗M × Tx∗M → R

is positive definite. Geodesics of M: ∀x ∈ M and ξ ∈ TxM

γx : R ∋ (−ε, ε) → M, ε → γx(ε)

such that γx(0) = x and ˙

γx(0) = ξ.

Applications of geometric optimisation techniques to engineering problems – p. 20/31

slide-37
SLIDE 37

state of the art

Riemannian Newton direction ξ ∈ TxM by solving

hess f(x) · ξ = grad f(x)

✲ r

xk

r xk+1

M

✴ P P P P P P ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘ P P P P P P

ξ

Applications of geometric optimisation techniques to engineering problems – p. 21/31

slide-38
SLIDE 38

state of the art

Local parameterisation of M around x ∈ M

µx : Rd → M, κ → µx(κ); µx(0) = x

Construct locally

f ◦ µx : Rd → R

Euclidean Newton direction κ ∈ Rd by solving

H(f ◦ µx)(0)κ = ∇(f ◦ µx)(0)

Applications of geometric optimisation techniques to engineering problems – p. 22/31

slide-39
SLIDE 39

state of the art

r

xk

r

xk+1 M κ µ−1

x

νx

Rd

✲ ✻ ❩❩ ❩ ⑦ r ③ ②

Applications of geometric optimisation techniques to engineering problems – p. 23/31

slide-40
SLIDE 40

state of the art

Let x∗ ∈ M be a nondegenerate critical point. Let

{µx}x∈M and {νx}x∈M be locally smooth around x∗. Consider the following iteration on M x0 ∈ M, xk+1 = νxk

  • Nf◦µxk(0)
  • (N)

Theorem: (Hüper-T.) Under the condition

D µx∗(0) = D νx∗(0)

there exists an open neighborhood V ⊂ M of x∗ such that the point sequence generated by (N) converges quadratically to x∗ provided x0 ∈ V .

Applications of geometric optimisation techniques to engineering problems – p. 24/31

slide-41
SLIDE 41

state of the art

know how to construct computable families of coordinate charts for St, Grass can deal with approximate Newton local convergence theory for more general iterations (Manton-T.) some global convergence results of trust region on manifold schemes (Absil et al.)

Applications of geometric optimisation techniques to engineering problems – p. 25/31

slide-42
SLIDE 42

trust-region methods

Image: http://www.inma.ucl.ac.be/˜blondel/workshops/2004/Absil.pdf

Applications of geometric optimisation techniques to engineering problems – p. 26/31

slide-43
SLIDE 43

state of the art – ICA

One-unit ICA problem as an optimisation problem

  • n Sd−1

f : Sd−1 → R, q → E[G(q⊤Y )].

Geodesics, gradient, Hessian (Hüper-Shen)

γq : R → Sd−1, ε → exp

  • ε(ξq⊤− qξ⊤)
  • q.

grad f(q) =

  • I − qq⊤

E[G′(q⊤Y ) Y ]

hess f(q) · ξ =

  • E[G′′(q⊤Y )Y Y ⊤]
  • ∈Rd×d

− E[G′(q⊤Y )q⊤Y ]

  • ∈R

I

  • · ξ

Applications of geometric optimisation techniques to engineering problems – p. 27/31

slide-44
SLIDE 44

state of the art – ICA

Alternative to geodesics on Sd−1

ρq : R → Sd−1, ε → q + εξ q + εξ

ANICA as a selfmap on Sd−1

q →

1 τ(q)(E[G′(q⊤Y )Y ] − E[G′′(q⊤Y )]q)

1

τ(q)(E[G′(q⊤Y )Y ] − E[G′′(q⊤Y )]q),

where

τ : Sd−1 → R, τ(q) := E[G′(q⊤Y )q⊤Y ] − E[G′′(q⊤Y )]

Applications of geometric optimisation techniques to engineering problems – p. 28/31

slide-45
SLIDE 45

state of the art – ICA

FastICA vs ANICA

1 2 3 4 5 6 7 8 10

−10

10

−5

10

Iteration (k) || x(k)− x* ||

FastICA ANICA

Applications of geometric optimisation techniques to engineering problems – p. 29/31

slide-46
SLIDE 46

state of the art – ICA

FastICA vs ANICA

1 2 3 4 5 6 7 8 10

−10

10

−5

10

Iteration (k) || x(k)− x* ||

FastICA ANICA

Parallel version (ANLICA, Hüper-Shen) with cost function

f : Od(R) → R, Q →

m

  • i=1

E[G(q⊤

i Y )]

Applications of geometric optimisation techniques to engineering problems – p. 29/31

slide-47
SLIDE 47

state of the art – ICA

10 20 30 40 50 60 10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 10

1

Sweep Norm ( x(i) − x(i)* ) 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 10

−16

10

−14

10

−12

10

−10

10

−8

10

−6

10

−4

10

−2

10 10

2

Sweep Norm ( x(i) − x(i)* ) 1 2 3 4 5 6 7 8 9

Parallel FastICA ANLICA

Applications of geometric optimisation techniques to engineering problems – p. 30/31

slide-48
SLIDE 48

the end

Thank you.

Applications of geometric optimisation techniques to engineering problems – p. 31/31