Covariance Matrix Simplification For Efficient Uncertainty - - PowerPoint PPT Presentation

covariance matrix simplification for efficient
SMART_READER_LITE
LIVE PREVIEW

Covariance Matrix Simplification For Efficient Uncertainty - - PowerPoint PPT Presentation

MaxEnt 2007 PASEO Covariance Matrix Simplification For Efficient Uncertainty Management Andr Jalobeanu, Jorge A. Gutirrez PASEO Research Group LSIIT (CNRS/ Univ. Strasbourg) - Illkirch, France *part of SpaceFusion project ANR Jeunes


slide-1
SLIDE 1

Covariance Matrix Simplification For Efficient Uncertainty Management

André Jalobeanu, Jorge A. Gutiérrez PASEO Research Group LSIIT (CNRS/ Univ. Strasbourg) - Illkirch, France

PASEO

*part of SpaceFusion project ANR “Jeunes Chercheurs” 2005-2008

MaxEnt 2007

slide-2
SLIDE 2

Outline

Uncertainty and error propagation

Why do we need to simplify the covariances? Computing/storing/using uncertainties

Inverse covariance simplification

Information theory-based methods The proposed algorithm Some results

Conclusions

slide-3
SLIDE 3

Uncertainties and error propagation

Error propagation from the source to the end result Computing uncertainties Storing uncertainties Using uncertainties

slide-4
SLIDE 4

Error modeling: from source to result

processing algorithm

input data

  • utput

model

pdf transformed pdf

๏Input noise: stochastic process

(observation = realization of a random variable)

  • Several additive processes, zero-mean
  • Stochastic independence between detectors (white noise)
  • Stationary process

๏Processing algorithm: deterministic transform in general ๏Output noise: stochastic process

(result = realization of a random variable)

  • Additive & zero mean assumption, stochastic independence, stationarity
slide-5
SLIDE 5

Example: 2D Super-Resolution in astronomy

1 2 3

pointing pattern (model space) 1 2 3

Experimental setting: 4 noisy images

undersampled (factor 2), shifted (1/2 pixel)

Image fusion result (mean)

Inverse covariance of the result

diagonal terms, and near-diagonal (covariance) terms

[ADA 2006]

slide-6
SLIDE 6

Uncertain knowledge of model variables

๏ Diagonal terms

  • Inverse variance, if all other terms are zero (1/variable)

๏ Near-diagonal terms

  • Nearest neighbors (left/right and up/down in 2D)
  • Longer range (diagonal directions in 2D)

๏ Long-range terms (n/pixel)

  • Should be zero (more convenient, realistic)

Posterior pdf P(X | Y) prop. to exp -U(X)

Gaussian approximation of the posterior pdf Inverse covariance matrix = 2nd derivatives of U at the optimum sparse but not enough!

slide-7
SLIDE 7

Approximating uncertainties

๏ Inverse covariance approximation

Goal: provide a 1st-order Markovian model

  • Drop the long-range interactions for simplicity
  • Minimize a distance between Gaussian distributions, e.g.:
  • Preserve the variance and nearest neighbor covariance

covariance inverse covariance approx. true

inf

γ DKL

  • G(0, ΣX), G(0, ˜

ΣX)

  • Inverse covariance matrix:

sparse, but not enough...

slide-8
SLIDE 8

Storing uncertainties - 2D images, 1 band

Optimum NxN pixels Uncertainties: NxN x ( 1 + 2 [+ 2]) parameters

self vertical horizontal diagonal diagonal type of interaction

limited redundancy: 3 or 5

slide-9
SLIDE 9

Multispectral uncertainties

Add interactions between bands

... ...

Uncertainties: NxN x (M + 2M [+2M] + M-1) parameters Optimum M bands

  • f NxN pixels

limited redundancy: max. 4 or 6

slide-10
SLIDE 10

Using uncertainties

Example: recursive data fusion

Average Probabilistic fusion

Result #1 Result #2

๏ Bayesian updating and uncertainty propagation

  • Use a simplified posterior pdf (approx. inverse covariance matrix)

as a prior density for subsequent data processing

  • Recursive (vs. batch) data fusion: allow for model updates

Φ(X)(k+1) = XT ˜ Σ−1 (k) X

slide-11
SLIDE 11

Analyzing processed images when uncertainties are provided

๏ Bayesian approach: data analysis from processed images

  • Goal: compute the posterior pdf of the parameters of interest
  • Use the extended data term: approximate posterior pdf
  • Update existing statistical methods (Bayesian or not)

to use this extended term - no other changes required!

  • If possible, provide uncertainties on the analysis result as well
  • Example: prediction X=F(parameters), F=star profile...

D(X) =

  • p

1 2dp(X − ˆ X)2

p + ch p(X − ˆ

X)p(X − ˆ X)r(p) + cv

p(X − ˆ

X)p(X − ˆ X)u(p)

  • Extended data term:

usual term extra off-diagonal terms (horizontal and vertical inv. covariances)

slide-12
SLIDE 12

Inverse covariance simplification

State of the art The proposed algorithm Some results Approximations for large matrices

slide-13
SLIDE 13

Problem statement for Markov Chains

Inverse covariance matrix Inverse covariance matrix Graphical model Graphical model Given model (computed posterior pdf) Simplified model

Σ−1

˜ Σ−1

slide-14
SLIDE 14

Information-theory based approaches

๏ Set to zero - bad idea: ๏ Minimize a distance between distributions

  • Kullback-Leibler or - relative entropy
  • Symmetric Kullback-Leibler
  • Bhattacharyya distance...

DKL(P˜

Σ | PΣ) ∝ log

det Σ det ˜ Σ

  • + tr
  • Σ−1 ˜

Σ

  • Relative entropy maximization:

M.E. subject to constraints

˜ Σ−1

uv = 0

Σ, ˜ Σ

Nonlinear problem and constraints difficult to enforce. Any ideas?

˜ Σ positive definite ˜ Σ positive definite

slide-15
SLIDE 15

Proposed approach

๏ Enforce the constraints: ๏ Minimize the norm of the residual

to find the unknown inverse covariance entries X and the unknown covariances Z (even if not needed)

x =

˜ Σ−1

C I ˜ Σ−1C = I ˜ Σ−1

uv = 0 if (u, v) ∈ Ω

Ckl = Σkl if (k, l) ∈ ¯ Ω E = ˜ Σ−1C − I2

slide-16
SLIDE 16

Alternate optimization scheme

๏ Z fixed, minimize E with respect to X

  • Quadratic form - use a conjugate gradient descent

๏ X fixed, minimize E with respect to Z

  • Quadratic form - use a conjugate gradient descent

E = ˜ Σ−1C − I2

This is not a quadratic form in (X,Z)

EZ(X) = 1 2XtAZX + Bt

ZX + const

EX(Z) = 1 2ZtAXZ + Bt

XZ + const

slide-17
SLIDE 17

Some details...

AZ =

  • ij

(aZ)ij(aZ)t

ij

BZ = −

  • i

(aZ)ii

BX = −

  • i

(aX)ii AX =

  • ij

(aX)ij(aX)t

ij

slide-18
SLIDE 18

Test 1 (simulation, 6x6 matrix)

Convergence monitoring

Σ−1

˜ Σ−1

C

slide-19
SLIDE 19

Test 2 (simulation, 6x6 matrix)

˜ Σ−1

C

slide-20
SLIDE 20

How to simplify large matrices?

Block sweeping/averaging technique

Original method Using block sweeping and averaging

slide-21
SLIDE 21

Block size and 2D Markov Random Fields?

Example: simplification of a 8-neigbor MRF to obtain a 4-neighbor MRF

What is the minimum block size?

Input inverse covariance Simplified (block sweeping)

slide-22
SLIDE 22

Conclusions

Accomplishments

New algorithm to simplify inverse covariance matrices using covariance and support constraints Fast alternate optimization scheme

What’s next?

Extension to general 2D Markov Random Fields Improve the block size determination technique Application to image processing (e.g. data fusion) in remote sensing and astronomy