SLIDE 1 Covariance Matrix Simplification For Efficient Uncertainty Management
André Jalobeanu, Jorge A. Gutiérrez PASEO Research Group LSIIT (CNRS/ Univ. Strasbourg) - Illkirch, France
PASEO
*part of SpaceFusion project ANR “Jeunes Chercheurs” 2005-2008
MaxEnt 2007
SLIDE 2
Outline
Uncertainty and error propagation
Why do we need to simplify the covariances? Computing/storing/using uncertainties
Inverse covariance simplification
Information theory-based methods The proposed algorithm Some results
Conclusions
SLIDE 3
Uncertainties and error propagation
Error propagation from the source to the end result Computing uncertainties Storing uncertainties Using uncertainties
SLIDE 4 Error modeling: from source to result
processing algorithm
input data
model
pdf transformed pdf
๏Input noise: stochastic process
(observation = realization of a random variable)
- Several additive processes, zero-mean
- Stochastic independence between detectors (white noise)
- Stationary process
๏Processing algorithm: deterministic transform in general ๏Output noise: stochastic process
(result = realization of a random variable)
- Additive & zero mean assumption, stochastic independence, stationarity
SLIDE 5 Example: 2D Super-Resolution in astronomy
1 2 3
pointing pattern (model space) 1 2 3
Experimental setting: 4 noisy images
undersampled (factor 2), shifted (1/2 pixel)
Image fusion result (mean)
Inverse covariance of the result
diagonal terms, and near-diagonal (covariance) terms
[ADA 2006]
SLIDE 6 Uncertain knowledge of model variables
๏ Diagonal terms
- Inverse variance, if all other terms are zero (1/variable)
๏ Near-diagonal terms
- Nearest neighbors (left/right and up/down in 2D)
- Longer range (diagonal directions in 2D)
๏ Long-range terms (n/pixel)
- Should be zero (more convenient, realistic)
Posterior pdf P(X | Y) prop. to exp -U(X)
Gaussian approximation of the posterior pdf Inverse covariance matrix = 2nd derivatives of U at the optimum sparse but not enough!
SLIDE 7 Approximating uncertainties
๏ Inverse covariance approximation
Goal: provide a 1st-order Markovian model
- Drop the long-range interactions for simplicity
- Minimize a distance between Gaussian distributions, e.g.:
- Preserve the variance and nearest neighbor covariance
covariance inverse covariance approx. true
inf
γ DKL
ΣX)
- Inverse covariance matrix:
sparse, but not enough...
SLIDE 8 Storing uncertainties - 2D images, 1 band
Optimum NxN pixels Uncertainties: NxN x ( 1 + 2 [+ 2]) parameters
self vertical horizontal diagonal diagonal type of interaction
limited redundancy: 3 or 5
SLIDE 9 Multispectral uncertainties
Add interactions between bands
... ...
Uncertainties: NxN x (M + 2M [+2M] + M-1) parameters Optimum M bands
limited redundancy: max. 4 or 6
SLIDE 10 Using uncertainties
Example: recursive data fusion
Average Probabilistic fusion
Result #1 Result #2
๏ Bayesian updating and uncertainty propagation
- Use a simplified posterior pdf (approx. inverse covariance matrix)
as a prior density for subsequent data processing
- Recursive (vs. batch) data fusion: allow for model updates
Φ(X)(k+1) = XT ˜ Σ−1 (k) X
SLIDE 11 Analyzing processed images when uncertainties are provided
๏ Bayesian approach: data analysis from processed images
- Goal: compute the posterior pdf of the parameters of interest
- Use the extended data term: approximate posterior pdf
- Update existing statistical methods (Bayesian or not)
to use this extended term - no other changes required!
- If possible, provide uncertainties on the analysis result as well
- Example: prediction X=F(parameters), F=star profile...
D(X) =
1 2dp(X − ˆ X)2
p + ch p(X − ˆ
X)p(X − ˆ X)r(p) + cv
p(X − ˆ
X)p(X − ˆ X)u(p)
usual term extra off-diagonal terms (horizontal and vertical inv. covariances)
SLIDE 12
Inverse covariance simplification
State of the art The proposed algorithm Some results Approximations for large matrices
SLIDE 13
Problem statement for Markov Chains
Inverse covariance matrix Inverse covariance matrix Graphical model Graphical model Given model (computed posterior pdf) Simplified model
Σ−1
˜ Σ−1
SLIDE 14 Information-theory based approaches
๏ Set to zero - bad idea: ๏ Minimize a distance between distributions
- Kullback-Leibler or - relative entropy
- Symmetric Kullback-Leibler
- Bhattacharyya distance...
DKL(P˜
Σ | PΣ) ∝ log
det Σ det ˜ Σ
Σ
- Relative entropy maximization:
M.E. subject to constraints
˜ Σ−1
uv = 0
Σ, ˜ Σ
Nonlinear problem and constraints difficult to enforce. Any ideas?
˜ Σ positive definite ˜ Σ positive definite
SLIDE 15
Proposed approach
๏ Enforce the constraints: ๏ Minimize the norm of the residual
to find the unknown inverse covariance entries X and the unknown covariances Z (even if not needed)
x =
˜ Σ−1
C I ˜ Σ−1C = I ˜ Σ−1
uv = 0 if (u, v) ∈ Ω
Ckl = Σkl if (k, l) ∈ ¯ Ω E = ˜ Σ−1C − I2
SLIDE 16 Alternate optimization scheme
๏ Z fixed, minimize E with respect to X
- Quadratic form - use a conjugate gradient descent
๏ X fixed, minimize E with respect to Z
- Quadratic form - use a conjugate gradient descent
E = ˜ Σ−1C − I2
This is not a quadratic form in (X,Z)
EZ(X) = 1 2XtAZX + Bt
ZX + const
EX(Z) = 1 2ZtAXZ + Bt
XZ + const
SLIDE 17 Some details...
AZ =
(aZ)ij(aZ)t
ij
BZ = −
(aZ)ii
BX = −
(aX)ii AX =
(aX)ij(aX)t
ij
SLIDE 18
Test 1 (simulation, 6x6 matrix)
Convergence monitoring
Σ−1
˜ Σ−1
C
SLIDE 19
Test 2 (simulation, 6x6 matrix)
˜ Σ−1
C
SLIDE 20
How to simplify large matrices?
Block sweeping/averaging technique
Original method Using block sweeping and averaging
SLIDE 21 Block size and 2D Markov Random Fields?
Example: simplification of a 8-neigbor MRF to obtain a 4-neighbor MRF
What is the minimum block size?
Input inverse covariance Simplified (block sweeping)
SLIDE 22
Conclusions
Accomplishments
New algorithm to simplify inverse covariance matrices using covariance and support constraints Fast alternate optimization scheme
What’s next?
Extension to general 2D Markov Random Fields Improve the block size determination technique Application to image processing (e.g. data fusion) in remote sensing and astronomy