On two variants of incremental condition estimation Jurjen Duintjer - - PowerPoint PPT Presentation

on two variants of incremental condition estimation
SMART_READER_LITE
LIVE PREVIEW

On two variants of incremental condition estimation Jurjen Duintjer - - PowerPoint PPT Presentation

On two variants of incremental condition estimation Jurjen Duintjer Tebbens joint work with Miroslav T uma Institute of Computer Science Academy of Sciences of the Czech Republic SNA11, Ronov pod Radhot em, January 27, 2011. 1.


slide-1
SLIDE 1

On two variants of incremental condition estimation

Jurjen Duintjer Tebbens

joint work with

Miroslav T˚ uma

Institute of Computer Science Academy of Sciences of the Czech Republic SNA11, Rožnov pod Radhoštˇ em, January 27, 2011.

slide-2
SLIDE 2
  • J. Duintjer Tebbens, M. T˚

uma

2

  • 1. Motivation: BIF
  • This work is motivated by the recently introduced Balanced Incomplete

Factorization (BIF) method for Cholesky (or LDU) decomposition [Bru, Marín, Mas, T˚ uma - 2008]

  • The method is remarkable, among others, in that it computes the

inverse triangular factors simultaneously during the factorization process.

  • In BIF the presence of the inverse could be used to obtain robust

dropping criteria

  • In this talk we are interested in complete factorization (BIF without

dropping any entries)

slide-3
SLIDE 3
  • J. Duintjer Tebbens, M. T˚

uma

3

  • 1. Motivation: BIF
  • The main question is: How can the presence of the inverse factors be

exploited in complete factorization?

  • Perhaps the first thing that comes to mind, is to use the inverse

triangular factors for improved condition estimation. Recall that κ(L) = σmax(L) σmin(L) = LL−1.

  • In this talk we concentrate on estimation of the 2-norm condition

number.

  • We will see that exploiting the inverse factors for better condition

estimation is possible, but not as straightforward as it may seem.

slide-4
SLIDE 4
  • J. Duintjer Tebbens, M. T˚

uma

4

  • 2. Condition estimation

Traditionally, 2-norm condition estimators assume a triangular decomposition and compute estimates for the factors. E.g., if A is symmetric positive definite with Cholesky decomposition A = LLT , then the condition number of A satisfies κ(A) = κ(L)2 = κ(LT )2. κ(LT ) can be cheaply estimated with a technique called incremental condition number estimation. Main idea: Subsequent estimation of leading submatrices:

k k+1

LT = ⇒ all columns are accessed only once.

slide-5
SLIDE 5
  • J. Duintjer Tebbens, M. T˚

uma

5

  • 2. Condition estimation

We will call the original incremental technique, introduced by Bischof [Bischof - 1990], simply incremental condition estimation (ICE): Let R be upper triangular with a given approximate maximal singular value σmaxICE(R) and corresponding approximate singular vector y, y = 1, σmaxICE(R) = yT R ≈ σmax(R) = max

x=1 xT R.

ICE approximates the maximal singular value of the extended matrix R′ =

  • R

v γ

  • by maximizing
  • sy,

c R v γ

  • ,
  • ver all s, c satisfying

c2 + s2 = 1.

slide-6
SLIDE 6
  • J. Duintjer Tebbens, M. T˚

uma

6

  • 2. Condition estimation

We have max

s,c,c2+s2=1

  • sy,

c R v γ

  • 2

= max

s,c,c2+s2=1

  • sy,

c R v γ RT vT γ sy c

  • =

max

s,c,c2+s2=1

  • s,

c σmaxICE(R)2 + (yT v)2 γ(vT y) γ(vT y) γ2 s c

  • .

The maximum is obtained with the normalized eigenvector corresponding to the maximum eigenvalue λmax(BICE) of BICE ≡

  • σmaxICE(R)2 + (yT v)2

γ(vT y) γ(vT y) γ2

  • .

We denote the normalized eigenvector by

  • ˆ

s ˆ c

  • and put ˆ

y ≡

  • ˆ

sy ˆ c

  • .
slide-7
SLIDE 7
  • J. Duintjer Tebbens, M. T˚

uma

7

  • 2. Condition estimation

Then we define the approximate maximal singular value of the extended matrix as σmaxICE(R′) ≡ ˆ yT R′ ≈ σmax(R′). Similarly, if for some y with unit norm, σminICE(R) = yT R ≈ σmin(R) = min

x=1 xT R,

then ICE uses the minimal eigenvalue λmin(BICE) of BICE =

  • σminICE(R)2 + (yT v)2

γ(vT y) γ(vT y) γ2

  • The corresponding eigenvector of BICE yields the new vector ˆ

yT and σminICE(R′) ≡ ˆ yT R′ ≈ σmin(R′).

slide-8
SLIDE 8
  • J. Duintjer Tebbens, M. T˚

uma

8

  • 2. Condition estimation

Experiment:

  • We generate 50 random matrices B of dimension 100 with the Matlab

command B = randn(100, 100)

  • We compute the Cholesky decompositions LLT of the 50 symmetric

positive definite matrices A = BBT

  • We compute the estimations σmaxICE(LT ) and σminICE(LT )
  • In the following graph we display the quality of the estimations through

the number

  • σmaxICE(LT )

σminICE(LT )

2 κ(A) , where κ(A) is the true condition number. Note that we always have σmaxICE(LT ) σminICE(LT ) 2 ≤ κ(A).

slide-9
SLIDE 9
  • J. Duintjer Tebbens, M. T˚

uma

9

  • 2. Condition estimation

10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Quality of the estimator ICE for 50 random s.p.d. matrices of dimension 100.

slide-10
SLIDE 10
  • J. Duintjer Tebbens, M. T˚

uma

10

  • 2. Condition estimation

With the decomposition method from [Bru, Marín, Mas, T˚ uma - 2008] we have to our disposal not only the Cholesky decomposition of A, A = LLT but also the inverse Cholesky factors as is the case in balanced factorization, i.e. we have A−1 = L−T L−1. Then we can run ICE on L−T and use the additional estimations 1 σmaxICE(L−T ) ≈ σmin(LT ), 1 σminICE(L−T ) ≈ σmax(LT ). In the following graph we take the best of both estimations, we display

  • max(σmaxICE(LT ),σminICE(L−T )−1)

min(σminICE(LT ),σmaxICE(L−T )−1)

2 κ(A) .

slide-11
SLIDE 11
  • J. Duintjer Tebbens, M. T˚

uma

11

  • 2. Condition estimation

10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Quality of ICE for 50 random s.p.d. matrices of dimension 100.

slide-12
SLIDE 12
  • J. Duintjer Tebbens, M. T˚

uma

11

  • 2. Condition estimation

10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Quality of the estimator ICE without (black) and with exploiting (green) the inverse for 50 random s.p.d. matrices of dimension 100.

slide-13
SLIDE 13
  • J. Duintjer Tebbens, M. T˚

uma

12

  • 2. Condition estimation

Plugging in the inverse gives no improvement! By comparing σmaxICE(R′), R′ =

  • R

v γ

  • with

1 σminICE((R′)−1), (R′)−1 =

  • R−1

− R−1v

γ 1 γ

  • ,

it can be proven that σmaxICE(A) = 1 σminICE(A−1), σminICE(A) = 1 σmaxICE(A−1), which we denote as ICE(A) = ICE(A−1).

slide-14
SLIDE 14
  • J. Duintjer Tebbens, M. T˚

uma

13

  • 2. Condition estimation

An alternative technique called incremental norm estimation (INE) was proposed in [Duff, Vömel - 2002]: Let R be upper triangular with given approximate maximal singular value σmaxINE(R) and corresponding approximate right singular vector z, z = 1, σmaxINE(R) = Rz ≈ σmax(R) = max

x=1 Rx.

INE approximates the maximal singular value of the extended matrix R′ =

  • R

v γ

  • by maximizing
  • R

v γ sz c

  • ,
  • ver all s, c satisfying c2 + s2 = 1.
slide-15
SLIDE 15
  • J. Duintjer Tebbens, M. T˚

uma

14

  • 2. Condition estimation

We have max

s,c,c2+s2=1

  • R

v γ sz c

  • 2

= max

s,c,c2+s2=1

  • sz

c RT vT γ R v γ sz c

  • =

max

s,c,c2+s2=1

  • s

c zT RT Rz zT RT v zT RT v vT v + γ2 s c

  • .

The maximum is obtained for the normalized eigenvector corresponding to the maximum eigenvalue λmax(BINE) of BINE ≡

  • zT RT Rz

zT RT v zT RT v vT v + γ2

  • .

We denote the normalized eigenvector by

  • ˆ

s ˆ c

  • and put ˆ

z ≡

  • ˆ

sz ˆ c

  • .
slide-16
SLIDE 16
  • J. Duintjer Tebbens, M. T˚

uma

15

  • 2. Condition estimation

Then we define the approximate maximal singular value of the extended matrix as σmaxINE(R′) ≡ R′ˆ z ≈ σmax(R′). Similarly, if for a unit vector z, Rz ≈ σmin(R) = min

x=1 Rx,

then INE uses the minimal eigenvalue λmin(BINE) of BINE =

  • zT RT Rz

zT RT v zT RT v vT v + γ2

  • .

The corresponding eigenvector of BINE yields the new vector ˆ z and σminINE(R′) ≡ R′ˆ z ≈ σmin(R′).

slide-17
SLIDE 17
  • J. Duintjer Tebbens, M. T˚

uma

16

  • 2. Condition estimation

By comparing σmaxINE(R′), R′ =

  • R

v γ

  • with

1 σminINE((R′)−1), (R′)−1 =

  • R−1

− R−1v

γ 1 γ

  • ,

it can be checked, that for INE we have in general INE(A) = INE(A−1) ! Let us demonstrate this in the same experiment as before.

slide-18
SLIDE 18
  • J. Duintjer Tebbens, M. T˚

uma

17

  • 2. Condition estimation

10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Quality of the estimator INE(A)

slide-19
SLIDE 19
  • J. Duintjer Tebbens, M. T˚

uma

17

  • 2. Condition estimation

10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Quality of the estimator INE(A) (black) and of INE(A−1) (green).

slide-20
SLIDE 20
  • J. Duintjer Tebbens, M. T˚

uma

17

  • 2. Condition estimation

10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Quality of the estimators INE(A) (black), INE(A−1) (green) and best of both INE(A) and INE(A−1) (red).

slide-21
SLIDE 21
  • J. Duintjer Tebbens, M. T˚

uma

17

  • 2. Condition estimation

10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Quality of the estimators INE(A) (black), INE(A−1) (green), best of both INE(A) and INE(A−1) (red) and ICE(A) (blue).

slide-22
SLIDE 22
  • J. Duintjer Tebbens, M. T˚

uma

18

  • 2. Condition estimation

Why this improvement ?

  • In general, both ICE and INE give a satisfactory approximation of

σmax(R), though INE tends to be better.

  • The problem is to approximate σmin(R), for ICE as well as for INE.
  • The trick is to translate to the problem of finding the maximal singular

value σmax(R−1) of R−1, which is in general done better with INE than with ICE.

  • This has an important impact on the estimate because σmin(R) is

typically small and appears in the denominator of σmax(R)

σmin(R) ,

We see that the main reason for the improvement is that INE tends to give a better estimate of maximal singular values. And why is that ?

slide-23
SLIDE 23
  • J. Duintjer Tebbens, M. T˚

uma

19

  • 2. Condition estimation

Note: INE does not always give a better estimate of the maximal singular

  • value. But we have the following rather technical result.

Theorem [DT, T˚ uma - ?]. Consider condition estimation of the matrix R′ =

  • R

v γ

  • ,

where both ICE and INE start with the same approximation of σmax(R) denoted by δ. Let y, y = 1 be the approximate singular vector for ICE, δ = yT R ≈ σmax(R), and let z, z = 1 be the approximate singular vector for INE, δ = Rz ≈ σmax(R).

slide-24
SLIDE 24
  • J. Duintjer Tebbens, M. T˚

uma

20

  • 2. Condition estimation

Theorem (continued). Then we have superiority of INE, σmaxINE(R′) ≥ σmaxICE(R′), if (vT Rz)2 ≥ δ2(vT y)2 + 1 2

  • vT v − (vT y)2

α −

  • α2 + 4δ2(vT y)2
  • .

where α = δ2 − γ2 − (vT y)2. Let us use the notation ρ ≡ δ2(vT y)2 + 1 2

  • vT v − (vT y)2

α −

  • α2 + 4δ2(vT y)2
  • .

If ρ ≤ 0, then INE is unconditionally superior to ICE (i.e. regardless of the approximate singular vector z).

slide-25
SLIDE 25
  • J. Duintjer Tebbens, M. T˚

uma

21

  • 2. Condition estimation

To conclude we demonstrate the previous theorem.

  • Assume that at some stage of an incremental condition estimation

process we have σmaxICE(R) = σmaxINE(R) = 1.

  • Consider possible vectors v in the new columns of R′ that have unit

norm, i.e. vT v = 1.

  • Then (vT y)2 ≤ 1. The x-axes of the following figures represent the

possible values of (vT y)2 ≤ 1.

  • The y-axes represent values of γ2, i.e. the square of the new diagonal

entry.

  • The superiority criterion for INE expressed by the value of ρ is given by

the z-axes.

slide-26
SLIDE 26
  • J. Duintjer Tebbens, M. T˚

uma

22

  • 2. Condition estimation

0.2 0.4 0.6 0.8 1 1 2 3 4 5 0.2 0.4 0.6 0.8 1

Value of ρ in dependence of (vT y)2 (x-axis) and γ2 (y-axis) with v2 = 1.

slide-27
SLIDE 27
  • J. Duintjer Tebbens, M. T˚

uma

23

  • 2. Condition estimation

2 4 6 8 10 1 2 3 4 5 2 4 6 8 10

Value of ρ in dependence of (vT y)2 (x-axis) and γ2 (y-axis) with v2 = 10.

slide-28
SLIDE 28
  • J. Duintjer Tebbens, M. T˚

uma

24

  • 2. Condition estimation
  • Exploiting the presence of inverse factors combined with INE gives a

significant improvement of incremental condition estimation.

  • This may be an important advantage of methods where inverse

triangular factors are just a by-product of the factorization.

  • We did not consider sparse Cholesky factors, which ask for modified

ICE [Bischof, Pierce, Lewis - 1990].

  • Matlab does not offer a 2-norm condition number estimator but a

1-norm condition number estimator (previous versions offered the 2-norm condition number estimator ICE)

slide-29
SLIDE 29
  • J. Duintjer Tebbens, M. T˚

uma

25

Thank you for your attention!

Supported by project number IAA100300802 of the grant agency of the Academy of Sciences of the Czech Republic.