Backward Perturbation Analysis for Scaled Total Least Squares - - PowerPoint PPT Presentation

backward perturbation analysis
SMART_READER_LITE
LIVE PREVIEW

Backward Perturbation Analysis for Scaled Total Least Squares - - PowerPoint PPT Presentation

Backward Perturbation Analysis Backward Perturbation Analysis for Scaled Total Least Squares Problems David Titley-P eloquin Joint work with Xiao-Wen Chang and Chris Paige McGill University, School of Computer Science Research supported by


slide-1
SLIDE 1

Backward Perturbation Analysis

Backward Perturbation Analysis

for Scaled Total Least Squares Problems David Titley-P´ eloquin Joint work with Xiao-Wen Chang and Chris Paige

McGill University, School of Computer Science Research supported by NSERC

Computational Methods with Applications Harrachov, 2007

CMA Harrachov 2007 — 1 of 27

slide-2
SLIDE 2

Backward Perturbation Analysis > Outline

Outline

The scaled total least squares problem Backward perturbation analysis A pseudo-minimal backward error µ A lower bound for µ An asymptotic estimate for µ Numerical experiments and conclusion

CMA Harrachov 2007 — 2 of 27

slide-3
SLIDE 3

Backward Perturbation Analysis > Notation

Notation

Matrices: A, ∆A, E, . . . Vectors: b, ∆b, f , . . . Scalars: γ, β, ξ, . . .

CMA Harrachov 2007 — 3 of 27

slide-4
SLIDE 4

Backward Perturbation Analysis > Notation

Notation

Matrices: A, ∆A, E, . . . Vectors: b, ∆b, f , . . . Scalars: γ, β, ξ, . . . Vector norms: v2

2 ≡ v Tv

Matrix norms: A2 ≡ σmax(A), A2

F ≡ trace(ATA)

CMA Harrachov 2007 — 3 of 27

slide-5
SLIDE 5

Backward Perturbation Analysis > Notation

Notation

Matrices: A, ∆A, E, . . . Vectors: b, ∆b, f , . . . Scalars: γ, β, ξ, . . . Vector norms: v2

2 ≡ v Tv

Matrix norms: A2 ≡ σmax(A), A2

F ≡ trace(ATA)

σmin(A): smallest singular value of A λmin(A): smallest eigenvalue of (real symmetric) A A†: Moore-Penrose generalized inverse of A (for a non-zero vector v † = v T/v2

2)

CMA Harrachov 2007 — 3 of 27

slide-6
SLIDE 6

Backward Perturbation Analysis > The scaled total least squares problem

Outline

The scaled total least squares problem Backward perturbation analysis A pseudo-minimal backward error µ A lower bound for µ An asymptotic estimate for µ Numerical experiments and conclusion

CMA Harrachov 2007 — 4 of 27

slide-7
SLIDE 7

Backward Perturbation Analysis > The scaled total least squares problem

The scaled total least squares problem

Given A ∈ Rm×n and b ∈ Rm, the least squares problem is min

f ,x

  • f 2

2 : Ax = b + f

  • CMA Harrachov 2007 — 5 of 27
slide-8
SLIDE 8

Backward Perturbation Analysis > The scaled total least squares problem

The scaled total least squares problem

Given A ∈ Rm×n and b ∈ Rm, the least squares problem is min

f ,x

  • f 2

2 : Ax = b + f

  • The scaled total least squares (STLS) problem is

min

E,f ,x

  • [E, γf ]2

F : (A + E)x = b + f

  • CMA Harrachov 2007 — 5 of 27
slide-9
SLIDE 9

Backward Perturbation Analysis > The scaled total least squares problem

The scaled total least squares problem

Given A ∈ Rm×n and b ∈ Rm, the least squares problem is min

f ,x

  • f 2

2 : Ax = b + f

  • The scaled total least squares (STLS) problem is

min

E,f ,x

  • [E, γf ]2

F : (A + E)x = b + f

  • The STLS problem reduces to

the least squares (LS) problem as γ → 0 the total least squares (TLS) problem if γ = 1 the data least squares (DLS) problem as γ → ∞

CMA Harrachov 2007 — 5 of 27

slide-10
SLIDE 10

Backward Perturbation Analysis > The scaled total least squares problem

STLS optimality conditions

The STLS problem is equivalent to min

x

b − Ax2

2

γ−2 + x2

2

CMA Harrachov 2007 — 6 of 27

slide-11
SLIDE 11

Backward Perturbation Analysis > The scaled total least squares problem

STLS optimality conditions

The STLS problem is equivalent to min

x

b − Ax2

2

γ−2 + x2

2

Lemma (Paige and Strakoˇ s, 2002) under mild conditions on A and b, a unique STLS solution exists

CMA Harrachov 2007 — 6 of 27

slide-12
SLIDE 12

Backward Perturbation Analysis > The scaled total least squares problem

STLS optimality conditions

The STLS problem is equivalent to min

x

b − Ax2

2

γ−2 + x2

2

Lemma (Paige and Strakoˇ s, 2002) under mild conditions on A and b, a unique STLS solution exists assuming these conditions hold, ˆ x is optimal if and only if AT(b − Aˆ x) = − b − Aˆ x2

2

γ−2 + ˆ x2

2

ˆ x and b − Aˆ x2

2

γ−2 + ˆ x2

2

< σ2

min(A)

CMA Harrachov 2007 — 6 of 27

slide-13
SLIDE 13

Backward Perturbation Analysis > Backward perturbation analysis

Outline

The scaled total least squares problem Backward perturbation analysis A pseudo-minimal backward error µ A lower bound for µ An asymptotic estimate for µ Numerical experiments and conclusion

CMA Harrachov 2007 — 7 of 27

slide-14
SLIDE 14

Backward Perturbation Analysis > Backward perturbation analysis

Backward perturbation analysis

Recent research: Consistent linear systems: Oettli & Prager (64), Rigal & Gaches (67),

  • D. Higham & N. Higham (92), Varah (94),

J.G. Sun & Z. Sun (97), Sun (99), etc.

CMA Harrachov 2007 — 8 of 27

slide-15
SLIDE 15

Backward Perturbation Analysis > Backward perturbation analysis

Backward perturbation analysis

Recent research: Consistent linear systems: Oettli & Prager (64), Rigal & Gaches (67),

  • D. Higham & N. Higham (92), Varah (94),

J.G. Sun & Z. Sun (97), Sun (99), etc. LS problems: Stewart (77), Wald´ en, Karlson & Sun (95), Sun (96,97), Gu (98), Grcar, Saunders & Su (04), Golub & Su (05)

CMA Harrachov 2007 — 8 of 27

slide-16
SLIDE 16

Backward Perturbation Analysis > Backward perturbation analysis

Backward perturbation analysis

Recent research: Consistent linear systems: Oettli & Prager (64), Rigal & Gaches (67),

  • D. Higham & N. Higham (92), Varah (94),

J.G. Sun & Z. Sun (97), Sun (99), etc. LS problems: Stewart (77), Wald´ en, Karlson & Sun (95), Sun (96,97), Gu (98), Grcar, Saunders & Su (04), Golub & Su (05) DLS problems: Chang, Golub & Paige (06)

CMA Harrachov 2007 — 8 of 27

slide-17
SLIDE 17

Backward Perturbation Analysis > Backward perturbation analysis

Backward perturbation analysis for STLS

Given an approximate STLS solution 0 = y ∈ Rn, we seek minimal perturbations ∆A and ∆b such that y is the exact STLS solution of the perturbed problem: y = arg min

x

(b + ∆b) − (A + ∆A)x2

2

γ−2 + x2

2

CMA Harrachov 2007 — 9 of 27

slide-18
SLIDE 18

Backward Perturbation Analysis > Backward perturbation analysis

Backward perturbation analysis for STLS

Given an approximate STLS solution 0 = y ∈ Rn, we seek minimal perturbations ∆A and ∆b such that y is the exact STLS solution of the perturbed problem: y = arg min

x

(b + ∆b) − (A + ∆A)x2

2

γ−2 + x2

2

Applications: if ∆A and ∆b are small, we can say y is a backward stable (ie. numerically acceptable) solution

CMA Harrachov 2007 — 9 of 27

slide-19
SLIDE 19

Backward Perturbation Analysis > Backward perturbation analysis

Backward perturbation analysis for STLS

Given an approximate STLS solution 0 = y ∈ Rn, we seek minimal perturbations ∆A and ∆b such that y is the exact STLS solution of the perturbed problem: y = arg min

x

(b + ∆b) − (A + ∆A)x2

2

γ−2 + x2

2

Applications: if ∆A and ∆b are small, we can say y is a backward stable (ie. numerically acceptable) solution this can be used to design stopping criteria for iterative methods for large sparse problems

CMA Harrachov 2007 — 9 of 27

slide-20
SLIDE 20

Backward Perturbation Analysis > A pseudo-minimal backward error µ

Outline

The scaled total least squares problem Backward perturbation analysis A pseudo-minimal backward error µ A lower bound for µ An asymptotic estimate for µ Numerical experiments and conclusion

CMA Harrachov 2007 — 10 of 27

slide-21
SLIDE 21

Backward Perturbation Analysis > A pseudo-minimal backward error µ

The minimal backward error problem

The minimal backward error problem: min

[∆A,∆b]∈G { [∆A, ∆b]F }

where G ≡

  • [∆A, ∆b] : y = arg min

x

(b + ∆b) − (A + ∆A)x2

2

γ−2 + x2

2

  • CMA Harrachov 2007 — 11 of 27
slide-22
SLIDE 22

Backward Perturbation Analysis > A pseudo-minimal backward error µ

The set G

Recall the STLS optimality conditions:    h(A, b, ˆ x) ≡ AT(b − Aˆ x) + b−Aˆ

x2

2

γ−2+ˆ x2

2 ˆ

x = 0,

b−Aˆ x2

2

γ−2+ˆ x2

2

< σ2

min(A)

CMA Harrachov 2007 — 12 of 27

slide-23
SLIDE 23

Backward Perturbation Analysis > A pseudo-minimal backward error µ

The set G

Recall the STLS optimality conditions:    h(A, b, ˆ x) ≡ AT(b − Aˆ x) + b−Aˆ

x2

2

γ−2+ˆ x2

2 ˆ

x = 0,

b−Aˆ x2

2

γ−2+ˆ x2

2

< σ2

min(A)

Therefore G ≡

  • [∆A, ∆b] : y = arg min

x

(b + ∆b) − (A + ∆A)x2

2

γ−2 + x2

2

  • CMA Harrachov 2007 — 12 of 27
slide-24
SLIDE 24

Backward Perturbation Analysis > A pseudo-minimal backward error µ

The set G

Recall the STLS optimality conditions:    h(A, b, ˆ x) ≡ AT(b − Aˆ x) + b−Aˆ

x2

2

γ−2+ˆ x2

2 ˆ

x = 0,

b−Aˆ x2

2

γ−2+ˆ x2

2

< σ2

min(A)

Therefore G ≡

  • [∆A, ∆b] : y = arg min

x

(b + ∆b) − (A + ∆A)x2

2

γ−2 + x2

2

  • =

[∆A, ∆b] : h(A + ∆A, b + ∆b, y) = 0,

(b+∆b)−(A+∆A)y2

2

γ−2+y2

2

< σ2

min(A + ∆A)

  • CMA Harrachov 2007 — 12 of 27
slide-25
SLIDE 25

Backward Perturbation Analysis > A pseudo-minimal backward error µ

The superset G+

The inequality makes the problem difficult... G =

  • [∆A, ∆b] : h(A + ∆A, b + ∆b, y)

= 0,

(b+∆b)−(A+∆A)y2

2

γ−2+y2

2

< σ2

min(A + ∆A)

  • CMA Harrachov 2007 — 13 of 27
slide-26
SLIDE 26

Backward Perturbation Analysis > A pseudo-minimal backward error µ

The superset G+

The inequality makes the problem difficult... G =

  • [∆A, ∆b] : h(A + ∆A, b + ∆b, y)

= 0,

(b+∆b)−(A+∆A)y2

2

γ−2+y2

2

< σ2

min(A + ∆A)

  • We ignore it, and define

G+ ≡ { [∆A, ∆b] : h(A + ∆A, b + ∆b, y) = 0 }

CMA Harrachov 2007 — 13 of 27

slide-27
SLIDE 27

Backward Perturbation Analysis > A pseudo-minimal backward error µ

The superset G+

The inequality makes the problem difficult... G =

  • [∆A, ∆b] : h(A + ∆A, b + ∆b, y)

= 0,

(b+∆b)−(A+∆A)y2

2

γ−2+y2

2

< σ2

min(A + ∆A)

  • We ignore it, and define

G+ ≡ { [∆A, ∆b] : h(A + ∆A, b + ∆b, y) = 0 } We will consider the following pseudo-minimal backward error µ: µ ≡ min

[∆A,∆b]∈G+ { [∆A, ∆b]F}

CMA Harrachov 2007 — 13 of 27

slide-28
SLIDE 28

Backward Perturbation Analysis > A pseudo-minimal backward error µ

Can we really ignore the inequality?

We are ignoring the inequality in the set G and solving µ ≡ min

[∆A,∆b]∈G+ { [∆A, ∆b]F}

This gives the minimal backward distance such that y is a stationary point (but not necessarily the global minimum).

CMA Harrachov 2007 — 14 of 27

slide-29
SLIDE 29

Backward Perturbation Analysis > A pseudo-minimal backward error µ

Can we really ignore the inequality?

We are ignoring the inequality in the set G and solving µ ≡ min

[∆A,∆b]∈G+ { [∆A, ∆b]F}

This gives the minimal backward distance such that y is a stationary point (but not necessarily the global minimum). Theorem Let ˆ x be the true STLS solution. There exists an ǫ > 0 such that when y − ˆ x2 < ǫ, µ(y) is indeed the true minimal backward error.

CMA Harrachov 2007 — 14 of 27

slide-30
SLIDE 30

Backward Perturbation Analysis > A pseudo-minimal backward error µ

A pseudo-minimal backward error

We can find an explicit representation of G+ and minimize over this set. Theorem Define r ≡ b − Ay and M ≡ A(I − yy †)AT − rrT 1 + y2

2

+ (Ay + γ2y2

2b)(Ay + γ2y2 2b)T

y2

2 + γ4y4 2

Then µ2 = r2

2

1 + y2

2

+ min {λmin(M), 0}

CMA Harrachov 2007 — 15 of 27

slide-31
SLIDE 31

Backward Perturbation Analysis > A pseudo-minimal backward error µ

Limits γ → 0 and γ → ∞

When γ → 0, M → AAT − rrT 1 + y2

2

This is consistent with the result of Wald´ en, Karlson & Sun (95).

CMA Harrachov 2007 — 16 of 27

slide-32
SLIDE 32

Backward Perturbation Analysis > A pseudo-minimal backward error µ

Limits γ → 0 and γ → ∞

When γ → 0, M → AAT − rrT 1 + y2

2

This is consistent with the result of Wald´ en, Karlson & Sun (95). When γ → ∞, M → A(I − yy †)AT − rrT 1 + y2

2

+ bbT This is consistent with the result of Chang, Golub & Paige (06).

CMA Harrachov 2007 — 16 of 27

slide-33
SLIDE 33

Backward Perturbation Analysis > A lower bound for µ

Outline

The scaled total least squares problem Backward perturbation analysis A pseudo-minimal backward error µ A lower bound for µ An asymptotic estimate for µ Numerical experiments and conclusion

CMA Harrachov 2007 — 17 of 27

slide-34
SLIDE 34

Backward Perturbation Analysis > A lower bound for µ

A lower bound for µ

For any [∆A, ∆b] ∈ G+, h(A + ∆A, b + ∆b, y) = 0.

CMA Harrachov 2007 — 18 of 27

slide-35
SLIDE 35

Backward Perturbation Analysis > A lower bound for µ

A lower bound for µ

For any [∆A, ∆b] ∈ G+, h(A + ∆A, b + ∆b, y) = 0. Rearranging terms and taking the 2-norm: [∆A, ∆b]2

2 + β1 · [∆A, ∆b]2 − β0 ≥ 0

where β0, β1 ≥ 0 are independent of ∆A and ∆b.

CMA Harrachov 2007 — 18 of 27

slide-36
SLIDE 36

Backward Perturbation Analysis > A lower bound for µ

A lower bound for µ

For any [∆A, ∆b] ∈ G+, h(A + ∆A, b + ∆b, y) = 0. Rearranging terms and taking the 2-norm: [∆A, ∆b]2

2 + β1 · [∆A, ∆b]2 − β0 ≥ 0

where β0, β1 ≥ 0 are independent of ∆A and ∆b. Theorem µ ≥ 2β0

  • β1 +
  • β2

1 + 4β0

−1 ≡ µlb

CMA Harrachov 2007 — 18 of 27

slide-37
SLIDE 37

Backward Perturbation Analysis > A lower bound for µ

A lower bound for µ

For any [∆A, ∆b] ∈ G+, h(A + ∆A, b + ∆b, y) = 0. Rearranging terms and taking the 2-norm: [∆A, ∆b]2

2 + β1 · [∆A, ∆b]2 − β0 ≥ 0

where β0, β1 ≥ 0 are independent of ∆A and ∆b. Theorem µ ≥ 2β0

  • β1 +
  • β2

1 + 4β0

−1 ≡ µlb This lower bound can be estimated in O(mn) flops.

CMA Harrachov 2007 — 18 of 27

slide-38
SLIDE 38

Backward Perturbation Analysis > An asymptotic estimate for µ

Outline

The scaled total least squares problem Backward perturbation analysis A pseudo-minimal backward error µ A lower bound for µ An asymptotic estimate for µ Numerical experiments and conclusion

CMA Harrachov 2007 — 19 of 27

slide-39
SLIDE 39

Backward Perturbation Analysis > An asymptotic estimate for µ

An asymptotic estimate for µ

Any [∆A, ∆b] ∈ G+ satisfy the normal equations: h(A + ∆A, b + ∆b, y) = 0

CMA Harrachov 2007 — 20 of 27

slide-40
SLIDE 40

Backward Perturbation Analysis > An asymptotic estimate for µ

An asymptotic estimate for µ

Any [∆A, ∆b] ∈ G+ satisfy the normal equations: h(A + ∆A, b + ∆b, y) = 0 Expanding h(A + ∆A, b + ∆b, y) in a Taylor expansion and truncating lower-order terms, we obtain an asymptotic estimate ˜ µ.

CMA Harrachov 2007 — 20 of 27

slide-41
SLIDE 41

Backward Perturbation Analysis > An asymptotic estimate for µ

An asymptotic estimate for µ

Any [∆A, ∆b] ∈ G+ satisfy the normal equations: h(A + ∆A, b + ∆b, y) = 0 Expanding h(A + ∆A, b + ∆b, y) in a Taylor expansion and truncating lower-order terms, we obtain an asymptotic estimate ˜ µ. Theorem lim

y→ˆ x ˜

µ/µ = 1

CMA Harrachov 2007 — 20 of 27

slide-42
SLIDE 42

Backward Perturbation Analysis > An asymptotic estimate for µ

Computing the asymptotic estimate

Theorem Define r ≡ b − Ay, ξ0 ≡

  • 1 + y2

2 ,

ξ1 ≡

  • γ−2 + y2

2 ,

ξ2 ≡ y2

2 − γ−2 + 2

ξ0ξ2

1

, and B ≡   ξ0A + ξ2ry T ξ−1

0 r2I

ξ−1

0 r2y2(I − yy †)

  , c ≡   ξ−1

0 r

−ξ1r2y  

CMA Harrachov 2007 — 21 of 27

slide-43
SLIDE 43

Backward Perturbation Analysis > An asymptotic estimate for µ

Computing the asymptotic estimate

Theorem Define r ≡ b − Ay, ξ0 ≡

  • 1 + y2

2 ,

ξ1 ≡

  • γ−2 + y2

2 ,

ξ2 ≡ y2

2 − γ−2 + 2

ξ0ξ2

1

, and B ≡   ξ0A + ξ2ry T ξ−1

0 r2I

ξ−1

0 r2y2(I − yy †)

  , c ≡   ξ−1

0 r

−ξ1r2y   Then ˜ µ = BB†c2

CMA Harrachov 2007 — 21 of 27

slide-44
SLIDE 44

Backward Perturbation Analysis > Numerical experiments and conclusion

Outline

The scaled total least squares problem Backward perturbation analysis A pseudo-minimal backward error µ A lower bound for µ An asymptotic estimate for µ Numerical experiments and conclusion

CMA Harrachov 2007 — 22 of 27

slide-45
SLIDE 45

Backward Perturbation Analysis > Numerical experiments and conclusion

Sample numerical test

Tests: When is µ actually the minimal backward error? (How close does y have to be to the true STLS solution for (b + ∆b) − (A + ∆A)y2

2

γ−2 + y2

2

< σ2

min(A + ∆A)

to hold?)

CMA Harrachov 2007 — 23 of 27

slide-46
SLIDE 46

Backward Perturbation Analysis > Numerical experiments and conclusion

Sample numerical test

Tests: When is µ actually the minimal backward error? (How close does y have to be to the true STLS solution for (b + ∆b) − (A + ∆A)y2

2

γ−2 + y2

2

< σ2

min(A + ∆A)

to hold?) Are µlb and ˜ µ good estimates of µ?

CMA Harrachov 2007 — 23 of 27

slide-47
SLIDE 47

Backward Perturbation Analysis > Numerical experiments and conclusion

Sample numerical test

Data: γ = 1 100 × 40 “random” A with AF = 1 and κ2(A) = 105 E and f are “random” with EF , f 2 ≤ 10−2 b = (A + E)x + f where x = [1, 1, . . . , 1]T

CMA Harrachov 2007 — 24 of 27

slide-48
SLIDE 48

Backward Perturbation Analysis > Numerical experiments and conclusion

Sample numerical test

Data: γ = 1 100 × 40 “random” A with AF = 1 and κ2(A) = 105 E and f are “random” with EF , f 2 ≤ 10−2 b = (A + E)x + f where x = [1, 1, . . . , 1]T Obtain the STLS solution ˆ x Perturb ˆ x “randomly” to obtain y, with y−ˆ

x2 ˆ x2

≤ δx Run 1000 tests with δx = 10−2 and δx = 10−1

CMA Harrachov 2007 — 24 of 27

slide-49
SLIDE 49

Backward Perturbation Analysis > Numerical experiments and conclusion

Sample test result: δx = 10−2

µlb vs µ: blue dots; ˜ µ vs µ: red stars

0.5 1 1.5 2 2.5 3 x 10

−3

0.5 1 1.5 2 2.5 3 x 10

−3

CMA Harrachov 2007 — 25 of 27

slide-50
SLIDE 50

Backward Perturbation Analysis > Numerical experiments and conclusion

Sample test result: δx = 10−1

µlb vs µ: blue dots; ˜ µ vs µ: red stars

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.005 0.01 0.015 0.02 0.025 0.03 0.035

CMA Harrachov 2007 — 26 of 27

slide-51
SLIDE 51

Backward Perturbation Analysis > Numerical experiments and conclusion

Summary and future work

Given an approximate STLS solution 0 = y ∈ Rn, we have found: a pseudo-minimal backward error µ (if y is close enough to ˆ x, this is the minimal backward error) a lower bound for µ (a good, cheap estimate) an asymptotic estimate for µ (an excellent estimate)

CMA Harrachov 2007 — 27 of 27

slide-52
SLIDE 52

Backward Perturbation Analysis > Numerical experiments and conclusion

Summary and future work

Given an approximate STLS solution 0 = y ∈ Rn, we have found: a pseudo-minimal backward error µ (if y is close enough to ˆ x, this is the minimal backward error) a lower bound for µ (a good, cheap estimate) an asymptotic estimate for µ (an excellent estimate) Future work: find constant bounds for ˜ µ/µ find a cheap way to compute the asymptotic estimate use these results to design stopping criteria for iterative methods

CMA Harrachov 2007 — 27 of 27

slide-53
SLIDE 53

Backward Perturbation Analysis > Numerical experiments and conclusion

Summary and future work

Given an approximate STLS solution 0 = y ∈ Rn, we have found: a pseudo-minimal backward error µ (if y is close enough to ˆ x, this is the minimal backward error) a lower bound for µ (a good, cheap estimate) an asymptotic estimate for µ (an excellent estimate) Future work: find constant bounds for ˜ µ/µ find a cheap way to compute the asymptotic estimate use these results to design stopping criteria for iterative methods Thank you for your attention!

CMA Harrachov 2007 — 27 of 27