1 ... v 1 u 1 | | u m . A = - - PowerPoint PPT Presentation

1 v 1 u 1 u m a u v
SMART_READER_LITE
LIVE PREVIEW

1 ... v 1 u 1 | | u m . A = - - PowerPoint PPT Presentation

Lesson 20 S INGULAR V ALUE D ECOMPOSITION Given a matrix A C m n , m n , the singular value decomposition (SVD) is the factorization 1 ... v 1 u 1 | | u m . A =


slide-1
SLIDE 1

SINGULAR VALUE DECOMPOSITION

Lesson 20

slide-2
SLIDE 2
  • Given a matrix A ∈ Cm×n, m ≥ n, the singular value decomposition (SVD) is the

factorization A = UΣV = u1 | · · · | um

  • σ1

... σn . . .

  • v

1

. . . v

n

  • where σk ≥ 0 and U, V are unitary
  • The SVD is fundamental in applied mathematics
  • Some applications (2 of thousands):

Image compression Principle component analysis in statistics

slide-3
SLIDE 3
  • Given a matrix A ∈ Cm×n, m ≥ n, the singular value decomposition (SVD) is the

factorization A = UΣV = u1 | · · · | um

  • σ1

... σn . . .

  • v

1

. . . v

n

  • where σk ≥ 0 and U, V are unitary
  • The SVD is fundamental in applied mathematics
  • Some applications (2 of thousands):

Image compression Principle component analysis in statistics

slide-4
SLIDE 4

CONNECTION WITH 


EIGENVALUE DECOMPOSITION OF 
 SYMMETRIC MATRIX

slide-5
SLIDE 5
  • Suppose A ∈ Rnn is symmetric, A = A, with eigenvectors,eigenvalues Avk =

λkvk

  • Recall that the eigenvectors

are orthogonal (assume are distinct):

  • We can normalize

so that is orthogonal and satisfies i.e.,

  • We aren't quite done: we have to make sure

is positive:

  • i.e., the singular values are the absolute value of the eigenvalues
slide-6
SLIDE 6
  • Suppose A ∈ Rnn is symmetric, A = A, with eigenvectors,eigenvalues Avk =

λkvk

  • Recall that the eigenvectors {v1, . . . , vn} are orthogonal (assume λk are distinct):

v

kvj = 1

λk (Avk)vj = 1 λk v

kAvj

  • We can normalize

so that is orthogonal and satisfies i.e.,

  • We aren't quite done: we have to make sure

is positive:

  • i.e., the singular values are the absolute value of the eigenvalues
slide-7
SLIDE 7
  • Suppose A ∈ Rnn is symmetric, A = A, with eigenvectors,eigenvalues Avk =

λkvk

  • Recall that the eigenvectors {v1, . . . , vn} are orthogonal (assume λk are distinct):

v

kvj = 1

λk (Avk)vj = 1 λk v

kAvj

= λj λk v

kvj

  • We can normalize

so that

  • is orthogonal and

satisfies i.e.,

  • We aren't quite done: we have to make sure

is positive:

  • i.e., the singular values are the absolute value of the eigenvalues
slide-8
SLIDE 8
  • Suppose A ∈ Rnn is symmetric, A = A, with eigenvectors,eigenvalues Avk =

λkvk

  • Recall that the eigenvectors {v1, . . . , vn} are orthogonal (assume λk are distinct):

v

kvj = 1

λk (Avk)vj = 1 λk v

kAvj

= λj λk v

kvj

  • We can normalize qk := vk so that Q =

q1 | · · · | qn

  • is orthogonal and

satisfies AQ = QΛ i.e., A = QΛQ

  • We aren't quite done: we have to make sure

is positive:

  • i.e., the singular values are the absolute value of the eigenvalues
slide-9
SLIDE 9
  • Suppose A ∈ Rnn is symmetric, A = A, with eigenvectors,eigenvalues Avk =

λkvk

  • Recall that the eigenvectors {v1, . . . , vn} are orthogonal (assume λk are distinct):

v

kvj = 1

λk (Avk)vj = 1 λk v

kAvj

= λj λk v

kvj

  • We can normalize qk := vk so that Q =

q1 | · · · | qn

  • is orthogonal and

satisfies AQ = QΛ i.e., A = QΛQ

  • We aren't quite done: we have to make sure Λ is positive:

A = Q

  • |λ1|

... |λn|

  • λ1

... λn

  • Q

= QΣV , i.e., the singular values are the absolute value of the eigenvalues

slide-10
SLIDE 10

EXISTENCE AND UNIQUENESS

slide-11
SLIDE 11

: Every matrix A in Cmn has an SVD.

  • Set σ1 = A2. By compactness, we know there exists u1, v1 satisfying u1 =

v1 = 1 with Av1 = σ1u1.

  • Assume without loss of generality that

, and use the QR decomposition to factor so that is orthogonal and spans , and similarly define

  • Then we have

where

  • We have

implying that as otherwise

slide-12
SLIDE 12

: Every matrix A in Cmn has an SVD.

  • Set σ1 = A2. By compactness, we know there exists u1, v1 satisfying u1 =

v1 = 1 with Av1 = σ1u1.

  • Assume without loss of generality that e

1 v1 = 0, and use the QR decomposition

to factor v1 | e2 | · · · | en

  • = V1R1 =

v1 | · · · | vn

  • R1

so that V1 is orthogonal and spans Cn, and similarly define U1

  • Then we have

where

  • We have

implying that as otherwise

slide-13
SLIDE 13

: Every matrix A in Cmn has an SVD.

  • Set σ1 = A2. By compactness, we know there exists u1, v1 satisfying u1 =

v1 = 1 with Av1 = σ1u1.

  • Assume without loss of generality that e

1 v1 = 0, and use the QR decomposition

to factor v1 | e2 | · · · | en

  • = V1R1 =

v1 | · · · | vn

  • R1

so that V1 is orthogonal and spans Cn, and similarly define U1

  • Then we have

U

1 AV1 = U 1

σ1u1 | Av2 | · · · Avn

  • =

σ1 w B

  • where B Cm1n1
  • We have

implying that as otherwise

slide-14
SLIDE 14

: Every matrix A in Cmn has an SVD.

  • Set σ1 = A2. By compactness, we know there exists u1, v1 satisfying u1 =

v1 = 1 with Av1 = σ1u1.

  • Assume without loss of generality that e

1 v1 = 0, and use the QR decomposition

to factor v1 | e2 | · · · | en

  • = V1R1 =

v1 | · · · | vn

  • R1

so that V1 is orthogonal and spans Cn, and similarly define U1

  • Then we have

U

1 AV1 = U 1

σ1u1 | Av2 | · · · Avn

  • =

σ1 w B

  • where B Cm1n1
  • We have
  • σ1

w B σ1 w

  • σ2

1 + ww =

  • σ2

1 + ww

  • σ1

w

  • ,

implying that w = as otherwise A

  • σ2

1 + ww > σ1

slide-15
SLIDE 15
  • Thus we have

U

1 AV1 =

σ1 B

  • Assume (induction) that we have B = U2Σ2V

2

  • Thus
  • Induction and the fact that

case is trivial completes the construction

slide-16
SLIDE 16
  • Thus we have

U

1 AV1 =

σ1 B

  • Assume (induction) that we have B = U2Σ2V

2

  • Thus

A = U1 1 U2 σ1 Σ2 1 V

2

  • V

1 = UΣV

  • Induction and the fact that 1 × 1 case is trivial completes the construction
slide-17
SLIDE 17

: The σj are uniquely determined, and if they are distinct, then U and V are uniquely determined up to sign.

slide-18
SLIDE 18

SOME PROPERTIES

slide-19
SLIDE 19

: A2 = σ1

  • u=1

Au =

u=1

QΣV u =

v=1

Σv = σj = σ1 : The rank of is , the number of non-zero singular values

  • :
  • and
  • : If

is square,

slide-20
SLIDE 20

: A2 = σ1

  • u=1

Au =

u=1

QΣV u =

v=1

Σv = σj = σ1 : The rank of A is r, the number of non-zero singular values A = Σ = r :

  • and
  • : If

is square,

slide-21
SLIDE 21

: A2 = σ1

  • u=1

Au =

u=1

QΣV u =

v=1

Σv = σj = σ1 : The rank of A is r, the number of non-zero singular values A = Σ = r : (A) = {u1, . . . , ur} and (A) = {vr+1, . . . , vn} : If is square,

slide-22
SLIDE 22

: A2 = σ1

  • u=1

Au =

u=1

QΣV u =

v=1

Σv = σj = σ1 : The rank of A is r, the number of non-zero singular values A = Σ = r : (A) = {u1, . . . , ur} and (A) = {vr+1, . . . , vn} : If A is square, | A| = n

k=1 σk

A = U Σ V = ± Σ

slide-23
SLIDE 23

: The non-zerro singular values of A are the square roots of the non-zero eigenvalues of AA or AA AA = (UΣV )UΣV = V ΣU UΣV = V ΣΣV

slide-24
SLIDE 24

: The non-zerro singular values of A are the square roots of the non-zero eigenvalues of AA or AA AA = (UΣV )UΣV = V ΣU UΣV = V ΣΣV AA = UΣV (UΣV ) = UΣV V ΣU = UΣΣU

slide-25
SLIDE 25

SVD AS BEST LOW-RANK APPROXIMATION

slide-26
SLIDE 26
  • We define the outer product of two vectors

a =

  • a1

. . . am

  • and

b =

  • b1

. . . bn

  • using the (not dot product!!) notation

ab :=

  • a1¯

b1 . . . a1¯ bn . . . ... . . . am¯ b1 · · · am¯ bn

  • The outer product is a rank-1 matrix, as verified by the trivial SVD:

where and

slide-27
SLIDE 27
  • We define the outer product of two vectors

a =

  • a1

. . . am

  • and

b =

  • b1

. . . bn

  • using the (not dot product!!) notation

ab :=

  • a1¯

b1 . . . a1¯ bn . . . ... . . . am¯ b1 · · · am¯ bn

  • The outer product is a rank-1 matrix, as verified by the trivial SVD:

ab = U

  • a b

... . . .

  • V

where U = a/ a | · · · and V = b/ b | · · ·

slide-28
SLIDE 28
  • Note that if A ∈ Cµ×m then we have

A(ab) = A ¯ b1a | · · · | ¯ bna = ¯ b1Aa | · · · | ¯ bnAa = (Aa)b

  • Similarly, if B ∈ C×n then (ab)B = a(Bb)
  • The SVD can thus be viewed as a sum of rank-1 matrices:
slide-29
SLIDE 29
  • Note that if A ∈ Cµ×m then we have

A(ab) = A ¯ b1a | · · · | ¯ bna = ¯ b1Aa | · · · | ¯ bnAa = (Aa)b

  • Similarly, if B ∈ C×n then (ab)B = a(Bb)
  • The SVD can thus be viewed as a sum of rank-1 matrices:

A = UΣV =

n

  • k=1

σkUeke

kV

slide-30
SLIDE 30
  • Note that if A ∈ Cµ×m then we have

A(ab) = A ¯ b1a | · · · | ¯ bna = ¯ b1Aa | · · · | ¯ bnAa = (Aa)b

  • Similarly, if B ∈ C×n then (ab)B = a(Bb)
  • The SVD can thus be viewed as a sum of rank-1 matrices:

A = UΣV =

n

  • k=1

σkUeke

kV

=

n

  • k=1

σkukv

k

slide-31
SLIDE 31

: For any ν = 0, . . . , r, the partial sums of the SVD A :=

  • k=1

σkukv

k

is the best rank-ν approximation to A: A A =

  • (B)≤ A B = σ+1
slide-32
SLIDE 32
  • We first note for some (orthogonal) permutation matrices P1 and P2

AA = U

  • ...

σ+1 ... σn . . .

  • V = UP
  • σ+1

... σn ... . . .

  • P V

so that A A = σ+1

  • Suppose that there exists some

with rank at most satisfying

  • has a kernel
  • f dimension at least

, so that for all we have

  • But for

which has dimension we have, for

  • These two subspaces can't intersect, so we have a contradiction
slide-33
SLIDE 33
  • We first note for some (orthogonal) permutation matrices P1 and P2

AA = U

  • ...

σ+1 ... σn . . .

  • V = UP
  • σ+1

... σn ... . . .

  • P V

so that A A = σ+1

  • Suppose that there exists some B Cm×n with rank at most ν satisfying

A B < σ+1

  • B has a kernel W of dimension at least n ν, so that for all w W we have

Aw = (A B)w < σ+1 w

  • But for

which has dimension we have, for

  • These two subspaces can't intersect, so we have a contradiction
slide-34
SLIDE 34
  • We first note for some (orthogonal) permutation matrices P1 and P2

AA = U

  • ...

σ+1 ... σn . . .

  • V = UP
  • σ+1

... σn ... . . .

  • P V

so that A A = σ+1

  • Suppose that there exists some B Cm×n with rank at most ν satisfying

A B < σ+1

  • B has a kernel W of dimension at least n ν, so that for all w W we have

Aw = (A B)w < σ+1 w

  • But for (v1, . . . , v+1) which has dimension ν+1 we have, for v = +1

k=1 αkvk

Av =

  • +1
  • k=1

σkαkvk

  • =

+1

  • k=1

σk |αk| σk+1 v

  • These two subspaces can't intersect, so we have a contradiction
slide-35
SLIDE 35

IMAGE COMPRESSION

slide-36
SLIDE 36
  • We can represent a grayscale image by a matrix A whose entries are within [0, 1]

with 0 representing black and 1 representing white

  • If the image has m pixels in the y direction and n pixels in the x direction, then

the storage requires mn numbers

  • But we can approximate using the SVD

which requires only numbers: to store and to store

slide-37
SLIDE 37
  • We can represent a grayscale image by a matrix A whose entries are within [0, 1]

with 0 representing black and 1 representing white

  • If the image has m pixels in the y direction and n pixels in the x direction, then

the storage requires mn numbers

  • But we can approximate using the SVD

A ≈ Aν =

ν

  • k=1

σkukv

k

which requires only (m + n)ν numbers: mν to store u1, . . . , uν and nν to store σ1v1, . . . , σνvν

slide-38
SLIDE 38

88,804 unknowns 5,960 unknowns 11,920 unknowns 29,800 unknowns