lgebra Linear e Aplicaes MATRIX ALGEBRA Basic definitions A scalar - - PowerPoint PPT Presentation
lgebra Linear e Aplicaes MATRIX ALGEBRA Basic definitions A scalar - - PowerPoint PPT Presentation
lgebra Linear e Aplicaes MATRIX ALGEBRA Basic definitions A scalar is complex number (unless specified) n -tuples of complex numbers denoted by C n n -tuples of real numbers denoted by R n ( R 2 is the space of points in the
MATRIX ALGEBRA
Basic definitions
- A scalar is complex number (unless specified)
- n-tuples of complex numbers denoted by Cn
- n-tuples of real numbers denoted by Rn
(R2 is the space of points in the plane etc.)
- Rm×n and Cm×n denote the set of real and
complex m × n matrices, respectively
- Equality A = B
- Iff [A]ij = [B]ij and same shape!
v = ✓1 2 ◆ 6= u = 1 2
Addition
- Addition A + B
- [A+B]ij = [A]ij + [B]ij
- Same shape
- Matrix addition vs. scalar addition
- Addition inverse –A
- [–A]ij = –[A]ij
- Matrix difference A – B
- [A–B]ij = [A]ij – [B]ij
- Same shape!
✓ −2 x 3 z + 3 4 −y ◆ + ✓ 2 1 − x −2 −3 4 + x 4 + y ◆ = ✓0 1 1 z 8 + x 4 ◆
Addition properties
- Let A, B, and C be m × n matrices
- Closure: A + B is still an m × n matrix
- Associative: (A + B) + C = A + (B + C)
- Commutative: A + B = B + A
- Additive identity: [0]ij = 0 ⇒A + 0 = A
- Additive inverse: A + (–A) = 0
Scalar multiplication and properties
- Scalar multiplication
- Closure: is still an m × n matrix
- Associative:
- Distributive:
- Distributive:
- Identity:
- Where 1 is a scalar!
αA
[αA]ij = α[A]ij
αA (αβ)A = α(βA) α(A + B) = αA + αB (α + β)A = αA + βA 1A = A
Transpose of a matrix
- Transpose
- A is m × n, AT is n × m
- Conjugate matrix
- Conjugate transpose
- Also called adjoint
AT
[AT ]ij = [A]ji [ ¯ A]ij = [A]ji
¯ A A∗ = ¯ AT = AT
✓1 2 3 4 5 6 ◆T = @ 1 4 2 5 3 6 1 A ✓1 − 4i i 2 3 2 + i ◆∗ = @ 1 + 4i 3 −i 2 − i 2 1 A
Properties of transpose
- A∗∗ = A
- AT T = A
(A + B)T = AT + BT (A + B)∗ = A∗ + B∗ (αA)T = αAT (αA)∗ = ¯ αA∗
⇥ (αA)∗⇤
ij =
⇥ (αA)
T ⇤ ij
= [αA]ji = [αA]ji = α[A]ji = ¯ α[ ¯ A]ji = ¯ α[( ¯ A)
T ]ij = ¯
α[A∗]ij
Symmetries
- Let A = [aij] be a matrix
- A is a symmetric matrix if A = AT
- A is a skew-symmetric matrix if A = –AT
- A is a Hermitian matrix if A = A*
- Complex analogous of symmetry
- A is a skew-Hermitian matrix if A = –A*
- aij = aji
aij = −aji aij = −aji aij = aji
- Hooke’s law
- Force exerted by a spring is proportional to
displacement from rest position.
- f = –k x
Example of symmetry in action
x1 x2 x3 k1 k2 Node 1 Node 2 Node 3 F1
- F1
F3
- F3
The stiffness matrix
x1 x2 x3 k1 k2 Node 1 Node 2 Node 3 F1
- F1
F3
- F3
F2 = −F1 − F3 F3 = k2(x3 + x2) F1 = k1(x1 − x2) k1x1 −k1x2 = F1 −k1x1 + (k1 + k2)x2 − k2x3 = F2 −k2x2 + k2x3 = F3 K = k1 −k1 −k1 k1 + k2 −k2 −k2 k2
LINEARITY
Definition of linear function
- Let D and R be sets that possess the
- perations addition and product by scalar
- A function f from D to R is said to be a
linear function iff
- Equivalently, iff
f(x + y) = f(x) + f(y) f(αx) = αf(x) f(αx + y) = αf(x) + f(y)
More explicitly
- Lines or planes through the origin
- Not going through the origin?
- Not a linear function, but an affine function
- In higher dimensions
- Harder to visualize, hyper-planes through the origin
y = f(x) = αx z = f(x, y) = αx + βy z = f(x, y) = αx + βy + γ y = f(x) = αx + β f(x1, x2, . . . , xn) = α1x1 + α2x2 + · · · + αnxn
Linearity in other contexts
- Derivative operator
- Integral operator
- Matrix transpose operator
d(αf + g) dx = α d f dx + dg dx Z (αf + g)dx = α Z f dx + Z g dx (αA + B)T = αAT + BT
Linear systems and linear functions
- Think of the linear system
- as an equation f(x) = u from Rn to Rm where
- , , and
a11x1 + a12x2 + · · · a1nxn = u1 a21x1 + a22x2 + · · · a2nxn = u2 . . . am1x1 + am2x2 + · · · amnxn = um
x = x1 x2 . . . xn u = u1 u2 . . . um [f(x)]i =
n
X
j=1
aij[x]j = [u]i
Linear systems and linear functions
- Is f linear?
- It is linear
- Solving a linear system is the inverse of
applying a linear function
- What is the x such that f(x) = u?
[f(αx + y)]i =
n
X
j=1
aij(α[x]j + [y]j) = α
n
X
j=1
aij[x]j +
n
X
j=1
aij[y]j = [αf(x) + f(y)]i
[f(x)]i =
n
X
j=1
aij[x]j = [u]i
=
n
X
j=1
aij[αx + y]j
Linear functions (Rn to Rm) and matrices
- Let and
- Then and
- Now organize all in a matrix
x = x1 x2 · · · xn x =
n
X
j=1
xjej f(x) = f @
n
X
j=1
xjej 1 A =
n
X
j=1
xjf(ej) f(ej) = a1j a2j . . . amj e1 = 1 . . . , e2 = 1 . . . , · · · , en = . . . 1 A = B B B @ a11 a12 · · · a1n a21 a22 · · · a2n . . . . . . . . . am1 am2 · · · amn 1 C C C A = f(e1) f(e2) · · · f(en)
MATRIX MULTIPLICATION
What about matrix multiplication?
- An m × n matrix is a neat way to specify a linear
function from Rn to Rm
- Our definitions of addition and multiplication by
scalar work well with this interpretation
- How would you define matrix multiplication?
- Point-wise definition is not very useful
- (f g) is not even linear anymore!
- Matrix element-wise multiplication not good either.
(fg)(x) = f(x)g(x) (αf)(x) = α
- f(x)
- (f + g)(x) = f(x) + g(x)
“Multiplication” as composition
- Proposed by Arthur Cayley in 1855
- Very recent!
- Linear functions are closed under composition
- And, composition is associative
- This is promising
(g f)(αx + y) = g
- f(αx + y)
- = g
- αf(x) + f(y)
- = αg
- f(x)
- + g
- f(y)
- = α (g f)(x) + (g f)(y)
Potential issues
- Function composition is not commutative
- So matrix multiplication not commutative either
- Not all function compositions make sense
- Matrices must be compatible
- An m × n matrix is a function from Rn to Rm
- Product between m × n and n × p matrices is OK
- But what is the resulting matrix?
g f 6= f g f : A → B g: B → C
Matrix multiplication
- Let g: A = [aij]m×n and f: B = [bjk]n×p
- Consider g ∘ f: C = [cik]m×p
⇥ g(y) ⇤
i = n
X
j=1
aij[y]j ⇥ f(x) ⇤
j = p
X
k=1
bjk[x]k [(g f)(x) ⇤
i =
h g
- f(x)
i
i = n
X
j=1
aij ⇥ f(x) ⇤
j
=
n
X
j=1
aij
p
X
k=1
bjk[x]k =
p
X
k=1 n
X
j=1
aijbjk[x]k =
p
X
k=1
cik[x]k cik =
n
X
j=1
aijbjk =
p
X
k=1
@
n
X
j=1
aijbjk 1 A [x]k
Step by step multiplication
- A = [aij]m×n , B = [bjk]n×p , C = AB = [cik]m×p
cij =
n
X
k=1
aikbkj
· · · aik · · · cij . . . bkj . . .
=
n n m p p m
Example #1
- Multiplication of a row and a column
- Result is a scalar value
- Also known as the scalar product of r and c
- Also standard inner product, or dot product
rT = r1 r2 · · · rn
- rT c = r1c1 + r2c2 + · · · + rncn =
n
X
i=1
rici c = c1 c2 . . . cn
More insight into multiplication
- A = [aij]m×n , B = [bjk]n×p , C = AB = [cik]m×p
cij =
n
X
k=1
aikbkj
· · · aik · · · cij . . . bkj . . .
=
= Ai∗B∗j
Example #2
- Multiplication of a column and a row
- Result is an m × n matrix
- Also known as the outer product of c and r
- What is the rank of the result?
rT = r1 r2 · · · rn
- c rT =
c1r1 c1r2 · · · c1rn c2r1 c2r2 · · · c2rn . . . . . . . . . cmr1 cmr2 · · · cmrn c = c1 c2 . . . cm
Example #3
- Two matrices
H = FG = ✓αa + βd αb + βe αc + βf γa + δd γb + δe γc + δf ◆ G = ✓a b c d e f ◆ F = ✓α β γ δ ◆
How expensive is multiplication?
- Let A = [aij]m×n, B = [bjk]n×p, C = AB = [cik]m×p
- n multiplications and (n–1) additions per cij
- There are m × p entries cij
- Total cost is
- m × n × p multiplications
- m × (n-1) × p additions
- Or n3 for two square matrices of size n
cij =
n
X
k=1
aikbkj
Rows and columns of result
- Let A = [aij]m×n and B = [bjk]n×p
- Consider each row and column of result in turn
- Rows of AB are linear combinations of rows of B
- Columns of AB are linear combinations of columns of A
[AB]i∗ = Ai∗B [AB]∗j = AB∗j = ai1B1∗ + ai2B2∗ + · · · + ainBn∗ =
n
X
k=1
aikBk∗ = A∗1b1j + A∗2b2j + · · · + A∗nbnj =
n
X
k=1
A∗kbkj
Linear systems
- Every linear system of m equations and n unknowns
Can be written as a matrix equation Ax = b where
a11x1 + a12x2 + · · · a1nxn = b1 a21x1 + a22x2 + · · · a2nxn = b2 . . . am1x1 + am2x2 + · · · amnxn = bm
A = a11 a12 · · · a1n a21 a22 · · · a2n . . . . . . ... . . . am1 am2 · · · amn x = x1 x2 . . . xn b = b1 b2 . . . bm
Column/row view of linear systems
- Row view of system
- Geometrically
- The column view is
- Geometrically
x − y = 1 x + 2y = 4 ✓1 1 ◆ x + ✓−1 2 ◆ y = ✓1 4 ◆
Multiplication properties
- Left-hand distributive:
- Right-hand distributive:
- Proof similar to above
- Associative:
- Proof obvious from definition
- Function composition is associative
A(B + C) = AB + AC (D + E)F = DF + EF A(BC) = (AB)C
⇥ A(B + C) ⇤
ij =
X
k
[A]ik[B + C]kj = X
k
[A]ik
- [B]kj + [C]kj
- = [AB]ij + [AC]ij
= X
k
[A]ik[B]kj + X
k
[A]ik[C]kj
Beware of pitfalls
- What is ?
- Not the same as !
- Simpler or more complicated than scalars?
- How many terms? Which terms? Coefficients?
(A + B)2 A2 + AB + BA + B2 A2 + 2AB + B2 (A + B)(A + B)= A(A + B) + B(A + B) =
Example #4
- Interested in studying immigration patterns
- Consider the transition diagram (yearly)
- What happens to the population distribution?
- All to N, all to S, some other equilibrium point?
N S
.25 .5 .5 .75
Example #4
- Modeling with equations:
- In matrix form:
N S
.25 .5 .5 .75
pt = ✓nt st ◆ pt = ✓.5 .25 .5 .75 ◆ pt−1 = T pt−1 = T2 pt−2 = · · · = Tt p0 nt+1 = .5nt + .25st st+1 = .5nt + .75st nt+2 = ? st+2 = ?
Example #4
- General term
- In the limit
- Look at powers of T
- Looks like
- Which means
pt = Tt p0 lim
t→∞ pt = lim t→∞ Tt p0
T = ✓.500 .250 .500 .750 ◆ lim
t→∞ Tt =
✓1/3 2/3 1/3 2/3 ◆ T2 = ✓.375 .312 .625 .687 ◆ T4 = ✓.328 .332 .672 .668 ◆ T3 = ✓.344 .328 .656 .672 ◆ T5 = ✓.334 .333 .666 .667 ◆ T6 = ✓.333 .333 .667 .667 ◆ lim
t→∞ pt =
✓1/3 1/3 2/3 2/3 ◆ ✓n0 s0 ◆ = ✓ 1
3n0 + 1 3s0 2 3n0 + 2 3s0
◆
Can we find this limit directly?
- Go back to linear system
- What does it look like in the limit?
- Necessary condition for the limit to exist?
- For a generic transition matrix
- Do all transition matrices have this property?
- How do we find the solution?
Reverse rule for transposition
- For conformable matrices A and B
- Proof
- What can we say about ATA and AAT ?
(AB)T = BT AT (AB)∗ = B∗A∗ ⇥ (AB)T ⇤
ij = [AB]ji = [A]j∗[B]∗i =
X
k
[A]jk[B]ki = X
k
[B]ki[A]jk = X
k
⇥ BT ⇤
ik
⇥ AT ⇤
kj
= ⇥ BT ⇤
i∗
⇥ AT ⇤
∗j =
⇥ BT AT ⇤
ij
Trace of a matrix
- Let A = [aij] be an n × n matrix
trace(A) =
- Show that trace is a linear function
- Show that trace(AB) = trace(BA)
- Even though dimensions are different!
a11 + a22 + · · · + ann =
n
X
i=1
aii
Block matrix-multiplication
- Partition matrices A and B into blocks
- If the pairs (Aik, Bkj) are conformable, then
block (i,j) of the product matrix AB is
- Formula almost as if blocks were scalars!
A11 A12 · · · A1r A21 A22 · · · A2r . . . · · · ... . . . As1 As2 · · · Asr B11 B12 · · · B1t B21 B22 · · · B2t . . . · · · ... . . . Br1 Br2 · · · Brt Ai1B1j + Ai2B12 + · · · + AirB1r =
r
X
k=1
AikBkj
Reducible systems
- Want to solve T x = b with T block-triangular
and partition x and b conformally
- From block multiplication we have
- Reduces to two smaller linear systems
- What about the computational cost of solving
the partitioned system?
T = ✓A B C ◆ x = ✓x1 x2 ◆ b = ✓b1 b2 ◆ Ax1 + Bx2 = b1 Cx2 = b2
Derivative of a matrix product
- Let each entry of A(t) and B(t) be a differentiable function
- f real variable t and define
- Find where C = AB
⇥ d(AB)(t)/dt ⇤
ij = d
⇥ (AB)(t) ⇤
ij/dt = d
"X
k
[A]ik(t)[B]kj(t) # /dt = X
k
⇣ d
- [A(t)]ik
- /dt
⌘ [B(t)]kj + X
k
[A(t)]ik ⇣ d
- [B(t)]kj
- /dt
⌘ = X
k
d
- [A(t)]ik[B(t)]kj
- /dt
⇥ dA(t)/dt ⇤
ij = d
- [A(t)]ij
- /dt
= X
k
[dA(t)/dt]ik[B(t)]kj + X
k
[A(t)]ik[dB(t)/dt]kj = h dA(t)/dt
- B
i
ij +
h A
- dB(t)/dt
i
ij
dC(t)/dt
MATRIX INVERSION
Scalar analogue of linear system
- When α ≠ 0, and for every β, the equation
α x = β
- has a solution x = α-1 β, since
α (α-1 β) = (α α-1) β = (1) β = β
- Solution is unique, since
α x1 = β = α x2 ⇒ α-1 (α x1) = α-1 (α x2) ⇒ (α-1 α) x1 = (α-1 α) x2 ⇒ (1) x1 = (1) x2 ⇒ x1 = x2
- Want the same for matrices!
What do we need?
- Need a matrix I analogue to the scalar 1,
identity under multiplication
- AI = A
- IA = A
- Also need a matrix analogue A-1 to the scalar
multiplication inverse α-1
- A-1A = I just like α α-1 = 1
- AA-1 = I just like α-1 α = 1
= A∗j
= [I]jjAj∗
=
n
X
i=1
[I]ijA∗j
Identity Matrix
- An n × n matrix with 1s in the diagonal,
and 0s otherwise
- For every m × n matrix A,
ImA = A and A In = A
- Proof
In = 1 · · · 1 · · · . . . . . . ... . . . · · · 1 [AI]∗j = AI∗j
Matrix inversion
- Let A and B be square matrices
- If AB = I and BA = I,
then B = A-1 is the inverse of A
- Not all matrices A have inverses
- Not all scalars α have inverses either
- If A has an inverse, it is said to be nonsingular
- If A has no inverse, it said to singular
- What happens if A is rectangular?
How to compute an inverse?
- From the definition
- Solve for each column separately
AA−1 = I ⇥ AA−1⇤
∗i = [I]∗i
A ⇥ A−1⇤
∗i = [I]∗i
How to compute an inverse
- Use Gauss-Jordan to reduce
[A|I] to [I|A-1]
- How expensive is the procedure?
- About n3 multiplications/divisions
- About n3 additions/subtractions
- Gaussian elimination followed by substitution
gives similar operation count
- Elimination with n right-hand-sides
- Back-substitution with n right-hand-sides
Example #1
- General 2 × 2 case
✓a b 1 c d 1 ◆ ✓a b 1 d − bc/a −c/a 1 ◆ ✓a b 1 1 −c/(ad − bc) a/(ad − bc) ◆ ✓a 1 + bc/(ad − bc) −ab/(ad − bc) 1 −c/(ad − bc) a/(ad − bc) ◆ ✓a ad −ab 1 −c a ◆ 1 ad − bc ✓1 d −b 1 −c a ◆ 1 ad − bc
Existence of inverse
- These conditions are equivalent
1. A-1 exists (A is nonsingular) 2. rank(A) = n 3. Gauss-Jordan on A leads to I 4. A x = 0 implies x = 0
- 2 ⇔ 3 (definition of rank)
- 2 ⇔ 4 (when studying homogeneous systems)
- What about 1 ⇔ 4?
Existence of Inverse
- A-1 exists ⇔ (A x = 0 implies x = 0)
- Proof
- Proof
has a unique solution for each i, so But is ?
(⇒) (⇐)
⇒ A−1(Ax) = A−10 Ax = 0 ⇒ (A−1A)x = 0 ⇒ Ix = 0 ⇒ x = 0 Axi = I∗i AX = I XA = I A(XA − I) = AXA − A = IA − A = 0 A[XA − I]∗i = 0 ⇒ [XA − I]∗i = 0 ⇒ XA = I XA − I = 0
Properties of Inverse
- Let A and B be nonsingular matrices
- The product AB is also nonsingular
- Reverse rule for matrix inversion
- and
- A−1−1 = A
(AB)−1 = B−1A−1
- A−1T =
- AT −1
- A−1∗ =
- A∗−1
Derivative of inverse
- Let entry of A(t) be a differentiable function
- f real variable t and define
- Assume nonsingular A(t) and find
⇥ dA(t)/dt ⇤
ij = d
- [A(t)]ij
- /dt
d
- A−1(t)
- /dt
A(t) A−1(t) = I d
- A(t) A−1(t)
- /dt = 0
dA(t)/dt A−1(t) + A d
- A−1(t)
- /dt = 0
d
- A−1(t)
- /dt = −A−1(t)
- dA(t)/dt
- A−1(t)
A d
- A−1(t)
- /dt = −dA(t)/dt A−1(t)
ELEMENTARY MATRICES
Elementary operations as matrix multiplications
- Consider the matrix product row by row
- Can we find matrix A corresponding to each
elementary row operation on B?
- Exchange two rows
- Multiply a row by a non-zero constant
- Add a multiple of a row into another
[AB]i∗ = Ai∗B = ai1B1∗ + ai2B2∗ + · · · + ainBn∗ =
n
X
k=1
aikBk∗
Exchanging two rows
- Let us try to exchange rows r and s
- First, rows that do not change (i,j ≠ r , i,j ≠ s)
- aii = 1 and aij = 0, i ≠ j,
- Now rows being replaced
- row r: ars = 1, arj = 0, j ≠ s
- row s: asr = 1, ais = 0, i ≠ r
- A is simply I with same rows exchanged!
[AB]i∗ = Ai∗B = ai1B1∗ + ai2B2∗ + · · · + ainBn∗ =
n
X
k=1
aikBk∗
What about other operations?
- Multiply row i by α ≠ 0?
- I with aii replaced by α
- Add α times row i to row j?
- I with aji replaced by α
- What about column operations?
- Multiply on the right side!
Elementary matrices
- These are elementary matrices
- Type I: row/column exchange
- Type II: multiply row/column
- Type III: add multiple or row/column to another
- Can all be written in the form
- Can we invert a matrix of this form?
- Start from
- So elementary matrices are invertible
E = I − uvT
E−1 = I − αuvT E−1 = I 1 vT u 1uvT , vT u 6= 1
Why? What happens?
Products of Elementary Matrices
- A is a non-singular matrix iff it is the product
- f elementary matrices of types I, II, and III
- Proof
- Gauss-Jordan reduces A to I using row operations
- Converse is obvious
- Product of non-singular matrices is non-singular
Gk · · · G2G1A = I ⇒ G−1
1 G−1 2
· · · G−1
k
= A
Reduced row-echelon form is unique
- In other words
- First thing to notice
- C is not necessarily the identity
- Nevertheless, U = V!
- Proof by induction
Em · · · E2E1A = U Fn · · · F2F1A = V
- ⇒ U = V
C = (Em · · · E2E1)(Fm · · · F2F1)−1 U = CV V = C−1U
1 ∗ ∗ ∗ ∗ 1 ∗ ∗ ∗ 1 ∗ ∗ ∗ 1 ∗
Zero columns do not matter
…
U∗i = 0 ⇔ V∗i = 0 U∗i = CV∗i
U V
…
Basis
I1 …
U V
I1 …
U∗1 = I1 = V∗1 ⇒ C∗1 = I1 CV∗1 = U∗1 ⇒ CI1 = I1
Basis
I1 …
U V
I1 … αI1 βI1
αI1 = U∗2 = CV∗2 = βC∗1 = βI1 α = β U∗2 = V∗2
Intuition for inductive step
I1 I2 …
U V
I1 I2 …
C∗2 = I2 U∗2 = I2 = V∗2
Inductive step
U V
I1 I2 … Ii … α1 α2 . . . αi . . . I1 I2 … Ii … β1 β2 βi . . . . . .
C∗j = Ij = U∗j = V∗j, 1 ≤ j ≤ i U∗(i+1) = CV∗(i+1) =
i
X
j=1
βjC∗j =
i
X
j=1
βjIj = V∗(i+1)
Concluding the inductive step
U V
I1 I2 … Ii Ii+1 … I1 I2 … Ii Ii+1 …
U∗(i+1) = Ii+1 = V∗(i+1) CV∗(i+1) = U∗(i+1) ⇒ CIi+1 = Ii+1 ⇒ C∗(i+1) = I(i+1)
Equivalence
- Let P and Q be non-singular matrices
- And therefore products of elementary matrices
- Equivalence
- Row equivalence
- Column equivalence
A ∼ B ⇔ PAQ = B A
row
∼ B ⇔ PA = B A
col
∼ B ⇔ AQ = B
Rank normal form
- Start from reduced row-echelon form
- Use column operations to simplify even more
- Final form is
1 ∗ ∗ ∗ ∗ 1 ∗ ∗ ∗ 1 ∗ ∗ ∗ 1 ∗
PA = EA
PAQ = ✓ Ir ◆
1 1 1 1
PAF1 · · · Fn
1 1 1 1
PAF1 · · · FnG1 · · · Gm
Properties of equivalence
- Row equivalence preserves column relations
- Column equivalence preserve row relations
- Testing for equivalence
- rank(A) = rank(B)
- Multiplication by non-singular matrix preserves rank
A
row
∼ B ⇔ EA = EB A
col
∼ B ⇔ EAT = EBT A ∼ B ⇔
A ∼ B Nr ∼ ∼ Ns Nr ∼ Ns ⇒ r = s A ∼ Nr ∼ B (⇒) (⇐) A
row
∼ EA (⇒)
row
∼ B ⇒ EA = EB (⇐) A
row
∼ EA = EB
row
∼ B
Rank of transpose
- Transposition does not change rank
- Not very intuitive
- Use equivalence
- A ∼ Nr
PAQ = Nr QT AT PT = NT
r
AT ∼ NT
r
rank(A) = rank(Nr) = rank(NT
r ) = rank(AT )
THE LU DECOMPOSITION
Back to Gaussian Elimination
- Assume for now there are no zero pivots
- Row operations are products by elementary matrices
- So we have and finally
= 1 −2 1 5 −4 1 L = 1 2 1 3 4 1
A = (G3G2G1)−1U = G−1
1 G−1 2 G−1 3 U = LU
G3G2G1A = U
G3G2G1 = 1 1 −4 1 1 1 −3 1 1 −2 1 1 − → 2 2 2 3 3 4 = U 2 2 2 4 7 7 6 18 22 R2 − 2R1 R3 − 3R1 − → 2 2 2 3 3 12 16 R3 − 4R2
Back to Gaussian Elimination
- Assume for now there are no zero pivots
- How to transform U back to A?
- Use inverse of row operations in inverse order
− → 2 2 2 3 3 4 = U 1 1 4 1 2 2 2 3 3 4 = 2 2 2 3 3 12 16 1 2 1 3 1 2 2 2 3 3 12 16 = 2 2 2 4 7 7 6 18 22 2 2 2 4 7 7 6 18 22 R2 − 2R1 R3 − 3R1 − → 2 2 2 3 3 12 16 R3 − 4R2 1 2 1 3 4 1 2 2 2 3 3 4 = 2 2 2 4 7 7 6 18 22 L = 1 2 1 3 4 1
A = LU
Quick proof by induction
- Assume we completed i steps
- Starting step i+1
- Look at the product column by column
- Or look by blocks
1 · · · · · · . . . ... . . . . . . . . . ... . . . · · · 1 · · · · · · α1 1 · · · · · · α2 1 · · · . . . ... . . . . . . . . . ... . . . · · · αi+1 · · · 1 1 · · · · · · . . . ... . . . . . . . . . ... . . . · · · 1 · · · · · · 1 · · · · · · ∗ 1 · · · . . . ... . . . . . . . . . ... . . . · · · ∗ ∗ · · · 1 = 1 · · · · · · . . . ... . . . . . . . . . ... . . . · · · 1 · · · · · · α1 1 · · · · · · α2 ∗ 1 · · · . . . ... . . . . . . . . . ... . . . · · · αi+1 ∗ ∗ · · · 1
LU decomposition
- Let A be an n × n matrix for which
no zero pivots appear during elimination
- Then A can be factored as A = L U
- L is lower triangular and U is upper triangular
- lii = 1 and uii ≠ 0
- L and U are unique
- It is the LU factorization of A
- Proof later
Solving a linear system from L and U
- If we know L and U, we don’t need A
- Equivalent to solving two linear systems
- How expensive is to solve each of them?
- L by forward substitution, U by back substitution
- Only n2/2 each
- For each right-hand-side, only n2 operations
- Much faster than factoring, which takes n3/3!
Ly = b Ux = y Ax = LUx = b
Representing L and U
- Factorization can be done in place
- Store lij in the position it annihilates
- Overwrites what was in position (i,j) of A
- No extra memory needed
- Good for speed (caching)
u11 u12 u13 l21 u22 u23 l31 l32 u33 a11 a12 a13 a21 a22 a23 a31 a32 a33
Existence of L and U
- Two equivalent conditions to A = L U
- A zero pivot does not appear during
row reduction by Type III operations
- Each of the leading principal submatrices
Ak is nonsingular
A1 = a11
- A2 =
✓a11 a12 a21 a22 ◆ A3 = a11 a12 a13 a21 a22 a23 a31 a32 a33
Outline of proof
- Proof
- If Ak is singular, then entire last row is zero
- Including the kth pivot! Contradiction.
- Proof
- If kth pivot is zero, then all entries before are also
- zero. Ak is singular. Contradiction.
(⇒) (⇐)
Row exchanges
- Necessary to avoid zero pivots
- In fact, necessary for numerical reasons
= ˜ TkETk−1 · · · T1A EUk = ETkTk−1 · · · T1A Uk = TkTk−1 · · · T1A
Ti = 1 · · · · · · . . . ... . . . . . . . . . ... . . . · · · 1 · · · 1 · · · α1 1 · · · · · · α2 1 · · · . . . ... . . . . . . . . . ... . . . · · · αn−k · · · 1
Tk
= ˜ Tk ˜ Tk−1 · · · ˜ T1EA U = ˜ Tn ˜ Tn−1 · · · ˜ T1EpEp−1 · · · E1A = (˜ Tn ˜ Tn−1 · · · ˜ T1)(EpEp−1 · · · E1)A = L−1PA PA = LU
Proof
- Must show that
- Assume E exchanges rows i and j
i k
αj−k αi−k
1 j j 1 i
· · · · · · · · · · · ·
Tk
Ti = 1 · · · · · · . . . ... . . . . . . . . . ... . . . · · · 1 · · · 1 · · · α1 1 · · · · · · α2 1 · · · . . . ... . . . . . . . . . ... . . . · · · αn−k · · · 1
Tk
ETk = ˜ TkE
1
Proof
i k
αj−k αi−k
j j 1 i
· · · · · · · · · · · ·
ETk
- Must show that
- Assume E exchanges rows i and j
Ti = 1 · · · · · · . . . ... . . . . . . . . . ... . . . · · · 1 · · · 1 · · · α1 1 · · · · · · α2 1 · · · . . . ... . . . . . . . . . ... . . . · · · αn−k · · · 1
Tk
ETk = ˜ TkE
Proof
i k
αj−k αi−k
1 j j 1 i
· · · · · · · · · · · ·
ETkE = ˜ Tk
Ti = 1 · · · · · · . . . ... . . . . . . . . . ... . . . · · · 1 · · · 1 · · · α1 1 · · · · · · α2 1 · · · . . . ... . . . . . . . . . ... . . . · · · αn−k · · · 1
Tk
- Must show that
- Apply column exchange: multiply on the right
- Since EE = I, we are done
ETk = ˜ TkE
- Result is
- Don’t need matrix P
- pi is enough
- Why? How?
- L and U as before
u11 u12 u13 p1 l21 u22 u23 p2 l31 l32 u33 p3
Representing P, L, and U
- Augmented matrix
- Last column affected
- nly by row exchanges
- Proceed with standard
elimination
a11 a12 a13 1 a21 a22 a23 2 a31 a32 a33 3
Solving a linear system from P, L and U
- If we know P, L and U, we don’t need A
- Equivalent to solving two linear systems
- Simply permute the right hand side!
Ux = y Ax = b ⇒ PAx = Pb ⇒ LUx = Pb Ly = Pb
Summary of LU with row exchanges
- For every non-singular matrix A, there is
permutation matrix P such that PA = LU
- Compute L, U overwriting entries in A
- Row exchanges fix coefficients in L and U
- Augmented vector represents P
- Apply permutation to identity to recover P if curious
- Solve Ax = b as in LUx = Pb
- Compute Pb without a matrix product
Proof of unicity of LU decomposition
- Lemmas
- The product of two lower (upper) triangular
matrices is lower (upper) triangular
- And diagonals of 1s are preserved!
- The inverse of a lower (upper) triangular matrix is
lower (upper) triangular
- Given the lemmas
L1U1 = L2U2 ⇒ U1 = L−1
1 L2U2
L−1
1 L2 = D ⇒ D = I
⇒ U1U−1
2
= L−1
1 L2
⇒ U1U−1
2
= D = L−1
1 L2
The LDU factorization
- Eliminate asymmetry in decomposition
- L has ones in diagonal, U does not
- Matrix U can be factored further
- So A = LDU
- What if A is symmetric?
- A = LDLT (Why? Look at unicity)
u11 u12 · · · u1n u22 · · · u2n . . . . . . ... . . . · · · unn = u11 · · · u22 · · · . . . . . . ... . . . · · · unn 1 u12/u11 · · · u1n/u11 1 · · · u2n/u22 . . . . . . ... . . . · · · 1
Computing LDLT
A = LDLT A∗j = L
- DLT
∗j = n
X
k=1
L∗k
- DLT
kj= j
X
k=1
L∗k
- DLT
kj
=
j
X
k=1
L∗kDkkLT
kj =
j
X
k=1
L∗kDkkLjk = L∗jDjj +
j−1
X
k=1
L∗kDkkLjk Djj = Ajj −
j−1
X
k=1
LjkDkkLjk L∗j =
- A∗j −
j−1
X
k=1
L∗kDkkLjk
- /Djj
The Cholesky factorization
- A symmetric matrix A is positive definite iff it
has an LU decomposition with positive pivots
- Many useful equivalent definitions
- Behaves much like a positive scalar
- We know that A = LDLT
- But
- So
- With R upper triangular
D = diag(p1, p2, . . . , pn) D
1 2 = diag(√p1, √p2, . . . , √pn)
A = LDLT = (LD
1 2 )(D 1 2 LT ) = RT R
Computing RT R
R2
jj = Ajj − j−1
X
k=1
R2
jk
A = RT R A∗j = RT R∗j = RT
∗jRjj + j−1
X
k=1
RT
∗kRkj
=
n
X
k=1
RT
∗kRkj = j
X
k=1
RT
∗kRkj
RT
∗j =
- A∗j −
j−1
X
k=1
RT
∗kRkj
- /Rjj