lgebra Linear e Aplicaes VECTOR SPACES Avoid rediscovering the - - PowerPoint PPT Presentation
lgebra Linear e Aplicaes VECTOR SPACES Avoid rediscovering the - - PowerPoint PPT Presentation
lgebra Linear e Aplicaes VECTOR SPACES Avoid rediscovering the wheel Many mathematical objects that seem to have nothing in common with matrices do in fact share very similar properties Points in plane, in 3-space, polynomials,
VECTOR SPACES
Avoid rediscovering the wheel
- Many mathematical objects that seem to have
nothing in common with matrices do in fact share very similar properties
- Points in plane, in 3-space, polynomials,
continuous functions, differentiable functions, etc
- All possess addition and scalar multiplication
- Rather than study these objects separately,
develop a general theory that applies to all
- Vector Spaces describe objects of this class
Definition of Vector Space #1
- Composed of 4 items
- A non-empty set V of vectors
- For most of this course, n-tuples or matrices
- A scalar field F
- We will use R or C
- Vector addition (x + y) from V × V to V
- Scalar multiplication (α x) from F × V to V
Definition of Vector Space #2
- Required properties of vector addition
- (A1) Closure
- x + y ∈ V , ∀x, y
- (A2) Associative
- (x + y) + z = x + (y + z)
- (A3) Commutative
- x + y = y + x
- (A4) Neutral element
- ∃0 ∈ V, x + 0 = x, ∀x
- (A5) Addition inverse
- ∀x, ∃(–x) ∈ V, x + (–x) = 0,
Definition of Vector Space #3
- Required properties of scalar multiplication
- (M1) Closure
- α x ∈ V , ∀α, x
- (M2) Associative
- (αβ)x = α(βx), ∀α, β, x
- (M3) Distributive #1
- (α+β)x = αx+βx, ∀α, β, x
- (M4) Distributive #2
- α(x+y) = αx+αy, ∀α, x, y
- (M5) Neutral element
- 1x = x, ∀x
Examples of Vector Space
- Rm×n over R, and Cm×n over C
- Follows directly from our definitions of matrix
addition and scalar multiplication
- Real coordinate spaces R1×n or Rn×1
- Special case of above
- Will denote as Rn and distinguish only if needed
- Focus of this course!
Other examples of Vector Space
- With function addition and scalar multiplication
defined pointwise
- The following are vectors spaces over R
- Set of functions mapping [0,1] into R
- Set of all real-valued continuous functions on [0,1]
- Set of real-valued functions differentiable on [0,1]
- Set of all polynomials with real coefficients
(f + g)(x) = f(x) + g(x) (αf)(x) = αf(x)
Subspace
- Let S be a non-empty subset of a
vector space V over F
- S is said to be a subspace of V if it is also
a vector space
- Only need to check for closure properties of
addition and scalar multiplication
- If closure is respected, other properties follow
- r are inherited from V
Proof
- Only (A1), (A4), (A5), and (M1) are non-trivial
- But (A1) and (M1) together give (A4) and (A5)
- (M1) gives –x = (-1) x which implies (A5)
- Since x and –x are both in V, (A1) implies (A4)
Examples of subspaces (or not)
- The trivial subspace Z of V contains only one
element, the zero vector 0
- Does it satisfy closure?
- Every subspace contains the zero vector!
- The first quadrant in R2
- Does it satisfy closure?
- What about lower-triangular matrices?
- What about symmetric matrices?
Subspaces look “flat”
- Can’t be curved
- Think “flat” line,
surface, etc
- Through the origin
u v αv u + v u v u + v
Spanning sets #1
- Take a set of vectors
from a vector space V over F
- Consider all possible linear combinations of vi
- Then span(S) is a subspace of V
- Show closure properties
- All in span(S)
S = {v1, v2, . . . , vr}
span(S) = {α1v1 + α2v2 + · · · + αrvr | αi ∈ F}
x + y = X
i
(ξi + ηi)vi βx = X
i
(βξi)vi x = X
i
ξivi y = X
i
ηivi
Spanning sets #2
- Take a set of vectors
from a vector space V over F
- Consider all possible linear combinations of vi
- We call span(S) the space spanned by S
- If U is a vector space such that U = span(S),
we say S is a spanning set for U
- In other words, S spans U
S = {v1, v2, . . . , vr}
span(S) = {α1v1 + α2v2 + · · · + αrvr | αi ∈ F}
Examples of spanning sets
- S = {(1 1), (2 2)} spans the line x = y in R2
- The set spans Rn
- The set spans the set
- f all polynomials of degree n or less
S = {1, x, x2, . . . , xn} S =
- ei | i ∈ {1, . . . , n}
Important exercise (4.1.7)
- Take a subset S with n vectors from V,
a subspace of Rm×1
- Form a matrix A in Rm×n in which each
column is a vector of S
- Show that S spans V iff for each b in V there is
at least one x in Rn×1 such that Ax = b
- In other words, iff Ax = b is a consistent linear
system for each b in V
Leads to important test
- How can you tell if a subset of vectors, say, in
R3, spans the whole of R3?
- S = {(1 1 1), (1 –1 –1), (3 1 1)}
- S = {(1 1 1), (1 –1 –1), (3 1 1)}
- Place these rows as columns in a matrix
- Run elimination to find the rank
- If rank is 3, matrix is invertible, system is
consistent for any right-hand side
- If less than 3, certainly does not span R3. Why?
Sum of subspaces
- Let X and Y be subspaces of a vector space V
- Define the sum of X + Y as the set of all
possible sums between vectors of X and Y
- The sum X + Y is also a subspace of V
- Check closure properties for vectors in X + Y
- If SX, SY span X, Y then SX ∪ SY spans X + Y
- Write any vector in X + Y as a linear combination
- f vectors from SX ∪ SY
X + Y = {x + y | x ∈ X, y ∈ Y}
FOUR FUNDAMENTAL SUBSPACES
Subspaces and Linear Functions #1
- Rules (A1) and (M1) remind us of linearity
- Let be a linear function and
let R(f) denote the range of f
- The range of any linear function
is a subspace of Rm
- Proof
- (A1) and (M1)
f : Rn → Rm f : Rn → Rm
αy1 + y2 = αf(x1) + f(x2) = f(αx1 + x2)
R(f) =
- f(x) | x ∈ Rn
⊆ Rm
y1 = f(x1) y2 = f(x2)
Range spaces #1
- The range of a matrix A in Rm×n is the
subspace R(A) of Rm corresponding to the range of f(x) = Ax
- The range of AT is the subspace of Rm
R(A) =
- Ax | x ∈ Rn
⊆ Rm R(AT ) =
- AT y | y ∈ Rm
⊆ Rn
Range spaces #2
- R(A) is also known as column-space of A
- R(AT ) is also known as row-space of A
Ax = X
i
ξi[A]∗i (AT y)
T = yT A =
X
j
ηj[A]j∗ x = ξ1 ξ2 · · · ξn T y = η1 η2 · · · ηn T b ∈ R(A) ⇔ b = Ax a ∈ R
- AT
⇔ aT = yT A
Equal Ranges
- For two matrices A and B of the same shape
- Proof
- Proof
- Similarly,
R(A) = R(B) ⇔ A
col
∼ B R(AT ) = R(BT ) ⇔ A
row
∼ B (⇒)
∀x, ∃y | Ax = By ⇒ A[I]i = [A]i = Byi ⇒ A
col
∼ B
(⇐)
⇒ A = B[y1 y2 · · · yn] = BP A = BP ⇒ b = Ax = BPx ⇒ b = By y = Px
But is P invertible?
Testing spanning sets
- When do two sets of vectors in Rn span the
same subspace?
- Set them as rows of matrices A and B
- Run Gauss-Jordan to compute EA and EB
- Look at non-zero rows of EA and EB
- They must agree!
Example
A = 1 2 2 3 , 2 4 1 3 , 3 6 1 4 B = 1 1 , 1 2 3 4 B = ✓0 1 1 1 2 3 4 ◆ A = 1 2 2 3 2 4 1 3 3 6 1 4 EA = 1 2 1 1 1 EB = ✓1 2 1 1 1 ◆
Spanning the row and column spaces
- Let U be any row echelon form for matrix A
- Spanning sets for A and AT are as follows
- Nonzero rows of U span R(AT )
- Because
- Basic columns in A span R(A)
- Non-basic columns are combinations of basic columns
- They are redundant
- If you need one, use the corresponding linear
combination of basic columns instead U
row
∼ A
Where are the other two spaces?
- Consider a general linear function f from
Rm to Rn and focus on
- From linearity, this is certainly a subspace
- Check (A1) and (M1)
- Given a matrix A, there are two linear
functions to consider
N(f) = {x | f(x) = 0}
f(x) = Ax g(y) = AT y
Nullspace
- For an m × n matrix A, the set
is the nullspace of A
- I.e., the set of solutions for the homogeneous
linear system Ax = 0
- The set is the left nullspace of
because it is the set of solutions of yT A = 0T N(AT ) ⊆ Rm N(A) = {x | Ax = 0} ⊆ Rn
Example
- Find a spanning set for N(A) where
- Simply the general solution to Ax = 0
A = ✓1 2 3 4 5 6 ◆ EA = ✓1 2 3 ◆ x1 x2 x3 = −2x2 − 3x3 x2 x3 = x2 −2 1 + x3 −3 1
Spanning the Nullspace
- To find the Nullspace of an m×n matrix A,
- Find an echelon form U for A and
- Find the general solution to Ux = 0
- Then the set spans N(A)
- In particular
- N(A) = {0} iff rank(A) = n
- N(AT ) = {0} iff rank(A) = m
xh = xf1h1 + xf2h2 + · · · + xfn−rhn−r H = {h1, h2, . . . , hn−r}
Computing N(AT )
- The obvious idea is to run elimination until we
reach an echelon form for AT
- Echelon forms of A and AT are not the same
- Want to obtain all spaces with a single reduction
- Let PA = U, for U in echelon form
- Assume rank(A) = r
- Then last n – r rows of P span N(AT )
Proof
- Let
- Proof
- Proof
(⇒) (⇐)
U = ✓Ur ◆ PA = ✓P1 P2 ◆ A = ✓P1A P2A ◆ = ✓Ur ◆ = U ⇒ P2A = 0 P−1 = Q1 Q2
- yT A = 0 ⇒ yT P−1U = 0 ⇒ yT Q1
Q2 ✓Ur ◆ = 0 ⇒ yT Q1Ur = 0 ⇒ yT Q1 = 0 P = ✓P1 P2 ◆ ⇒ yT Q1P1 = 0 ⇒ yT (I − Q2P2) = 0 ⇒ yT = yT Q2P2 ⇒ yT = (yT Q2)P2 PA = U P−1P = I ⇒ Q1P1 + Q2P2 = I yT A = 0 ⇒ yT = uT P2 ⇒ Q1P1 = I − Q2P2
Example
- Using Gauss-Jordan
- Find EA
- So
- From which
1 2 2 3 1 2 4 1 3 1 3 6 1 4 1 1 2 2 3 −1/3 2/3 1 1 2/3 −1/3 1/3 −5/3 1 P = −1/3 2/3 2/3 −1/3 1/3 −5/3 1 N(AT ) = span 1/3 −5/3 1
Additional insights
- We have shown that
- It turns out that
- Proof
N(AT ) = R(PT
2 )
R(A) = N(P2)
R(A) ⊆ N(P2) R(A) ⊇ N(P2) P(A|y) = (PA|Py) y = Ax ⇒ P2y = 0 P2y = P2Ax = (P2A)x = 0x = 0 Py = ✓P1y P2y ◆ = ✓P1y ◆
P2y = 0 ⇒ ∃x | Ax = y
PA = ✓P1A P2A ◆ = ✓Ur ◆ P(A|y) = ✓Ur P1y ◆
Equal Nullspaces #1
- We already know how to test for equality in
range spaces: row and column equivalence
- How do we test for Nullspace equality?
- Use equivalence again
- For two matrices A and B of the same shape
- Similarly,
N(A) = N(B) ⇔ A
row
∼ B N(AT ) = N(BT ) ⇔ A
col
∼ B
Equal Nullspaces #2
- Let’s prove one of them
- Proof (⇒)
N(AT ) = N(BT ) ⇔ A
col
∼ B
N(AT ) = N(BT ) ⇒ R(A) = R(B) ⇔ A
col
∼ B ⇒ z = Ax1 ⇔ z = Bx2 (A | Bx2) → (PA | PBx2) → ✓P1A P1Bx2 P2A P2Bx2 ◆ → ✓P1A P1Bx2 P2Bx2 ◆ → ✓P1A P1Bx2 ◆ yT A = 0 ⇔ yT B = 0
Equal Nullspaces #3
- Let’s prove one of them
- Proof
- Conversely,
(replace A and B in proof)
(⇐) N(AT ) = N(BT ) ⇔ A
col
∼ B
A = BQ P2A = 0 ⇒ P2BQ = 0 ⇒ P2B = 0 ⇒ N(AT ) ⊆ N(BT ) N(BT ) ⊆ N(AT )
Summary #1
- The four fundamental subspaces associated to
a matrix Am×n are
- The range or column space
- The row-space or left-hand range
- The nullspace
- The left-hand nullspace
N(A) = {x | Ax = 0} ⊆ Rn R(A) = {Ax} ⊆ Rm N(AT ) = {y | yA = 0} ⊆ Rm R(AT ) = {yA} ⊆ Rn
Summary #2
- Let P be a nonsingular matrix such that
PA = U, where U is in echelon form and let rank(A) = r
- Spanning sets for
- R(A): Basic columns of A
- R(AT): Non-zero rows in U (transposed)
- N(A): The hi in the general solution of Ax = 0
- N(AT): The last m – r rows in P (transposed)
Summary #3
- If A and B are matrices of the same shape
A
col
∼ B ⇔ R(A) = R(B) ⇔ N(AT ) = N(BT ) A
row
∼ B ⇔ N(A) = N(B) ⇔ R(AT ) = R(BT )
LINEAR INDEPENDENCE, BASIS, AND DIMENSION
Linear independence
- Matrix dimensions give an incomplete picture
- f the true size of a linear system
- The important number is the rank
- Number of pivots
- Number of non-zero rows in echelon form
- Better interpretation
- Number of genuinely independent rows in matrix
- Other rows are redundant
Formally
- Take a set of vectors
- Look at linear combinations
- Vectors vi are linearly independent (l.i.) iff the only
linear combination that produces 0 is trivial
- Otherwise they are linearly dependent (l.d.)
- One of them is a linear combination of the others
S = {v1, v2, . . . , vr} α1v1 + α2v2 + · · · + αrvr αi = 0
Easy to visualize in R3
- 2 vectors are dependent if they lie on a line
- 3 vectors are dependent if they lie on a plane
- Or line
- 4 vectors are always dependent
- 3 random vectors should be independent
Example
- Determine if the set of
vectors is l.i.
- Look for a non-trivial
solution to
- I.e., non-trivial solution
to the homogeneous linear system
- From Gauss-Jordan
- So, they are l.d. and e.g.
S = 1 2 1 , 1 2 , 5 6 7 α1 1 2 1 + α2 1 2 + α3 5 6 7 = 1 1 5 2 6 1 2 7 α1 α2 α3 = EA = 1 3 1 2 α1 = −3 α2 = −2 α3 = 1
Linear independence and Matrices
- Let A be an m × n matrix.
- These are equivalent to saying the columns of A
form a linearly independent set
- N(A) = {0} rank(A) = n
- These are equivalent to saying the rows of A form
a linearly independent set
- N(AT ) = {0} rank(A) = m
- If A is square, these are equivalent to saying
matrix A is non-singular
- Columns of A form a linearly independent set
- Rows of A form a linearly independent set
Diagonal dominance
- An n × n matrix A = [aij] is diagonally
dominant whenever
- I.e., diagonal elements are larger in magnitude than
the sum of magnitudes of other row elements
- These matrices appear frequently in practical
- applications. Two important properties are
- They are never singular
- Don’t need to use partial pivoting
|akk| >
n
X
j=1 j6=k
|akj|, k ∈ {1, 2, . . . , n}
Diagonal dominance
- Diagonally dominant matrices are non-singular
- Proof by contradiction
- Assume there is a non-zero vector in N(A)
- Find a contradiction
- Let Ax = 0, and let xk be the entry of largest
magnitude in x
⇒ akkxk = −
n
X
j=1 j6=k
akjxj [Ax]k = 0 =
n
X
j=1
akjxj ⇒ |akk||xk| =
- n
X
j=1 j6=k
akjxj
- ≤
n
X
j=1 j6=k
|akj||xj| ≤ |xk|
n
X
j=1 j6=k
|akj| ⇒ |akk| ≤
n
X
j=1 j6=k
|akj|
Polynomial interpolation
- Given a set of m points
where xi are distinct, there is a unique polynomial
- f degree m–1 that goes through each point in S
S = {(x1, y1), . . . , (xm, ym)} ⇥(t) = 0 + 1t + · · · + m−1tm−1 0 + 1x1 + 2x2
1 + · · · m−1xm−1 1
= ⇥(x1) = y1 0 + 1x2 + 2x2
2 + · · · m−1xm−1 2
= ⇥(x2) = y2 . . . 0 + 1xm + 2x2
m + · · · m−1xm−1 m
= ⇥(xm) = ym
Polynomial interpolation
- Same as saying the following system has a
unique solution for any right-hand side yi
- Matrix is non-singular whenever xi are distinct?
- Such matrices are called Vandermonde Matrices
1 x1 x2
1
· · · xm−1
1
1 x2 x2
2
· · · xm−1
2
. . . . . . . . . . . . 1 xm x2
m
· · · xm−1
m
α0 α1 . . . αm−1 = y1 y2 . . . ym
Vandermonde Matrices
- Vandermonde matrices have independent
columns whenever
- Proof
- So p(x) has m distinct roots and degree n – 1?
- Fundamental theorem of of algebra implies
n ≤ m
p(xi) = α0 + α1xi + α2x2
i + · · · + αn−1xn−1 i
= 0 α0 α1 . . . αn−1 = . . . αj = 0 1 x1 x2
1
· · · xn−1
1
1 x2 x2
2
· · · xn−1
2
. . . . . . . . . . . . 1 xm x2
m
· · · xn−1
m
Lagrange interpolator
- In particular, when n = m we have that
has a unique solution
- The solution is the Lagrange interpolator
(t) =
m
X
i=1
yi Qm
j6=i(t − xj)
Qm
j6=i(xi − xj)
! 1 x1 x2
1
· · · xm−1
1
1 x2 x2
2
· · · xm−1
2
. . . . . . . . . . . . 1 xm x2
m
· · · xm−1
m
α0 α1 . . . αm−1 = y1 y2 . . . ym
Example of interpolation
2 4 6 8 x 2 4 6 8 H L 2 4 6 8 x 5 10 15 20 25 H L 2 4 6 8 x 6
- 4
- 2
2 4 6 P4HxL 2 4 6 8 x 6
- 4
- 2
2 4 6 P5HxL
Maximal independent subsets #1
- We know that if rank(Am×n) < n then the
columns of A must be a dependent set
- In such cases, we often want to extract the
maximal independent subset of columns
- An l.i. set with as many columns of A as possible
- Such columns are sufficient to span R(A)
Maximal independent subsets #2
- If rank(Am×n) = r, then the following hold
- Any maximal independent subset of the columns
- f A contain exactly r columns
- Any maximal independent subset of rows from A
contain exactly r rows
- In particular, the r basic columns in A constitute a
maximal independent subset of the columns of A
Maximal independent subsets #3
- Any maximal independent subset of the
columns of A contain exactly r columns
- Proof
- Every column of matrix A can be written as a
linear combination of the basic columns in A
- Pick k > r columns of A and show they are l.d.
A∗s1 A∗s2 · · · A∗sk
- B
B B @ α1 α2 . . . αk 1 C C C A = 0 A∗b1 A∗b2 · · · A∗br
- B
B B @ β11 β12 · · · β1k β21 β22 · · · β2k . . . . . . ... . . . βr1 βr2 · · · βrk 1 C C C A B B B @ α1 α2 . . . αk 1 C C C A = 0
Basic facts about Independence
- The following hold about a set of vectors in V
- If S contains an l.d. subset, then S itself must be l.d.
- If S is l.i., then every subset of S is also l.i.
- If S is l.i. and , then is l.i. iff
- Proof ( )
- If and n > m, then S is l.d.
S = {u1, u2, . . . , un}
v ∈ V S ∪ {v} v ⇥ span(S) S ⊆ Rm
⇐
α1u1 + α2u2 + · · · + αnun + αn+1v = 0 ⇒ αn+1 = 0 ⇒ α1u1 + α2u2 + · · · + αnun = 0 ⇒ α1 = α2 = · · · = αn = 0
BASIS AND DIMENSION
Bases
- A basis for a vector space V is a set S that
- Spans V
- Is linearly independent
- Spanning sets can contain redundant vectors
- Bases, on the other hand, contain only
necessary and sufficient information
- Every vector space V has a basis
- General proof depends on the axiom of choice
- Bases are not unique
Examples
- The unit vectors in Rn are a
basis for Rn. The standard or canonic basis of Rn
- If A is an n × n non-singular matrix, then the set
- f rows and the set of columns of A each
constitute a basis of Rn
- What about ?
- The set is a basis for all
polynomials of degree n or less
- What about the vector space of all polynomials?
S = {e1, e2, . . . , en} Z = {0}
S = {1, x, x2, . . . , xn}
Characterizations of a Basis
- With V a subspace of Rm and
the following are equivalent
- (1) B is a basis for V
- (2) B is a minimal spanning set for V
- (3) B is a maximal l.i. subset of V
B = {b1, b2, . . . , bn}
Proof #1
- (basis) ⇒ (minimal spanning set)
- Assume B is a basis and X is a smaller spanning set
- (minimal spanning set) ⇒ (basis)
- A minimal spanning set must be l.i.
- Otherwise remove an l.d. vector and reduce size
- So it wasn’t minimal (contradiction)
b1 b2 · · · bn
- =
x1 x2 · · · xk
- B
B B @ α11 α12 · · · α1n α21 α22 · · · α2n . . . . . . ... . . . αk1 αk2 · · · αkn 1 C C C A B = XA k < n rank(A) ≤ k < n ⇤y ⇥= 0 | Ay = 0 ⇒ By = 0 ⇒ B is l.d. (contradiction)
Proof #2
- (maximal l.i. set) ⇒ (basis)
- If a maximal l.i. set B of V is not a basis for V
then there is
- So is l.i. and B is not maximal (contradiction)
- (basis) ⇒ (maximal l.i. set)
- If basis B is not maximal l.i., then take a
larger set Y that is maximal l.i.
- We know Y is a basis
- But a basis is minimal and B is smaller
- So B must also be maximal
v V | v ⇥ span(B) B ∪ {v}
Dimension
- We have just proven that, although there are
many bases for V, each of them has the same number of vectors
- The dimension of a space V, dim V, is the
number of vectors in
- Any basis of V
- Any minimal spanning set for V
- Any maximal independent set for V
Examples
- , then dim Z = 0
- The basis is the empty set
- L is a line through the origin in R3, dim L = 1
- Any non-zero vector along L forms a basis for L
- P is a plane through the origin in R3, dim P = 2
- How would we find a basis?
- dim Rn = n
- The canonic vectors form a basis
Z = {0}
Further insights
- Dimension measures the “amount of stuff”
in a subspace
- Point < Line < Plane < R3
- Also measures the number of degrees of
freedom in the subspace
- Z: no freedom, Line: 1 degree, Plane: 2 etc
- Do not confuse with number of components
in a vector! Related, but not equal!
Subspace dimension
- Let M and N be vectors spaces and
- (1) dim M ≤ dim N
- (2) If dim M = dim N , then M = N
- Proof
- (1) Assume dim M > dim N
- Basis of M (all l.i. elements of N ) would have more
vectors than the maximum independent set of N
- (2) Assume
- Augment basis of M with
- Independent set with more than dim N vectors!
M ⊆ N
M ⊂ N
v ∈ N \ M
Four Fundamental Subspaces: Dimension
- For an m × n matrix A with rank(A) = r
- dim R(A) = r
- dim N(A) = n – r
- dim R(AT) = r
- dim N(AT) = m – r
Rank Plus Nullity Theorem
- For all m × n matrices A
dim R(A) + dim N(A) = n
- As the “amount of stuff” in R(A) grows,
the “amount of stuff” in N(A) shrinks
- (dim N(A) was traditionally known as nullity)
Completing a Basis
- If is an l.i. subset of an
n-dimensional space V, where r < n, show how to extend Sr with so that forms a basis for V
- Solution
- Create a matrix A with Sr as columns
- Augment to by the identity matrix
- Reduce to echelon form to find basic columns
- Return the n basic columns of
Sr = {v1, v2, . . . , vr} {vr+1, . . . , vn} Sn = {v1, . . . , vr, vr+1, . . . , vn}
(A|I) (A|I)
Example
- Take two l.i. vectors in R4 and augment to a
complete basis for R4
- Solution
(A|I) = 1 1 1 −1 1 1 2 −2 1 S2 = 1 −1 2 , 1 −2 E(A|I) = 1 1 1 1 −1/2 1 1 1/2 S4 = 1 −1 2 , 1 −2 , 1 , 1
Graphs
- A graph G is defined by is a pair (V, E), where
V is a set of vertices, and E a set of edges
- Each edge connects two vertices
- So E ⊆ V × V
e1 = (v2, v1) e2 = (v1, v4)
. . .
e1 e2 e3 e4 v1 v2 v3 v4 e5 e6
Incidence Matrices
- For a graph G with m vertices and n edges
- Associate an m × n matrix E such that
[E]ij = 1, ej = (∗, vi) −1, ej = (vi, ∗) 0,
- therwise
e1 e2 e3 e4 v1 v2 v3 v4 e5 e6 E = 1 −1 −1 −1 −1 1 1 1 1 1 −1 −1 e1 e2 e3 e4 e5 e6 v1 v2 v3 v4
Rank and Connectivity
- Each edge is associated to two vertices
- Each column contains two entries (1, and –1)
- All columns add up to zero
- In other words, if eT = (1 1 … 1),
then eT E = 0 and therefore
- So
- Equality holds iff the graph is connected!
- I.e., when there is a sequence of edges connecting
any pair of vertices
e ∈ N(ET )
rank(E) = rank(ET ) = m − dim N(ET ) ≤ m − 1
Proof of Rank and Connectivity #1
- Proof
- Assume G is connected, prove dim N(ET) = 1
- I.e., prove e = (1 1 … 1)T spans N(ET)
- Let and take any xi and xk from x
- There is a path from vi to vk
- Take the subset of vertices visited along the way
- There is an edge q linking and
(⇒)
x ∈ N(ET ) {vj1 = vi, vj2, . . . , vjr = vk}
vjp vjp+1
Proof of Rank and Connectivity #2
- There is an edge q linking and
- So column q in E is -1 at row jp and 1 at row jp+1
- But and so
- Since this is true for all p, it turns out xi = xk
- But i and k were arbitrary
- And so finally we reach
- So dim N(ET) = 1
- Which leads to rank(E) = m – 1
vjp vjp+1
xT E = 0 xT E∗q = 0 = xjp+1 − xjp x = αe
Proof of Rank and Connectivity #3
- Proof
- If the graph is not connected, we can partition it
into two disconnected subgraphs G1 and G2
- Reorder vertices so vertices/edges in G1 appear
before vertices/edges of G2 in E.
- Now compute the rank
(⇐)
E = ✓E1 E2 ◆
rank(E) = rank ✓E1 E2 ◆ = rank(E1) + rank(E2) = m − 2 ≤ (m1 − 1) + (m2 − 1)
Application of Rank and Connectivity
- Nodes
- Loops
A : I1R1 − I3R3 + I5R5 = E1 − E3 B : I2R2 − I5R5 + I6R6 = E2 C : I3R3 + I4R4 − I6R6 = E3 + E4 1 : I1 − I2 − I5 = 0 2 : −I1 − I3 + I4 = 0 3 : I3 + I5 + I6 = 0 4 : I2 − I4 − I6 = 0
Rank of a product
- Equivalent matrices have the same rank
- Recall the rank normal form
- Multiplication by invertible matrices preserves rank
- Multiplication by rectangular or singular
matrices can reduce the rank
- If A is m × n and B is n × p then
rank(AB) = rank(B) − dim N(A) ∩ R(B)
Proof #1
- Start with a basis for
- Augment to form a basis for R(B)
- Let us prove that dim R(AB) = t, so that
- Sufficient to prove that
is a basis for R(AB) N(A) ∩ R(B)
S = {x1, x2, . . . , xs} Sext = {x1, . . . , xs, z1, . . . , zt} rank(AB) = rank(B) − dim N(A) ∩ R(B) T = {Az1, . . . , Azt}
Proof #2
- T spans R(AB)
- T is l.i.
b ∈ R(AB) ⇒ b = ABy By ∈ R(B) ⇒ By = X ξixi + X ηizi b = A X ξixi
- + A
X ηizi
- =
X ηiAzi ⇒ X αizi ∈ N(A) ∩ R(B) X αiAzi = 0 ⇒ A X αizi
- = 0
⇒ X αizi − X βixi = 0 ⇒ X αizi = X βixi ⇒ αi = βi = 0
Small perturbations can’t reduce rank
- We already know that we can’t increase rank
by means of matrix product
- We now show it is impossible to reduce rank
by adding a matrix that is “small enough”
- “Small” in a sense that will be clarified later,
but for now here is some intuition
rank(AB) ≤ rank(B) rank(A + E) ≥ rank(A)
Proof
- Suppose rank(A) = r and let P and Q reduce
A to rank normal form
- Apply P and Q to A + E
- But Ir + E11 is invertible. Keep eliminating
- From which
P(A + E)Q = ✓Ir + E11 E12 E21 E22 ◆ PEQ = ✓E11 E12 E21 E22 ◆ P2P(A + E)QQ2 = ✓Ir + E11 S ◆
rank(A + E) = rank(A) + rank(S) ≥ rank(A)
PAQ = ✓Ir ◆
Pitfall solving singular systems
- Due to floating-point precision,
we do not really solve Ax = b
- We solve some perturbed system (A+E)x = b
- If A is non-singular, so is A+E and we are fine
- If A is singular, A+E may have higher rank!
- All we need is for rank(S) > 0!
- But
- So fewer free variables than actual system
- Significant loss of information
S = E22 − E21(I + E11)−1E12
Products ATA and AAT
- For A in Rm×n, the following statements hold
- rank(ATA) = rank(A) = rank(AAT )
- R(ATA) = R(AT )
and R(AAT ) = R(A)
- N(ATA) = N(A)
and N(AAT ) = N(AT )
- For A in Cm×n, replace transposition by
conjugate transpose operation
Proof #1
- rank(ATA) = rank(A)
- We know that
- So prove
rank(AT A) = rank(A) − dim N(AT ) ∩ R(A) N(AT ) ∩ R(A) = {0} x ∈ N(AT ) ∩ R(A) ⇒ AT x = 0 x = Ay ⇒ AT Ay = 0 ⇒ xT x = 0 ⇒ yT AT Ay = 0 ⇒ X x2
i = 0 ⇒ x = 0
Proof #2
- R(ATA) = R(AT )
- N(ATA) = N(A)
R(BC) ⊆ R(B) ⇒ R(AT A) ⊆ R(AT ) dim R(AT A) = rank(AT A) = rank(A) = rank(AT ) = dim R(AT ) N(B) ⊆ N(CB) ⇒ N(A) ⊆ N(AT A) dim N(A) = n − rank(A) = n − rank(AT A) = dim N(AT A)
Application for ATA
- Consider an m×n system Ax = b that may or
may not be consistent
- Multiply on the left by AT to reach
ATAx = AT b
- This is known as the associated system of
normal equations
- It has many nice properties
Application for ATA
- ATAx = AT b is always consistent!
- If Ax = b is consistent, then both systems
have the same solution set
- Take a particular solution p for Ax = b
- If Ap = b, then ATAp = AT b
- General solution is p + N(A) = p + N(ATA)
- If Ax = b has a unique solution, then it is
- x = (ATA)-1AT b N(A) = 0 = N(ATA)
AT b ∈ R(AT ) = R(AT A)
(warning: A may not even be square, so not invertible!)
Normal equations
- For an m×n system Ax = b, the associated system of
normal equations is the m×n system ATAx = AT b
- ATAx = AT b is always consistent,
even when Ax = b is not
- When both are consistent, the solution sets agree
- Otherwise, ATAx = AT b gives the least-squares
solution to Ax = b
- When Ax = b is consistent and has a unique solution,
so does ATAx = AT b and the solution x = (ATA)-1AT b
LEAST SQUARES
Motivating problem
- Assume we observe a phenomenon that varies
with time and record observations
- Want to be able to infer the value of an
- bservation at an arbitrary point in time
- Assume we have a sensible model for f
- Find “good” values for and given D
f(ˆ t ) = ˆ b
α β
f(t) = α + βt e.g. D =
- (t1, b1), (t2, b2), . . . , (tm, bm)
Proposed solution
- Want to find “best”
- Find values for and
that minimize
- Turns out this reduces
to a linear problem
- Let us express in vector
form and generalize α β
m
X
i=1
ε2
i = m
X
i=1
- f(ti) − bi
2 f(t) = α + βt
Changing to vector form
- In our example, define
- Then and
A = 1 t1 1 t2 . . . . . . 1 tm x = ✓α β ◆ b = b1 b2 . . . bm ε = Ax − b
[ε]i = α + βti − bi = εi
m
X
i=1
ε2
i = εT ε = (Ax − b)T (Ax − b)
= xT AT Ax − xT AT b − bT Ax + bT b = xT AT Ax − 2xT AT b + bT b
The minimization problem
- Our goal is to find where the scalar
function
- From calculus, at the minimum
- Both Ax and xT AT can be seen as matrix
functions of each xi
- We can use our rules for differentiation of
matrix functions
ε(x) = xT AT Ax − 2xT AT b + bT b
arg min
x
ε(x) rε(x) = 0
⇥ rε(x) ⇤
i = ∂ε(x)
∂xi
Finding the minimum
- Differentiating
w.r.t. each component in x we get
- Since and since
- Equating to zero and grouping all rows
ATAx = AT b
⇥ rε(x) ⇤
i = ∂ε(x)
∂xi = ∂x ∂xi
T
AT Ax + xT AT A ∂x ∂xi − 2 ∂x ∂xi
T
AT b
ε(x) = xT AT Ax − 2xT AT b + bT b
⇥ rε(x) ⇤
i = eT i AT Ax + xT AT Aei 2eT i AT b
∂x ∂xi = ei eT
i AT =
⇥ AT ⇤
i∗
= 2eT
i AT Ax − 2eT i AT b
= 2 ⇥ AT Ax ⇤
i∗ − 2
⇥ AT b ⇤
i∗
Is there a favorite solution?
- Calculus tells us that the minimum of
can only happen at some solution of the normal equations ATAx = AT b
- Are all solutions equally good?
- Take any two solutions z1 and z2 = z1 + u
- Same argument proves no other vector can
produce a lower value for ε(x)
ε(x) = xT AT Ax − 2xT AT b + bT b
ε(x)
ε(z2) = ε(z1 + u) = ε(z1) + uT AT Au = ε(z1) ε(z1) = bT b − zT
1 AT b
General Least Squares
- For A in Rm×n and b in Rm, let
- The general least squares problem is to find
a vector x that minimizes the quantity
- Any such vector is a least-squares solution
- The solution set is the same as of ATAx = AT b
- Unique only iff rank(A) = n, in which case
x = (ATA)-1AT b
- If Ax = b is consistent, solution sets are the same
ε = Ax − b
m
X
i=1
ε2
i = εT ε = (Ax − b)T (Ax − b)
Example of Linear Regression
- Predict amount of weight that a pint of ice-
cream loses when stored at low temperatures
- Assume a linear model for phenomenon
t1 (time), t2 (temperature), (random noise)
- Assume random noise “averages out”
- Use measurements to find least-squares
solution for parameters in
y = α0 + α1t1 + α2t2 + ε
ε
E(t1, t2) = α0 + α1t1 + α2t2
Result of experiments
- Assume the following measurements
- In vector form, we get
Time (weeks)
1 1 1 2 2 2 3 3 3
Temp (oC)
- 10
- 5
- 10
- 5
- 10
- 5
Loss (grams)
0.15 0.18 0.2 0.17 0.19 0.22 0.2 0.23 0.25
A = 1 1 −10 1 1 −5 1 1 1 2 −10 1 2 −5 1 2 1 3 −10 1 3 −5 1 3 −0 b = 0.15 0.18 0.2 0.17 0.19 0.22 0.2 0.23 0.25 x = α0 α1 α2
Solution
- Certainly the system can’t be consistent?
- Best we can do is solve normal equations
ATAx = AT b
- Which leads to
- So for example
Ax = b ⇒ εi = 0
9 18 −45 18 42 −90 −45 −90 375 α0 α1 α2 = 1.79 3.73 −8.2
E(t1, t2) = 1.74 + .025t1 + .005t2 E(9, −35) = 1.74 + .025(9) + .005(−35) = .224
Example of Curve Fitting
- Interestingly, least squares can be used to fit
non-linear models to data
- Model must be linear on the parameters we
are solving for
- Does not need to be linear on the variables
- I.e., linear on , maybe not on ti
- For example, we can use least squares to fit a
polynomial to a set of points αi
Polynomial fitting problem
- Find a polynomial
- f a given degree that fits the data
as well as possible in the least squares sense
- Assume ti are distinct and n ≤ m
- Here we are more interested in n < m
- Otherwise we can fit the data perfectly
- With Lagrange interpolation
- Fitting “perfectly” not always a good idea
p(t) = α0 + α1t + α2t2 + · · · + αn−1tn−1 D =
- (t1, b1), (t2, b2), . . . , (tm, bm)
In matrix form
- Analogous to earlier
derivation, since
- This time with
m
X
i=1
ε2
i = m
X
i=1
- p(ti) − bi
2 = (Ax − b)T (Ax − b) A = 1 t1 t2
1
· · · tn−1
1
1 t2 t2
2
· · · tn−1
2
. . . . . . . . . ... . . . 1 tm t2
m
· · · tn−1
m
x = α0 α1 . . . αn−1 b = b1 b2 . . . bm
Example of Polynomial Fitting
- Use measurements of the height of a
projectile at different positions to find out where it will land
- Sensible model for trajectory is a parabola
- So we have
- Assume we have more than three data points
- Only need three data points
- Don’t want to discard useful data
- Improves results when measurements are noisy
p(t) = α0 + α1t + α2t2
Fitting the data
A = 1 1 0.25 0.0625 1 0.5 0.25 1 0.75 0.5625 1 1 1 AT A = 5. 2.5 1.875 2.5 1.875 1.5625 1.875 1.5625 1.38281 AT b = 62. 43.75 34.9375
Distance from source (km)
.25 .50 .75 1
Height (m)
8 15 19 20 x = −0.2286 39.83 −19.43 p(t) = −0.2286 + 39.83t − 19.43t2
Ê Ê Ê Ê Ê
0.2 0.4 0.6 0.8 1.0 5 10 15 20
p(t) = 0 ⇒ t ∈ {0.005755, 2.044}
Ê Ê Ê Ê Ê
0.2 0.4 0.6 0.8 1.0 5 10 15 20
Ê Ê Ê Ê Ê
0.0 0.5 1.0 1.5 2.0 10 20 30 40
Ê Ê Ê Ê Ê
0.0 0.5 1.0 1.5 2.0 10 20 30 40
What about Lagrange Interpolation?
- Also uses all the data, as long as we increase
polynomial degree
Distance from source (km)
.25 .50 .75 1
Height (m)
8 15 19 20 (t) =
m
X
i=1
bi Qm
j6=i(t − tj)
Qm
j6=i(ti − tj)
! (t) = 1 3(88t + 62t2 − 160t3 + 64t4)
LINEAR TRANSFORMATIONS
Linear Transformations
- Given two vectors spaces U and V over field F
- A linear transformation from U to V is a linear
function T mapping U into V
- A linear operator on U is a linear
transformation mapping U into itself
T(αx + y) = αT(x) + T(y)
Examples #1
- The zero transformation, 0(x) = 0, maps any
vector in U to the zero vector in V
- The identity operator I(x) = x, maps every
vector from U back into itself
- For any m × n matrix A, the function
T(x) = Ax is a linear transformation from Rn to Rm
Geometric examples
- The rotator Q in R2 by an angle
- The projector P from R3 to the xy-plane
- The reflector R about the xy-plane
θ
Q(u) = ✓cos θ − sin θ sin θ cos θ ◆ u P(u) = 1 1 u u P(u) R(u) = 1 1 −1 u u R(u) u Q(u) θ
Infinite dimensional examples
- Let V be the space of differentiable functions,
and W the space of all functions (from R to R). The mapping D(f) = df/dx
- Let C be the set of all continuous functions
from R to R. The mapping T(f) = R x
0 f(t)dt
Z x
- αf(t) + g(t)
- dt = α
Z x f(t)dt + Z x g(t)dt ∂(αf + g) ∂x = α∂f ∂x + ∂g ∂x
Coordinates of a Vector
- Let be a basis for U
- Take a vector v in U
- The coefficients in the expansion
are called the coordinates of v w.r.t B
- To denote the column vector with these
coefficients, we write
- Order is important!
- When no basis is specified, canonic basis (in
standard order) is assumed
B = {b1, b2, . . . , bn}
αi
v = α1b1 + α1b1 + · · · + αnbn [v]B = α1 α2 · · · αn T
Change of basis as a linear system
- Find the coordinates of vector
in basis
v = 8 7 4 B = 1 1 1 , 1 2 2 , 1 2 3
Space of Linear Transformations
- For each pair of vector spaces U and V over F,
the set of linear transformations L(U, V) from U to V is itself a vector space over F
- Let and
be bases for U and V , find a basis for L(U, V)
- The set with all Bji forms a basis for L(U, V)
- dim L(U, V) = (dim U ) (dim V )
L(uj) =
m
X
i=1
αijvi L(u) =
n
X
j=1
ξjL(uj) L(u) =
n
X
j=1
ξj
m
X
i=1
αijvi =
n
X
j=1 m
X
i=1
αij(ξjvi) =
n
X
j=1 m
X
i=1
αijBji(u) Bji(u) = ξjvi
B = {u1, u2, . . . , un} B0 = {v1, v2, . . . , vm}
u = ξ1u1 + ξ2u2 + · · · + ξnun
Proof
- That Bji spans L(U, V) should be clear
- We started from an arbitrary transformation
- To prove that the Bji are l.i., try writing
X
j,i
ηjiBji = 0 ⇒ ⇣ X
j,i
ηjiBji ⌘ (uk) = 0 ⇒ X
j,i
ηjiBji(uk) = 0 ⇒ X
j,i
ηji[uk]jvi = 0 ⇒ X
i
ηkivi = 0 ⇒ ηki = 0
Coordinate Matrix Representation
- Let and
be bases for U and V, respectively
- The coordinate matrix of L in L(U, V) with
respect to the basis pair is
B = {u1, u2, . . . , un} B0 = {v1, v2, . . . , vm} (B, B0)
[L]BB0 = ⇣ ⇥ L(u1) ⇤
B0
⇥ L(u2) ⇤
B0
· · · ⇥ L(un) ⇤
B0
⌘ L(u) =
n
X
j=1 m
X
i=1
αijξjvi [L]BB0 = α11 α12 · · · α1n α21 α22 · · · α2n . . . . . . ... . . . αm1 αm2 · · · αmn L(uk) =
m
X
i=1
αikvi = α1k α2k . . . αmk
B0
Change of basis and similarities
- Coordinate matrices depend on basis choice
- Some bases may force coordinate matrix to
have desirable properties
- Learn to choose basis that simplifies our work
- Other properties do not change regardless of
basis are invariant to choice of basis
- These properties are inherent to the
transformation itself and are worthy of study
Change of basis operator
- Assume we have two bases for V
- Let T be a linear operator on V such that
T(yi) = xi
- T is the change of basis operator from to B
- And we have
B = {x1, x2, . . . , xn} B0 = {y1, y2, . . . , yn}
B0
- ld basis
new basis
[T]B = [T]B0 = [I]BB0
[T(yi)]B0 = [xi]B0 [T(xi)]B xi =
n
X
j=1
αjyj ⇒ T(xi) =
n
X
j=1
αjT(yj) =
n
X
j=1
αjxj = [xi]B0 [I(xi)]B0 = [xi]B0 not necessarily the identity matrix!
Change of basis matrix
- Assume we have two bases for V
- If T(yi) = xi is the change of basis operator
- P is the change of basis matrix from B to
- We have
- for all v in V
- P is nonsingular
- P is unique
B0 P = [T]B = [T]B0 = [I]BB0
[v]B0 = P[v]B B = {x1, x2, . . . , xn} B0 = {y1, y2, . . . , yn}
- ld basis
new basis
Example
- Here are two different bases for the space of
polynomials of degree two or less
- Find the coordinates of
relative to
- Answer
B = {1, t, t2} B0 = {1, 1 + t, 1 + t + t2} q(t) = 3 + 2t + 4t2
B0
q(t) = 1(1) − 2(1 + t) + 4(1 + t + t2)
Changing Matrix Coordinates
- Let A be a linear operator on V,
let B and be two bases for V
- The coordinate matrices and
are related as follows
- Conversely,
B0 [A]B [A]B0 P = [I]BB0 Q = [I]B0B = P−1 [A]B = P−1[A]B0P [A]B0 = Q−1[A]BQ
Similarity
- Matrices Bn×n and Cn×n are said to be similar
matrices whenever there exists a nonsingular matrix Q such that
- We write when B and C are similar
- The linear operator
defined by is called a similarity transformation B = Q−1CQ B ' C f : Rn×n → Rn×n f(C) = Q−1CQ
Relevance of similarity
- Any two coordinate matrices for a given linear
- perator must be similar
- We know how to transform one into the other
- The process is a similarity
- Conversely, any two similar matrices are
coordinates of the same linear operator
- For different choices of bases
- We use similarity transformations to study
linear operators in a basis that simplifies the associated coordinate matrix
Example of invariance to similarity
- The trace of a square matrix C
- It is similarity invariant
- We can talk about the trace of a linear operator
- Regardless of the particular coordinate matrix
- Even though they are different for different bases
- Rank is also similarity invariant
- Why?
trace(AB) = trace(BA) trace
- (QC)Q−1
= trace
- Q−1QC
- = trace(C)
INVARIANT SUBSPACES
Invariant Subspaces
- Take a linear operator T on V and consider the
image T(X ) of under T
- Certainly it is a subset of V, but in general not
related to X in any obvious way
- If , we say that X is an invariant
subspace under T
- We can now look at T as a linear operator on X
if we ignore the rest of V and restrict it to X
- Denote the restriction of T to X by
X ⊆ V T(X) ⊆ X T/X
Relevance to coordinate matrices #1
- Assume we have a basis for the invariant X
- Augment it to a basis for V
- Write the coordinate matrix of T with basis B
- But
BX = {x1, x2, . . . , xr} B = {x1, x2, . . . , xr, y1, y2, . . . , yq}
T(xj) =
r
X
i=1
αijxi [T]B = ⇣ · · · ⇥ T(xj) ⇤
B
· · · ⇥ T(yj) ⇤
B
· · · ⌘ ⇥ T(xj) ⇤
B =
B B B B B B B B @ α1j . . . αrj . . . 1 C C C C C C C C A T(yj) =
r
X
i=1
βijxi +
q
X
i=1
γijyi ⇥ T(yj) ⇤
B =
B B B B B B B B @ β1j . . . βrj γ1j . . . γqj 1 C C C C C C C C A
Relevance to coordinate matrices #2
- So
- It is block triangular
- What if was also
an invariant subspace?
- Would be block diagonal
[T]B = α11 · · · α1r β11 · · · β1q . . . ... . . . . . . ... . . . αr1 · · · αrr βr1 · · · βrq · · · γ11 · · · γ1q . . . ... . . . . . . ... . . . · · · γq1 · · · γrq
Y = span{y1, y2, . . . , yq}
[T]B = ✓[T/X ]BX [T/Y]BY ◆ [T]B = ✓[T/X ]BX Br×q Cq×q ◆
Invariant subspaces and matrix representation
- Let T be a linear operator on V
- Let be bases for subspaces with
dimensions and a basis for V
- Subspace X is invariant under T iff
- Subspaces are all invariant under T iff
BX , BY, . . . , BZ X, Y, . . . , Z B = BX ∪ BY ∪ · · · ∪ BZ r1, r2, . . . , rk [T]B = ✓Ar1×r1 B C ◆ A = [T/X ]BX X, Y, . . . , Z A = [T/X ]BX B = [T/Y]BY . . . C = [T/Z]BZ [T]B = Ar1×r1 · · · Br2×r2 · · · . . . . . . ... . . . · · · Crk×rk
Triangular and diagonal block forms
- When T is an n × n matrix,
- Q is a nonsingular matrix such that
iff the first r columns of Q span an invariant subspace under T
- Q is a nonsingular matrix such that
iff where each Qi is n × ri and the columns of each Qi span an invariant subspace under T
Q−1TQ = ✓Ar×r Br×q Cq×q ◆ Q−1TQ = Ar1×r1 · · · Br2×r2 · · · . . . . . . ... . . . · · · Crk×rk
Q =
- Q1
Q2 · · · Qk
Example
- Find all subspaces of R2 that are invariant under
- Solution
- Zero-dimensional and 2-dimensional are trivial
- One-dimensional invariants M are trickier
A = ✓ 0 1 −2 3 ◆
x ∈ M ⇒ Ax ∈ M Ax = λx (A − λI)x = 0 x ∈ N(A − λI) N(A λI) ⇥= {0} ✓−λ 1 −2 3 − λ ◆ → ✓−2 3 − λ −λ 1 ◆ → ✓−2 3 − λ 1 + (λ2 − 3λ)/2 ◆ λ2 − 3λ + 2 = 0 λ1 = 1 λ2 = 2
Example (continued)
- Find all subspaces of R2 that are invariant under
- Solution
- Notice that spans R2
- The scalars are called eigenvalues of A and the
nonzero vectors in are the associated eigenvectors of A
A = ✓ 0 1 −2 3 ◆
λ1 = 1 λ2 = 2 M1 = N(A − I) M2 = N(A − 2I) = span ⇢✓1 1 ◆ = span ⇢✓1 2 ◆ B = ⇢✓1 1 ◆ , ✓1 2 ◆ Q = ✓1 1 1 2 ◆ [A]B = Q−1AQ = ✓1 2 ◆