lgebra Linear e Aplicaes VECTOR SPACES Avoid rediscovering the - - PowerPoint PPT Presentation

lgebra linear e aplica es vector spaces avoid
SMART_READER_LITE
LIVE PREVIEW

lgebra Linear e Aplicaes VECTOR SPACES Avoid rediscovering the - - PowerPoint PPT Presentation

lgebra Linear e Aplicaes VECTOR SPACES Avoid rediscovering the wheel Many mathematical objects that seem to have nothing in common with matrices do in fact share very similar properties Points in plane, in 3-space, polynomials,


slide-1
SLIDE 1

Álgebra Linear e Aplicações

slide-2
SLIDE 2

VECTOR SPACES

slide-3
SLIDE 3

Avoid rediscovering the wheel

  • Many mathematical objects that seem to have

nothing in common with matrices do in fact share very similar properties

  • Points in plane, in 3-space, polynomials,

continuous functions, differentiable functions, etc

  • All possess addition and scalar multiplication
  • Rather than study these objects separately,

develop a general theory that applies to all

  • Vector Spaces describe objects of this class
slide-4
SLIDE 4

Definition of Vector Space #1

  • Composed of 4 items
  • A non-empty set V of vectors
  • For most of this course, n-tuples or matrices
  • A scalar field F
  • We will use R or C
  • Vector addition (x + y) from V × V to V
  • Scalar multiplication (α x) from F × V to V
slide-5
SLIDE 5

Definition of Vector Space #2

  • Required properties of vector addition
  • (A1) Closure
  • x + y ∈ V , ∀x, y
  • (A2) Associative
  • (x + y) + z = x + (y + z)
  • (A3) Commutative
  • x + y = y + x
  • (A4) Neutral element
  • ∃0 ∈ V, x + 0 = x, ∀x
  • (A5) Addition inverse
  • ∀x, ∃(–x) ∈ V, x + (–x) = 0,
slide-6
SLIDE 6

Definition of Vector Space #3

  • Required properties of scalar multiplication
  • (M1) Closure
  • α x ∈ V , ∀α, x
  • (M2) Associative
  • (αβ)x = α(βx), ∀α, β, x
  • (M3) Distributive #1
  • (α+β)x = αx+βx, ∀α, β, x
  • (M4) Distributive #2
  • α(x+y) = αx+αy, ∀α, x, y
  • (M5) Neutral element
  • 1x = x, ∀x
slide-7
SLIDE 7

Examples of Vector Space

  • Rm×n over R, and Cm×n over C
  • Follows directly from our definitions of matrix

addition and scalar multiplication

  • Real coordinate spaces R1×n or Rn×1
  • Special case of above
  • Will denote as Rn and distinguish only if needed
  • Focus of this course!
slide-8
SLIDE 8

Other examples of Vector Space

  • With function addition and scalar multiplication

defined pointwise

  • The following are vectors spaces over R
  • Set of functions mapping [0,1] into R
  • Set of all real-valued continuous functions on [0,1]
  • Set of real-valued functions differentiable on [0,1]
  • Set of all polynomials with real coefficients

(f + g)(x) = f(x) + g(x) (αf)(x) = αf(x)

slide-9
SLIDE 9

Subspace

  • Let S be a non-empty subset of a

vector space V over F

  • S is said to be a subspace of V if it is also

a vector space

  • Only need to check for closure properties of

addition and scalar multiplication

  • If closure is respected, other properties follow
  • r are inherited from V
slide-10
SLIDE 10

Proof

  • Only (A1), (A4), (A5), and (M1) are non-trivial
  • But (A1) and (M1) together give (A4) and (A5)
  • (M1) gives –x = (-1) x which implies (A5)
  • Since x and –x are both in V, (A1) implies (A4)
slide-11
SLIDE 11

Examples of subspaces (or not)

  • The trivial subspace Z of V contains only one

element, the zero vector 0

  • Does it satisfy closure?
  • Every subspace contains the zero vector!
  • The first quadrant in R2
  • Does it satisfy closure?
  • What about lower-triangular matrices?
  • What about symmetric matrices?
slide-12
SLIDE 12

Subspaces look “flat”

  • Can’t be curved
  • Think “flat” line,

surface, etc

  • Through the origin

u v αv u + v u v u + v

slide-13
SLIDE 13

Spanning sets #1

  • Take a set of vectors

from a vector space V over F

  • Consider all possible linear combinations of vi
  • Then span(S) is a subspace of V
  • Show closure properties
  • All in span(S)

S = {v1, v2, . . . , vr}

span(S) = {α1v1 + α2v2 + · · · + αrvr | αi ∈ F}

x + y = X

i

(ξi + ηi)vi βx = X

i

(βξi)vi x = X

i

ξivi y = X

i

ηivi

slide-14
SLIDE 14

Spanning sets #2

  • Take a set of vectors

from a vector space V over F

  • Consider all possible linear combinations of vi
  • We call span(S) the space spanned by S
  • If U is a vector space such that U = span(S),

we say S is a spanning set for U

  • In other words, S spans U

S = {v1, v2, . . . , vr}

span(S) = {α1v1 + α2v2 + · · · + αrvr | αi ∈ F}

slide-15
SLIDE 15

Examples of spanning sets

  • S = {(1 1), (2 2)} spans the line x = y in R2
  • The set spans Rn
  • The set spans the set
  • f all polynomials of degree n or less

S = {1, x, x2, . . . , xn} S =

  • ei | i ∈ {1, . . . , n}
slide-16
SLIDE 16

Important exercise (4.1.7)

  • Take a subset S with n vectors from V,

a subspace of Rm×1

  • Form a matrix A in Rm×n in which each

column is a vector of S

  • Show that S spans V iff for each b in V there is

at least one x in Rn×1 such that Ax = b

  • In other words, iff Ax = b is a consistent linear

system for each b in V

slide-17
SLIDE 17

Leads to important test

  • How can you tell if a subset of vectors, say, in

R3, spans the whole of R3?

  • S = {(1 1 1), (1 –1 –1), (3 1 1)}
  • S = {(1 1 1), (1 –1 –1), (3 1 1)}
  • Place these rows as columns in a matrix
  • Run elimination to find the rank
  • If rank is 3, matrix is invertible, system is

consistent for any right-hand side

  • If less than 3, certainly does not span R3. Why?
slide-18
SLIDE 18

Sum of subspaces

  • Let X and Y be subspaces of a vector space V
  • Define the sum of X + Y as the set of all

possible sums between vectors of X and Y

  • The sum X + Y is also a subspace of V
  • Check closure properties for vectors in X + Y
  • If SX, SY span X, Y then SX ∪ SY spans X + Y
  • Write any vector in X + Y as a linear combination
  • f vectors from SX ∪ SY

X + Y = {x + y | x ∈ X, y ∈ Y}

slide-19
SLIDE 19

FOUR FUNDAMENTAL SUBSPACES

slide-20
SLIDE 20

Subspaces and Linear Functions #1

  • Rules (A1) and (M1) remind us of linearity
  • Let be a linear function and

let R(f) denote the range of f

  • The range of any linear function

is a subspace of Rm

  • Proof
  • (A1) and (M1)

f : Rn → Rm f : Rn → Rm

αy1 + y2 = αf(x1) + f(x2) = f(αx1 + x2)

R(f) =

  • f(x) | x ∈ Rn

⊆ Rm

y1 = f(x1) y2 = f(x2)

slide-21
SLIDE 21

Range spaces #1

  • The range of a matrix A in Rm×n is the

subspace R(A) of Rm corresponding to the range of f(x) = Ax

  • The range of AT is the subspace of Rm

R(A) =

  • Ax | x ∈ Rn

⊆ Rm R(AT ) =

  • AT y | y ∈ Rm

⊆ Rn

slide-22
SLIDE 22

Range spaces #2

  • R(A) is also known as column-space of A
  • R(AT ) is also known as row-space of A

Ax = X

i

ξi[A]∗i (AT y)

T = yT A =

X

j

ηj[A]j∗ x = ξ1 ξ2 · · · ξn T y = η1 η2 · · · ηn T b ∈ R(A) ⇔ b = Ax a ∈ R

  • AT

⇔ aT = yT A

slide-23
SLIDE 23

Equal Ranges

  • For two matrices A and B of the same shape
  • Proof
  • Proof
  • Similarly,

R(A) = R(B) ⇔ A

col

∼ B R(AT ) = R(BT ) ⇔ A

row

∼ B (⇒)

∀x, ∃y | Ax = By ⇒ A[I]i = [A]i = Byi ⇒ A

col

∼ B

(⇐)

⇒ A = B[y1 y2 · · · yn] = BP A = BP ⇒ b = Ax = BPx ⇒ b = By y = Px

But is P invertible?

slide-24
SLIDE 24

Testing spanning sets

  • When do two sets of vectors in Rn span the

same subspace?

  • Set them as rows of matrices A and B
  • Run Gauss-Jordan to compute EA and EB
  • Look at non-zero rows of EA and EB
  • They must agree!
slide-25
SLIDE 25

Example

A = 1 2 2 3 , 2 4 1 3 , 3 6 1 4 B = 1 1 , 1 2 3 4 B = ✓0 1 1 1 2 3 4 ◆ A =   1 2 2 3 2 4 1 3 3 6 1 4   EA =   1 2 1 1 1   EB = ✓1 2 1 1 1 ◆

slide-26
SLIDE 26

Spanning the row and column spaces

  • Let U be any row echelon form for matrix A
  • Spanning sets for A and AT are as follows
  • Nonzero rows of U span R(AT )
  • Because
  • Basic columns in A span R(A)
  • Non-basic columns are combinations of basic columns
  • They are redundant
  • If you need one, use the corresponding linear

combination of basic columns instead U

row

∼ A

slide-27
SLIDE 27

Where are the other two spaces?

  • Consider a general linear function f from

Rm to Rn and focus on

  • From linearity, this is certainly a subspace
  • Check (A1) and (M1)
  • Given a matrix A, there are two linear

functions to consider

N(f) = {x | f(x) = 0}

f(x) = Ax g(y) = AT y

slide-28
SLIDE 28

Nullspace

  • For an m × n matrix A, the set

is the nullspace of A

  • I.e., the set of solutions for the homogeneous

linear system Ax = 0

  • The set is the left nullspace of

because it is the set of solutions of yT A = 0T N(AT ) ⊆ Rm N(A) = {x | Ax = 0} ⊆ Rn

slide-29
SLIDE 29

Example

  • Find a spanning set for N(A) where
  • Simply the general solution to Ax = 0

A = ✓1 2 3 4 5 6 ◆ EA = ✓1 2 3 ◆   x1 x2 x3   =   −2x2 − 3x3 x2 x3   = x2   −2 1   + x3   −3 1  

slide-30
SLIDE 30

Spanning the Nullspace

  • To find the Nullspace of an m×n matrix A,
  • Find an echelon form U for A and
  • Find the general solution to Ux = 0
  • Then the set spans N(A)
  • In particular
  • N(A) = {0} iff rank(A) = n
  • N(AT ) = {0} iff rank(A) = m

xh = xf1h1 + xf2h2 + · · · + xfn−rhn−r H = {h1, h2, . . . , hn−r}

slide-31
SLIDE 31

Computing N(AT )

  • The obvious idea is to run elimination until we

reach an echelon form for AT

  • Echelon forms of A and AT are not the same
  • Want to obtain all spaces with a single reduction
  • Let PA = U, for U in echelon form
  • Assume rank(A) = r
  • Then last n – r rows of P span N(AT )
slide-32
SLIDE 32

Proof

  • Let
  • Proof
  • Proof

(⇒) (⇐)

U = ✓Ur ◆ PA = ✓P1 P2 ◆ A = ✓P1A P2A ◆ = ✓Ur ◆ = U ⇒ P2A = 0 P−1 = Q1 Q2

  • yT A = 0 ⇒ yT P−1U = 0 ⇒ yT Q1

Q2 ✓Ur ◆ = 0 ⇒ yT Q1Ur = 0 ⇒ yT Q1 = 0 P = ✓P1 P2 ◆ ⇒ yT Q1P1 = 0 ⇒ yT (I − Q2P2) = 0 ⇒ yT = yT Q2P2 ⇒ yT = (yT Q2)P2 PA = U P−1P = I ⇒ Q1P1 + Q2P2 = I yT A = 0 ⇒ yT = uT P2 ⇒ Q1P1 = I − Q2P2

slide-33
SLIDE 33

Example

  • Using Gauss-Jordan
  • Find EA
  • So
  • From which

  1 2 2 3 1 2 4 1 3 1 3 6 1 4 1     1 2 2 3 −1/3 2/3 1 1 2/3 −1/3 1/3 −5/3 1   P =   −1/3 2/3 2/3 −1/3 1/3 −5/3 1   N(AT ) = span      1/3 −5/3 1     

slide-34
SLIDE 34

Additional insights

  • We have shown that
  • It turns out that
  • Proof

N(AT ) = R(PT

2 )

R(A) = N(P2)

R(A) ⊆ N(P2) R(A) ⊇ N(P2) P(A|y) = (PA|Py) y = Ax ⇒ P2y = 0 P2y = P2Ax = (P2A)x = 0x = 0 Py = ✓P1y P2y ◆ = ✓P1y ◆

P2y = 0 ⇒ ∃x | Ax = y

PA = ✓P1A P2A ◆ = ✓Ur ◆ P(A|y) = ✓Ur P1y ◆

slide-35
SLIDE 35

Equal Nullspaces #1

  • We already know how to test for equality in

range spaces: row and column equivalence

  • How do we test for Nullspace equality?
  • Use equivalence again
  • For two matrices A and B of the same shape
  • Similarly,

N(A) = N(B) ⇔ A

row

∼ B N(AT ) = N(BT ) ⇔ A

col

∼ B

slide-36
SLIDE 36

Equal Nullspaces #2

  • Let’s prove one of them
  • Proof (⇒)

N(AT ) = N(BT ) ⇔ A

col

∼ B

N(AT ) = N(BT ) ⇒ R(A) = R(B) ⇔ A

col

∼ B ⇒ z = Ax1 ⇔ z = Bx2 (A | Bx2) → (PA | PBx2) → ✓P1A P1Bx2 P2A P2Bx2 ◆ → ✓P1A P1Bx2 P2Bx2 ◆ → ✓P1A P1Bx2 ◆ yT A = 0 ⇔ yT B = 0

slide-37
SLIDE 37

Equal Nullspaces #3

  • Let’s prove one of them
  • Proof
  • Conversely,

(replace A and B in proof)

(⇐) N(AT ) = N(BT ) ⇔ A

col

∼ B

A = BQ P2A = 0 ⇒ P2BQ = 0 ⇒ P2B = 0 ⇒ N(AT ) ⊆ N(BT ) N(BT ) ⊆ N(AT )

slide-38
SLIDE 38

Summary #1

  • The four fundamental subspaces associated to

a matrix Am×n are

  • The range or column space
  • The row-space or left-hand range
  • The nullspace
  • The left-hand nullspace

N(A) = {x | Ax = 0} ⊆ Rn R(A) = {Ax} ⊆ Rm N(AT ) = {y | yA = 0} ⊆ Rm R(AT ) = {yA} ⊆ Rn

slide-39
SLIDE 39

Summary #2

  • Let P be a nonsingular matrix such that

PA = U, where U is in echelon form and let rank(A) = r

  • Spanning sets for
  • R(A): Basic columns of A
  • R(AT): Non-zero rows in U (transposed)
  • N(A): The hi in the general solution of Ax = 0
  • N(AT): The last m – r rows in P (transposed)
slide-40
SLIDE 40

Summary #3

  • If A and B are matrices of the same shape

A

col

∼ B ⇔ R(A) = R(B) ⇔ N(AT ) = N(BT ) A

row

∼ B ⇔ N(A) = N(B) ⇔ R(AT ) = R(BT )

slide-41
SLIDE 41

LINEAR INDEPENDENCE, BASIS, AND DIMENSION

slide-42
SLIDE 42

Linear independence

  • Matrix dimensions give an incomplete picture
  • f the true size of a linear system
  • The important number is the rank
  • Number of pivots
  • Number of non-zero rows in echelon form
  • Better interpretation
  • Number of genuinely independent rows in matrix
  • Other rows are redundant
slide-43
SLIDE 43

Formally

  • Take a set of vectors
  • Look at linear combinations
  • Vectors vi are linearly independent (l.i.) iff the only

linear combination that produces 0 is trivial

  • Otherwise they are linearly dependent (l.d.)
  • One of them is a linear combination of the others

S = {v1, v2, . . . , vr} α1v1 + α2v2 + · · · + αrvr αi = 0

slide-44
SLIDE 44

Easy to visualize in R3

  • 2 vectors are dependent if they lie on a line
  • 3 vectors are dependent if they lie on a plane
  • Or line
  • 4 vectors are always dependent
  • 3 random vectors should be independent
slide-45
SLIDE 45

Example

  • Determine if the set of

vectors is l.i.

  • Look for a non-trivial

solution to

  • I.e., non-trivial solution

to the homogeneous linear system

  • From Gauss-Jordan
  • So, they are l.d. and e.g.

S =      1 2 1   ,   1 2   ,   5 6 7      α1   1 2 1   + α2   1 2   + α3   5 6 7   =       1 1 5 2 6 1 2 7     α1 α2 α3   =     EA =   1 3 1 2   α1 = −3 α2 = −2 α3 = 1

slide-46
SLIDE 46

Linear independence and Matrices

  • Let A be an m × n matrix.
  • These are equivalent to saying the columns of A

form a linearly independent set

  • N(A) = {0} rank(A) = n
  • These are equivalent to saying the rows of A form

a linearly independent set

  • N(AT ) = {0} rank(A) = m
  • If A is square, these are equivalent to saying

matrix A is non-singular

  • Columns of A form a linearly independent set
  • Rows of A form a linearly independent set
slide-47
SLIDE 47

Diagonal dominance

  • An n × n matrix A = [aij] is diagonally

dominant whenever

  • I.e., diagonal elements are larger in magnitude than

the sum of magnitudes of other row elements

  • These matrices appear frequently in practical
  • applications. Two important properties are
  • They are never singular
  • Don’t need to use partial pivoting

|akk| >

n

X

j=1 j6=k

|akj|, k ∈ {1, 2, . . . , n}

slide-48
SLIDE 48

Diagonal dominance

  • Diagonally dominant matrices are non-singular
  • Proof by contradiction
  • Assume there is a non-zero vector in N(A)
  • Find a contradiction
  • Let Ax = 0, and let xk be the entry of largest

magnitude in x

⇒ akkxk = −

n

X

j=1 j6=k

akjxj [Ax]k = 0 =

n

X

j=1

akjxj ⇒ |akk||xk| =

  • n

X

j=1 j6=k

akjxj

n

X

j=1 j6=k

|akj||xj| ≤ |xk|

n

X

j=1 j6=k

|akj| ⇒ |akk| ≤

n

X

j=1 j6=k

|akj|

slide-49
SLIDE 49

Polynomial interpolation

  • Given a set of m points

where xi are distinct, there is a unique polynomial

  • f degree m–1 that goes through each point in S

S = {(x1, y1), . . . , (xm, ym)} ⇥(t) = 0 + 1t + · · · + m−1tm−1 0 + 1x1 + 2x2

1 + · · · m−1xm−1 1

= ⇥(x1) = y1 0 + 1x2 + 2x2

2 + · · · m−1xm−1 2

= ⇥(x2) = y2 . . . 0 + 1xm + 2x2

m + · · · m−1xm−1 m

= ⇥(xm) = ym

slide-50
SLIDE 50

Polynomial interpolation

  • Same as saying the following system has a

unique solution for any right-hand side yi

  • Matrix is non-singular whenever xi are distinct?
  • Such matrices are called Vandermonde Matrices

     1 x1 x2

1

· · · xm−1

1

1 x2 x2

2

· · · xm−1

2

. . . . . . . . . . . . 1 xm x2

m

· · · xm−1

m

          α0 α1 . . . αm−1      =      y1 y2 . . . ym     

slide-51
SLIDE 51

Vandermonde Matrices

  • Vandermonde matrices have independent

columns whenever

  • Proof
  • So p(x) has m distinct roots and degree n – 1?
  • Fundamental theorem of of algebra implies

n ≤ m

p(xi) = α0 + α1xi + α2x2

i + · · · + αn−1xn−1 i

= 0      α0 α1 . . . αn−1      =      . . .      αj = 0      1 x1 x2

1

· · · xn−1

1

1 x2 x2

2

· · · xn−1

2

. . . . . . . . . . . . 1 xm x2

m

· · · xn−1

m

    

slide-52
SLIDE 52

Lagrange interpolator

  • In particular, when n = m we have that

has a unique solution

  • The solution is the Lagrange interpolator

(t) =

m

X

i=1

yi Qm

j6=i(t − xj)

Qm

j6=i(xi − xj)

!      1 x1 x2

1

· · · xm−1

1

1 x2 x2

2

· · · xm−1

2

. . . . . . . . . . . . 1 xm x2

m

· · · xm−1

m

          α0 α1 . . . αm−1      =      y1 y2 . . . ym     

slide-53
SLIDE 53

Example of interpolation

2 4 6 8 x 2 4 6 8 H L 2 4 6 8 x 5 10 15 20 25 H L 2 4 6 8 x 6

  • 4
  • 2

2 4 6 P4HxL 2 4 6 8 x 6

  • 4
  • 2

2 4 6 P5HxL

slide-54
SLIDE 54

Maximal independent subsets #1

  • We know that if rank(Am×n) < n then the

columns of A must be a dependent set

  • In such cases, we often want to extract the

maximal independent subset of columns

  • An l.i. set with as many columns of A as possible
  • Such columns are sufficient to span R(A)
slide-55
SLIDE 55

Maximal independent subsets #2

  • If rank(Am×n) = r, then the following hold
  • Any maximal independent subset of the columns
  • f A contain exactly r columns
  • Any maximal independent subset of rows from A

contain exactly r rows

  • In particular, the r basic columns in A constitute a

maximal independent subset of the columns of A

slide-56
SLIDE 56

Maximal independent subsets #3

  • Any maximal independent subset of the

columns of A contain exactly r columns

  • Proof
  • Every column of matrix A can be written as a

linear combination of the basic columns in A

  • Pick k > r columns of A and show they are l.d.

A∗s1 A∗s2 · · · A∗sk

  • B

B B @ α1 α2 . . . αk 1 C C C A = 0 A∗b1 A∗b2 · · · A∗br

  • B

B B @ β11 β12 · · · β1k β21 β22 · · · β2k . . . . . . ... . . . βr1 βr2 · · · βrk 1 C C C A B B B @ α1 α2 . . . αk 1 C C C A = 0

slide-57
SLIDE 57

Basic facts about Independence

  • The following hold about a set of vectors in V
  • If S contains an l.d. subset, then S itself must be l.d.
  • If S is l.i., then every subset of S is also l.i.
  • If S is l.i. and , then is l.i. iff
  • Proof ( )
  • If and n > m, then S is l.d.

S = {u1, u2, . . . , un}

v ∈ V S ∪ {v} v ⇥ span(S) S ⊆ Rm

α1u1 + α2u2 + · · · + αnun + αn+1v = 0 ⇒ αn+1 = 0 ⇒ α1u1 + α2u2 + · · · + αnun = 0 ⇒ α1 = α2 = · · · = αn = 0

slide-58
SLIDE 58

BASIS AND DIMENSION

slide-59
SLIDE 59

Bases

  • A basis for a vector space V is a set S that
  • Spans V
  • Is linearly independent
  • Spanning sets can contain redundant vectors
  • Bases, on the other hand, contain only

necessary and sufficient information

  • Every vector space V has a basis
  • General proof depends on the axiom of choice
  • Bases are not unique
slide-60
SLIDE 60

Examples

  • The unit vectors in Rn are a

basis for Rn. The standard or canonic basis of Rn

  • If A is an n × n non-singular matrix, then the set
  • f rows and the set of columns of A each

constitute a basis of Rn

  • What about ?
  • The set is a basis for all

polynomials of degree n or less

  • What about the vector space of all polynomials?

S = {e1, e2, . . . , en} Z = {0}

S = {1, x, x2, . . . , xn}

slide-61
SLIDE 61

Characterizations of a Basis

  • With V a subspace of Rm and

the following are equivalent

  • (1) B is a basis for V
  • (2) B is a minimal spanning set for V
  • (3) B is a maximal l.i. subset of V

B = {b1, b2, . . . , bn}

slide-62
SLIDE 62

Proof #1

  • (basis) ⇒ (minimal spanning set)
  • Assume B is a basis and X is a smaller spanning set
  • (minimal spanning set) ⇒ (basis)
  • A minimal spanning set must be l.i.
  • Otherwise remove an l.d. vector and reduce size
  • So it wasn’t minimal (contradiction)

b1 b2 · · · bn

  • =

x1 x2 · · · xk

  • B

B B @ α11 α12 · · · α1n α21 α22 · · · α2n . . . . . . ... . . . αk1 αk2 · · · αkn 1 C C C A B = XA k < n rank(A) ≤ k < n ⇤y ⇥= 0 | Ay = 0 ⇒ By = 0 ⇒ B is l.d. (contradiction)

slide-63
SLIDE 63

Proof #2

  • (maximal l.i. set) ⇒ (basis)
  • If a maximal l.i. set B of V is not a basis for V

then there is

  • So is l.i. and B is not maximal (contradiction)
  • (basis) ⇒ (maximal l.i. set)
  • If basis B is not maximal l.i., then take a

larger set Y that is maximal l.i.

  • We know Y is a basis
  • But a basis is minimal and B is smaller
  • So B must also be maximal

v V | v ⇥ span(B) B ∪ {v}

slide-64
SLIDE 64

Dimension

  • We have just proven that, although there are

many bases for V, each of them has the same number of vectors

  • The dimension of a space V, dim V, is the

number of vectors in

  • Any basis of V
  • Any minimal spanning set for V
  • Any maximal independent set for V
slide-65
SLIDE 65

Examples

  • , then dim Z = 0
  • The basis is the empty set
  • L is a line through the origin in R3, dim L = 1
  • Any non-zero vector along L forms a basis for L
  • P is a plane through the origin in R3, dim P = 2
  • How would we find a basis?
  • dim Rn = n
  • The canonic vectors form a basis

Z = {0}

slide-66
SLIDE 66

Further insights

  • Dimension measures the “amount of stuff”

in a subspace

  • Point < Line < Plane < R3
  • Also measures the number of degrees of

freedom in the subspace

  • Z: no freedom, Line: 1 degree, Plane: 2 etc
  • Do not confuse with number of components

in a vector! Related, but not equal!

slide-67
SLIDE 67

Subspace dimension

  • Let M and N be vectors spaces and
  • (1) dim M ≤ dim N
  • (2) If dim M = dim N , then M = N
  • Proof
  • (1) Assume dim M > dim N
  • Basis of M (all l.i. elements of N ) would have more

vectors than the maximum independent set of N

  • (2) Assume
  • Augment basis of M with
  • Independent set with more than dim N vectors!

M ⊆ N

M ⊂ N

v ∈ N \ M

slide-68
SLIDE 68

Four Fundamental Subspaces: Dimension

  • For an m × n matrix A with rank(A) = r
  • dim R(A) = r
  • dim N(A) = n – r
  • dim R(AT) = r
  • dim N(AT) = m – r
slide-69
SLIDE 69

Rank Plus Nullity Theorem

  • For all m × n matrices A

dim R(A) + dim N(A) = n

  • As the “amount of stuff” in R(A) grows,

the “amount of stuff” in N(A) shrinks

  • (dim N(A) was traditionally known as nullity)
slide-70
SLIDE 70

Completing a Basis

  • If is an l.i. subset of an

n-dimensional space V, where r < n, show how to extend Sr with so that forms a basis for V

  • Solution
  • Create a matrix A with Sr as columns
  • Augment to by the identity matrix
  • Reduce to echelon form to find basic columns
  • Return the n basic columns of

Sr = {v1, v2, . . . , vr} {vr+1, . . . , vn} Sn = {v1, . . . , vr, vr+1, . . . , vn}

(A|I) (A|I)

slide-71
SLIDE 71

Example

  • Take two l.i. vectors in R4 and augment to a

complete basis for R4

  • Solution

(A|I) =     1 1 1 −1 1 1 2 −2 1     S2 =            1 −1 2     ,     1 −2            E(A|I) =     1 1 1 1 −1/2 1 1 1/2     S4 =            1 −1 2     ,     1 −2     ,     1     ,     1           

slide-72
SLIDE 72

Graphs

  • A graph G is defined by is a pair (V, E), where

V is a set of vertices, and E a set of edges

  • Each edge connects two vertices
  • So E ⊆ V × V

e1 = (v2, v1) e2 = (v1, v4)

. . .

e1 e2 e3 e4 v1 v2 v3 v4 e5 e6

slide-73
SLIDE 73

Incidence Matrices

  • For a graph G with m vertices and n edges
  • Associate an m × n matrix E such that

[E]ij =      1, ej = (∗, vi) −1, ej = (vi, ∗) 0,

  • therwise

e1 e2 e3 e4 v1 v2 v3 v4 e5 e6 E =     1 −1 −1 −1 −1 1 1 1 1 1 −1 −1     e1 e2 e3 e4 e5 e6 v1 v2 v3 v4

slide-74
SLIDE 74

Rank and Connectivity

  • Each edge is associated to two vertices
  • Each column contains two entries (1, and –1)
  • All columns add up to zero
  • In other words, if eT = (1 1 … 1),

then eT E = 0 and therefore

  • So
  • Equality holds iff the graph is connected!
  • I.e., when there is a sequence of edges connecting

any pair of vertices

e ∈ N(ET )

rank(E) = rank(ET ) = m − dim N(ET ) ≤ m − 1

slide-75
SLIDE 75

Proof of Rank and Connectivity #1

  • Proof
  • Assume G is connected, prove dim N(ET) = 1
  • I.e., prove e = (1 1 … 1)T spans N(ET)
  • Let and take any xi and xk from x
  • There is a path from vi to vk
  • Take the subset of vertices visited along the way
  • There is an edge q linking and

(⇒)

x ∈ N(ET ) {vj1 = vi, vj2, . . . , vjr = vk}

vjp vjp+1

slide-76
SLIDE 76

Proof of Rank and Connectivity #2

  • There is an edge q linking and
  • So column q in E is -1 at row jp and 1 at row jp+1
  • But and so
  • Since this is true for all p, it turns out xi = xk
  • But i and k were arbitrary
  • And so finally we reach
  • So dim N(ET) = 1
  • Which leads to rank(E) = m – 1

vjp vjp+1

xT E = 0 xT E∗q = 0 = xjp+1 − xjp x = αe

slide-77
SLIDE 77

Proof of Rank and Connectivity #3

  • Proof
  • If the graph is not connected, we can partition it

into two disconnected subgraphs G1 and G2

  • Reorder vertices so vertices/edges in G1 appear

before vertices/edges of G2 in E.

  • Now compute the rank

(⇐)

E = ✓E1 E2 ◆

rank(E) = rank ✓E1 E2 ◆ = rank(E1) + rank(E2) = m − 2 ≤ (m1 − 1) + (m2 − 1)

slide-78
SLIDE 78

Application of Rank and Connectivity

  • Nodes
  • Loops

A : I1R1 − I3R3 + I5R5 = E1 − E3 B : I2R2 − I5R5 + I6R6 = E2 C : I3R3 + I4R4 − I6R6 = E3 + E4 1 : I1 − I2 − I5 = 0 2 : −I1 − I3 + I4 = 0 3 : I3 + I5 + I6 = 0 4 : I2 − I4 − I6 = 0

slide-79
SLIDE 79

Rank of a product

  • Equivalent matrices have the same rank
  • Recall the rank normal form
  • Multiplication by invertible matrices preserves rank
  • Multiplication by rectangular or singular

matrices can reduce the rank

  • If A is m × n and B is n × p then

rank(AB) = rank(B) − dim N(A) ∩ R(B)

slide-80
SLIDE 80

Proof #1

  • Start with a basis for
  • Augment to form a basis for R(B)
  • Let us prove that dim R(AB) = t, so that
  • Sufficient to prove that

is a basis for R(AB) N(A) ∩ R(B)

S = {x1, x2, . . . , xs} Sext = {x1, . . . , xs, z1, . . . , zt} rank(AB) = rank(B) − dim N(A) ∩ R(B) T = {Az1, . . . , Azt}

slide-81
SLIDE 81

Proof #2

  • T spans R(AB)
  • T is l.i.

b ∈ R(AB) ⇒ b = ABy By ∈ R(B) ⇒ By = X ξixi + X ηizi b = A X ξixi

  • + A

X ηizi

  • =

X ηiAzi ⇒ X αizi ∈ N(A) ∩ R(B) X αiAzi = 0 ⇒ A X αizi

  • = 0

⇒ X αizi − X βixi = 0 ⇒ X αizi = X βixi ⇒ αi = βi = 0

slide-82
SLIDE 82

Small perturbations can’t reduce rank

  • We already know that we can’t increase rank

by means of matrix product

  • We now show it is impossible to reduce rank

by adding a matrix that is “small enough”

  • “Small” in a sense that will be clarified later,

but for now here is some intuition

rank(AB) ≤ rank(B) rank(A + E) ≥ rank(A)

slide-83
SLIDE 83

Proof

  • Suppose rank(A) = r and let P and Q reduce

A to rank normal form

  • Apply P and Q to A + E
  • But Ir + E11 is invertible. Keep eliminating
  • From which

P(A + E)Q = ✓Ir + E11 E12 E21 E22 ◆ PEQ = ✓E11 E12 E21 E22 ◆ P2P(A + E)QQ2 = ✓Ir + E11 S ◆

rank(A + E) = rank(A) + rank(S) ≥ rank(A)

PAQ = ✓Ir ◆

slide-84
SLIDE 84

Pitfall solving singular systems

  • Due to floating-point precision,

we do not really solve Ax = b

  • We solve some perturbed system (A+E)x = b
  • If A is non-singular, so is A+E and we are fine
  • If A is singular, A+E may have higher rank!
  • All we need is for rank(S) > 0!
  • But
  • So fewer free variables than actual system
  • Significant loss of information

S = E22 − E21(I + E11)−1E12

slide-85
SLIDE 85

Products ATA and AAT

  • For A in Rm×n, the following statements hold
  • rank(ATA) = rank(A) = rank(AAT )
  • R(ATA) = R(AT )

and R(AAT ) = R(A)

  • N(ATA) = N(A)

and N(AAT ) = N(AT )

  • For A in Cm×n, replace transposition by

conjugate transpose operation

slide-86
SLIDE 86

Proof #1

  • rank(ATA) = rank(A)
  • We know that
  • So prove

rank(AT A) = rank(A) − dim N(AT ) ∩ R(A) N(AT ) ∩ R(A) = {0} x ∈ N(AT ) ∩ R(A) ⇒ AT x = 0 x = Ay ⇒ AT Ay = 0 ⇒ xT x = 0 ⇒ yT AT Ay = 0 ⇒ X x2

i = 0 ⇒ x = 0

slide-87
SLIDE 87

Proof #2

  • R(ATA) = R(AT )
  • N(ATA) = N(A)

R(BC) ⊆ R(B) ⇒ R(AT A) ⊆ R(AT ) dim R(AT A) = rank(AT A) = rank(A) = rank(AT ) = dim R(AT ) N(B) ⊆ N(CB) ⇒ N(A) ⊆ N(AT A) dim N(A) = n − rank(A) = n − rank(AT A) = dim N(AT A)

slide-88
SLIDE 88

Application for ATA

  • Consider an m×n system Ax = b that may or

may not be consistent

  • Multiply on the left by AT to reach

ATAx = AT b

  • This is known as the associated system of

normal equations

  • It has many nice properties
slide-89
SLIDE 89

Application for ATA

  • ATAx = AT b is always consistent!
  • If Ax = b is consistent, then both systems

have the same solution set

  • Take a particular solution p for Ax = b
  • If Ap = b, then ATAp = AT b
  • General solution is p + N(A) = p + N(ATA)
  • If Ax = b has a unique solution, then it is
  • x = (ATA)-1AT b N(A) = 0 = N(ATA)

AT b ∈ R(AT ) = R(AT A)

(warning: A may not even be square, so not invertible!)

slide-90
SLIDE 90

Normal equations

  • For an m×n system Ax = b, the associated system of

normal equations is the m×n system ATAx = AT b

  • ATAx = AT b is always consistent,

even when Ax = b is not

  • When both are consistent, the solution sets agree
  • Otherwise, ATAx = AT b gives the least-squares

solution to Ax = b

  • When Ax = b is consistent and has a unique solution,

so does ATAx = AT b and the solution x = (ATA)-1AT b

slide-91
SLIDE 91

LEAST SQUARES

slide-92
SLIDE 92

Motivating problem

  • Assume we observe a phenomenon that varies

with time and record observations

  • Want to be able to infer the value of an
  • bservation at an arbitrary point in time
  • Assume we have a sensible model for f
  • Find “good” values for and given D

f(ˆ t ) = ˆ b

α β

f(t) = α + βt e.g. D =

  • (t1, b1), (t2, b2), . . . , (tm, bm)
slide-93
SLIDE 93

Proposed solution

  • Want to find “best”
  • Find values for and

that minimize

  • Turns out this reduces

to a linear problem

  • Let us express in vector

form and generalize α β

m

X

i=1

ε2

i = m

X

i=1

  • f(ti) − bi

2 f(t) = α + βt

slide-94
SLIDE 94

Changing to vector form

  • In our example, define
  • Then and

A =      1 t1 1 t2 . . . . . . 1 tm      x = ✓α β ◆ b =      b1 b2 . . . bm      ε = Ax − b

[ε]i = α + βti − bi = εi

m

X

i=1

ε2

i = εT ε = (Ax − b)T (Ax − b)

= xT AT Ax − xT AT b − bT Ax + bT b = xT AT Ax − 2xT AT b + bT b

slide-95
SLIDE 95

The minimization problem

  • Our goal is to find where the scalar

function

  • From calculus, at the minimum
  • Both Ax and xT AT can be seen as matrix

functions of each xi

  • We can use our rules for differentiation of

matrix functions

ε(x) = xT AT Ax − 2xT AT b + bT b

arg min

x

ε(x) rε(x) = 0

⇥ rε(x) ⇤

i = ∂ε(x)

∂xi

slide-96
SLIDE 96

Finding the minimum

  • Differentiating

w.r.t. each component in x we get

  • Since and since
  • Equating to zero and grouping all rows

ATAx = AT b

⇥ rε(x) ⇤

i = ∂ε(x)

∂xi = ∂x ∂xi

T

AT Ax + xT AT A ∂x ∂xi − 2 ∂x ∂xi

T

AT b

ε(x) = xT AT Ax − 2xT AT b + bT b

⇥ rε(x) ⇤

i = eT i AT Ax + xT AT Aei 2eT i AT b

∂x ∂xi = ei eT

i AT =

⇥ AT ⇤

i∗

= 2eT

i AT Ax − 2eT i AT b

= 2 ⇥ AT Ax ⇤

i∗ − 2

⇥ AT b ⇤

i∗

slide-97
SLIDE 97

Is there a favorite solution?

  • Calculus tells us that the minimum of

can only happen at some solution of the normal equations ATAx = AT b

  • Are all solutions equally good?
  • Take any two solutions z1 and z2 = z1 + u
  • Same argument proves no other vector can

produce a lower value for ε(x)

ε(x) = xT AT Ax − 2xT AT b + bT b

ε(x)

ε(z2) = ε(z1 + u) = ε(z1) + uT AT Au = ε(z1) ε(z1) = bT b − zT

1 AT b

slide-98
SLIDE 98

General Least Squares

  • For A in Rm×n and b in Rm, let
  • The general least squares problem is to find

a vector x that minimizes the quantity

  • Any such vector is a least-squares solution
  • The solution set is the same as of ATAx = AT b
  • Unique only iff rank(A) = n, in which case

x = (ATA)-1AT b

  • If Ax = b is consistent, solution sets are the same

ε = Ax − b

m

X

i=1

ε2

i = εT ε = (Ax − b)T (Ax − b)

slide-99
SLIDE 99

Example of Linear Regression

  • Predict amount of weight that a pint of ice-

cream loses when stored at low temperatures

  • Assume a linear model for phenomenon

t1 (time), t2 (temperature), (random noise)

  • Assume random noise “averages out”
  • Use measurements to find least-squares

solution for parameters in

y = α0 + α1t1 + α2t2 + ε

ε

E(t1, t2) = α0 + α1t1 + α2t2

slide-100
SLIDE 100

Result of experiments

  • Assume the following measurements
  • In vector form, we get

Time (weeks)

1 1 1 2 2 2 3 3 3

Temp (oC)

  • 10
  • 5
  • 10
  • 5
  • 10
  • 5

Loss (grams)

0.15 0.18 0.2 0.17 0.19 0.22 0.2 0.23 0.25

A =               1 1 −10 1 1 −5 1 1 1 2 −10 1 2 −5 1 2 1 3 −10 1 3 −5 1 3 −0               b =               0.15 0.18 0.2 0.17 0.19 0.22 0.2 0.23 0.25               x =   α0 α1 α2  

slide-101
SLIDE 101

Solution

  • Certainly the system can’t be consistent?
  • Best we can do is solve normal equations

ATAx = AT b

  • Which leads to
  • So for example

Ax = b ⇒ εi = 0

  9 18 −45 18 42 −90 −45 −90 375     α0 α1 α2   =   1.79 3.73 −8.2  

E(t1, t2) = 1.74 + .025t1 + .005t2 E(9, −35) = 1.74 + .025(9) + .005(−35) = .224

slide-102
SLIDE 102

Example of Curve Fitting

  • Interestingly, least squares can be used to fit

non-linear models to data

  • Model must be linear on the parameters we

are solving for

  • Does not need to be linear on the variables
  • I.e., linear on , maybe not on ti
  • For example, we can use least squares to fit a

polynomial to a set of points αi

slide-103
SLIDE 103

Polynomial fitting problem

  • Find a polynomial
  • f a given degree that fits the data

as well as possible in the least squares sense

  • Assume ti are distinct and n ≤ m
  • Here we are more interested in n < m
  • Otherwise we can fit the data perfectly
  • With Lagrange interpolation
  • Fitting “perfectly” not always a good idea

p(t) = α0 + α1t + α2t2 + · · · + αn−1tn−1 D =

  • (t1, b1), (t2, b2), . . . , (tm, bm)
slide-104
SLIDE 104

In matrix form

  • Analogous to earlier

derivation, since

  • This time with

m

X

i=1

ε2

i = m

X

i=1

  • p(ti) − bi

2 = (Ax − b)T (Ax − b) A =      1 t1 t2

1

· · · tn−1

1

1 t2 t2

2

· · · tn−1

2

. . . . . . . . . ... . . . 1 tm t2

m

· · · tn−1

m

     x =      α0 α1 . . . αn−1      b =      b1 b2 . . . bm     

slide-105
SLIDE 105

Example of Polynomial Fitting

  • Use measurements of the height of a

projectile at different positions to find out where it will land

  • Sensible model for trajectory is a parabola
  • So we have
  • Assume we have more than three data points
  • Only need three data points
  • Don’t want to discard useful data
  • Improves results when measurements are noisy

p(t) = α0 + α1t + α2t2

slide-106
SLIDE 106

Fitting the data

A =       1 1 0.25 0.0625 1 0.5 0.25 1 0.75 0.5625 1 1 1       AT A =   5. 2.5 1.875 2.5 1.875 1.5625 1.875 1.5625 1.38281   AT b =   62. 43.75 34.9375  

Distance from source (km)

.25 .50 .75 1

Height (m)

8 15 19 20 x =   −0.2286 39.83 −19.43   p(t) = −0.2286 + 39.83t − 19.43t2

Ê Ê Ê Ê Ê

0.2 0.4 0.6 0.8 1.0 5 10 15 20

p(t) = 0 ⇒ t ∈ {0.005755, 2.044}

Ê Ê Ê Ê Ê

0.2 0.4 0.6 0.8 1.0 5 10 15 20

slide-107
SLIDE 107

Ê Ê Ê Ê Ê

0.0 0.5 1.0 1.5 2.0 10 20 30 40

Ê Ê Ê Ê Ê

0.0 0.5 1.0 1.5 2.0 10 20 30 40

What about Lagrange Interpolation?

  • Also uses all the data, as long as we increase

polynomial degree

Distance from source (km)

.25 .50 .75 1

Height (m)

8 15 19 20 (t) =

m

X

i=1

bi Qm

j6=i(t − tj)

Qm

j6=i(ti − tj)

! (t) = 1 3(88t + 62t2 − 160t3 + 64t4)

slide-108
SLIDE 108

LINEAR TRANSFORMATIONS

slide-109
SLIDE 109

Linear Transformations

  • Given two vectors spaces U and V over field F
  • A linear transformation from U to V is a linear

function T mapping U into V

  • A linear operator on U is a linear

transformation mapping U into itself

T(αx + y) = αT(x) + T(y)

slide-110
SLIDE 110

Examples #1

  • The zero transformation, 0(x) = 0, maps any

vector in U to the zero vector in V

  • The identity operator I(x) = x, maps every

vector from U back into itself

  • For any m × n matrix A, the function

T(x) = Ax is a linear transformation from Rn to Rm

slide-111
SLIDE 111

Geometric examples

  • The rotator Q in R2 by an angle
  • The projector P from R3 to the xy-plane
  • The reflector R about the xy-plane

θ

Q(u) = ✓cos θ − sin θ sin θ cos θ ◆ u P(u) =   1 1   u u P(u) R(u) =   1 1 −1   u u R(u) u Q(u) θ

slide-112
SLIDE 112

Infinite dimensional examples

  • Let V be the space of differentiable functions,

and W the space of all functions (from R to R). The mapping D(f) = df/dx

  • Let C be the set of all continuous functions

from R to R. The mapping T(f) = R x

0 f(t)dt

Z x

  • αf(t) + g(t)
  • dt = α

Z x f(t)dt + Z x g(t)dt ∂(αf + g) ∂x = α∂f ∂x + ∂g ∂x

slide-113
SLIDE 113

Coordinates of a Vector

  • Let be a basis for U
  • Take a vector v in U
  • The coefficients in the expansion

are called the coordinates of v w.r.t B

  • To denote the column vector with these

coefficients, we write

  • Order is important!
  • When no basis is specified, canonic basis (in

standard order) is assumed

B = {b1, b2, . . . , bn}

αi

v = α1b1 + α1b1 + · · · + αnbn [v]B = α1 α2 · · · αn T

slide-114
SLIDE 114

Change of basis as a linear system

  • Find the coordinates of vector

in basis

v =   8 7 4   B =      1 1 1   ,   1 2 2   ,   1 2 3     

slide-115
SLIDE 115

Space of Linear Transformations

  • For each pair of vector spaces U and V over F,

the set of linear transformations L(U, V) from U to V is itself a vector space over F

  • Let and

be bases for U and V , find a basis for L(U, V)

  • The set with all Bji forms a basis for L(U, V)
  • dim L(U, V) = (dim U ) (dim V )

L(uj) =

m

X

i=1

αijvi L(u) =

n

X

j=1

ξjL(uj) L(u) =

n

X

j=1

ξj

m

X

i=1

αijvi =

n

X

j=1 m

X

i=1

αij(ξjvi) =

n

X

j=1 m

X

i=1

αijBji(u) Bji(u) = ξjvi

B = {u1, u2, . . . , un} B0 = {v1, v2, . . . , vm}

u = ξ1u1 + ξ2u2 + · · · + ξnun

slide-116
SLIDE 116

Proof

  • That Bji spans L(U, V) should be clear
  • We started from an arbitrary transformation
  • To prove that the Bji are l.i., try writing

X

j,i

ηjiBji = 0 ⇒ ⇣ X

j,i

ηjiBji ⌘ (uk) = 0 ⇒ X

j,i

ηjiBji(uk) = 0 ⇒ X

j,i

ηji[uk]jvi = 0 ⇒ X

i

ηkivi = 0 ⇒ ηki = 0

slide-117
SLIDE 117

Coordinate Matrix Representation

  • Let and

be bases for U and V, respectively

  • The coordinate matrix of L in L(U, V) with

respect to the basis pair is

B = {u1, u2, . . . , un} B0 = {v1, v2, . . . , vm} (B, B0)

[L]BB0 = ⇣ ⇥ L(u1) ⇤

B0

⇥ L(u2) ⇤

B0

· · · ⇥ L(un) ⇤

B0

⌘ L(u) =

n

X

j=1 m

X

i=1

αijξjvi [L]BB0 =      α11 α12 · · · α1n α21 α22 · · · α2n . . . . . . ... . . . αm1 αm2 · · · αmn      L(uk) =

m

X

i=1

αikvi =           α1k α2k . . . αmk          

B0

slide-118
SLIDE 118

Change of basis and similarities

  • Coordinate matrices depend on basis choice
  • Some bases may force coordinate matrix to

have desirable properties

  • Learn to choose basis that simplifies our work
  • Other properties do not change regardless of

basis are invariant to choice of basis

  • These properties are inherent to the

transformation itself and are worthy of study

slide-119
SLIDE 119

Change of basis operator

  • Assume we have two bases for V
  • Let T be a linear operator on V such that

T(yi) = xi

  • T is the change of basis operator from to B
  • And we have

B = {x1, x2, . . . , xn} B0 = {y1, y2, . . . , yn}

B0

  • ld basis

new basis

[T]B = [T]B0 = [I]BB0

[T(yi)]B0 = [xi]B0 [T(xi)]B xi =

n

X

j=1

αjyj ⇒ T(xi) =

n

X

j=1

αjT(yj) =

n

X

j=1

αjxj = [xi]B0 [I(xi)]B0 = [xi]B0 not necessarily the identity matrix!

slide-120
SLIDE 120

Change of basis matrix

  • Assume we have two bases for V
  • If T(yi) = xi is the change of basis operator
  • P is the change of basis matrix from B to
  • We have
  • for all v in V
  • P is nonsingular
  • P is unique

B0 P = [T]B = [T]B0 = [I]BB0

[v]B0 = P[v]B B = {x1, x2, . . . , xn} B0 = {y1, y2, . . . , yn}

  • ld basis

new basis

slide-121
SLIDE 121

Example

  • Here are two different bases for the space of

polynomials of degree two or less

  • Find the coordinates of

relative to

  • Answer

B = {1, t, t2} B0 = {1, 1 + t, 1 + t + t2} q(t) = 3 + 2t + 4t2

B0

q(t) = 1(1) − 2(1 + t) + 4(1 + t + t2)

slide-122
SLIDE 122

Changing Matrix Coordinates

  • Let A be a linear operator on V,

let B and be two bases for V

  • The coordinate matrices and

are related as follows

  • Conversely,

B0 [A]B [A]B0 P = [I]BB0 Q = [I]B0B = P−1 [A]B = P−1[A]B0P [A]B0 = Q−1[A]BQ

slide-123
SLIDE 123

Similarity

  • Matrices Bn×n and Cn×n are said to be similar

matrices whenever there exists a nonsingular matrix Q such that

  • We write when B and C are similar
  • The linear operator

defined by is called a similarity transformation B = Q−1CQ B ' C f : Rn×n → Rn×n f(C) = Q−1CQ

slide-124
SLIDE 124

Relevance of similarity

  • Any two coordinate matrices for a given linear
  • perator must be similar
  • We know how to transform one into the other
  • The process is a similarity
  • Conversely, any two similar matrices are

coordinates of the same linear operator

  • For different choices of bases
  • We use similarity transformations to study

linear operators in a basis that simplifies the associated coordinate matrix

slide-125
SLIDE 125

Example of invariance to similarity

  • The trace of a square matrix C
  • It is similarity invariant
  • We can talk about the trace of a linear operator
  • Regardless of the particular coordinate matrix
  • Even though they are different for different bases
  • Rank is also similarity invariant
  • Why?

trace(AB) = trace(BA) trace

  • (QC)Q−1

= trace

  • Q−1QC
  • = trace(C)
slide-126
SLIDE 126

INVARIANT SUBSPACES

slide-127
SLIDE 127

Invariant Subspaces

  • Take a linear operator T on V and consider the

image T(X ) of under T

  • Certainly it is a subset of V, but in general not

related to X in any obvious way

  • If , we say that X is an invariant

subspace under T

  • We can now look at T as a linear operator on X

if we ignore the rest of V and restrict it to X

  • Denote the restriction of T to X by

X ⊆ V T(X) ⊆ X T/X

slide-128
SLIDE 128

Relevance to coordinate matrices #1

  • Assume we have a basis for the invariant X
  • Augment it to a basis for V
  • Write the coordinate matrix of T with basis B
  • But

BX = {x1, x2, . . . , xr} B = {x1, x2, . . . , xr, y1, y2, . . . , yq}

T(xj) =

r

X

i=1

αijxi [T]B = ⇣ · · · ⇥ T(xj) ⇤

B

· · · ⇥ T(yj) ⇤

B

· · · ⌘ ⇥ T(xj) ⇤

B =

B B B B B B B B @ α1j . . . αrj . . . 1 C C C C C C C C A T(yj) =

r

X

i=1

βijxi +

q

X

i=1

γijyi ⇥ T(yj) ⇤

B =

B B B B B B B B @ β1j . . . βrj γ1j . . . γqj 1 C C C C C C C C A

slide-129
SLIDE 129

Relevance to coordinate matrices #2

  • So
  • It is block triangular
  • What if was also

an invariant subspace?

  • Would be block diagonal

[T]B =           α11 · · · α1r β11 · · · β1q . . . ... . . . . . . ... . . . αr1 · · · αrr βr1 · · · βrq · · · γ11 · · · γ1q . . . ... . . . . . . ... . . . · · · γq1 · · · γrq          

Y = span{y1, y2, . . . , yq}

[T]B = ✓[T/X ]BX [T/Y]BY ◆ [T]B = ✓[T/X ]BX Br×q Cq×q ◆

slide-130
SLIDE 130

Invariant subspaces and matrix representation

  • Let T be a linear operator on V
  • Let be bases for subspaces with

dimensions and a basis for V

  • Subspace X is invariant under T iff
  • Subspaces are all invariant under T iff

BX , BY, . . . , BZ X, Y, . . . , Z B = BX ∪ BY ∪ · · · ∪ BZ r1, r2, . . . , rk [T]B = ✓Ar1×r1 B C ◆ A = [T/X ]BX X, Y, . . . , Z A = [T/X ]BX B = [T/Y]BY . . . C = [T/Z]BZ [T]B =      Ar1×r1 · · · Br2×r2 · · · . . . . . . ... . . . · · · Crk×rk     

slide-131
SLIDE 131

Triangular and diagonal block forms

  • When T is an n × n matrix,
  • Q is a nonsingular matrix such that

iff the first r columns of Q span an invariant subspace under T

  • Q is a nonsingular matrix such that

iff where each Qi is n × ri and the columns of each Qi span an invariant subspace under T

Q−1TQ = ✓Ar×r Br×q Cq×q ◆ Q−1TQ =      Ar1×r1 · · · Br2×r2 · · · . . . . . . ... . . . · · · Crk×rk     

Q =

  • Q1

Q2 · · · Qk

slide-132
SLIDE 132

Example

  • Find all subspaces of R2 that are invariant under
  • Solution
  • Zero-dimensional and 2-dimensional are trivial
  • One-dimensional invariants M are trickier

A = ✓ 0 1 −2 3 ◆

x ∈ M ⇒ Ax ∈ M Ax = λx (A − λI)x = 0 x ∈ N(A − λI) N(A λI) ⇥= {0} ✓−λ 1 −2 3 − λ ◆ → ✓−2 3 − λ −λ 1 ◆ → ✓−2 3 − λ 1 + (λ2 − 3λ)/2 ◆ λ2 − 3λ + 2 = 0 λ1 = 1 λ2 = 2

slide-133
SLIDE 133

Example (continued)

  • Find all subspaces of R2 that are invariant under
  • Solution
  • Notice that spans R2
  • The scalars are called eigenvalues of A and the

nonzero vectors in are the associated eigenvectors of A

A = ✓ 0 1 −2 3 ◆

λ1 = 1 λ2 = 2 M1 = N(A − I) M2 = N(A − 2I) = span ⇢✓1 1 ◆ = span ⇢✓1 2 ◆ B = ⇢✓1 1 ◆ , ✓1 2 ◆ Q = ✓1 1 1 2 ◆ [A]B = Q−1AQ = ✓1 2 ◆

N(A − λI) λ