Matrix Calculations: Kernels & Images, Matrix Multiplication A. - - PowerPoint PPT Presentation

matrix calculations kernels images matrix multiplication
SMART_READER_LITE
LIVE PREVIEW

Matrix Calculations: Kernels & Images, Matrix Multiplication A. - - PowerPoint PPT Presentation

Matrix multiplication Matrix inverse Radboud University Nijmegen Kernel and image Matrix Calculations: Kernels & Images, Matrix Multiplication A. Kissinger (and H. Geuvers) Institute for Computing and Information Sciences Intelligent


slide-1
SLIDE 1

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Matrix Calculations: Kernels & Images, Matrix Multiplication

  • A. Kissinger (and H. Geuvers)

Institute for Computing and Information Sciences – Intelligent Systems Radboud University Nijmegen

Version: spring 2016

  • A. Kissinger

Version: spring 2016 Matrix Calculations 1 / 43

slide-2
SLIDE 2

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Outline

Matrix multiplication Matrix inverse Kernel and image

  • A. Kissinger

Version: spring 2016 Matrix Calculations 2 / 43

slide-3
SLIDE 3

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

From last time

  • Vector spaces V , W , . . . are special kinds of sets whose

elements are called vectors.

  • Vectors can be added together, or multiplied by a real

number, For v, w ∈ V , a ∈ R: v + w ∈ V a · v ∈ V

  • The simplest examples are:

Rn := {(a1, . . . , an) | ai ∈ R}

  • Linear maps are special kinds of functions which satisfy two

properties: f (v + w) = f (v) + f (w) f (a · v) = a · f (v)

  • A. Kissinger

Version: spring 2016 Matrix Calculations 3 / 43

slide-4
SLIDE 4

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

From last time

  • Whereas there exist LOTS of functions between the sets V

and W ...

  • ...there actually aren’t that many linear maps:

Theorem

For every linear map f : Rn → Rm, there exists an m × n matrix A where: f (v) = A · v (where “·” is the matrix multiplication of A and a vector v)

  • More generally, every linear map f : V → W is representable

as a matrix, but you have to fix a basis for V and W first: {v1, . . . , vm} ∈ V {w1, . . . , wn} ∈ W

  • ...whereas in Rn there is an obvious choice:

{(1, 0, . . . , 0), (0, 1, . . . , 0), . . . , (0, . . . , 0, 1)} ∈ Rn

  • A. Kissinger

Version: spring 2016 Matrix Calculations 4 / 43

slide-5
SLIDE 5

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Matrix-vector multiplication

For a matrix A and a vector v, w := A · v is the vector whose i-th row is the dot product of the i-th row of A with v:    a11 · · · a1n . . . . . . am1 · · · amn    ·    v1 . . . vn    =    a11v1 + . . . + a1nvn . . . am1v1 + . . . + amnvn    i.e. wi := a11v1 + . . . + a1nvn =

n

  • j=1

aijvj.

  • A. Kissinger

Version: spring 2016 Matrix Calculations 5 / 43

slide-6
SLIDE 6

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Example: systems of equations

a11x1 + · · · + a1nxn = b1 . . . am1x1 + · · · + amnxn = bm ⇒ A · x = b    a11 · · · a1n . . . am1 · · · amn    ·    x1 . . . xn    =    b1 . . . bn    a11x1 + · · · + a1nxn = 0 . . . . . . . . . am1x1 + · · · + amnxn = 0 ⇒ A · x = 0    a11 · · · a1n . . . . . . . . . am1 · · · amn    ·    x1 . . . xn    =    . . .   

  • A. Kissinger

Version: spring 2016 Matrix Calculations 6 / 43

slide-7
SLIDE 7

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Matrix multiplication

  • Consider linear maps g, f represented by matrices A, B:

g(v) = A · v f (w) = B · w

  • Can we find a matrix C that represents their composition?

g(f (v)) = C · v

  • Let’s try:

g(f (v)) = g(B · v) = A · (B · v)

(∗)

= (A · B) · v (where step (∗) is currently ‘wishful thinking’)

  • Great! Let C := A · B.
  • But we don’t know what “·” means for two matrices yet...
  • A. Kissinger

Version: spring 2016 Matrix Calculations 8 / 43

slide-8
SLIDE 8

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Matrix multiplication

  • Solution: generalise from A · v
  • A vector is a matrix with one column:

The number in the i-th row and the first column of A · v is the dot product of the i-th row of A with the first column of v.

  • So for matrices A, B:

The number in the i-th row and the j-th column of A·B is the dot product of the i-th row of A with the j-th column of B.

  • A. Kissinger

Version: spring 2016 Matrix Calculations 9 / 43

slide-9
SLIDE 9

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Matrix multiplication

For A an m × n matrix, B an n × p matrix: A · B = C is an m × p matrix.     . . . . . . . . . ai1 · · · ain . . . . . . . . .     ·    · · · bj1 · · · · · · . . . · · · · · · bjn · · ·    =     ... . . . ... · · · cij · · · ... . . . ...     cij =

n

  • k=1

aikbkj

  • A. Kissinger

Version: spring 2016 Matrix Calculations 10 / 43

slide-10
SLIDE 10

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Special case: vectors

For A an m × n matrix, B an n × 1 matrix: A · b = c is an m × 1 matrix.     . . . . . . . . . ai1 · · · ain . . . . . . . . .     ·    b11 . . . bn1    =     . . . ci1 . . .     ci1 =

n

  • k=1

aikbk1

  • A. Kissinger

Version: spring 2016 Matrix Calculations 11 / 43

slide-11
SLIDE 11

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Matrix composition

Theorem

Matrix composition is associative: (A · B) · C = A · (B · C)

  • Proof. Let X := (A · B) · C. This is a matrix with entries:

xip =

  • k

aikbkp Then, the matrix entries of X · C are:

  • p

xipcpj =

  • p
  • k

aikbkp

  • cpk =
  • kp

aikbkpcpk (because sums can always be pulled outside, and combined)

  • A. Kissinger

Version: spring 2016 Matrix Calculations 12 / 43

slide-12
SLIDE 12

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Associativity of matrix composition

Proof (cont’d). Now, let Y := B · C. This has matrix entries: ykj =

  • p

bkpcpj Then, the matrix entries of A · Y are:

  • k

aikykj =

  • k

aik

  • p

bkpcpj

  • =
  • kp

aikbkpcpk ...which is the same as before! So: (A · B) · C = X · C = A · Y = A · (B · C) So we can drop those pesky parentheses: A · B · C := (A · B) · C = A · (B · C)

  • A. Kissinger

Version: spring 2016 Matrix Calculations 13 / 43

slide-13
SLIDE 13

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Matrix product and composition

Corollary

The composition of linear maps is given by matrix product.

  • Proof. Let g(w) = A · w and f (v) = B · v. Then:

g(f (v)) = g(B · v) = A · B · v

  • No wishful thinking necessary!
  • A. Kissinger

Version: spring 2016 Matrix Calculations 14 / 43

slide-14
SLIDE 14

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Example 1

Consider the following two linear maps, and their associated matrices: R3

f

− → R2 R2

g

− → R2 f (x1, x2, x3) = (x1 − x2, x2 + x3) g(y1, y2) = (2y1 − y2, 3y2) Mf = 1 −1 0 1 1

  • Mg =

2 −1 3

  • We can compute the composition directly:

(g ◦ f )(x1, x2, x3) = g

  • f (x1, x2, x3)
  • = g(x1 − x2, x2 + x3)

= ( 2(x1 − x2) − (x2 + x3), 3(x2 + x3) ) = ( 2x1 − 3x2 − x3, 3x2 + 3x3 ) So: Mg◦f = 2 −3 −1 3 3

  • ...which is just the product of the matrices: Mg◦f = Mg · Mf
  • A. Kissinger

Version: spring 2016 Matrix Calculations 15 / 43

slide-15
SLIDE 15

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Note: matrix composition is not commutative

In general, A · B = B · A For instance: Take A = 1 0 −1

  • and B =

1 −1 0

  • . Then:

A · B = 1 0 −1

  • ·

1 −1 0

  • =

1 · 0 + 0 · −1 1 · 1 + 0 · 0 0 · 0 + −1 · −1 0 · 1 + −1 · 0

  • =

0 1 1 0

  • B · A =

1 −1 0

  • ·

1 0 −1

  • =

0 · 1 + 1 · 0 0 · 0 + 1 · −1 −1 · 1 + 0 · 0 −1 · 0 + 0 · −1

  • =

−1 −1

  • A. Kissinger

Version: spring 2016 Matrix Calculations 16 / 43

slide-16
SLIDE 16

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

But it is...

...associative, as we’ve already seen: A · B · C := (A · B) · C = A · (B · C) It also has a unit given by the identity matrix I: A · I = I · A = A where: I :=      1 0 · · · 0 0 1 · · · 0 . . . ... . . . 0 0 · · · 1     

  • A. Kissinger

Version: spring 2016 Matrix Calculations 17 / 43

slide-17
SLIDE 17

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Example: political swingers, part I

  • We take an extremely crude view on politics and distinguish
  • nly left and right wing political supporters
  • We study changes in political views, per year
  • Suppose we observe, for each year:
  • 80% of lefties remain lefties and 20% become righties
  • 90% of righties remain righties, and 10% become lefties

Questions . . .

  • start with a population L = 100, R = 150, and compute the

number of lefties and righties after one year;

  • similarly, after 2 years, and 3 years, . . .
  • Find a convenient way to represent these computations.
  • A. Kissinger

Version: spring 2016 Matrix Calculations 18 / 43

slide-18
SLIDE 18

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Political swingers, part II

  • So if we start with a population L = 100, R = 150, then after
  • ne year we have:
  • lefties: 0.8 · 100 + 0.1 · 150 = 80 + 15 = 95
  • righties: 0.2 · 100 + 0.9 · 150 = 20 + 135 = 155
  • Two observations:
  • this looks like a matrix-vector multiplication
  • long-term developments can be calculated via iterated matrices
  • A. Kissinger

Version: spring 2016 Matrix Calculations 19 / 43

slide-19
SLIDE 19

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Political swingers, part III

  • We can write the political transition matrix as

P = 0.8 0.1 0.2 0.9

  • If

L R

  • =

100 150

  • , then after one year we have:

P · 100 150

  • =

0.8 0.1 0.2 0.9

  • ·

100 150

  • =

0.8 · 100 + 0.1 · 150 0.2 · 100 + 0.9 · 150

  • =

95 155

  • After two years we have:

P · 95 155

  • =

0.8 0.1 0.2 0.9

  • ·

95 155

  • =

0.8 · 95 + 0.1 · 155 0.2 · 95 + 0.9 · 155

  • =

91.5 158.5

  • A. Kissinger

Version: spring 2016 Matrix Calculations 20 / 43

slide-20
SLIDE 20

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Political swingers, part IV

The situation after two years is obtained as: P · P ·

  • L

R

  • =
  • 0.8 0.1

0.2 0.9

  • ·
  • 0.8 0.1

0.2 0.9

  • ·
  • L

R

  • do this multiplication first

=

  • 0.8 · 0.8 + 0.1 · 0.2 0.8 · 0.1 + 0.1 · 0.9

0.2 · 0.8 + 0.9 · 0.2 0.2 · 0.1 + 0.9 · 0.9

  • ·
  • L

R

  • =
  • 0.66 0.17

0.34 0.83

  • ·
  • L

R

  • The situation after n years is described by the n-fold iterated

matrix: Pn = P · P · · · P

  • n times
  • A. Kissinger

Version: spring 2016 Matrix Calculations 21 / 43

slide-21
SLIDE 21

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Political swingers, part V

Interpret the following iterations: P2 = P · P = 0.66 0.17 0.34 0.83

  • P3 = P · P · P =

0.8 0.1 0.2 0.9

  • ·

0.66 0.17 0.34 0.83

  • =

0.562 0.219 0.438 0.781

  • P4 = P · P · P · P =

0.8 0.1 0.2 0.9

  • ·

0.562 0.219 0.438 0.781

  • =

0.4934 0.2533 0.5066 0.7467

  • Etc. Does this stabilise? We’ll talk about fixed points later on...
  • A. Kissinger

Version: spring 2016 Matrix Calculations 22 / 43

slide-22
SLIDE 22

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Solving equations the old fashioned way...

  • We now know that systems of equations look like this:

A · x = b

  • The goal is to solve for x, in terms of A and b.
  • Here comes some more wishful thinking:

x = 1 A · b

  • Well, we can’t really divide by a matrix, but if we are lucky,

we can find another matrix called A−1 which acts like 1

A.

  • A. Kissinger

Version: spring 2016 Matrix Calculations 24 / 43

slide-23
SLIDE 23

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Inverse

Definition

The inverse of a matrix A is another matrix A−1 such that: A−1 · A = A · A−1 = I

  • Not all matrices have inverses, but when they do, we are

happy, because: A · x = b = ⇒ A−1 · A · x = A−1 · b = ⇒ x = A−1 · b

  • So, how do we compute the inverse of a matrix?
  • A. Kissinger

Version: spring 2016 Matrix Calculations 25 / 43

slide-24
SLIDE 24

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Remember me?

  • A. Kissinger

Version: spring 2016 Matrix Calculations 26 / 43

slide-25
SLIDE 25

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Gaussian elimination as matrix multiplication

  • Each step of Gaussian elimination can be represented by a

matrix multiplication: A ⇒ A′ A′ := G · A

  • For instance, multiplying the i-th row by c is given by:

G(Ri:=cRi) · A where G(Ri:=cRi) is just like the identity matrix, but gii = c.

  • Exercise. What are the other Gaussian elimination matrices?

G(Ri↔Rj) G(Ri:=Ri+cRj)

  • A. Kissinger

Version: spring 2016 Matrix Calculations 27 / 43

slide-26
SLIDE 26

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Reduction to Echelon form

  • The idea: treat A as a coefficient matrix, and compute its

reduced Echelon form

  • If the Echelon form of A has n pivots, then its reduced

Echelon form is the identity matrix: A ⇒ A1 ⇒ A2 ⇒ · · · ⇒ Ap = I

  • Now, we can use our Gauss matrices to remember what we

did: A1 := G1 · A A2 := G2 · G1 · A · · · Ap := Gp · · · G1 · A = I

  • A. Kissinger

Version: spring 2016 Matrix Calculations 28 / 43

slide-27
SLIDE 27

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Computing the inverse

  • A ha!

Gp · · · G1 · A = I = ⇒ A−1 = Gp · · · G1

  • So all we have to do is construct p different matrices and

multiply them all together!

  • Since I already have plans for this afternoon, lets take a

shortcut:

Theorem

For C a matrix and (A|B) an augmented matrix: C · (A|B) = (C · A | C · B)

  • A. Kissinger

Version: spring 2016 Matrix Calculations 29 / 43

slide-28
SLIDE 28

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Computing the inverse

  • Since Gaussian elimination is just multiplying by a certain

matrix on the left... A ⇒ G · A

  • ...doing Gaussian elimination (for A) on an augmented matrix

applies G to both parts: (A|B) ⇒ (G · A | G · B)

  • So, if G = A−1:

(A|B) ⇒ (A−1 · A | A−1 · B) = (I| A−1 · B)

  • A. Kissinger

Version: spring 2016 Matrix Calculations 30 / 43

slide-29
SLIDE 29

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Computing the inverse

  • We already (secretly) used this trick to solve:

A · x = b = ⇒ x = A−1 · b

  • Here, we are only interested in the vector A−1 · b
  • Which is exactly what Gaussian elimination on the augmented

matrix gives us: (A|b) ⇒ (I| A−1 · b)

  • To get the entire matrix, we just need to choose something

clever to the right of the line

  • Like this:

(A|I) ⇒ (I| A−1 · I) = (I| A−1)

  • A. Kissinger

Version: spring 2016 Matrix Calculations 31 / 43

slide-30
SLIDE 30

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Computing the inverse: example

For example, we compute the inverse of: A := 1 1 1 2

  • as follows:

1 1 1 0 1 2 0 1

1 1 1 0 1 −1 1

1 0 2 −1 0 1 −1 1

  • So:

A−1 := 2 −1 −1 1

  • A. Kissinger

Version: spring 2016 Matrix Calculations 32 / 43

slide-31
SLIDE 31

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Computing the inverse: non-example

Unlike transpose, not every matrix has an inverse. For example, if we try to compute the inverse for: B := 1 1 1 1

  • we have:

1 1 1 0 1 1 0 1

1 1 1 0 0 −1 1

  • We don’t have enough pivots to continue reducing. So B does not

have an inverse.

  • A. Kissinger

Version: spring 2016 Matrix Calculations 33 / 43

slide-32
SLIDE 32

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Subspace definition

Definition

A subset S ⊆ V of a vector space V is called a (linear) subspace if S is closed under addition and scalar multiplication:

  • 0 ∈ S
  • v, v ′ ∈ S implies v + v ′ ∈ S
  • v ∈ S and a ∈ R implies a · v ∈ S.

Note

  • A subspace S ⊆ V is a vector space itself, and thus also has a

basis.

  • Also S has its own dimension, where dim(S) ≤ dim(V ).
  • A. Kissinger

Version: spring 2016 Matrix Calculations 35 / 43

slide-33
SLIDE 33

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Subspace examples

1 Earlier we saw that the subset of solutions of a system of

equations is closed under addition and (scalar) multiplication, and thus is a linear subspace.

2 The diagonal D = {(x, x) | x ∈ R} ⊆ R2 is a linear subspace:

  • if (x1, x1), (x2, x2) ∈ D, then also

(x1, x1) + (x2, x2) = (x1 + x2, x1 + x2) ∈ D

  • if (x, x) ∈ D and a ∈ R, also a · (x, x) = (a · x, a · x) ∈ D

Also:

  • D has a single vector as basis, for example (1, 1)
  • thus, D has dimension 1
  • A. Kissinger

Version: spring 2016 Matrix Calculations 36 / 43

slide-34
SLIDE 34

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Basis for subspaces

Let the space V have dimension n, and a subspace S ⊆ V dimension p, where p ≤ n. Then:

  • any set of > p vectors in S is linearly dependent
  • any set of < p vectors in S does not span S
  • any set of p independent vectors in S is a basis for S
  • any set of p vectors that spans S is a basis for S
  • A. Kissinger

Version: spring 2016 Matrix Calculations 37 / 43

slide-35
SLIDE 35

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Kernel and image: definitions

Definition

Let f : V → W be a linear map

  • the kernel of f is the subset of V given by:

ker(f ) = {v ∈ V | f (v) = 0}

  • the image of f is the subset of W given by:

im(f ) = {f (v) | v ∈ V }

Example

Consider the function f : R2 → R2 given by f (x, y) = (x, 0)

  • the kernel is {(x, y) ∈ R2 | f (x, y) = (0, 0)}, which is

{(0, y) | y ∈ R}, i.e. the y-axis.

  • the image is the x-axis {(x, 0) | x ∈ R}
  • A. Kissinger

Version: spring 2016 Matrix Calculations 38 / 43

slide-36
SLIDE 36

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Kernels and images are subspaces

Theorem

For a linear map f : V → W ,

  • ker(f ) = {v | f (v) = 0} ⊆ V is a linear subspace
  • im(f ) = {f (v) | v ∈ V } ⊆ W is a linear subspace.

Proof: We check two cases (do the others yourself!)

  • Closure of ker(f ) under addition: if v, v ′ ∈ ker(f ), then

f (v) = 0 and f (v ′) = 0. By linearity of f , f (v + v ′) = f (v) + f (v ′) = 0 + 0 = 0, so v + v ′ ∈ ker(f ).

  • Closure of im(f ) under scalar multiplication: Assume

w ∈ im(f ), say w = f (v), and a ∈ R. Again by linearity: a · w = a · f (v) = f (a · v), so a · w ∈ im(f ).

  • A. Kissinger

Version: spring 2016 Matrix Calculations 39 / 43

slide-37
SLIDE 37

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Injectivity and surjectivity

  • A linear map f : V → W is surjective:

∀w∃v.f (v) = w if and only if im(f ) = W .

  • A linear map f : V → W is injective:

f (v) = f (w) = ⇒ v = w if and only if ker(f ) = 0.

  • A. Kissinger

Version: spring 2016 Matrix Calculations 40 / 43

slide-38
SLIDE 38

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

The kernel as solution space

With this kernel (and image) terminology we can connect some previous concepts.

Theorem

Suppose a linear map f : V → W has matrix A. Then: v ∈ ker(f ) ⇐ ⇒ f (v) = 0 ⇐ ⇒ A · v = 0 ⇐ ⇒ v solves a system of homogeneous equations Moreover, the dimension of the kernel dim(ker(f )) is the same as the number of basic solutions of A, that is the number of columns without pivots in the echelon form of A.

  • A. Kissinger

Version: spring 2016 Matrix Calculations 41 / 43

slide-39
SLIDE 39

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

We can learn a lot about a matrix...

  • ...by looking at its columns.
  • Suppose a linear map f is represented by a matrix A with

columns {v1, . . . , vn}: f (w) =   | | v1 · · · vn | |   · w

  • Then, dim(im(f )) is the dimension of the space spanned by

{v1, . . . , vn}

  • ...which is the same as the number of pivots in A
  • A. Kissinger

Version: spring 2016 Matrix Calculations 42 / 43

slide-40
SLIDE 40

Matrix multiplication Matrix inverse Kernel and image

Radboud University Nijmegen

Kernel-image-dimension theorem (aka. rank-nullity)

Theorem

For a linear map f : V → W one has: dim(ker(f )) + dim(im(f )) = dim(V ) Proof: Let A be a matrix that represents f . It has dim(V )

  • columns. dim(im(f )) of those are pivots, and the rest correspond

to basic solutions to A · x = 0, which give a basis for ker(f ).

  • A. Kissinger

Version: spring 2016 Matrix Calculations 43 / 43