Quiz Parts 1 and 2: Describe two interpretations of the matrix-vector - - PowerPoint PPT Presentation
Quiz Parts 1 and 2: Describe two interpretations of the matrix-vector - - PowerPoint PPT Presentation
Quiz Parts 1 and 2: Describe two interpretations of the matrix-vector product A v , one involving rows and one involving columns. Part 3: Describe an interpretation of the matrix-matrix product AB , one involving either rows or columns. Parts 4
SLIDE 1
SLIDE 2
Matrix-vector equation for sensor node
Define D = {’radio’, ’sensor’, ’memory’, ’CPU’}. Goal: Compute a D-vector u that, for each hardware component, gives the current drawn by that component. Four test periods:
I total milliampere-seconds in these test periods b = [140, 170, 60, 170] I for each test period, vector specifying how long each hardware device was operating:
I duration1 = Vec(D, ’radio’:.1, ’CPU’:.3) I duration2 = Vec(D, ’sensor’:.2, ’CPU’:.4) I duration3 = Vec(D, ’memory’:.3, ’CPU’:.1) I duration4 = Vec(D, ’memory’:.5, ’CPU’:.4)
To get u, solve A ⇤ x = b where A = 2 6 6 4 duration1 duration2 duration3 duration4 3 7 7 5
SLIDE 3
The solver module, and floating-point arithmetic
For arithmetic over R, Python uses floats, so round-off errors occur: >>> 10.0**16 + 1 == 10.0**16 True Consequently algorithms such as that used in solve(A, b) do not find exactly correct solutions. To see if solution u obtained is a reasonable solution to A ⇤ x = b, see if the vector b A ⇤ u has entries that are close to zero: >>> A = listlist2mat([[1,3],[5,7]]) >>> u = solve(A, b) >>> b - A*u Vec({0, 1},{0: -4.440892098500626e-16, 1: -8.881784197001252e-16}) The vector b A ⇤ u is called the residual. Easy way to test if entries of the residual are close to zero: compute the dot-product of the residual with itself: >>> res = b - A*u >>> res * res 9.860761315262648e-31
SLIDE 4
Checking the output from solve(A, b)
For some matrix-vector equations A ⇤ x = b, there is no solution. In this case, the vector returned by solve(A, b) gives rise to a largeish residual: >>> A = listlist2mat([[1,2],[4,5],[-6,1]]) >>> b = list2vec([1,1,1]) >>> u = solve(A, b) >>> res = b - A*u >>> res * res 0.24287856071964012 Some matrix-vector equations are ill-conditioned, which can prevent an algorithm using floats from getting even approximate solutions, even when solutions exists: >>> A = listlist2mat([[1e20,1],[1,0]]) >>> b = list2vec([1,1]) >>> u = solve(A, b) >>> b - A*u Vec({0, 1},{0: 0.0, 1: 1.0}) We will not study conditioning in this course.
SLIDE 5
Triangular matrix
Recall: We considered triangular linear systems, e.g. [ 1, 0.5, 2, 4 ] · x = 8 [ 0, 3, 3, 2 ] · x = 3 [ 0, 0, 1, 5 ] · x = 4 [ 0, 0, 0, 2 ] · x = 6 [ 0, 0, 0, 2 ] · x = 6 We can rewrite this linear system as a matrix-vector equation: 2 6 6 4 1 0.5 2 4 3 3 2 1 5 2 3 7 7 5 ⇤ x = [8, 3, 4, 6] The matrix is a triangular matrix. Definition: An n ⇥ n upper triangular matrix A is a matrix with the property that Aij = 0 for i > j. Note that the entries forming the upper triangle can be be zero or nonzero. We can use backward substitution to solve such a matrix-vector equation. Triangular matrices will play an important role later.
SLIDE 6
Algebraic properties of matrix-vector multiplication
Proposition: Let A be an R ⇥ C matrix.
I For any C-vector v and any scalar α,
A ⇤ (α v) = α (A ⇤ v)
I For any C-vectors u and v,
A ⇤ (u + v) = A ⇤ u + A ⇤ v
SLIDE 7
Algebraic properties of matrix-vector multiplication
To prove A ⇤ (α v) = α (A ⇤ v) we need to show corresponding entries are equal: Need to show entry i of A ⇤ (α v) = entry i of α (A ⇤ v) Proof: Write A = 2 6 4
a1
. . .
am
3 7 5. By dot-product def. of matrix-vector mult, entry i of A ⇤ (α v) =
ai · α v
= α (ai · v) by homogeneity of dot-product By definition of scalar-vector multiply, entry i of α (A ⇤ v) = α (entry i of A ⇤ v) = α (ai · v) by dot-product definition of matrix-vector multiply QED
SLIDE 8
Algebraic properties of matrix-vector multiplication
To prove A ⇤ (u + v) = A ⇤ u + A ⇤ v we need to show corresponding entries are equal: Need to show entry i of A ⇤ (u + v) = entry i of A ⇤ u + A ⇤ v Proof: Write A = 2 6 4
a1
. . .
am
3 7 5. By dot-product def. of matrix-vector mult, entry i of A ⇤ (u + v) =
ai · (u + v)
=
ai · u + ai · v
by distributive property of dot-product By dot-product def. of matrix-vector mult, entry i of A ⇤ u =
ai · u
entry i of A ⇤ v =
ai · v
so entry i of A ⇤ u + A ⇤ v = ai · u + ai · v QED
SLIDE 9
Matrix-matrix multiplication and function composition
Corresponding to an R ⇥ C matrix A over a field F, there is a function f : FC ! FR namely the function defined by f (y) = A ⇤ y
SLIDE 10
Matrix-matrix multiplication and function composition
Matrices A and B ) functions f (y) = A ⇤ y and g(x) = B ⇤ x and h(x) = (AB) ⇤ x Matrix-Multiplication Lemma f g = h Example: A = 1 1 1
- ) f
✓ x1 x2 ◆ = 1 1 1 x1 x2
- =
x1 + x2 x2
- B =
1 1 1
- ) g
✓ x1 x2 ◆ = 1 1 1 x1 x2
- =
x1 x1 + x2
- product AB =
1 1 1 1 1 1
- =
2 1 1 1
- corresponds to function h
✓ x1 x2 ◆ = 2 1 1 1 x1 x2
- =
2x1 + x2 x1 + x2
- f g
✓ x1 x2 ◆ = f ✓ x1 x1 + x2 ◆ = 2x1 + x2 x1 + x2
- so f g = h
SLIDE 11
Matrix-matrix multiplication and function composition
Matrices A and B ) functions f (y) = A ⇤ y and g(x) = B ⇤ x and h(x) = (AB) ⇤ x Matrix-Multiplication Lemma: f g = h Proof: Let columns of B be b1, . . . , bn. By the matrix-vector definition of matrix-matrix multiplication, column j of AB is A ⇤ (column j of B). For any n-vector x = [x1, . . . , xn], g(x) = B ⇤ x by definition of g = x1b1 + · · · + xnbn by linear combinations definition Therefore f (g(x)) = f (x1b1 + · · · xnbn) = x1(f (b1)) + · · · + xn(f (bn)) by algebraic properties = x1(A ⇤ b1) + · · · + xn(A ⇤ bn) by definition of f = x1(column 1 of AB) + · · · + xn(column n of AB) by matrix-vector def. = (AB) ⇤ x by linear-combinations def. = h(x) by definition of h
SLIDE 12
Associativity of matrix-matrix multiplication
Matrices A and B ) functions f (y) = A ⇤ y and g(x) = B ⇤ x and h(x) = (AB) ⇤ x Matrix-Multiplication Lemma: f g = h Matrix-matrix multiplication corresponds to function composition. Corollary: Matrix-matrix multiplication is associative: (AB)C = A(BC) Proof: Function composition is associative. QED Example: 1 1 1 ✓ 1 1 1 1 3 1 2 ◆ = 1 1 1 0 5 1 2
- =
0 5 1 7
- ✓ 1
1 1 1 1 1 ◆ 1 3 1 2
- =
1 1 1 2 1 3 1 2
- =
0 5 1 7
SLIDE 13
Matrices and their functions
Now we study the relationship between a matrix M and the function x 7! M ⇤ x
I Easy: Going from a matrix M to the function x 7! M ⇤ x I A little harder: Going from the function x 7! M ⇤ x to the matrix M.
In studying this relationship, we come up with the fundamental notion of a linear transformation.
SLIDE 14
From matrix to function
Starting with a M, define the function f (x) = M ⇤ x. Domain and co-domain? If M is an R ⇥ C matrix over F then
I domain of f is FC I co-domain of f is FR
Example: Let M be the matrix # @ ? a 1 2 3 b 10 20 30 and define f (x) = M ⇤ x
I Domain of f is R{#,@,?}. I Co-domain of f is R{a,b}.
f maps # @ ? 2 2
- 2
to a b Example: Define f (x) = 1 2 3 10 20 30
- ⇤ x.
I Domain of f is R3 I Co-domain of f is R2
f maps [2, 2, 2] to [0, 0]
SLIDE 15
From function to matrix
We have a function f : FA ! FB We want to compute matrix M such that f (x) = M ⇤ x.
I Since the domain is FA, we know that the input x is an A-vector. I For the product M ⇤ x to be legal, we need the column-label set of M to be A. I Since the co-domain is FB, we know that the output f (x) = M ⇤ x is B-vector. I To achieve that, we need row-label set of M to be B.
Now we know that M must be a B ⇥ A matrix.... ... but what about its entries?
SLIDE 16
From function to matrix
I We have a function f : Fn
! Fm
I We think there is an m ⇥ n matrix M such that f (x) = M ⇤ x
How to go from the function f to the entries of M?
I Write mystery matrix in terms of its columns: M =
2 4 v1 · · ·
vn
3 5
I Use standard generators e1 = [1, 0, . . . , 0, 0], . . . , en = [0, . . . , 0, 1]
with linear-combinations definition of matrix-vector multiplication: f (e1) = 2 4 v1 · · ·
vn
3 5 ⇤ [1, 0, . . . , 0, 0] = v1 . . . f (en) = 2 4 v1 · · ·
vn
3 5 ⇤ [0, 0, . . . , 0, 1] = vn
SLIDE 17
From function to matrix: horizontal scaling
Define s([x, y]) = stretching by two in horizontal direction Assume s([x, y]) = M ⇤ [x, y] for some matrix M.
I We know s([1, 0]) = [2, 0] because we are stretching by two in horizontal direction I We know s([0, 1]) = [0, 1] because no change in vertical direction.
Therefore M = 2 1
SLIDE 18
From function to matrix: horizontal scaling
(1,0) (2,0)
Define s([x, y]) = stretching by two in horizontal direction Assume s([x, y]) = M ⇤ [x, y] for some matrix M.
I We know s([1, 0]) = [2, 0] because we are stretching by two in horizontal direction I We know s([0, 1]) = [0, 1] because no change in vertical direction.
Therefore M = 2 1
SLIDE 19
From function to matrix: horizontal scaling
(0,1) (0,1)
Define s([x, y]) = stretching by two in horizontal direction Assume s([x, y]) = M ⇤ [x, y] for some matrix M.
I We know s([1, 0]) = [2, 0] because we are stretching by two in horizontal direction I We know s([0, 1]) = [0, 1] because no change in vertical direction.
Therefore M = 2 1
SLIDE 20
From function to matrix: rotation by 90 degrees
Define r([x, y]) = rotation by 90 degrees Assume r([x, y]) = M ⇤ [x, y] for some matrix M.
I We know rotating [1, 0] should give [0, 1] so r([1, 0]) = [0, 1] I We know rotating [0, 1] should give [1, 0] so r([0, 1]) = [1, 0]
Therefore M = 0 1 1
SLIDE 21
From function to matrix: rotation by 90 degrees
Define r([x, y]) = rotation by 90 degrees Assume r([x, y]) = M ⇤ [x, y] for some matrix M.
I We know rotating [1, 0] should give [0, 1] so r([1, 0]) = [0, 1] I We know rotating [0, 1] should give [1, 0] so r([0, 1]) = [1, 0]
Therefore M = 0 1 1
- rϴ([1,0]) = [0,1]
(1,0) (0,1)
rϴ([0,1]) = [-1,0]
(-1,0) (0,1)
rϴ([1,0]) = [0,1]
(1,0)
SLIDE 22
From function to matrix: rotation by θ degrees
Define r([x, y]) = rotation by θ. Assume r([x, y]) = M ⇤ [x, y] for some matrix M.
I We know r([1, 0]) = [cos θ, sin θ] so column 1 is [cos θ, sin θ] I We know r([0, 1]) = [ sin θ, cos θ] so column 2 is [ sin θ, cos θ]
Therefore M = cos θ sin θ sin θ cos θ
- ϴ
cos ϴ sin ϴ rϴ([1,0]) = [cos ϴ,sin ϴ]
(cos ϴ,sin ϴ) (1,0)
SLIDE 23
From function to matrix: rotation by θ degrees
Define r([x, y]) = rotation by θ. Assume r([x, y]) = M ⇤ [x, y] for some matrix M.
I We know r([1, 0]) = [cos θ, sin θ] so column 1 is [cos θ, sin θ] I We know r([0, 1]) = [ sin θ, cos θ] so column 2 is [ sin θ, cos θ]
Therefore M = cos θ sin θ sin θ cos θ
- ϴ
cos ϴ rϴ([0,1]) = [-sin ϴ, cos ϴ] sin ϴ
(1,0) (-sin ϴ,cos ϴ)
SLIDE 24
From function to matrix: rotation by θ degrees
Define r([x, y]) = rotation by θ. Assume r([x, y]) = M ⇤ [x, y] for some matrix M.
I We know r([1, 0]) = [cos θ, sin θ] so column 1 is [cos θ, sin θ] I We know r([0, 1]) = [ sin θ, cos θ] so column 2 is [ sin θ, cos θ]
Therefore M = cos θ sin θ sin θ cos θ
- For clockwise rotation by 90 degrees, plug in θ = -90 degrees...
Matrix Transform (http://xkcd.com/824)
SLIDE 25
From function to matrix: translation
t([x, y]) = translation by [1, 2]. Assume t([x, y]) = M ⇤ [x, y] for some matrix M.
I We know t([1, 0]) = [2, 2] so column 1 is [2, 2]. I We know t([0, 1]) = [1, 3] so column 2 is [1, 3].
Therefore M = 2 1 2 3
SLIDE 26
From function to matrix: translation
t([x, y]) = translation by [1, 2]. Assume t([x, y]) = M ⇤ [x, y] for some matrix M.
I We know t([1, 0]) = [2, 2] so column 1 is [2, 2]. I We know t([0, 1]) = [1, 3] so column 2 is [1, 3].
Therefore M = 2 1 2 3
- (1,0)
(2,2)
SLIDE 27
From function to matrix: translation
t([x, y]) = translation by [1, 2]. Assume t([x, y]) = M ⇤ [x, y] for some matrix M.
I We know t([1, 0]) = [2, 2] so column 1 is [2, 2]. I We know t([0, 1]) = [1, 3] so column 2 is [1, 3].
Therefore M = 2 1 2 3
- (0,1)
(1,3)
SLIDE 28
From function to matrix: identity function
Consider the function f : R4 ! R4 defined by f (x) = x This is the identity function on R4. Assume f (x) = M ⇤ x for some matrix M. Plug in the standard generators e1 = [1, 0, 0, 0], e2 = [0, 1, 0, 0], e3 = [0, 0, 1, 0], e4 = [0, 0, 0, 1]
I f (e1) = e1 so first column is e1 I f (e2) = e2 so second column is e2 I f (e3) = e3 so third column is e3 I f (e4) = e4 so fourth column is e4
So M = 2 6 6 4 1 1 1 1 3 7 7 5 Identity function f (x) corresponds to identity matrix 1
SLIDE 29
Diagonal matrices
Let d1, . . . , dn be real numbers. Let f : Rn ! Rn be the function such that f ([x1, . . . , xn]) = [d1x1, . . . , dnxn]. The matrix corresponding to this function is 2 6 4 d1 ... dn 3 7 5 Such a matrix is called a diagonal matrix because the only entries allowed to be nonzero form a diagonal. Definition: For a domain D, a D ⇥ D matrix M is a diagonal matrix if M[r, c] = 0 for every pair r, c 2 D such that r 6= c. Special case: d1 = · · · = dn = 1. In this case, f (x) = x (identity function) The matrix 2 6 4 1 ... 1 3 7 5 is an identity matrix.
SLIDE 30
Which functions can be expressed as matrix-vector products?
In each example, we assumed the function could be expressed as a matrix-vector product. How can we verify that assumption? We’ll state two algebraic properties.
I If a function can be expressed as a matrix-vector product x 7! M ⇤x, it has these properties. I If the function from FC to FR has these properties, it can be expressed as a matrix-vector
product.
SLIDE 31