NLA Reading Group Spring13 by smail Ar is a linear combination of - - PowerPoint PPT Presentation

nla reading group spring 13
SMART_READER_LITE
LIVE PREVIEW

NLA Reading Group Spring13 by smail Ar is a linear combination of - - PowerPoint PPT Presentation

NLA Reading Group Spring13 by smail Ar is a linear combination of the columns of 2 Let us re-write the matrix-vector multiplication As mathematicians, we are used to viewing the formula = as a


slide-1
SLIDE 1

NLA Reading Group Spring’13

by İsmail Arı

slide-2
SLIDE 2

2

 𝑐 is a linear combination of the columns of 𝐵  

slide-3
SLIDE 3

3

Let us re-write the matrix-vector multiplication “As mathematicians, we are used to viewing the formula 𝐵𝑦 = 𝑐 as a statement that 𝐵 acts on 𝑦 to produce 𝑐 The new formula, by contrast, suggests the interpretation that 𝑦 acts on 𝐵 to produce 𝑐

slide-4
SLIDE 4

4

The map from vectors of coefficients of polynomials 𝑞 of degree < 𝑜 to vectors (𝑞(𝑦1), 𝑞(𝑦2), … , 𝑞(𝑦𝑛)) of sampled polynomial values is linear. The product 𝐵𝑑 gives the sampled polynomial values:

slide-5
SLIDE 5

Do not see 𝐵𝑑 as 𝑛 distinct scalar summations. Instead, see 𝐵 as a matrix of columns, each giving sampled values of a monomial*, Thus, 𝐵𝑑 is a single vector summation that at once gives a linear combination of these monomials,

*In mathematics, a monomial is roughly speaking, a polynomial which has only one term.

slide-6
SLIDE 6

6

 each column of 𝐶 is a linear combination of the columns of 𝐵 Thus 𝑐𝑘 is a linear combinations of the columns 𝑏𝑙 with coefficients 𝑑𝑙𝑘

slide-7
SLIDE 7

7

slide-8
SLIDE 8

8

The matrix 𝑆 is a discrete analogue of an indefinite integral operator

slide-9
SLIDE 9

9

null(𝐵) is the set of vectors that satisfy 𝐵𝑦 = 0, where 0 is the 0-vector in ℂ𝑛 range(𝐵) is the space spanned by the columns of 𝐵 The column/row rank of a matrix is the dimension of its column/row space. Column rank always equals row rank. So, we call this as rank of the matrix. A matrix 𝐵 of size m-by-n with m ≥ n has full rank iff it maps no two distinct vectors to the same vector.

slide-10
SLIDE 10

10

A nonsingular or invertible matrix is a square matrix of full rank. 𝐽 is the m-by-m identity. The matrix 𝑎 is the inverse of 𝐵.

slide-11
SLIDE 11

11

For an m-by-m matrix 𝐵, the following conditions are equivalent: We mention the determinant, though a convenient notion theoretically, rarely finds a useful role on numerical algorithms.

slide-12
SLIDE 12

12

Do not think 𝑦 as the result of applying 𝐵−1 to 𝑐. Instead, think it as the unique vector that satisfies the equation 𝐵𝑦 = 𝑐. 𝐵−1𝑐 is the vector of coefficients of the expansion of 𝑐 in the basis of columns of 𝐵. Multiplication by 𝐵−1 is a change of basis operation.

slide-13
SLIDE 13

NLA Reading Group Spring’13

by İsmail Arı

slide-14
SLIDE 14

14

The complex conjugate of a scalar 𝑨, written 𝑨 or 𝑨∗, is obtained by negating its imaginary part. The hermitian conjugate or adjoint of an m-by-n matrix 𝐵, written 𝐵∗, is the n-by-m matrix whose i, j entry is the complex conjugate of the j, i entry of 𝐵. If 𝐵 = 𝐵∗, 𝐵 is hermitian. For real 𝐵, adjoint is known as transpose and shown as 𝐵𝑈. If 𝐵 = 𝐵𝑈, then 𝐵 is symmetric.

slide-15
SLIDE 15

15

Euclidean length of 𝑦 The inner product is bilinear, i.e. linear in each vector separately:

slide-16
SLIDE 16

16

A pair of vectors 𝑦 and 𝑧 are orthogonal if 𝑦∗𝑧 = 0. Two sets of vectors 𝑌 and 𝑍 are orthogonal if every 𝑦 ∈ 𝑌 is orthogonal to 𝑧 ∈ 𝑍. A set of nonzero vectors 𝑇 is orthogonal if its elements are pairwise orthogonal. A set of nonzero vectors 𝑇 is orthonormal if it is orthogonal, in addition, every 𝑦 ∈ 𝑇 has 𝑦 = 1.

slide-17
SLIDE 17

17

The vectors in an orthogonal set 𝑇 are linearly independent. Sketch of the proof:  Assume than they were not independent and propose a nonzero vector by linear combination of the members of 𝑇  Observe that its length should be larger than 0  Use the bilinearity of inner products and the orthogonality of 𝑇 to contradict the assumption If an orthogonal set 𝑇 ⊆ ℂ𝑛 contains 𝑛 vectors, then it is a basis for ℂ𝑛.

slide-18
SLIDE 18

18

Inner products can be used to decompose arbitrary vectors into orthogonal components. Assume 𝑟1, 𝑟2, … , 𝑟𝑜 : an orthonormal set 𝑤: an arbitrary vector Utilizing the scalars 𝑟𝑘

∗𝑤 as coordinates in an expansion, we find that

is orthogonal to 𝑟1, 𝑟2, … , 𝑟𝑜 Thus we see that 𝑤 can be decomposed into 𝑜 + 1 orthogonal components:

slide-19
SLIDE 19

19

2 1 1 2

We view 𝑤 as a sum of coefficients 𝑟𝑘

∗𝑤 times vectors 𝑟𝑗.

We view 𝑤 as a sum of orthogonal projections of 𝑤 onto the various directions of 𝑟𝑗. The 𝑗th projection operation is achieved by the very special rank-one matrix 𝑟𝑗𝑟𝑗

∗.

slide-20
SLIDE 20

20

If 𝑅∗ = 𝑅−1, 𝑅 is unitary.

slide-21
SLIDE 21

21

𝑅∗𝑐 is the vector of coefficients of the expansion of 𝑐 in the basis of columns of 𝐵.

slide-22
SLIDE 22

22

Multiplication by a unitary matrix or its adjoint preserve geometric structure in the Euclidean sense, because inner products are preserved. The invariance of inner products means that angles between vectors are preserved, and so are their lengths: In the real case, multiplication by an orthonormal matrix 𝑅 corresponds to a rigid rotation (if det𝑅 = 1) or reflection (if det𝑅 = −1) of the vector space.

slide-23
SLIDE 23

NLA Reading Group Spring’13

by İsmail Arı

slide-24
SLIDE 24

24

The essential notions of size and distance in a vector space are captured by norms. In order to conform a reasonable notion of length, a norm must satisfy for all vectors 𝑦 and 𝑧 and for all scalars 𝛽 ∈ ℂ.

slide-25
SLIDE 25

25

The closed unit ball 𝑦 ∈ ℂ𝑛: 𝑦 ≤ 1 corresponding to each norm is illustrated to the right for the case 𝑛 = 2.

slide-26
SLIDE 26

26

Example: a weighted 2-norm Introduce the diagonal matrix 𝑋 whose 𝑗th diagonal entry is the weight 𝑥𝑗 ≠ 0. The most important norms in this book are the unweighted 2-norm and its induced matrix form.

slide-27
SLIDE 27

27

An 𝑛 × 𝑜 matrix can be viewed as a vector in an 𝑛𝑜-dimensional space: each of the 𝑛𝑜 entries of the matrix is an independent coordinate. ⇒ Any 𝑛𝑜-dimensional norm can be used for measuring the “size” of such a matrix. However, certain special matrix norms are more useful than the vector norms. These are the induced matrix norms, defined in terms of the behavior of a matrix as an operator between its normed domain and range spaces.

slide-28
SLIDE 28

28

Given vector norms ⋅ (𝑜) and ⋅ (𝑛) on the domain and range of 𝐵 ∈ ℂ𝑛×𝑜, respectively, the induced matrix norm 𝐵 (𝑛,𝑜) is the smallest number 𝐷 for which In other words, it is the maximum factor by which 𝐵 can stretch a vector 𝑦.

slide-29
SLIDE 29

29

slide-30
SLIDE 30

30

slide-31
SLIDE 31

31

For any 𝑛 × 𝑜 matrix 𝐵, 𝐵 1 is equal to the maximum column sum of 𝐵. Consider 𝑦 be in By choosing 𝑦 = 𝑓

𝑘, where 𝑘 maximizes 𝑏𝑘 1, we attain:

slide-32
SLIDE 32

32

For any 𝑛 × 𝑜 matrix 𝐵, 𝐵 ∞ is equal to the maximum row sum of 𝐵.

slide-33
SLIDE 33

33

Let 𝑞 and 𝑟 satisfy

1 𝑞 + 1 𝑟 = 1, with 1 ≤ 𝑞, 𝑟 ≤ ∞. Then, the Hölder inequality states

that, for any vectors 𝑦 and 𝑧: The Cauchy-Schwartz inequality is a special case 𝑞 = 𝑟 = 2:

slide-34
SLIDE 34

34

Consider 𝐵 = 𝑏∗ where 𝑏 is a column vector. For any 𝑦, we have: This bound is tight: observe that Therefore, we have

slide-35
SLIDE 35

35

Consider 𝐵 = 𝑣𝑤∗, where 𝑣 is an 𝑛-vector and 𝑤 is an 𝑜-vector. For any 𝑜-vector 𝑦, we can bound Therefore, we have This inequality is an equality for the case 𝑦 = 𝑤.

slide-36
SLIDE 36

36

Therefore, the induced norm of 𝐵𝐶 must satisfy

slide-37
SLIDE 37

37

slide-38
SLIDE 38

38

The most important matrix norm which is not induced by a vector norm is the Hilbert-Schmidt or Frobenius norm, defined by Observe that this s the same as the 2-norm of the matrix when viewed as an 𝑛𝑜- dimensional vector. Alternatively, we can write

slide-39
SLIDE 39

39

Let 𝐷 = 𝐵𝐶, then

slide-40
SLIDE 40

40

The matrix 2-norm and Frobenius norm are invariant under multiplication by unitary matrices. This fact is still valid if 𝑅 is generalized to a rectangular matrix with orthonormal

  • columns. Recall transformation used in PCA.