CSE 446: Linear Algebra Review Sachin Mehta University of - - PowerPoint PPT Presentation

cse 446 linear algebra review
SMART_READER_LITE
LIVE PREVIEW

CSE 446: Linear Algebra Review Sachin Mehta University of - - PowerPoint PPT Presentation

CSE 446: Linear Algebra Review Sachin Mehta University of Washington, Seattle Email: sacmehta@uw.edu Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 1 / 15 Things to get from Today


slide-1
SLIDE 1

CSE 446: Linear Algebra Review

Sachin Mehta

University of Washington, Seattle Email: sacmehta@uw.edu

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 1 / 15

slide-2
SLIDE 2

Things to get from Today

Basics of Vector and Matrix operations Matrix Differentiation EigenValues and EigenVectors

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 2 / 15

slide-3
SLIDE 3

Vectors

A vector v ∈ Rn is an n-tuple of real numbers. v =      v1 v2 . . . vn      e.g. v =   2 3 5   Length of v is ||v|| =

  • v2

1 + v2 2 + .. + v2 n

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 3 / 15

slide-4
SLIDE 4

Vector Operations

Addition and Subtraction: To add or subtract two vectors, add or subtract them component wise v ± u =      v1 v2 . . . vn      ±      u1 u2 . . . un      Scaling: This is just like expanding or shrinking the vector. Let α be a scalar, then vector v after scaling with α can be represented as αv. v′ = α v =      α v1 α v2 . . . α vn     

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 4 / 15

slide-5
SLIDE 5

Vector Operations

Inner (or dot) product of two vectors: Let u and v be two vectors, then their dot product is defined as: u.v = uT v u.v = uT v =      u1 u2 . . . un      .      v1 v2 . . . vn      =

  • u1 u2 . . . un

    v1 v2 . . . vn     

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 5 / 15

slide-6
SLIDE 6

Matrices

A matrix AN×M is written as: A =      a11 a12 . . . a1M a21 a22 . . . a2M . . . . . . . . . . . . aN1 aN2 . . . aNM      Addition and Subtraction C = A ± B = aij ± bij Matrix Product: The product of matrix An×m and Bm×p is another matrix C n×p given by the formula: C = AB ⇐ ⇒ cij =

m

  • k=1

aikbkj Note: Matrix multiplication is not commutative (AB = BA).

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 6 / 15

slide-7
SLIDE 7

Matrices

Inverse of a matrix

Matrices can be divided by scalar, but how can we divide a matrix by a matrix? Take Inverse Inverse of a matrix A is denoted by A−1. If A is a square matrix, then AA−1 = I All matrices are not invertible (refer linear algebra course for more details). In general,

Invertible matrices are square. Invertible matrices have LINEARLY INDEPENDENT COLUMNS (i.e. no vector formed by a column of the matrix is a scalar multiple of another) The DETERMINANT of the matrix != 0

This is not a linear algebra class. We’ll not ask you to solve large matrix inverses by hand. Instead use software!!. Linear algebra package for Python: numpy.linalg (https://docs. scipy.org/doc/numpy/reference/routines.linalg.html)

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 7 / 15

slide-8
SLIDE 8

Matrices

Trace of a matrix Tr(A) is the sum of diagonal elements of the matrix. Tr(A) =

n

  • i=1

aii

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 8 / 15

slide-9
SLIDE 9

Matrix-Vector Product

Multiplying a vector v by a matrix A transforms the vector v into new vector w. w is not always the same dimension as that of v w = Av =   a11 a12 a21 a22 a31 a32   v1 v2

  • =

  a11v1 + a12v2 a21v1 + a22v2 a31v1 + a32v2   Example: rotating a vector by 30o.

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 9 / 15

slide-10
SLIDE 10

Matrix Differentiation

Differentiation of a matrix with respect to scalar function

Very useful in Machine Learning - find gradients, find maximums / minimums for optimization Not as different from regular calculus as you may think If entry aij of matrix A is some function of f (x), then the result is a matrix of the form: δA δx =     

δa11 δx δa12 δx

. . .

δa1M δx δa21 δx δa22 δx

. . .

δa2M δx

. . . . . . . . . . . .

δaN1 δx δaN2 δx

. . .

δaNM δx

     Example: A = x x2 1 x

  • , then

δA δx = 1 2x 1

  • Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu )

CSE 446: Linear Algebra Review 10 / 15

slide-11
SLIDE 11

Matrix Differentiation

Differentiation of a scalar function with respect to Matrix

Also known as Gradient matrix Given a scalar function of a matrix y = f (X), the derivative δy

δX is:

δy δX =      

δy δx11 δy δx12

. . .

δy δx1M δy δx21 δy δx22

. . .

δy δx2M

. . . . . . . . . . . .

δy δxN1 δy δxN2

. . .

δy δxNM

      Example: Linear Regression ˆ w = arg minw

N

  • i=1

(xi.w − yi)2

How to find arg min? Derivative, of course...but with respect to w, which is a weight vector, not a single variable.

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 11 / 15

slide-12
SLIDE 12

Matrix Differentiation

Differentiation of a scalar function with respect to Matrix

δ δw

N

  • i=1

(xi.w − yi)2 =   

δ δw1

N

i=1(xi.w − yi)2

. . .

δ δwn

N

i=1(xi.w − yi)2

   =   

δ δw1

N

i=1(xi1w1 + . . . + xinwn − yi)2

. . .

δ δwn

N

i=1(xi1w1 + . . . + xinwn − yi)2

   =   

δ δw1

N

i=1 2xi1(xi.w − yi)

. . .

δ δwn

N

i=1 2xin(xi.w − yi)

  

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 12 / 15

slide-13
SLIDE 13

Eigen Vectors and Eigen Values

When a vector x is multiplied by a matrix A (so resultant vector is Ax), then direction of the resulting vector is changed. For example, rotating a vector. There are certain vectors (Ax whose direction is the same as x. Such vectors are called eigen vectors.

When we multiply A with x, then the resultant vector is scaled by λ. This λ is called an eigen value and helps in determining whether the vector x is stretched or shrunk or reversed or left unchanged when multiplied with A.

Matrix A has eigenvector x and eigen value λ if for some x, we have Ax = λx (A − λI)x = 0 where n solution λ’s are given by characteristic equation: det(A − λI) = 0. Determinants are tedious to compute by hand. Use mathematical libraries to compute the determinant.

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 13 / 15

slide-14
SLIDE 14

Example

Prove that Tr(AC) = Tr(CA) Tr(AC) =

n

  • i=1

(AC)ii =

n

  • i=1

n

  • j=1

aijcji =

n

  • j=1

n

  • i=1

cjiaij =

n

  • j=1

(CA)jj = Tr(CA)

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 14 / 15

slide-15
SLIDE 15

Useful Python Libraries/Functions

Linear algebra library (numpy.linalg) - This library has functions that are useful for programming assignments

Matrix inverse - numpy.linalg.inv Eigen Values - numpy.linalg.eig Dot product of two arrays - numpy.dot Matrix multiplication - numpy.matmul Dot product of two vectors - numpy.vdot

Link: https://docs.scipy.org/doc/numpy/reference/ routines.linalg.html

Sachin Mehta ( University of Washington, Seattle Email: sacmehta@uw.edu ) CSE 446: Linear Algebra Review 15 / 15