numerical linear algebra
play

Numerical Linear Algebra EECS 442 David Fouhey Fall 2019, - PowerPoint PPT Presentation

Numerical Linear Algebra EECS 442 David Fouhey Fall 2019, University of Michigan http://web.eecs.umich.edu/~fouhey/teaching/EECS442_W19/ Administrivia HW 1 out due in two weeks Follow submission format (wrong format = 0) The


  1. Matrices Horizontally concatenate n, m-dim column vectors and you get a mxn matrix A (here 2x3) ๐‘ค 1 1 ๐‘ค 2 1 ๐‘ค 3 1 ๐‘ฉ = ๐’˜ 1 , โ‹ฏ , ๐’˜ ๐‘œ = ๐‘ค 1 2 ๐‘ค 2 2 ๐‘ค 3 2 (scalar) (vector) (matrix) a undecorated a bold or arrow A lowercase lowercase uppercase bold

  2. Matrices ๐‘ˆ ๐‘ Transpose: flip (3x1) T = 1x3 ๐‘ = ๐‘ ๐‘ ๐‘‘ rows / columns ๐‘‘ Vertically concatenate m, n-dim row vectors and you get a mxn matrix A (here 2x3) ๐‘ˆ ๐’— 1 ๐‘ฃ 1 1 ๐‘ฃ 1 2 ๐‘ฃ 1 3 ๐ต = = โ‹ฎ ๐‘ฃ 2 1 ๐‘ฃ 2 2 ๐‘ฃ 2 3 ๐‘ˆ ๐’— ๐‘œ

  3. Matrix-Vector Product ๐’› 2๐‘ฆ1 = ๐‘ฉ 2๐‘ฆ3 ๐’š 3๐‘ฆ1 ๐‘ฆ 1 ๐‘ง 1 ๐‘ฆ 2 ๐’˜ ๐Ÿ ๐’˜ ๐Ÿ‘ ๐’˜ ๐Ÿ’ ๐‘ง 2 = ๐‘ฆ 3 ๐’› = ๐‘ฆ 1 ๐’˜ ๐Ÿ + ๐‘ฆ 2 ๐’˜ ๐Ÿ‘ + ๐‘ฆ 3 ๐’˜ ๐Ÿ’ Linear combination of columns of A

  4. Matrix-Vector Product ๐’› 2๐‘ฆ1 = ๐‘ฉ 2๐‘ฆ3 ๐’š 3๐‘ฆ1 3 ๐‘ง 1 ๐’— ๐Ÿ ๐‘ง 2 = ๐’š 3 ๐’— ๐Ÿ‘ ๐‘ผ ๐’š ๐‘ผ ๐’š ๐‘ง 1 = ๐’— ๐Ÿ ๐‘ง 2 = ๐’— ๐Ÿ‘ Dot product between rows of A and x

  5. Matrix Multiplication Generally: A mn and B np yield product ( AB ) mp | | ๐‘ผ โˆ’ ๐’ƒ ๐Ÿ โˆ’ ๐’„ ๐Ÿ โ‹ฏ ๐’„ ๐’’ ๐‘ฉ๐‘ช = โ‹ฎ ๐‘ผ โˆ’ ๐’ƒ ๐’ โˆ’ | | Yes โ€“ in A , Iโ€™m referring to the rows, and in B , Iโ€™m referring to the columns

  6. Matrix Multiplication Generally: A mn and B np yield product ( AB ) mp | | ๐’„ ๐Ÿ โ‹ฏ ๐’„ ๐’’ | | ๐‘ผ ๐’„ ๐Ÿ ๐‘ผ ๐’„ ๐’’ ๐‘ผ ๐’ƒ ๐Ÿ โ‹ฏ ๐’ƒ ๐Ÿ โˆ’ ๐’ƒ ๐Ÿ โˆ’ ๐‘ฉ๐‘ช = โ‹ฎ โ‹ฎ โ‹ฑ โ‹ฎ ๐‘ผ ๐’„ ๐Ÿ ๐‘ผ ๐’„ ๐’’ ๐‘ผ โˆ’ ๐’ƒ ๐’ โˆ’ ๐’ƒ ๐’ โ‹ฏ ๐’ƒ ๐’ ๐‘ผ ๐’„ ๐’Œ ๐‘ฉ๐‘ช ๐‘—๐‘˜ = ๐’ƒ ๐’‹

  7. Matrix Multiplication โ€ข Dimensions must match โ€ข Dimensions must match โ€ข Dimensions must match โ€ข (Yes, itโ€™s associative): ABx = (A)(Bx) = (AB)x โ€ข ( No itโ€™s not commutative ): ABx โ‰  (BA)x โ‰  ( BxA)

  8. Operations They Donโ€™t Teach You Probably Saw Matrix Addition ๐‘’ + ๐‘“ ๐‘” ๐‘ + ๐‘“ ๐‘ + ๐‘” ๐‘ ๐‘ โ„Ž = ๐‘• ๐‘‘ + ๐‘• ๐‘’ + โ„Ž ๐‘‘ What is this? FYI: e is a scalar ๐‘ ๐‘ ๐‘ + ๐‘“ ๐‘ + ๐‘“ ๐‘’ + ๐‘“ = ๐‘‘ ๐‘‘ + ๐‘“ ๐‘’ + ๐‘“

  9. Broadcasting If you want to be pedantic and proper, you expand e by multiplying a matrix of 1s (denoted 1 ) ๐‘ ๐‘ = ๐‘ ๐‘ ๐‘’ + ๐‘“ ๐‘’ + ๐Ÿ 2๐‘ฆ2 ๐‘“ ๐‘‘ ๐‘‘ ๐‘’ + ๐‘“ ๐‘“ = ๐‘ ๐‘ ๐‘“ ๐‘“ ๐‘‘ Many smart matrix libraries do this automatically. This is the source of many bugs.

  10. Broadcasting Example Given: a nx2 matrix P and a 2D column vector v , Want: nx2 difference matrix D ๐‘ฆ 1 ๐‘ง 1 ๐‘ฆ 1 โˆ’ ๐‘ ๐‘ง 1 โˆ’ ๐‘ ๐’˜ = ๐‘ โ‹ฎ โ‹ฎ ๐‘ธ = โ‹ฎ โ‹ฎ ๐‘ฌ = ๐‘ ๐‘ฆ ๐‘œ ๐‘ง ๐‘œ ๐‘ฆ ๐‘œ โˆ’ ๐‘ ๐‘ง ๐‘œ โˆ’ ๐‘ ๐‘ฆ 1 ๐‘ง 1 Blue stuff is ๐‘ ๐‘ ๐‘ธ โˆ’ ๐’˜ ๐‘ˆ = โ‹ฎ โ‹ฎ assumed / โˆ’ โ‹ฎ broadcast ๐‘ฆ ๐‘œ ๐‘ง ๐‘œ ๐‘ ๐‘

  11. Two Uses for Matrices 1. Storing things in a rectangular array (images, maps) โ€ข Typical operations : element-wise operations, convolution (which weโ€™ll cover next) โ€ข Atypical operations : almost anything you learned in a math linear algebra class 2. A linear operator that maps vectors to another space ( Ax ) โ€ข Typical/Atypical: reverse of above

  12. Images as Matrices Suppose someone hands you this matrix. Whatโ€™s wrong with it?

  13. Contrast โ€“ Gamma curve Typical way to change the contrast is to apply a nonlinear correction pixelvalue ๐›ฟ The quantity ๐›ฟ controls how much contrast gets added

  14. Contrast โ€“ Gamma curve Now the darkest 90% regions (10 th pctile) are much darker 50% than the moderately new dark regions (50 th 10% 90% pctile). new new 10% 50%

  15. Implementation Python+Numpy (right way): imNew = im**0.25 Python+Numpy (slow way โ€“ why? ): imNew = np.zeros(im.shape) for y in range(im.shape[0]): for x in range(im.shape[1]): imNew[y,x] = im[y,x]**expFactor

  16. Results Phew! Much Better.

  17. Element-wise Operations Element-wise power โ€“ beware notation ๐‘ž ๐‘ฉ ๐‘ž ๐‘—๐‘˜ = ๐ต ๐‘—๐‘˜ โ€œHadamard Productโ€ / Element -wise multiplication ๐‘ฉ โŠ™ ๐‘ช ๐‘—๐‘˜ = ๐‘ฉ ๐‘—๐‘˜ โˆ— ๐‘ช ๐‘—๐‘˜ Element-wise division ๐‘ฉ/๐‘ช ๐‘—๐‘˜ = ๐ต ๐‘—๐‘˜ ๐ถ ๐‘—๐‘˜

  18. Sums Across Axes ๐‘ฆ 1 ๐‘ง 1 Suppose have โ‹ฎ โ‹ฎ ๐‘ฉ = Nx2 matrix A ๐‘ฆ ๐‘œ ๐‘ง ๐‘œ ๐‘ฆ 1 + ๐‘ง 1 ND col. vec. โ‹ฎ ฮฃ(๐‘ฉ, 1) = ๐‘ฆ ๐‘œ + ๐‘ง ๐‘œ ๐‘œ ๐‘œ 2D row vec ฮฃ(๐‘ฉ, 0) = เท ๐‘ฆ ๐‘— , เท ๐‘ง ๐‘— ๐‘—=1 ๐‘—=1 Note โ€“ libraries distinguish between N-D column vector and Nx1 matrix.

  19. Vectorizing Example โ€ข Suppose I represent each image as a 128- dimensional vector โ€ข I want to compute all the pairwise distances between { x 1 , โ€ฆ, x N } and { y 1 , โ€ฆ, y M } so I can find, for every x i the nearest y j โ€ข Identity: ๐’š โˆ’ ๐’› 2 = ๐’š 2 + ๐’› 2 โˆ’ 2๐’š ๐‘ˆ ๐’› ๐’š 2 + ๐’› 2 โˆ’ 2๐’š ๐‘ˆ ๐’› 1/2 โ€ข Or: ๐’š โˆ’ ๐’› =

  20. Vectorizing Example โˆ’ ๐’š 1 โˆ’ โˆ’ ๐’› 1 โˆ’ | | ๐’ ๐‘ผ = โ‹ฎ โ‹ฎ ๐’› 1 โ‹ฏ ๐’› ๐‘ ๐’€ = ๐’ = โˆ’ ๐’š ๐‘‚ โˆ’ โˆ’ ๐’› ๐‘ โˆ’ | | ๐’š ๐Ÿ ๐Ÿ‘ Compute a Nx1 ๐šป ๐’€ ๐Ÿ‘ , ๐Ÿ = โ‹ฎ vector of norms ๐’š ๐‘ถ ๐Ÿ‘ (can also do Mx1) Compute a NxM ๐‘ผ ๐’› ๐’Œ ๐’€๐’ ๐‘ผ ๐‘—๐‘˜ = ๐’š ๐’‹ matrix of dot products

  21. Vectorizing Example ๐‘ผ โˆ’ 2๐’€๐’ ๐‘ผ 1/2 ๐„ = ฮฃ ๐’€ ๐Ÿ‘ , 1 + ฮฃ ๐’ ๐Ÿ‘ , 1 ๐’š ๐Ÿ ๐Ÿ‘ ๐’› 1 ๐Ÿ‘ ๐Ÿ‘ + โ‹ฏ ๐’› ๐‘ โ‹ฎ ๐’š ๐‘ถ ๐Ÿ‘ ๐’š ๐Ÿ 2 + ๐’› ๐Ÿ 2 ๐’š ๐Ÿ 2 + ๐’› ๐‘ต 2 โ‹ฏ Why? โ‹ฎ โ‹ฑ โ‹ฎ ๐’š ๐‘ถ 2 + ๐’› ๐Ÿ 2 ๐’š ๐‘ถ 2 + ๐’› ๐‘ต 2 โ‹ฏ 2 2 + ๐‘ง ๐‘˜ ฮฃ ๐‘Œ 2 , 1 + ฮฃ ๐‘ 2 , 1 ๐‘ˆ ๐‘—๐‘˜ = ๐‘ฆ ๐‘—

  22. Vectorizing Example ๐‘ผ โˆ’ 2๐’€๐’ ๐‘ผ 1/2 ๐„ = ฮฃ ๐’€ ๐Ÿ‘ , 1 + ฮฃ ๐’ ๐Ÿ‘ , 1 2 + 2๐’š ๐‘ผ ๐’› ๐’š ๐’‹ 2 + ๐’› ๐’Œ ๐„ ๐‘—๐‘˜ = Numpy code: XNorm = np.sum(X**2,axis=1,keepdims=True) YNorm = np.sum(Y**2,axis=1,keepdims=True) D = (XNorm+YNorm.T-2*np.dot(X,Y.T))**0.5 *May have to make sure this is at least 0 (sometimes roundoff issues happen)

  23. Does it Make a Difference? Computing pairwise distances between 300 and 400 128-dimensional vectors 1. for x in X, for y in Y, using native python: 9s 2. for x in X, for y in Y, using numpy to compute distance: 0.8s 3. vectorized: 0.0045s (~2000x faster than 1, 175x faster than 2) Expressing things in primitives that are optimized is usually faster

  24. Linear Independence A set of vectors is linearly independent if you canโ€™t write one as a linear combination of the others. 0 0 5 Suppose: ๐’ƒ = ๐’„ = ๐’… = 0 6 0 2 0 0 0 0 = 1 2 ๐’ƒ โˆ’ 1 ๐’š = = 2๐’ƒ 0 ๐’› = 3 ๐’„ โˆ’2 4 1 โ€ข Is the set {a,b,c} linearly independent? โ€ข Is the set {a,b,x} linearly independent? โ€ข Max # of independent 3D vectors?

  25. Span Span: all linear combinations of a set of vectors Span({ }) = Span({[0,1]}) = ? All vertical lines through origin = ๐œ‡ 0,1 : ๐œ‡ โˆˆ ๐‘† Is blue in {red }โ€™s span?

  26. Span Span: all linear combinations of a set of vectors Span({ , }) = ?

  27. Span Span: all linear combinations of a set of vectors Span({ , }) = ?

  28. Matrix-Vector Product | | Right-multiplying A by x ๐’… ๐Ÿ โ‹ฏ ๐’… ๐’ mixes columns of A ๐‘ฉ๐’š = ๐’š according to entries of x | | โ€ข The output space of f( x ) = Ax is constrained to be the span of the columns of A . โ€ข Canโ€™t output things you canโ€™t construct out of your columns

  29. An Intuition ๐‘ฆ 1 | | | ๐‘ฆ 2 ๐’… ๐Ÿ ๐’… ๐Ÿ‘ ๐’… ๐’ ๐’› = ๐‘ฉ๐’š = ๐‘ฆ 3 | | | y 1 y x Ax y 2 x 1 x 2 x 3 y 3 x โ€“ knobs on machine (e.g., fuel, brakes) y โ€“ state of the world (e.g., where you are) A โ€“ machine (e.g., your car)

  30. Linear Independence Suppose the columns of 3x3 matrix A are not linearly independent (c 1 , ฮฑ c 1 , c 2 for instance) ๐‘ฆ 1 | | | ๐‘ฆ 2 ๐’… ๐Ÿ ๐›ฝ๐’… ๐Ÿ ๐’… ๐Ÿ‘ ๐’› = ๐‘ฉ๐’š = ๐‘ฆ 3 | | | ๐’› = ๐‘ฆ 1 ๐’… ๐Ÿ + ๐›ฝ๐‘ฆ 2 ๐’… ๐Ÿ + ๐‘ฆ 3 ๐’… ๐Ÿ‘ ๐’› = ๐‘ฆ 1 + ๐›ฝ๐‘ฆ 2 ๐’… ๐Ÿ + ๐‘ฆ 3 ๐’… ๐Ÿ‘

  31. Linear Independence Intuition Knobs of x are redundant. Even if y has 3 outputs, you can only control it in two directions ๐’› = ๐‘ฆ 1 + ๐›ฝ๐‘ฆ 2 ๐’… ๐Ÿ + ๐‘ฆ 3 ๐’… ๐Ÿ‘ y 1 y x Ax y 2 x 1 x 2 x 3 y 3

  32. Linear Independence Recall: ๐‘ฉ๐’š = ๐‘ฆ 1 + ๐›ฝ๐‘ฆ 2 ๐’… ๐Ÿ + ๐‘ฆ 3 ๐’… ๐Ÿ‘ ๐‘ฆ 1 + ๐›พ ๐‘ฆ 1 + ๐›พ + ๐›ฝ๐‘ฆ 2 โˆ’ ๐›ฝ ๐›พ ๐’› = ๐‘ฉ ๐‘ฆ 2 โˆ’ ๐›พ/๐›ฝ = ๐›ฝ ๐‘‘ 1 + ๐‘ฆ 3 ๐‘‘ 2 ๐‘ฆ 3 โ€ข Can write y an infinite number of ways by ๐›พ adding ๐›พ to x 1 and subtracting ๐›ฝ from x 2 โ€ข Or, given a vector y thereโ€™s not a unique vector x s.t. y = Ax โ€ข Not all y have a corresponding x s.t. y=Ax

  33. Linear Independence ๐‘ฉ๐’š = ๐‘ฆ 1 + ๐›ฝ๐‘ฆ 2 ๐’… ๐Ÿ + ๐‘ฆ 3 ๐’… ๐Ÿ‘ ๐›พ ๐›พ โˆ’ ๐›ฝ ๐›พ ๐’› = ๐‘ฉ โˆ’๐›พ/๐›ฝ = ๐›ฝ ๐’… ๐Ÿ + 0๐’… ๐Ÿ‘ 0 โ€ข What else can we cancel out? โ€ข An infinite number of non-zero vectors x can map to a zero-vector y โ€ข Called the right null-space of A.

  34. Rank โ€ข Rank of a nxn matrix A โ€“ number of linearly independent columns ( or rows ) of A / the dimension of the span of the columns โ€ข Matrices with full rank (n x n, rank n) behave nicely: can be inverted, span the full output space, are one-to-one. โ€ข Matrices with full rank are machines where every knob is useful and every output state can be made by the machine

  35. Inverses โ€ข Given ๐’› = ๐‘ฉ๐’š , y is a linear combination of columns of A proportional to x . If A is full-rank, we should be able to invert this mapping. โ€ข Given some y (output) and A , what x (inputs) produced it? โ€ข x = A -1 y โ€ข Note: if you donโ€™t need to compute it, never ever compute it. Solving for x is much faster and stable than obtaining A -1 .

  36. Symmetric Matrices โ€ข Symmetric: ๐‘ฉ ๐‘ผ = ๐‘ฉ or ๐‘ 11 ๐‘ 12 ๐‘ 13 ๐‘ฉ ๐‘—๐‘˜ = ๐‘ฉ ๐‘˜๐‘— ๐‘ 21 ๐‘ 22 ๐‘ 23 โ€ข Have lots of special ๐‘ 31 ๐‘ 32 ๐‘ 33 properties Any matrix of the form ๐‘ฉ = ๐’€ ๐‘ผ ๐’€ is symmetric. ๐‘ผ ๐‘ฉ ๐‘ผ = ๐’€ ๐‘ผ ๐’€ Quick check: ๐‘ฉ ๐‘ผ = ๐’€ ๐‘ผ ๐’€ ๐‘ผ ๐‘ผ ๐‘ฉ ๐‘ผ = ๐’€ ๐‘ผ ๐’€

  37. Special Matrices โ€“ Rotations ๐‘  ๐‘  ๐‘  11 12 13 ๐‘  ๐‘  ๐‘  21 22 23 ๐‘  ๐‘  ๐‘  31 32 33 โ€ข Rotation matrices ๐‘บ rotate vectors and do not change vector L2 norms ( ๐‘บ๐’š 2 = ๐’š 2 ) โ€ข Every row/column is unit norm โ€ข Every row is linearly independent โ€ข Transpose is inverse ๐‘บ๐‘บ ๐‘ผ = ๐‘บ ๐‘ผ ๐‘บ = ๐‘ฑ โ€ข Determinant is 1 (otherwise itโ€™s also a coordinate flip/reflection), eigenvalues are 1

  38. Eigensystems โ€ข An eigenvector ๐’˜ ๐’‹ and eigenvalue ๐œ‡ ๐‘— of a matrix ๐‘ฉ satisfy ๐‘ฉ๐’˜ ๐’‹ = ๐œ‡ ๐‘— ๐’˜ ๐’‹ ( ๐‘ฉ๐’˜ ๐’‹ is scaled by ๐œ‡ ๐‘— ) โ€ข Vectors and values are always paired and typically you assume ๐’˜ ๐’‹ 2 = 1 โ€ข Biggest eigenvalue of A gives bounds on how much ๐‘” ๐’š = ๐‘ฉ๐’š stretches a vector x . โ€ข Hints of what people really mean: โ€ข โ€œLargest eigenvectorโ€ = vector w/ largest value โ€ข Spectral just means thereโ€™s eigenvectors

  39. Suppose I have points in a grid

  40. Now I apply f( x ) = Ax to these points Pointy-end: Ax . Non-Pointy-End: x

  41. ๐‘ฉ = 1.1 0 0 1.1 Red box โ€“ unit square, Blue box โ€“ after f( x ) = Ax . What are the yellow lines and why?

  42. ๐‘ฉ = 0.8 0 0 1.25 Now I apply f( x ) = Ax to these points Pointy-end: Ax . Non-Pointy-End: x

  43. ๐‘ฉ = 0.8 0 0 1.25 Red box โ€“ unit square, Blue box โ€“ after f( x ) = Ax . What are the yellow lines and why?

  44. ๐‘ฉ = cos(๐‘ข) โˆ’sin(๐‘ข) sin(๐‘ข) cos(๐‘ข) Red box โ€“ unit square, Blue box โ€“ after f( x ) = Ax . Can we draw any yellow lines?

  45. Eigenvectors of Symmetric Matrices โ€ข Always n mutually orthogonal eigenvectors with n (not necessarily) distinct eigenvalues โ€ข For symmetric ๐‘ฉ , the eigenvector with the ๐’š ๐‘ผ ๐‘ฉ๐’š largest eigenvalue maximizes ๐’š ๐‘ผ ๐’š (smallest/min) โ€ข So for unit vectors (where ๐’š ๐‘ผ ๐’š = 1 ), that eigenvector maximizes ๐’š ๐‘ผ ๐‘ฉ๐’š โ€ข A surprisingly large number of optimization problems rely on (max/min)imizing this

  46. The Singular Value Decomposition Can always write a mxn matrix A as: ๐‘ฉ = ๐‘ฝ๐šป๐‘พ ๐‘ผ 0 ฯƒ 1 A = U โˆ‘ ฯƒ 2 ฯƒ 3 Scale 0 Rotation Eigenvectors Sqrt of of AA T Eigenvalues of A T A

  47. The Singular Value Decomposition Can always write a mxn matrix A as: ๐‘ฉ = ๐‘ฝ๐šป๐‘พ ๐‘ผ V T A = U โˆ‘ Rotation Scale Rotation Eigenvectors Sqrt of Eigenvectors of AA T Eigenvalues of A T A of A T A

  48. Singular Value Decomposition โ€ข Every matrix is a rotation, scaling, and rotation โ€ข Number of non-zero singular values = rank / number of linearly independent vectors โ€ข โ€œClosestโ€ matrix to A with a lower rank 0 ฯƒ 1 V T ฯƒ 2 = A U ฯƒ 3 0

  49. Singular Value Decomposition โ€ข Every matrix is a rotation, scaling, and rotation โ€ข Number of non-zero singular values = rank / number of linearly independent vectors โ€ข โ€œClosestโ€ matrix to A with a lower rank 0 ฯƒ 1 V T ฯƒ 2 = ร‚ U 0 0

  50. Singular Value Decomposition โ€ข Every matrix is a rotation, scaling, and rotation โ€ข Number of non-zero singular values = rank / number of linearly independent vectors โ€ข โ€œClosestโ€ matrix to A with a lower rank โ€ข Secretly behind basically many things you do with matrices

  51. Solving Least-Squares Start with two points (x i ,y i ) (x 2 ,y 2 ) ๐’› = ๐‘ฉ๐’˜ ๐‘ง 1 ๐‘ง 2 = ๐‘ฆ 1 1 ๐‘› ๐‘ฆ 2 1 ๐‘ (x 1 ,y 1 ) ๐‘ง 1 ๐‘ง 2 = ๐‘›๐‘ฆ 1 + ๐‘ ๐‘›๐‘ฆ 2 + ๐‘ We know how to solve this โ€“ invert A and find v (i.e., (m,b) that fits points)

  52. Solving Least-Squares Start with two points (x i ,y i ) (x 2 ,y 2 ) ๐’› = ๐‘ฉ๐’˜ ๐‘ง 1 ๐‘ง 2 = ๐‘ฆ 1 1 ๐‘› ๐‘ฆ 2 1 ๐‘ (x 1 ,y 1 ) 2 ๐‘ง 1 ๐‘ง 2 โˆ’ ๐‘›๐‘ฆ 1 + ๐‘ ๐’› โˆ’ ๐‘ฉ๐’˜ 2 = ๐‘›๐‘ฆ 2 + ๐‘ 2 + ๐‘ง 2 โˆ’ ๐‘›๐‘ฆ 2 + ๐‘ 2 = ๐‘ง 1 โˆ’ ๐‘›๐‘ฆ 1 + ๐‘ The sum of squared differences between the actual value of y and what the model says y should be.

  53. Solving Least-Squares Suppose there are n > 2 points ๐’› = ๐‘ฉ๐’˜ ๐‘ง 1 ๐‘ฆ 1 1 ๐‘› โ‹ฎ โ‹ฎ โ‹ฎ = ๐‘ ๐‘ง ๐‘‚ ๐‘ฆ ๐‘‚ 1 Compute ๐‘ง โˆ’ ๐ต๐‘ฆ 2 again ๐‘œ ๐’› โˆ’ ๐‘ฉ๐’˜ 2 = เท ๐‘ง ๐‘— โˆ’ (๐‘›๐‘ฆ ๐‘— + ๐‘) 2 ๐‘—=1

  54. Solving Least-Squares Given y , A , and v with y = Av overdetermined ( A tall / more equations than unknowns) We want to minimize ๐’› โˆ’ ๐‘ฉ๐’˜ ๐Ÿ‘ , or find: arg min ๐’˜ ๐’› โˆ’ ๐‘ฉ๐’˜ 2 (The value of x that makes the expression smallest) Solution satisfies ๐‘ฉ ๐‘ผ ๐‘ฉ ๐’˜ โˆ— = ๐‘ฉ ๐‘ผ ๐’› or โˆ’1 ๐‘ฉ ๐‘ผ ๐’› ๐’˜ โˆ— = ๐‘ฉ ๐‘ผ ๐‘ฉ (Donโ€™t actually compute the inverse!)

  55. When is Least-Squares Possible? Given y , A , and v . Want y = Av Want n outputs, have n knobs y = A v to fiddle with, every knob is useful if A is full rank. A: rows (outputs) > columns = (knobs). Thus canโ€™t get precise v y A output you want (not enough knobs). So settle for โ€œclosestโ€ knob setting.

  56. When is Least-Squares Possible? Given y , A , and v . Want y = Av Want n outputs, have n knobs y = A v to fiddle with, every knob is useful if A is full rank. A: columns (knobs) > rows y = A (outputs). Thus, any output can v be expressed in infinite ways.

  57. Homogeneous Least-Squares Given a set of unit vectors (aka directions) ๐’š ๐Ÿ , โ€ฆ , ๐’š ๐’ and I want vector ๐’˜ that is as orthogonal to all the ๐’š ๐’‹ as possible (for some definition of orthogonal) Stack ๐’š ๐’‹ into A , compute Av ๐’š ๐’ โ€ฆ ๐’š ๐Ÿ‘ 0 if ๐‘ผ ๐‘ผ ๐’˜ โˆ’ ๐’š ๐Ÿ โˆ’ ๐’š ๐Ÿ ๐’š ๐Ÿ orthog ๐‘ฉ๐’˜ = ๐’˜ = โ‹ฎ โ‹ฎ ๐‘ผ ๐‘ผ ๐’˜ โˆ’ ๐’š ๐’ โˆ’ ๐’š ๐’ ๐’ ๐’˜ ๐Ÿ‘ ๐‘ฉ๐’˜ ๐Ÿ‘ = เท ๐‘ผ ๐’˜ Compute ๐’š ๐’‹ ๐’‹ Sum of how orthog. v is to each x

  58. Homogeneous Least-Squares โ€ข A lot of times, given a matrix A we want to find the v that minimizes ๐‘ฉ๐’˜ 2 . โ€ข I.e., want ๐ฐ โˆ— = arg min 2 ๐‘ฉ๐’˜ 2 ๐’˜ โ€ข Whatโ€™s a trivial solution? โ€ข Set v = 0 โ†’ Av = 0 โ€ข Exclude this by forcing v to have unit norm

  59. Homogeneous Least-Squares 2 Letโ€™s look at ๐‘ฉ๐’˜ 2 2 = Rewrite as dot product ๐‘ฉ๐’˜ 2 2 = ๐๐ฐ T (๐๐ฐ) Distribute transpose ๐‘ฉ๐’˜ 2 2 = ๐’˜ ๐‘ผ ๐‘ฉ ๐‘ผ ๐๐ฐ = ๐ฐ ๐” ๐ ๐” ๐ ๐ฐ ๐‘ฉ๐’˜ 2 We want the vector minimizing this quadratic form Where have we seen this?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend