machine learning for signal
play

Machine Learning for Signal Processing Fundamentals of Linear - PowerPoint PPT Presentation

Machine Learning for Signal Processing Fundamentals of Linear Algebra - 2 Class 3. 8 Sep 2016 Instructor: Bhiksha Raj 11-755/18-797 1 Overview Vectors and matrices Vector spaces Basic vector/matrix operations Various matrix


  1. Machine Learning for Signal Processing Fundamentals of Linear Algebra - 2 Class 3. 8 Sep 2016 Instructor: Bhiksha Raj 11-755/18-797 1

  2. Overview • Vectors and matrices • Vector spaces • Basic vector/matrix operations • Various matrix types • Projections • More on matrix types • Matrix determinants • Matrix inversion • Eigenanalysis • Singular value decomposition • Matrix Calculus 11-755/18-797 2

  3. The importance of Bases (x, y, z) z u 3 u 2 y u 1 x • Conventional 3D representation – Each point (vector) is just a triplet of coordinates – In reality, the coordinates are weights! – X = x. u 1 + y. u 2 + z. u 3 – u 1 = [0 0 1], u 2 = [0 1 0], u 3 = [1 0 0] • Unit vectors in each of the three directions 11-755/18-797 3

  4. The importance of Bases 1 𝒗 1 = 0 (x, y, z) 0 0 z 𝑦 1 0 0 𝒗 2 = 1 𝑧 𝒀 = = 𝑦 + 𝑧 + 𝑨 0 1 0 u 3 0 u 2 𝑨 0 0 1 0 y 𝒗 3 = 0 u 1 x 1 • Specialty of u 1 , u 2 , u 3 – Every point in the space can be expressed as some x. u 1 + y. u 2 + z. u 3 – All three “bases” u 1 , u 2 , u 3 are required 11-755/18-797 4

  5. The importance of Bases (a, b, c) v 2 cV 3 v 3 𝑌 = 𝑏𝑾 1 + 𝑐𝑾 2 + 𝑑𝑾 3 bV 2 v 1 aV 1 • Is there any other set v 1 , v 2 ,.. v n which share this property – Any point can be expressed as a. v 1 + b. v 2 + c. v 3 … – How many “ v ”s will we require 11-755/18-797 5

  6. Basis based representation u 3 v 3 v 1 v 2 u 1 u 2 • A “good” basis captures data structure • Here u 1 , u 2 and u 3 all take large values for data in the set • But in the ( v 1 v 2 v 3 ) set, coordinate values along v 3 are always small for data on the blue sheet – v 3 likely represents a “noise subspace” for these data 11-755/18-797 6

  7. Basis based representation u 3 v 3 v 1 v 2 u 1 u 2 • The most important challenge in ML: Find the best set of bases for a given data set 11-755/18-797 7

  8. Matrix as a Basis transform 𝐘 = 𝑏𝐰 1 + 𝑐𝐰 2 + 𝑑𝐰 3 , 𝐘 = 𝑦𝐯 1 + 𝑧𝐯 2 + 𝑨𝐯 3 𝑏 𝑦 𝑐 𝑧 = 𝐔 𝑑 𝑨 • A matrix transforms a representation in terms of a standard basis u 1 u 2 u 3 to a representation in terms of a different bases v 1 v 2 v 3 • Finding best bases: Find matrix that transforms standard representation to these bases 11-755/18-797 8

  9. • Going on to more mundane stuff.. 11-755/18-797 9

  10. Orthogonal/Orthonormal vectors     x u       A y B v       w   z      . 0 0 A B xu yv zw • Two vectors are orthogonal if they are perpendicular to one another – A.B = 0 – A vector that is perpendicular to a plane is orthogonal to every vector on • Two vectors are ortho normal if – They are orthogonal – The length of each vector is 1.0 – Orthogonal vectors can be made orthonormal by scaling their lengths to 1.0 11-755/18-797 10

  11. Orthogonal matrices All 3 at 90 o to    0 . 5 0 . 125 0 . 375 on another    0 . 5 0 . 125 0 . 375    0 0 . 75 0 . 5    • Ortho gonal Matrix : AA T = A T A = I – The matrix is square – All row vectors are orthonormal to one another • Every vector is perpendicular to the hyperplane formed by all other vectors – All column vectors are also orthonormal to one another – Observation: In an orthogonal matrix if the length of the row vectors is 1.0, the length of the column vectors is also 1.0 – Observation : In an orthogonal matrix no more than one row can have all entries with the same polarity (+ve or – ve) 11-755/18-797 11

  12. Orthogonal Matrices q Ax • Orthogonal matrices will retain the length and relative angles between transformed vectors – Essentially, they are combinations of rotations, reflections and permutations – Rotation matrices and permutation matrices are all orthogonal 11-755/18-797 12

  13. Orthogonal and Orthonormal Matrices    1 0 . 0675 0 . 1875     0 . 5 0 . 125 0 . 375    0 0 . 75 0 . 5   • If the vectors in the matrix are not unit length, it cannot be orthogonal – AA T != I, A T A != I – AA T = Diagonal or A T A = Diagonal, but not both – If all the entries are the same length, we can get AA T = A T A = Diagonal, though • A non-square matrix cannot be orthogonal – AA T =I or A T A = I, but not both 11-755/18-797 13

  14. Matrix Rank and Rank-Deficient Matrices P * Cone = • Some matrices will eliminate one or more dimensions during transformation – These are rank deficient matrices – The rank of the matrix is the dimensionality of the transformed version of a full-dimensional object 11-755/18-797 14

  15. Matrix Rank and Rank-Deficient Matrices Rank = 2 Rank = 1 • Some matrices will eliminate one or more dimensions during transformation – These are rank deficient matrices – The rank of the matrix is the dimensionality of the transformed version of a full-dimensional object 11-755/18-797 15

  16. Projections are often examples of rank-deficient transforms M = W =  P = W (W T W) -1 W T ; Projected Spectrogram = P*M  The original spectrogram can never be recovered  P is rank deficient  P explains all vectors in the new spectrogram as a mixture of only the 4 vectors in W  There are only a maximum of 4 linearly independent bases  Rank of P is 4 11-755/18-797 16

  17. Non-square Matrices   . 8 . 9  ˆ ˆ ˆ  . . x x x     1 2 N . . x x x   ˆ ˆ ˆ . 1 . 9 . .   y y y 1 2 N     1 2 N   ˆ ˆ ˆ  . .   . .  z z z  . 6 0  y y y 1 2 N 1 2 N X = 2D data P = transform PX = 3D, rank 2 • Non-square matrices add or subtract axes – More rows than columns  add axes • But does not increase the dimensionality of the data axes • May reduce dimensionality of the data 11-755/18-797 17

  18. Non-square Matrices   . . x x x  ˆ ˆ ˆ  . . x x x 1 2 N     1 2 N . 3 1 . 2   ˆ ˆ ˆ . . y y y  . .    y y y   1 2 N 1 2 N   . 5 1 1    . .  z z z 1 2 N X = 3D data, rank 3 P = transform PX = 2D, rank 2 • Non-square matrices add or subtract axes – More rows than columns  add axes • But does not increase the dimensionality of the data – Fewer rows than columns  reduce axes • May reduce dimensionality of the data 11-755/18-797 18

  19. The Rank of a Matrix   . 8 . 9     . 3 1 . 2 . 1 . 9         . 5 1 1 . 6 0   • The matrix rank is the dimensionality of the transformation of a full- dimensioned object in the original space • The matrix can never increase dimensions – Cannot convert a circle to a sphere or a line to a circle • The rank of a matrix can never be greater than the lower of its two dimensions 11-755/18-797 19

  20. The Rank of Matrix M =  Projected Spectrogram = P * M  Every vector in it is a combination of only 4 bases  The rank of the matrix is the smallest no. of bases required to describe the output  E.g. if note no. 4 in P could be expressed as a combination of notes 1,2 and 3, it provides no additional information  Eliminating note no. 4 would give us the same projection  The rank of P would be 3! 11-755/18-797 20

  21. Matrix rank is unchanged by transposition     0 . 9 0 . 5 0 . 8 0 . 9 0 . 1 0 . 42     0 . 1 0 . 4 0 . 9 0 . 5 0 . 4 0 . 44          0 . 42 0 . 44 0 . 86   0 . 8 0 . 9 0 . 86  • If an N-dimensional object is compressed to a K-dimensional object by a matrix, it will also be compressed to a K-dimensional object by the transpose of the matrix 11-755/18-797 21

  22. Matrix Determinant (r1+r2) (r2) (r1) (r2) (r1) • The determinant is the “volume” of a matrix • Actually the volume of a parallelepiped formed from its row vectors – Also the volume of the parallelepiped formed from its column vectors • Standard formula for determinant: in text book 11-755/18-797 22

  23. Matrix Determinant: Another Perspective Volume = V 1 Volume = V 2   0 . 8 0 0 . 7   1 . 0 0 . 8 0 . 8      0 . 7 0 . 9 0 . 7  • The determinant is the ratio of N-volumes – If V 1 is the volume of an N- dimensional sphere “O” in N -dimensional space • O is the complete set of points or vertices that specify the object – If V 2 is the volume of the N-dimensional ellipsoid specified by A*O, where A is a matrix that transforms the space – |A| = V 2 / V 1 11-755/18-797 23

  24. Matrix Determinants • Matrix determinants are only defined for square matrices – They characterize volumes in linearly transformed space of the same dimensionality as the vectors • Rank deficient matrices have determinant 0 – Since they compress full-volumed N-dimensional objects into zero- volume N-dimensional objects • E.g. a 3-D sphere into a 2-D ellipse: The ellipse has 0 volume (although it does have area) • Conversely, all matrices of determinant 0 are rank deficient – Since they compress full-volumed N-dimensional objects into zero-volume objects 11-755/18-797 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend