COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh - PowerPoint PPT Presentation

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and Singular Value Decomposition (SVD) Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit line Problem Given an n × d matrix A , where we interpret the rows of the matrix as points in R d , find a best fit line through the origin for the given n points. Question: How do we define best fit line? Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit line Problem Given an n × d matrix A , where we interpret the rows of the matrix as points in R d , find a best fit line through the origin for the given n points. Question: How do we define best fit line? A line that minimises the sum of squared distance of the n points to the line. Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit line Problem Given an n × d matrix A , where we interpret the rows of the matrix as points in R d , find a best fit line through the origin for the given n points. Question: How do we define best fit line? A line that minimises the sum of squared distance of the n points to the line. Claim: The best fit line maximises the sum of projections squared of the n points to the line. Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit line Problem Given an n × d matrix A , where we interpret the rows of the matrix as points in R d , find a best fit line through the origin for the given n points. The best fit line through the origin is one that minimises the sum of squared distance of the n points to the line. Let v denote a unit vector ( d × 1 matrix) in the direction of the best fit line. Claim: The sum of squared lengths of projections of the points onto v is || A v || 2 . Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit line Problem Given an n × d matrix A , where we interpret the rows of the matrix as points in R d , find a best fit line through the origin for the given n points. The best fit line through the origin is one that minimises the sum of squared distance of the n points to the line. Let v denote a unit vector ( d × 1 matrix) in the direction of the best fit line. Claim: The sum of squared lengths of projections of the points onto v is || A v || 2 . So, the best fit line is defined by unit vector v that maximises || A v || . This is the first singular vector of the matrix A . So, the first singular vector is defined as: v 1 = arg max || v || =1 || A v || Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit line Problem Given an n × d matrix A , where we interpret the rows of the matrix as points in R d , find a best fit line through the origin for the given n points. The best fit line through the origin is one that minimises the sum of squared distance of the n points to the line. Let v denote a unit vector ( d × 1 matrix) in the direction of the best fit line. Claim: The sum of squared lengths of projections of the points onto v is || A v || 2 . So, the best fit line is defined by unit vector v that maximises || A v || . This is the first singular vector of the matrix A . So, the first singular vector is defined as: v 1 = arg max || v || =1 || A v || The value σ 1 = || A v 1 || is called the first singular value of A . Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit line Problem Given an n × d matrix A , where we interpret the rows of the matrix as points in R d , find a best fit line through the origin for the given n points. The first singular vector is defined as: v 1 = arg max || v || =1 || A v || The value σ 1 = || A v 1 || is called the first singular value of A . So, σ 2 1 is equal to the sum of squared length of projections. Note that if all the data points are “close” to a line through the origin, then the first singular vector gives such a line. Question: if the data points are close to a plane (and in general close to a k -dimensional subspace), then how do we find such a plane? Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit line Problem Given an n × d matrix A , where we interpret the rows of the matrix as points in R d , find a best fit plane through the origin for the given n points. Let v 1 denote the first singular vector of A . Idea: Find a unit vector v perpendicular to v 1 that maximises || A v || . Output the plane through the origin defined by vectors v 1 and v . Claim: The plane defined above indeed maximises sum of squared distances of all the points. The second singular vector is defined as: v 2 = arg max || A v || . || v || =1 , v ⊥ v 1 The value σ 2 = || A v 2 || is called the second singular value of A . Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit plane Problem Given an n × d matrix A , where we interpret the rows of the matrix as points in R d , find a best fit plane through the origin for the given n points. Let v 1 denote the first singular vector of A . The second singular vector is defined as: v 2 = arg max || A v || . || v || =1 , v ⊥ v 1 The value σ 2 = || A v 2 || is called the second singular value of A . Theorem For any matrix A, the plane spanned by v 1 and v 2 is the best fit plane. Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit plane The first singular vector is defined as: v 1 = arg max || v || =1 || A v || . The second singular vector is defined as: v 2 = arg max || v || =1 , v ⊥ v 1 || A v || . Theorem For any matrix A, the plane spanned by v 1 and v 2 is the best fit plane. Proof sketch Let W denote the best fit plane for A . Claim 1: There exists an orthonormal basis ( w 1 , w 2 ) of W such that w 2 is perpendicular to v 1 . Claim 2: || A w 1 || 2 ≤ || A v 1 || 2 . Claim 3: || A w 2 || 2 ≤ || A v 2 || 2 . This gives || A w 1 || 2 + || A w 2 || 2 ≤ || A v 1 || 2 + || A v 2 || 2 . Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit subspace The first singular vector and first singular value is defined as: v 1 = arg max || A v || and σ 1 = || A v 1 || || v || =1 The second singular vector and second singular value is defined as: v 2 = arg max || A v || and σ 2 = || A v 2 || . || v || =1 , v ⊥ v 1 The third singular vector and third singular value is defined as: v 3 = arg max || A v || and σ 3 = || A v 3 || . || v || =1 , v ⊥ v 1 , v 2 ...and so on. Let r be the smallest positive integer such that: max || v || =1 , v ⊥ v 1 ,..., v r || A v || = 0. Then A has r singular vectors v 1 , ..., v r . Theorem Let A be any n × d matrix with r singular vectors v 1 , ..., v r . For 1 ≤ k ≤ r, let V k be the subspace spanned by v 1 , ..., v k . For each k, V k is the best-fit k-dimensional subspace for A. Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit subspace The first singular vector and first singular value is defined as: v 1 = arg max || A v || and σ 1 = || A v 1 || || v || =1 The second singular vector and second singular value is defined as: v 2 = arg max || A v || and σ 2 = || A v 2 || . || v || =1 , v ⊥ v 1 The third singular vector and third singular value is defined as: v 3 = arg max || A v || and σ 3 = || A v 3 || . || v || =1 , v ⊥ v 1 , v 2 ...and so on. Let r be the smallest positive integer such that: max || v || =1 , v ⊥ v 1 ,..., v r || A v || = 0. Then A has r singular vectors v 1 , ..., v r . The vectors v 1 , ..., v r are more specifically called the right singular vectors. Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Best fit subspace The first singular vector and first singular value is defined as: v 1 = arg max || A v || and σ 1 = || A v 1 || || v || =1 The second singular vector and second singular value is defined as: v 2 = arg max || A v || and σ 2 = || A v 2 || . || v || =1 , v ⊥ v 1 The third singular vector and third singular value is defined as: v 3 = arg max || A v || and σ 3 = || A v 3 || . || v || =1 , v ⊥ v 1 , v 2 ...and so on. Let r be the smallest positive integer such that: max || v || =1 , v ⊥ v 1 ,..., v r || A v || = 0. Then A has r singular vectors v 1 , ..., v r . The vectors v 1 , ..., v r are more specifically called the right singular vectors. For any singular vector v i , σ i = || A v i || may be interpreted as the component of the matrix A along v i . Given this interpretation, the “ the components should add up to give the whole content of A ”. Ragesh Jaiswal, IITD COL866: Foundations of Data Science

Best Fit Subspaces and SVD Frobenius Norm Let r be the smallest positive integer such that: max || v || =1 , v ⊥ v 1 ,..., v r || A v || = 0. Then A has r singular vectors v 1 , ..., v r . The vectors v 1 , ..., v r are more specifically called the right singular vectors. For any singular vector v i , σ i = || A v i || may be interpreted as the component of the matrix A along v i . Given this interpretation, the “ the components should add up to give the whole content of A ”. For any row a j in the matrix A , we can write || a j || 2 = � r i =1 ( a j · v i ) 2 . This further gives: n n r r r || a j || 2 = ( a j · v i ) 2 = || A v i || 2 = � � � � � σ 2 i . j =1 j =1 i =1 i =1 i =1 Ragesh Jaiswal, IITD COL866: Foundations of Data Science

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh - PowerPoint PPT Presentation

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL866: Foundations of Data Science Best Fit Subspaces and Singular Value Decomposition (SVD) Ragesh Jaiswal, IITD COL866: Foundations of Data Science Best Fit

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL866:

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL866:

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL866:

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL866:

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL866:

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL866:

recap to this point foundations foundations foundations foundations genetics =

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Outline Foundations of Data and Knowledge Systems EPCL Basic Training Camp 2012 3. Foundations

BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD CLASS BUILDING THE

For personal use only BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD

For personal use only BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD

Cognitive Foundations Lecture 2: Experimental Methods (2) Foundations of Language Science and

Foundations of Pharmaceutical Science Foundations of Pharmaceutical Science (Hass, Voigt, Balaz)

CSE 312: Foundations of Computer Science, II CSE 312: Foundations of Computer Science, II

DataCamp Data Types for Data Science DataCamp Data Types for Data Science Data types Data type

Basic Linear Regression James H. Steiger Department of Psychology and Human Development

Least Squares and Data Fitting Data fitting How do we best fit a set of data points? Linear

11. Regression and Least Squares Prof. Tesler Math 186 Winter 2019 Prof. Tesler Ch. 11: Linear

Fitting a Line, Residuals, and Correlation August 27, 2019 August 27, 2019 1 / 54 Fitting a

1 Hough Transform: Noisy line tokens votes Mechanics of the Hough transform Construct an

Announcements Wednesday, November 28 Please fill out your CIOS survey! If 85% of the class

Simple Linear Regression Chapter 10 1 Motivation Have data (sample, x s) Want to

Beyond assertion: setup and teardown UN IT TES TIN G F OR DATA S CIEN CE IN P YTH ON Dibya