compsci 514 algorithms for data science
play

compsci 514: algorithms for data science Cameron Musco University - PowerPoint PPT Presentation

compsci 514: algorithms for data science Cameron Musco University of Massachusetts Amherst. Fall 2019. Lecture 13 0 MAP Feedback: Going to adjust a bit how I take questions in class. Will try to more clearly identify important


  1. compsci 514: algorithms for data science Cameron Musco University of Massachusetts Amherst. Fall 2019. Lecture 13 0

  2. • MAP Feedback: • Going to adjust a bit how I take questions in class. • Will try to more clearly identify important information (what will • Will try to use iPad more to write out proofs in class. logistics graduates. We will have your Problem Set 2 and midterm grades back before then. appear on exams or problem sets) v.s. motivating examples. 1 • Pass/Fail Deadline is 10/29 for undergraduates and 10/31 for • Will release Problem Set 3 next week due ∼ 11 / 11.

  3. logistics graduates. We will have your Problem Set 2 and midterm grades back before then. appear on exams or problem sets) v.s. motivating examples. 1 • Pass/Fail Deadline is 10/29 for undergraduates and 10/31 for • Will release Problem Set 3 next week due ∼ 11 / 11. • MAP Feedback: • Going to adjust a bit how I take questions in class. • Will try to more clearly identify important information (what will • Will try to use iPad more to write out proofs in class.

  4. • Show how PCA can be interpreted in terms of the singular • Applications to word embeddings, graph embeddings, • Discussed how to compress a dataset that lies close to a • Optimal compression by projecting onto the top k • Saw how to calculate the error of the approximation – interpret the spectrum of X T X . summary Last Few Classes: Low-Rank Approximation and PCA k -dimensional subspace. eigenvectors of the covariance matrix X T X (PCA). This Class: Low-rank approximation and connection to singular value decomposition. value decomposition (SVD) of X . document classification, recommendation systems. 2

  5. • Show how PCA can be interpreted in terms of the singular • Applications to word embeddings, graph embeddings, summary Last Few Classes: Low-Rank Approximation and PCA k -dimensional subspace. eigenvectors of the covariance matrix X T X (PCA). interpret the spectrum of X T X . This Class: Low-rank approximation and connection to singular value decomposition. value decomposition (SVD) of X . document classification, recommendation systems. 2 • Discussed how to compress a dataset that lies close to a • Optimal compression by projecting onto the top k • Saw how to calculate the error of the approximation –

  6. • Show how PCA can be interpreted in terms of the singular • Applications to word embeddings, graph embeddings, summary Last Few Classes: Low-Rank Approximation and PCA k -dimensional subspace. eigenvectors of the covariance matrix X T X (PCA). interpret the spectrum of X T X . This Class: Low-rank approximation and connection to singular value decomposition. value decomposition (SVD) of X . document classification, recommendation systems. 2 • Discussed how to compress a dataset that lies close to a • Optimal compression by projecting onto the top k • Saw how to calculate the error of the approximation –

  7. summary Last Few Classes: Low-Rank Approximation and PCA k -dimensional subspace. eigenvectors of the covariance matrix X T X (PCA). interpret the spectrum of X T X . This Class: Low-rank approximation and connection to singular value decomposition. value decomposition (SVD) of X . document classification, recommendation systems. 2 • Discussed how to compress a dataset that lies close to a • Optimal compression by projecting onto the top k • Saw how to calculate the error of the approximation – • Show how PCA can be interpreted in terms of the singular • Applications to word embeddings, graph embeddings,

  8. k be the v k be an orthonormal basis for d is the projection matrix onto • VV T X VV T . Gives the closest approximation to X with rows in • X v k . . . d review matrix with these vectors as its columns. d and V Let v 1 3 Set Up: Assume that data points ⃗ x 1 , . . . ,⃗ x n lie close to any k -dimensional subspace V of R d . Let X ∈ R n × d be the data matrix. ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  9. d is the projection matrix onto • VV T X VV T . Gives the closest approximation to X with rows in • X review v k . . d . matrix with these vectors as its columns. 3 Set Up: Assume that data points ⃗ x 1 , . . . ,⃗ x n lie close to any k -dimensional subspace V of R d . Let X ∈ R n × d be the data matrix. Let ⃗ v 1 , . . . ,⃗ v k be an orthonormal basis for V and V ∈ R d × k be the ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  10. X VV T . Gives the closest approximation to X with rows in • X matrix with these vectors as its columns. v k . . review 3 Set Up: Assume that data points ⃗ x 1 , . . . ,⃗ x n lie close to any k -dimensional subspace V of R d . Let X ∈ R n × d be the data matrix. Let ⃗ v 1 , . . . ,⃗ v k be an orthonormal basis for V and V ∈ R d × k be the • VV T ∈ R d × d is the projection matrix onto V . ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  11. X VV T . Gives the closest approximation to X with rows in • X matrix with these vectors as its columns. v k . . review 3 Set Up: Assume that data points ⃗ x 1 , . . . ,⃗ x n lie close to any k -dimensional subspace V of R d . Let X ∈ R n × d be the data matrix. Let ⃗ v 1 , . . . ,⃗ v k be an orthonormal basis for V and V ∈ R d × k be the • VV T ∈ R d × d is the projection matrix onto V . ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  12. review matrix with these vectors as its columns. v k . 3 Set Up: Assume that data points ⃗ x 1 , . . . ,⃗ x n lie close to any k -dimensional subspace V of R d . Let X ∈ R n × d be the data matrix. Let ⃗ v 1 , . . . ,⃗ v k be an orthonormal basis for V and V ∈ R d × k be the • VV T ∈ R d × d is the projection matrix onto V . • X ≈ X ( VV T ) . Gives the closest approximation to X with rows in V . ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  13. • XVV T is a rank- k matrix – all its rows fall in • X ’s rows are approximately spanned by the columns of V . • X ’s columns are approximately spanned by the columns of XV . review of last time . v k . 4 Low-Rank Approximation: Approximate X ≈ XVV T . ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  14. • X ’s rows are approximately spanned by the columns of V . • X ’s columns are approximately spanned by the columns of XV . review of last time v k . 4 Low-Rank Approximation: Approximate X ≈ XVV T . • XVV T is a rank- k matrix – all its rows fall in V . ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  15. v k . review of last time 4 Low-Rank Approximation: Approximate X ≈ XVV T . • XVV T is a rank- k matrix – all its rows fall in V . • X ’s rows are approximately spanned by the columns of V . • X ’s columns are approximately spanned by the columns of XV . ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  16. dual view of low-rank approximation 5

  17. k XVV T 2 k X optimal low-rank approximation d v k . 2 2 VV T x i 1 i n F orthonormal V orthonormal V arg max arg min 6 d XVV T 2 F Given ⃗ x 1 , . . . ,⃗ x n (the rows of X ) we want to find an orthonormal span V ∈ R d × k (spanning a k -dimensional subspace V ). ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  18. k XVV T 2 optimal low-rank approximation v k . 2 2 VV T x i 1 n F i d arg min orthonormal V 6 F arg max Given ⃗ x 1 , . . . ,⃗ x n (the rows of X ) we want to find an orthonormal span V ∈ R d × k (spanning a k -dimensional subspace V ). orthonormal V ∈ R d × k ∥ X − XVV T ∥ 2 ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

  19. optimal low-rank approximation F v k . 2 2 VV T x i i n 1 6 arg min arg max Given ⃗ x 1 , . . . ,⃗ x n (the rows of X ) we want to find an orthonormal span V ∈ R d × k (spanning a k -dimensional subspace V ). orthonormal V ∈ R d × k ∥ X − XVV T ∥ 2 F = orthonormal V ∈ R d × k ∥ XVV T ∥ 2 ⃗ x 1 , . . . ,⃗ x n ∈ R d : data points, X ∈ R n × d : data matrix, ⃗ v 1 , . . . ,⃗ v k ∈ R d : orthogo- nal basis for subspace V . V ∈ R d × k : matrix with columns ⃗ v 1 , . . . ,⃗

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend