spatial data dimensionality reduction
play

Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 - PowerPoint PPT Presentation

Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 In this subfield, we think of a data point as a vector in R^n (what could possibly go wrong?) Linear dimensionality reduction: Reduction is achieved by is a single


  1. Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3

  2. In this subfield, we think of a data point as a vector in R^n (what could possibly go wrong?)

  3. “Linear” dimensionality reduction: Reduction is achieved by is a single matrix for every point.

  4. Regular Scatterplots • Every data point is a vector:   v 0 v 1     v 2   v 3 • Every scatterplot is produced by a very simple matrix:  1 � 0 0 0 0 1 0 0  1 � 0 0 0 0 0 1 0

  5. What about other matrices?

  6. Grand Tour (Asimov, 1985) http://cscheid.github.io/lux/demos/tour/tour.html

  7. Is there a best matrix? How do we think about that?

  8. Linear Algebra review • Vectors • Inner Products • Lengths • Angles • Bases • Linear Transformations and Eigenvectors

  9. Principal Component Analysis 0.2 0.1 Species Petal.Length setosa Petal.Width PC2 0.0 versicolor virginica Sepal.Length − 0.1 Sepal.Width − 0.2 − 0.10 − 0.05 0.00 0.05 0.10 0.15 PC1

  10. Principal Component Analysis • Algorithm: • Given data set as matrix X in R^(d x n), ~ 1 1 T ) = XH ˜ • Center matrix: ~ X = X ( I − n X T ˜ • Compute eigendecomposition of ˜ X • X T ˜ ˜ X = U Σ U T • The principal components are the first few rows of U Σ 1 / 2

  11. What if we don’t have coordinates, but distances? “Classical” Multidimensional Scaling

  12. http://www.math.pku.edu.cn/teachers/yaoy/Fall2011/ lecture11.pdf

  13. Borg and Groenen, Modern Multidimensional Scaling

  14. Borg and Groenen, Modern Multidimensional Scaling

  15. “Classical” Multidimensional Scaling • Algorithm: B = − 1 • Given , create D ij = | X i − X j | 2 2 HDH T • PCA of B is equal to the PCA of X • Huh?!

  16. “Nonlinear” dimensionality reduction (ie: projection is not a matrix operation)

  17. Data might have “high- order” structure

  18. http://isomap.stanford.edu/Supplemental_Fig.pdf

  19. We might want to minimize something else besides “di ff erence between squared distances” t-SNE: di ff erence between neighbor ordering Why not distances?

  20. The curse of Dimensionality • High dimensional space looks nothing like low- dimensional space • Most distances become meaningless

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend