Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 - - PowerPoint PPT Presentation

spatial data dimensionality reduction
SMART_READER_LITE
LIVE PREVIEW

Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 - - PowerPoint PPT Presentation

Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 In this subfield, we think of a data point as a vector in R^n (what could possibly go wrong?) Linear dimensionality reduction: Reduction is achieved by is a single


slide-1
SLIDE 1

Spatial Data: Dimensionality Reduction

CS444 Techniques, Lecture 3

slide-2
SLIDE 2

In this subfield, we think

  • f a data point as a

vector in R^n

(what could possibly go wrong?)

slide-3
SLIDE 3

“Linear” dimensionality reduction:

Reduction is achieved by is a single matrix for every point.

slide-4
SLIDE 4

Regular Scatterplots

  • Every data point is a vector:
  • Every scatterplot is produced

by a very simple matrix:

    v0 v1 v2 v3      1 1

  •  1

1

slide-5
SLIDE 5

What about other matrices?

slide-6
SLIDE 6

Grand Tour (Asimov, 1985)

http://cscheid.github.io/lux/demos/tour/tour.html

slide-7
SLIDE 7

Is there a best matrix? How do we think about that?

slide-8
SLIDE 8

Linear Algebra review

  • Vectors
  • Inner Products
  • Lengths
  • Angles
  • Bases
  • Linear Transformations and Eigenvectors
slide-9
SLIDE 9

Principal Component Analysis

Sepal.Length Sepal.Width Petal.Length Petal.Width −0.2 −0.1 0.0 0.1 0.2 −0.10 −0.05 0.00 0.05 0.10 0.15

PC1 PC2 Species

setosa versicolor virginica

slide-10
SLIDE 10

Principal Component Analysis

  • Algorithm:
  • Given data set as matrix X in R^(d x n),
  • Center matrix:
  • Compute eigendecomposition of
  • The principal components are the first few rows of

˜ X = X(I − ~ 1 n ~ 1T ) = XH ˜ XT ˜ X = UΣU T ˜ XT ˜ X UΣ1/2

slide-11
SLIDE 11

What if we don’t have coordinates, but distances? “Classical” Multidimensional Scaling

slide-12
SLIDE 12

http://www.math.pku.edu.cn/teachers/yaoy/Fall2011/ lecture11.pdf

slide-13
SLIDE 13

Borg and Groenen, Modern Multidimensional Scaling

slide-14
SLIDE 14

Borg and Groenen, Modern Multidimensional Scaling

slide-15
SLIDE 15

“Classical” Multidimensional Scaling

  • Algorithm:
  • Given , create
  • PCA of B is equal to the PCA of X
  • Huh?!

Dij = |Xi − Xj|2 B = −1 2HDHT

slide-16
SLIDE 16

“Nonlinear” dimensionality reduction (ie: projection is not a matrix operation)

slide-17
SLIDE 17

Data might have “high-

  • rder” structure
slide-18
SLIDE 18

http://isomap.stanford.edu/Supplemental_Fig.pdf

slide-19
SLIDE 19

We might want to minimize something else besides “difference between squared distances”

t-SNE: difference between neighbor ordering Why not distances?

slide-20
SLIDE 20

The curse of Dimensionality

  • High dimensional space looks nothing like low-

dimensional space

  • Most distances become meaningless