Dimensionality Reduction for Visualization Lecture 13 April 8, - - PowerPoint PPT Presentation

dimensionality reduction for visualization
SMART_READER_LITE
LIVE PREVIEW

Dimensionality Reduction for Visualization Lecture 13 April 8, - - PowerPoint PPT Presentation

CS53000-Spring 2020 Introduction to Scientific Visualization Dimensionality Reduction for Visualization Lecture 13 April 8, 2020 Outline High-dimensional data Dimensionality reduction Manifold learning Topological data analysis 2 CS530 /


slide-1
SLIDE 1

April 8, 2020

CS53000-Spring 2020

Introduction to Scientific Visualization

Dimensionality Reduction for Visualization

Lecture 13

slide-2
SLIDE 2

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Outline

High-dimensional data Dimensionality reduction Manifold learning Topological data analysis

2

slide-3
SLIDE 3

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

  • Data samples with many attributes
  • high-throughput imaging, hyperspectral

imaging, medical records, …

  • PDE modeling of physical systems
  • example: Navier Stokes equations

Why High-Dimensional?

3

ρDu Dt = ρ ✓∂u ∂t + u · ru ◆ = r¯ p + µr2u + 1 3µr (r · u) + ρg

<latexit sha1_base64="BdLZ/k2vmHhlTSaAe3o9KOYjGqA=">AC43icbVLNi9NAFJ/Er7V+VT16GSzCSrEkXUEvQsE9eFzB7i40tb5MJ+2wk5kw8yKUIVcvHhTx6j/lzX/Fk5MmW7K7Pg8fh/zvpIWUliMoj9BeO36jZu39m737ty9d/9B/+GjY6tLw/iUanNaQqWS6H4FAVKfloYDnkq+Ul69rbmTz5zY4VWH3BT8HkOKyUywQA9tOj/Tcxa0yQzwNxhkgOu08yVeUOsaJvaMNKnuF+K0oKMChA0q54B3rTsMPQhC010kRBKuECbsRqjc9hRc7NgXjisZftuDHcdc0bFuIK3dQdVRNf+fPNBV3rvNKw3aWlhVdNEfRKNoG/RqErfJgLRxtOj/TpalTlXyCRYO4ujAueuHp1JXvWS0vIC2Bms+MynCnJu5257o4o+8iSZtr4TyHdol2Hg9zaTZ56Zd2ivczV4P+4WYnZ67kTqiRK9YUykp/Ck3rg9OlMJyh3PgEmBG+V8rW4PeI/rfo+SXEl0e+mhyPR/HBaPz+5WAyadexR56Qp2SfxOQVmZB35IhMCQs+BV+Cb8H3kIdfwx/hz0YaBq3nMbkQ4a9/oWrowA=</latexit>
slide-4
SLIDE 4

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Why High-Dimensional?

  • Amalgamate individual properties into

high-dimensional feature vectors

  • e.g., turn images into vectors

4

slide-5
SLIDE 5

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Challenges

  • “Curse of dimensionality”
  • Sampling becomes exponentially costly
  • Space accumulates in“corners” of hypercube
  • Data processing becomes extremely expensive /

intractable

  • How to visualize?
  • Missing intuition for high-dim spatial domain

5

slide-6
SLIDE 6

6 CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

π

n 2

Γ n

2 + 1

  • 2n
<latexit sha1_base64="2czjdlwhHmVdGfR8iKzpzuVQ97Y=">ACK3icbZDLSgMxFIYzXmu9V26CRZBEcpMFXRZdKHLCvYCnbZk0kwbTDJDckYow7yPG1/FhS684Nb3ML0savVA4OP/z+Hk/EsuAHX/XAWFpeWV1Zza/n1jc2t7cLObt1EiasRiMR6WZADBNcsRpwEKwZa0ZkIFgjuL8a+Y0Hpg2P1B0MY9aWpK94yCkBK3ULl36oCU39mHfSCaosLWdZlvrXREriCxbCEZ6x8An2sK95fwDHuNxRWbdQdEvuPBf8KZQRNOqdgsvfi+iWQKqCDGtDw3hnZKNHAqWJb3E8NiQu9Jn7UsKiKZafjWzN8aJUeDiNtnwI8VmcnUiKNGcrAdkoCAzPvjcT/vFYC4U75SpOgCk6WRQmAkOER8HhHteMghaIFRz+1dMB8TGAjbevA3Bmz/5L9TLJe+0VL49K1Yq0zhyaB8doCPkoXNUQTeoimqIokf0jN7Qu/PkvDqfztekdcGZzuyhX+V8/wCiYafP</latexit>

Ratio of hypersphere volume / hypercube volume

slide-7
SLIDE 7

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

DIMENSION REDUCTION

7

slide-8
SLIDE 8

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Principal Component Analysis

  • Assume a point cloud
  • Interpret these points as observations of a

random variable

  • The empirical mean (centroid) is
  • The covariance matrix is given by

8

(xi)i=1,..,n ∈ I Rk X ∈ I Rk c = 1 n

  • i

xi Ajl = 1 n

  • i

(xij − cj)(xil − cl)

A = ¯ X¯ XT

slide-9
SLIDE 9

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

PCA

  • The eigenvectors of the covariance matrix

form a data-dependent coordinate system

9

  • The first m eigenvectors (in

decreasing order of associated eigenvalues) span the m principal dimensions of the point cloud.

slide-10
SLIDE 10

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

MDS

Multidimensional Scaling

  • Input: dissimilarity matrix

with and

  • Goal: find such that

10

∆ij = δi,j

∆ii = 0

∆ij > 0

∆ ∈ I Rn×n

x1, x2, . . . , xn ∈ I Rd

∀(i, j) ||xi − xj|| ≈ δi, j

slide-11
SLIDE 11

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

MDS

Method:

  • Center(*): where
  • Spectral decomposition:
  • Clamp(*):
  • Solution: first d columns of

This is a PCA!

(*): if measures Euclidean distances, is positive semidefinite ( )

11

B = −1 2H∆H H = I − 1 n11T B = UΛU T (Λ+)ij = max(Λij, 0) X = UΛ

1 2

+

B

λi ≥ 0

<latexit sha1_base64="2ePEJcstSKUfZ7QdDdfORtkckqI=">AB+XicbVDLSsNAFL3xWesr6tLNYBFclaQKuiy6cVnBPqAJYTKZtEMnkzgzKZTQP3HjQhG3/ok7/8Zpm4W2Hhg4nHMu984JM86Udpxva219Y3Nru7JT3d3bPzi0j47Ks0loW2S8lT2QqwoZ4K2NdOc9jJcRJy2g1HdzO/O6ZSsVQ86klG/QPBIsZwdpIgW173IQjHDkDegTcgK75tSdOdAqcUtSgxKtwP7yopTkCRWacKxU3Uy7RdYakY4nVa9XNEMkxEe0L6hAidU+cX8ik6N0qE4lSaJzSaq78nCpwoNUlCk0ywHqplbyb+5/VzHd/4BRNZrqkgi0VxzpFO0awGFDFJieYTQzCRzNyKyBLTLQpq2pKcJe/vEo6jbp7W8XNWat2UdFTiFM7gAF6hCfQgjYQGMzvMKbVgv1rv1sYiuWeXMCfyB9fkDbJKS3Q=</latexit>
slide-12
SLIDE 12

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

MANIFOLD LEARNING

12

slide-13
SLIDE 13

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Manifold Learning

  • Move away from linear assumption
  • Assume curved low-dimensional

geometry

  • Assume smoothness

➡ MANIFOLD

  • Manifold learning

13

Problem: Given points x1, . . . , xn ∈ RD that lie on a d-dimensional manifold M that can be described by a single coordi- nate chart f : M → Rd, find y1, . . . , yn ∈ Rd, where yi

def

= f(xi).

slide-14
SLIDE 14

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Isomap

  • Form k-NN graph of input points
  • Form dissimilarity matrix as

square of approximated geodesic distance between points

  • Compute MDS!

14

slide-15
SLIDE 15

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Isomap

  • Computing geodesic distance
  • standard graph problem in CS

15

O

slide-16
SLIDE 16

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Isomap

  • Successful in computer vision

problems

16

slide-17
SLIDE 17

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020 17

Intuition

  • smooth manifold is locally close to linear

Method

  • Characterize local linearity through weight matrix

such that is minimized

  • Find d-dimensional ‘s that minimize
  • Solution is obtained as first d eigenvectors of

Locally Linear Embedding

W

||xi − X

j∈N (i)

Wijxj||2 y X

i

||yi − X

j

Wijyj||2

(I − W)T (I − W)

slide-18
SLIDE 18

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020 17

Intuition

  • smooth manifold is locally close to linear

Method

  • Characterize local linearity through weight matrix

such that is minimized

  • Find d-dimensional ‘s that minimize
  • Solution is obtained as first d eigenvectors of

Locally Linear Embedding

W

||xi − X

j∈N (i)

Wijxj||2 y X

i

||yi − X

j

Wijyj||2

(I − W)T (I − W)

slide-19
SLIDE 19

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020 17

Intuition

  • smooth manifold is locally close to linear

Method

  • Characterize local linearity through weight matrix

such that is minimized

  • Find d-dimensional ‘s that minimize
  • Solution is obtained as first d eigenvectors of

Locally Linear Embedding

W

||xi − X

j∈N (i)

Wijxj||2 y X

i

||yi − X

j

Wijyj||2

(I − W)T (I − W)

slide-20
SLIDE 20

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

LLE

18

slide-21
SLIDE 21

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Laplacian Eigenmaps

  • Graph Laplacian of is

with diagonal and

  • Define as either binary indicator of

connectivity or through heat kernel

  • Compute eigenvectors of Laplacian associated with non-

zero eigenvalue

19

L = D − W

W

Dii = X

j

Wij

D

W

y

  • Lf = λDf
<latexit sha1_base64="dsrLN8VTUlLzGaQrKQMwKEl7fsg=">ACnicbVDLSsNAFL3xWesr6tLNaBFclaQKuhEKunDhoJ9QBPKZDJph04ezEyErp246+4caGIW7/AnX/jpA2orQcGDufcw9x7vIQzqSzry1hYXFpeWS2tldc3Nre2zZ3dloxTQWiTxDwWHQ9LylEm4opTjuJoDj0OG17w8vcb9TIVkc3alRQt0Q9yMWMIKVlnrmwY0TYjXwgiwYowvkcB31Mbr6UXtmxapaE6B5YhekAgUaPfPT8WOShjRShGMpu7aVKDfDQjHC6bjspJImAxn3Y1jXBIpZtNThmjI634KIiFfpFCE/V3IsOhlKPQ05P5hnLWy8X/vG6qgnM3Y1GSKhqR6UdBypGKUd4L8pmgRPGRJpgIpndFZIAFJkq3V9Yl2LMnz5NWrWqfVGu3p5V6vaijBPtwCMdgwxnU4Roa0AQCD/AEL/BqPBrPxpvxPh1dMIrMHvyB8fENoSmaOQ=</latexit>

xi → (f1(i), f2(i), ..., fm(i))

<latexit sha1_base64="W3rKVlr5XECUHEdTmKwvWf1PlY=">ACR3icbZBLSwMxFIUz9VXrq+rSTbAILcgwUwVdFty4rGAf0KlDJs20wcyD5I5ahv47N27d+RfcuFDEpWk7Qh9eCBy+c29yc7xYcAW9WbkVlbX1jfym4Wt7Z3dveL+QVNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAt7/5q7LcemFQ8Cm9hGLNuQPoh9zkloJFbvHMCAgPT59GLseO5P0BECmjR+wI5kMZ/n+yLXLvHI6C6oTYJrmHA0zW6quMWSZVqTwsvCzkQJZV3i69OL6JwEKgijVsa0YuimRwKlgo4KTKBYTek/6rKNlSAKmukhxE+0aSH/UjqEwKe0NmJlARKDQNPd47XVYveGP7ndRLwL7spD+MEWEinD/mJwBDhcai4xyWjIZaECq53hXTAZGEgo6+oEOwF7+8LJpV0z4zqzfnpVotiyOPjtAxKiMbXaAaukZ1EAUPaN39Im+jBfjw/g2fqatOSObOURzlTN+AZIYsGQ=</latexit>
slide-22
SLIDE 22

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Laplacian Eigenmaps

20

N = 5 t = 5.0 N = 10 t = 5.0 N = 15 t = 5.0 N = 5 t = 25.0 N = 10 t = 25.0 N = 15 t = 25.0 N = 5 t = ∞ N = 10 t = ∞ N = 15 t = ∞

−0.015 −0.01 −0.005 0.005 0.01 0.015 −0.01 −0.005 0.005 0.01 0.015 2 3 1 −8.2 −8 −7.8 −7.6 −7.4 x 10−3 4.94 4.96 4.98 5 5.02 5.04 5.06 x 10−3 sh sh sh sh sh sh sh sh sh sh sh sh sh 10 20 x 10−4 −7.1 −7 −6.9 −6.8 −6.7 −6.6 x 10−3 aa aa ao ao ao ao ao ao ao ao ao ao ao ao ao ao ao ao ao q q q l 7.5 8 8.5 9 9.5 x 10−3 4.6 4.7 4.8 4.9 5 5.1 5.2 5.3 5.4 x 10−3 h# h# h# dcl kcl gcl h# h# h# h# h# h# h#
slide-23
SLIDE 23

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Computation

  • Spectral decomposition intractable

for large problems

  • Approximate method can be used
  • Nyström method + column

sampling

21

slide-24
SLIDE 24

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Topological Data Analysis

  • Use algebraic topology

to analyze high- dimensional data

  • Persistent homology
  • Contour tree
  • Morse-smale complex

22

}

<latexit sha1_base64="Ygav2VoP4r1lIqAV4/r/q+kz4k=">AB+HicbZBLSwMxFIUz9VXro6Mu3QSL4GqYqYIuC25cVrQP6Awlk95pQzMPkjtCLQX/hxsXirj1p7jz35g+Ftp6IPBxzg25OWEmhUbX/bYKa+sbm1vF7dLO7t5+2T4bOo0VxwaPJWpaodMgxQJNFCghHamgMWhFY4vJ7mrQdQWqTJPY4yCGLWT0QkOENjde2yLyFCh/pK9AfoT7p2xXcmegqeAuokIXqXfvL76U8jyFBLpnWHc/NMBgzhYJLmJT8XEPG+JD1oWMwYTHoYDxbfEJPjdOjUarMSZDO3N83xizWehSHZjJmONDL2dT8L+vkGF0FY5FkOULC5w9FuaSY0mkLtCcUcJQjA4wrYXalfMAU42i6KpkSvOUvr0Kz6njnTvX2olK7e5rXUSTH5IScEY9ckhq5IXSIJzk5Jm8kjfr0Xqx3q2P+WjBWlR4RP7I+vwBkwmTfg=</latexit>

discrete data combinatorial processing

slide-25
SLIDE 25

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

TDA

  • Persistence diagram
  • Filtered simplicial complex

23

slide-26
SLIDE 26

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Persistent Homology

24

slide-27
SLIDE 27

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Persistent Homology

24

slide-28
SLIDE 28

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Persistent Homology

25

slide-29
SLIDE 29

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Persistent Homology

25

slide-30
SLIDE 30

CS530 / Spring 2020 : Introduction to Scientific Visualization. Dimensionality Reduction for Visualization 04/02/2020

Persistent Homology

26