Topological Data Analysis - II Afra Zomorodian Department of - - PowerPoint PPT Presentation

topological data analysis ii
SMART_READER_LITE
LIVE PREVIEW

Topological Data Analysis - II Afra Zomorodian Department of - - PowerPoint PPT Presentation

Topological Data Analysis - II Afra Zomorodian Department of Computer Science Dartmouth College September 4, 2007 1 Plan Yesterday: Motivation Topology Simplicial Complexes Invariants Homology Algebraic


slide-1
SLIDE 1

1

Afra Zomorodian Department of Computer Science Dartmouth College

September 4, 2007

Topological Data Analysis - II

slide-2
SLIDE 2

2

Plan

☺ Yesterday:

– Motivation – Topology – Simplicial Complexes – Invariants – Homology – Algebraic Complexes

  • Today

– Geometric Complexes – Persistent Homology – The Persistence Algorithm – Application to Natural Images

slide-3
SLIDE 3

3

Outline

  • Geometric Complexes

– Voronoi Diagram – Delaunay Triangulation – Alpha Complex – Witness Complex – Summary

  • Persistent Homology
  • The Persistence Algorithm
  • Application to Natural Images
slide-4
SLIDE 4

4

Recall

  • Procedure

– Cover points to get approximation of underlying space – Take nerve to get combinatorial representation

  • Example: ε-balls around points as cover
  • Idea: Use geometry of embedding space to generate

cover

slide-5
SLIDE 5

5

Voronoi Diagram

slide-6
SLIDE 6

6

Voronoi Diagram

  • p ∈ M ⊆ R2
  • Voronoi cell V(p):

closest points to p in R2

  • Voronoi Diagram: Decomposition of R2 into Voronoi

cells

  • Voronoi (1868 – 1908)
  • Idea: Use Voronoi cells as cover!
slide-7
SLIDE 7

7

Delaunay Triangulation

slide-8
SLIDE 8

8

  • Delaunay Triangulation: nerve of

Voronoi cover

  • Computational Geometry
  • General position assumption

– no events with probability 0 – no k + 1 points on (k – 1)-sphere – has to be handled in practice

  • Fast algorithms for R3
  • Delaunay (1890 – 1980)

Delaunay Triangulation

slide-9
SLIDE 9

9

Restricted Voronoi

slide-10
SLIDE 10

10

Alpha Complex

slide-11
SLIDE 11

11

Delaunay Subcomplex

slide-12
SLIDE 12

12

Alpha Complex

  • Alpha cell: Aε(p) = Bε(p) ∩ V(p)
  • Alpha shape: union of alpha cells
  • Alpha complex: nerve of alpha shape
  • Let D be the Delaunay triangulation

– A0 = ∅ – Aε ⊆ D – A∞ = D

  • Aε ' Cε
  • [Edelsbrunner, Kirkpatrick, and Seidel ’83], et al.
slide-13
SLIDE 13

13

Strong Witness

slide-14
SLIDE 14

14

Strong Witness

  • Given: Point set M ∈ Rd
  • Strong witness: x ∈ Rd

– x is equidistant from v0, . . ., vk ∈ M – x has no closer neighbor in M – x witnesses k-simplex {v0, . . ., vk}

  • Idea: Sample for witnesses
  • Problem: Prob(strong witness) = 0 for discrete set M
slide-15
SLIDE 15

15

Weak Witness

  • Weak witness: x ∈ Rd

– |x – vi| · |x – v| for i = 0, . . ., k and v ∈ M \ {v0, . . ., vk} – x’s closest k + 1 neighbors are v0, . . ., vk – x witnesses k-simplex {v0, . . ., vk} weakly

  • Strong witness ⇒ weak witness
  • (Theorem [de Silva])

A simplex has a strong witness iff all its faces have weak witnesses.

slide-16
SLIDE 16

16

Isomap

  • We want to capture the underlying space, not the

embedding space

  • Idea: Restrict witnesses to given points M

[Tenenbaum et al. ’00]

slide-17
SLIDE 17

17

  • Given: N points M
  • Choose: n landmarks L
  • M \ L will act as witnesses
  • D = n × N, distance matrix
  • Construct ε-graph on L:

Edge [ab] ∈ Wε(M) iff there exists a witness with max(D(a,i), D(b,i)) · ε

  • Do Vietoris-Rips Expansion

Witness Complex

slide-18
SLIDE 18

18

Complex Summary

  • Conformal Alpha – no global scale parameter
  • Flow – stable manifolds of distance function
  • Cubical – rasterize, usually interpretation of images

Complex Name Idea Scales? Extends? Cech Cε Nerve of ε-balls 1K ∼ Vietoris-Rips Vε Pairwise dist < ε 1K Y Alpha Aε Nerve of restricted Voronoi 500K d · 3 Witness Wε Landmarks and witnesses 1K Y

slide-19
SLIDE 19

19

Outline

☺ Geometric Complexes

  • Persistent Homology

– Filtrations – Algebraic Result – Simple Examples

  • The Persistence Algorithm
  • Application to Natural Images
slide-20
SLIDE 20

20

The Question of Scale

β0 = 150 β1 = 0 β2 = 0 β0 = 1 β1 = 37 β2 = 0 β0 = 1 β1 = 2 β2 = 1 β0 = 1 β1 = 1 β2 = 22

Combinatorial Topology

ε β1

slide-21
SLIDE 21

21

Filtration

  • A filtration of a space X is a nested sequence of

subspaces:

  • Cε ⊆ Cε0 if ε · ε0

(Also true for Vε, Aε, and Wε)

  • Simplices are always added,

never removed

  • Implies partial order on simplices
  • Full order: sequence of simplices
  • Ki = union of first i simplices in sequence

Witness Complex

slide-22
SLIDE 22

22

i

Inductive Systems

⊆ ⊆ ⊆ Hk(K250) Hk(K1452) Hk(K994) Hk(K500) K250 K500 K994 K1452 Functoriality Idea: Follow basis elements from birth to death Problem: Need a compatible basis!

slide-23
SLIDE 23

23

Persistent Homology

  • Persistence barcode: multiset of intervals

Birth Death

slide-24
SLIDE 24

24

Algebraic Result

1. Correspondence

  • Input: Filtration
  • Structure of homology: graded k[t]-module
  • 2. Classification
  • k, a field ⇒ k[t] is a PID
  • Structure theorem for

graded PIDs

3. Parameterization

  • n half-infinite
  • m finite
  • Barcode: multiset of n+m intervals (birth, death)
  • Complete discrete invariant!
slide-25
SLIDE 25

25

Deconstructing the Graph (2D)

Torus! β1 Barcode β1 Graph

ε β1

slide-26
SLIDE 26

26

Discovering 3D Structure

β1 Barcode

slide-27
SLIDE 27

27

Outline

☺ Geometric Complexes ☺ Persistent Homology

  • The Persistence Algorithm

– Adding a Simplex – Example Filtration

  • Application to Natural Images
slide-28
SLIDE 28

28

Adding a Simplex

  • Given: Filtered complex K
  • Ki = Ki – 1 ∪ σ, where σ is a k-simplex
  • Let c = ∂σ. c is a (k – 1)-chain.
  • (Lemma) c is a cycle.
  • Proof: ∂c = ∂∂σ = 0.
  • (Lemma) c is in Ki – 1.
  • Proof: Ki is a simplicial complex.
slide-29
SLIDE 29

29

Gaussian Elimination

  • σ is a k-simplex
  • c = ∂σ is a (k – 1)-cycle in Ki – 1
  • Two cases: c is a boundary or not in Ki – 1
  • Mk is matrix for ∂k
  • c is a boundary iff

– it is in range(Mk) – we can write it in terms of a basis for Mk

  • Gaussian elimination maintains a basis for range(Mk)
  • Filtration and persistence imply ordering on pivots
slide-30
SLIDE 30

30

Case 1: c is a boundary in Ki – 1

  • If c is a boundary, then

∃d ∈ Ck + 1(Ki – 1), such that c = ∂d

  • (Lemma) σ + d is a k-cycle in Ki.
  • Proof: ∂(σ + d) = ∂σ + ∂d = c + ∂d = 0.
  • σ creates a new k-cycle class
  • σ is a creator

c σ d

slide-31
SLIDE 31

31

Case 2: c is not a boundary in Ki – 1

  • (Lemma) c becomes a boundary in Ki.
  • Proof: c = ∂σ.
  • In Ki – 1

– c is a cycle – c is not a boundary – c is in a non-boundary homology class

  • In Ki: c is a boundary, so its homology class is trivial.
  • σ destroys a (k – 1)-dimensional class
  • σ is a destroyer
  • Suppose τ created that class that σ destroyed
  • We pair (τ, σ) to get the lifetime interval

σ c

slide-32
SLIDE 32

32

Example

slide-33
SLIDE 33

33

Filtration

  • Initially, cascade = σi
slide-34
SLIDE 34

34

Vertices a, b, c, d

  • ∂σ = 0 for all vertices σ
slide-35
SLIDE 35

35

ab

  • We sort ∂ab = b + a by youngest
  • Since b is unpaired, pair with ab
slide-36
SLIDE 36

36

bc, cd

  • ∂bc = c + b
  • ∂cd = d + c
slide-37
SLIDE 37

37

ad

  • ∂ad = (d + a) ∼ (d + a) + (d + c) = c + a

∼ (c + a) + (c + b) = b + a ∼ (b + a) + (b + a) = 0

slide-38
SLIDE 38

38

ac

  • ∂ac = (c + a) ∼ (c + a) + (c + b) = b + a

∼ (b + a) + (b + a) = 0

slide-39
SLIDE 39

39

abc

  • ∂abc = ac + bc + ab
slide-40
SLIDE 40

40

acd

  • ∂acd = ac + ad + cd ∼

(ac + ad + cd) + (ac + bc + ab) = ad + cd + bc + ab

slide-41
SLIDE 41

41

Barcode

  • β0: a is unpaired ⇒ [0, ∞)
  • β0: (b, ab) ⇒ [0, 1)
  • β0: (c, bc) ⇒ ∅
  • β0: (d, cd) ⇒ [1, 2)
  • β1: (ad, acd) ⇒ [2, 5)
  • β1: (ac, abc) ⇒ [4, 5)
slide-42
SLIDE 42

42

Outline

☺ Geometric Complexes ☺ Persistent Homology ☺ The Persistence Algorithm

  • Application to Natural Images
slide-43
SLIDE 43

43

Natural Images

  • J. H. van Hateren, Neurobiophysics, U. Groningen
slide-44
SLIDE 44

44

Local Structure: 3 x 3 Patches

(0.81, 0.62, 0.64, 0.82, 0.65, 0.64, 0.83, 0.66, 0.65) ∈ R9

slide-45
SLIDE 45

45

Mumford Dataset

  • David Mumford (Brown)

– 3 × 3 patches (R9) – Subtract mean intensity (R8) – Remove low contrast patches – Rescale to unit length (S7)

  • 2.5 million points on S7
  • What is its structure?
  • Examine dense areas
slide-46
SLIDE 46

46

Space of Idealized Lines

  • Lines in natural images
  • Rasterized in 3 × 3 patches
  • Parameterization

– Distance to center: I – Angle: S1 – Space is annulus: I × S1 I × S

1 ' S1

slide-47
SLIDE 47

47

Demo

slide-48
SLIDE 48

48

Quadratic (Vertical) Quadratic (Horizontal)

Graph Structure

Linear

slide-49
SLIDE 49

49

2D Structure

b b a a

slide-50
SLIDE 50

50

  • Can we design a compression algorithm that uses the

Klein bottle?

The Klein Bottle

slide-51
SLIDE 51

51

Software

  • Plex: comptop.stanford.edu/programs/plex

– Cech – Vietoris-Rips – Witness – Persistence

  • Cgal: www.cgal.org

– Alpha – Persistence (?)

  • CHomP: chomp.rutgers.edu
  • Alpha Shapes: biogeometry.duke.edu/software/alphashapes
  • GGobi: www.ggobi.org
slide-52
SLIDE 52

52

Conclusion

  • We are flooded by point set data and need to find

structure in them

  • Topology studies connectivity of spaces
  • Topological analysis may be viewed as generalization
  • f clustering
  • To analyze point sets, we require a combinatorial

representation approximating the original space

  • Homology focuses on the structure of cycles
  • Persistent homology analyzes the relationship of

structures at multiple scales