Computational Topology - Mapper Jiaqi Ni Eindhoven University of - - PowerPoint PPT Presentation

computational topology mapper
SMART_READER_LITE
LIVE PREVIEW

Computational Topology - Mapper Jiaqi Ni Eindhoven University of - - PowerPoint PPT Presentation

Computational Topology - Mapper Jiaqi Ni Eindhoven University of Technology June 14, 2018 Outline Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary Introduction Mapper


slide-1
SLIDE 1

Computational Topology - Mapper

Jiaqi Ni

Eindhoven University of Technology

June 14, 2018

slide-2
SLIDE 2

Outline

Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary

slide-3
SLIDE 3

Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary

slide-4
SLIDE 4

Introduction

◮ Mapper is a computational method for extracting simple

descriptions of high dimensional data sets in the form of simplicial complexes.

slide-5
SLIDE 5

Recap about Reeb Graph

Definition: The Reeb graph of f is the set of contours R(f).

slide-6
SLIDE 6

Recap about Reeb Graph

We can get similar result as Reeb Graph with Mapper.

slide-7
SLIDE 7

Recap about Reeb Graph

We can also get the more different results from Reeb Graph with Mapper.

slide-8
SLIDE 8

Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary

slide-9
SLIDE 9

Cover of space

If the set X is a topological space, then a cover C of X is a collection of subsets U of X whose union is the whole space

  • X. In this case we say that C covers X, or that the sets U

cover X.

slide-10
SLIDE 10

Cover of space

If the set X is a topological space, then a cover C of X is a collection of subsets U of X whose union is the whole space

  • X. In this case we say that C covers X, or that the sets U

cover X. Topological Space X Cover of Space X

slide-11
SLIDE 11

Cover of space

If Y is a subset of X, then a cover of Y is a collection of subsets of X whose union contains Y, i.e., C is a cover of Y if Y ⊆

  • α∈C

slide-12
SLIDE 12

Cover of space

If Y is a subset of X, then a cover of Y is a collection of subsets of X whose union contains Y, i.e., C is a cover of Y if Y ⊆

  • α∈C

slide-13
SLIDE 13

Cover refinement

◮ A refinement of a cover C of a topological space X is a new

cover D of X such that every set in D is contained in some set in C.

slide-14
SLIDE 14

Cover refinement

◮ A refinement of a cover C of a topological space X is a new

cover D of X such that every set in D is contained in some set in C.

◮ Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}

when ∀β ∃α Vβ ⊆ Uα

slide-15
SLIDE 15

Cover refinement

◮ A refinement of a cover C of a topological space X is a new

cover D of X such that every set in D is contained in some set in C.

◮ Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}

when ∀β ∃α Vβ ⊆ Uα Space X Cover of Space X Refinement of Cover

slide-16
SLIDE 16

Mapper in the continuous setting

Input:

◮ Continuous function(filter) f : X → R ◮ Cover C of im(f) by open intervals: im(f ) ⊆

  • c∈C

c Method:

◮ Compute pullback cover U of X: U = f −1(c)c∈C ◮ Refine U by separating each of its elements into its various

connected components → connected cover V

◮ The Mapper is the nerve of V:

◮ 1 vertex per element V ∈ V ◮ 1 edge per intersection V ∪ V ′ = ø, V , V ′ ∈ V ◮ 1 k-simplex per (k + 1)-fold intersection,

k

i=0 Vi = ø, V0, V1...Vk ∈ V

slide-17
SLIDE 17

Example of Mapper in the continuous setting

slide-18
SLIDE 18

Example of Mapper in the continuous setting

slide-19
SLIDE 19

Example of Mapper in the continuous setting

slide-20
SLIDE 20

Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary

slide-21
SLIDE 21

Mapper in practice

Input:

◮ Point cloud P with distance matrix ◮ Continuous function(filter) f : P → R ◮ Cover C of im(f) by open intervals: im(f ) ⊆

  • c∈C

c Method:

◮ Compute pullback cover U of X: U = f −1(c)c∈C ◮ Refine U by applying clustering algorithm(with distance

threshold δ) → connected cover V

◮ The Mapper is the nerve of V:

◮ 1 vertex per element V ∈ V ◮ 1 edge per intersection V ∪ V ′ = ø, V , V ′ ∈ V ◮ 1 k-simplex per (k + 1)-fold intersection,

k

i=0 Vi = ø, V0, V1...Vk ∈ V

slide-22
SLIDE 22

Example of Mapper in practice

slide-23
SLIDE 23

Example of Mapper in practice

slide-24
SLIDE 24

Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary

slide-25
SLIDE 25

Parameters of Mapper in practice

◮ Filter f : P → R

slide-26
SLIDE 26

Parameters of Mapper in practice

◮ Filter f : P → R ◮ Cover C of im(f) by open intervals:

slide-27
SLIDE 27

Parameters of Mapper in practice

◮ Filter f : P → R ◮ Cover C of im(f) by open intervals: ◮ Clustering algorithm

slide-28
SLIDE 28

Parameters of Mapper in practice - Filter functions

◮ The outcome of Mapper is highly dependent on the function

chosen to partition (filter) the data set and the choice of functions depends mostly on the dataset.

◮ Possible functions:

◮ Density ◮ Eccentricity ◮ Graph Laplacians ◮ sum/average/max/min ◮ x/y- axis projection

slide-29
SLIDE 29

Filter function examples

slide-30
SLIDE 30

Filter function examples

slide-31
SLIDE 31

Filter function examples

slide-32
SLIDE 32

Filter function examples

slide-33
SLIDE 33

Parameters of Mapper in practice - Cover

◮ Uniform cover I

◮ resolution / granularity: r (diameter of intervals) ◮ gain: g (percentage of overlap)

slide-34
SLIDE 34

Parameters of Mapper in practice - Cover

◮ Uniform cover I

◮ resolution / granularity: r (diameter of intervals) ◮ gain: g (percentage of overlap)

◮ Example:

slide-35
SLIDE 35

Parameters of Mapper in practice - Cover

◮ Uniform cover I

◮ resolution / granularity: r (diameter of intervals) ◮ gain: g (percentage of overlap)

◮ Example: ◮ Modification of r and g can highly effect the result.

slide-36
SLIDE 36

Cover examples

slide-37
SLIDE 37

Cover examples

slide-38
SLIDE 38

Cover examples

slide-39
SLIDE 39

Cover examples

slide-40
SLIDE 40

Mapper for Y-shape point cloud data

slide-41
SLIDE 41

Mapper for Y-shape point cloud data

slide-42
SLIDE 42

Parameters of uniform Cover

Parameter r:

◮ Small r: fine cover, Mapper close to Reeb Graph, but

sensitive to δ.

◮ Large r: rough cover, less sensitive to δ, but Mapper far from

Reeb Graph. Parameter g:

◮ Large g(close to 1): more points inside intersections, less

sensitive to δ but far from Reeb Graph.

◮ Small g(close to 0): controlled Mapper dimension, close to

Reeb Graph.

slide-43
SLIDE 43

Parameters of Mapper in practice - Clustering algorithm

Single-linkage clustering is one of several methods of hierarchical clustering.

◮ Based on grouping clusters in bottom-up fashion

(agglomerative clustering).

◮ At each step combining two clusters that contain the closest

pair of elements not yet belonging to the same cluster as each

  • ther.
slide-44
SLIDE 44

Example of Single-linkage clustering

slide-45
SLIDE 45

Example of Single-linkage clustering

slide-46
SLIDE 46

Example of Single-linkage clustering

slide-47
SLIDE 47

Example of Single-linkage clustering

slide-48
SLIDE 48

Example of Single-linkage clustering

slide-49
SLIDE 49

Example of Single-linkage clustering

slide-50
SLIDE 50

Example of Single-linkage clustering

slide-51
SLIDE 51

Example of Clustering algorithm with different parameters

slide-52
SLIDE 52

Example of Clustering algorithm with different parameters

slide-53
SLIDE 53

Example of Clustering algorithm with different parameters

slide-54
SLIDE 54

Example of Clustering algorithm with different parameters

slide-55
SLIDE 55

Parameters of graph neighborhood size

Parameter δ:

◮ Large δ: fewer nodes, clean Mapper but far from Reeb

Graph(more straight lines).

◮ Small δ: presence of topological structure but lots of nodes

(noisy).

slide-56
SLIDE 56

Higher Dimensional Parameter Spaces

◮ We use 1 function and let R to be our 1-dimensional

parameter space.

slide-57
SLIDE 57

Higher Dimensional Parameter Spaces

◮ We use 1 function and let R to be our 1-dimensional

parameter space.

◮ We can use M functions and let RM to be our M-dimensional

parameter space, remain to find a covering of an M-dimensional hypercube which is defined by the ranges of the M functions.

slide-58
SLIDE 58

Example of parameter space R2

◮ Assume we have a point could dataset P (2-Dim) as following. ◮ Assume we have two filter functions f : P → R, g : P → R,

and f = f −1 and g = g−1.

slide-59
SLIDE 59

Example of parameter space R2

◮ Moreover, assume we have the following cover C, which is

also the cover of P since f = f −1 and g = g−1.

slide-60
SLIDE 60

Example of parameter space R2

◮ Moreover, assume we have the following cover C, which is

also the cover of P since f = f −1 and g = g−1.

◮ Assume the clustering algorithm group every points in each

rectangle as one cluster.

slide-61
SLIDE 61

Example of parameter space R2

◮ Moreover, assume we have the following cover C, which is

also the cover of P since f = f −1 and g = g−1.

◮ Assume the clustering algorithm group every points in each

rectangle as one cluster.

slide-62
SLIDE 62

Example of parameter space R2

◮ Moreover, assume we have the following cover C, which is

also the cover of P since f = f −1 and g = g−1.

◮ Assume the clustering algorithm group every points in each

rectangle as one cluster.

◮ Whenever clusters corresponding to any n vertices have non

empty intersection, add a corresponding n-1 simplex.

slide-63
SLIDE 63

Example of parameter space R2

◮ Two clusters intersection = 1 edge.

slide-64
SLIDE 64

Example of parameter space R2

◮ Three clusters intersection = 1 triangle.

slide-65
SLIDE 65

Example of parameter space R2

◮ Four clusters intersection = 1 tetrahedron.

slide-66
SLIDE 66

Example of parameter space R2

◮ Final simplical complex.

slide-67
SLIDE 67

Higher Dimensional Parameter Spaces

Mapper to the parameter space RM can be extended in a similar fashion (by finding a covering of an M-dimensional hypercube which is defined by the ranges of the M functions).

slide-68
SLIDE 68

Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary

slide-69
SLIDE 69

Mapper in Applications

Most commonly used in:

◮ Clustering ◮ Feature selection (flares, loops)

slide-70
SLIDE 70

Applications to Medical science data

145 patients who had diabetes, for each patient, six quantities were measured:

◮ Age ◮ Relative weight ◮ Fasting plasma glucose ◮ Area under the plasma glucose curve for the three hour

glucose tolerance test (OGTT)

◮ Aarea under the plasma insulin curve for the (OGTT) ◮ Steady state plasma glucose response

This creates a 6 dimensional data set.

slide-71
SLIDE 71

Applications to Medical science data

◮ Applying projection pursuit methods to obtain a projection

into three dimensional Euclidean space

slide-72
SLIDE 72

Applications to Medical science data

◮ Applying projection pursuit methods to obtain a projection

into three dimensional Euclidean space We want to use Mapper as an automatic tool for detecting such flares in the data.

slide-73
SLIDE 73

Applications to Medical science data

◮ Left: 3 intervals, 50% overlap. ◮ Right: 4 intervals, 50% overlap. ◮ For each output:

◮ Left flare: adult onset Right flare: juvenile onset ◮ Distance function: L2-distance ◮ Filter function: density kernel with e=130,000

slide-74
SLIDE 74

Mapper in Applications

◮ Innate and adaptive T cells in asthmatic patients:

Relationship to severity and disease mechanisms, Hinks et al.,

  • J. Allergy Clinical Immunology, 2015

◮ Topological Data Analysis for Discovery in Preclinical Spinal

Cord Injury and Traumatic Brain Injury, Nielson et al., Nature, 2015

◮ Using Topological Data Analysis for Diagnosis Pulmonary

Embolism, Rucco et al., arXiv preprint, 2014

◮ CD8 T-cell reactivity to islet antigens is unique to type 1

while CD4 T-cell reactivity exists in both type 1 and type 2 diabetes, Sarikonda et al., J. Autoimmunity, 2013

◮ Extracting insights from the shape of complex data using

topology, Lum et al., Nature, 2013

◮ Topological Methods for Exploring Low-density States in

Biomolecular Folding Pathways, Yao et al., J. Chemical Physics, 2009

slide-75
SLIDE 75

Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary

slide-76
SLIDE 76

Summary

◮ Mapper: a computational method which retrieves a

higher-level understanding of the structure of data.

◮ Mapper in continuous setting. ◮ Mapper in practice ◮ Parameters of Mapper in practice

◮ filter function. ◮ covering algorithm. ◮ clustering algorithm.

◮ Applications

slide-77
SLIDE 77

Sources

◮ [SMG07] G. Singh, F. M’emoli, G. Carlsson, Topological

Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition, Eurographics Symposium on Point-Based Graphics 2007.

◮ Examples and images from Tutorial of topological data

analysis part 3(Mapper algorithm): https://www.slideshare.net/Eniod/tutorial-of-topological- data-analysis-part-3mapper-algorithm

◮ Examples and images from Introduction to Topological Data

Analysis: https://www.slideshare.net/hendrikarisma/introduction-to- topological-data-analysis-59759836

◮ Examples and images from KeplerMapper:

https://mlwave.github.io/kepler-mapper/