Computational Topology - Mapper Jiaqi Ni Eindhoven University of - - PowerPoint PPT Presentation
Computational Topology - Mapper Jiaqi Ni Eindhoven University of - - PowerPoint PPT Presentation
Computational Topology - Mapper Jiaqi Ni Eindhoven University of Technology June 14, 2018 Outline Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary Introduction Mapper
Outline
Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary
Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary
Introduction
◮ Mapper is a computational method for extracting simple
descriptions of high dimensional data sets in the form of simplicial complexes.
Recap about Reeb Graph
Definition: The Reeb graph of f is the set of contours R(f).
Recap about Reeb Graph
We can get similar result as Reeb Graph with Mapper.
Recap about Reeb Graph
We can also get the more different results from Reeb Graph with Mapper.
Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary
Cover of space
If the set X is a topological space, then a cover C of X is a collection of subsets U of X whose union is the whole space
- X. In this case we say that C covers X, or that the sets U
cover X.
Cover of space
If the set X is a topological space, then a cover C of X is a collection of subsets U of X whose union is the whole space
- X. In this case we say that C covers X, or that the sets U
cover X. Topological Space X Cover of Space X
Cover of space
If Y is a subset of X, then a cover of Y is a collection of subsets of X whose union contains Y, i.e., C is a cover of Y if Y ⊆
- α∈C
Uα
Cover of space
If Y is a subset of X, then a cover of Y is a collection of subsets of X whose union contains Y, i.e., C is a cover of Y if Y ⊆
- α∈C
Uα
Cover refinement
◮ A refinement of a cover C of a topological space X is a new
cover D of X such that every set in D is contained in some set in C.
Cover refinement
◮ A refinement of a cover C of a topological space X is a new
cover D of X such that every set in D is contained in some set in C.
◮ Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}
when ∀β ∃α Vβ ⊆ Uα
Cover refinement
◮ A refinement of a cover C of a topological space X is a new
cover D of X such that every set in D is contained in some set in C.
◮ Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}
when ∀β ∃α Vβ ⊆ Uα Space X Cover of Space X Refinement of Cover
Mapper in the continuous setting
Input:
◮ Continuous function(filter) f : X → R ◮ Cover C of im(f) by open intervals: im(f ) ⊆
- c∈C
c Method:
◮ Compute pullback cover U of X: U = f −1(c)c∈C ◮ Refine U by separating each of its elements into its various
connected components → connected cover V
◮ The Mapper is the nerve of V:
◮ 1 vertex per element V ∈ V ◮ 1 edge per intersection V ∪ V ′ = ø, V , V ′ ∈ V ◮ 1 k-simplex per (k + 1)-fold intersection,
k
i=0 Vi = ø, V0, V1...Vk ∈ V
Example of Mapper in the continuous setting
Example of Mapper in the continuous setting
Example of Mapper in the continuous setting
Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary
Mapper in practice
Input:
◮ Point cloud P with distance matrix ◮ Continuous function(filter) f : P → R ◮ Cover C of im(f) by open intervals: im(f ) ⊆
- c∈C
c Method:
◮ Compute pullback cover U of X: U = f −1(c)c∈C ◮ Refine U by applying clustering algorithm(with distance
threshold δ) → connected cover V
◮ The Mapper is the nerve of V:
◮ 1 vertex per element V ∈ V ◮ 1 edge per intersection V ∪ V ′ = ø, V , V ′ ∈ V ◮ 1 k-simplex per (k + 1)-fold intersection,
k
i=0 Vi = ø, V0, V1...Vk ∈ V
Example of Mapper in practice
Example of Mapper in practice
Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary
Parameters of Mapper in practice
◮ Filter f : P → R
Parameters of Mapper in practice
◮ Filter f : P → R ◮ Cover C of im(f) by open intervals:
Parameters of Mapper in practice
◮ Filter f : P → R ◮ Cover C of im(f) by open intervals: ◮ Clustering algorithm
Parameters of Mapper in practice - Filter functions
◮ The outcome of Mapper is highly dependent on the function
chosen to partition (filter) the data set and the choice of functions depends mostly on the dataset.
◮ Possible functions:
◮ Density ◮ Eccentricity ◮ Graph Laplacians ◮ sum/average/max/min ◮ x/y- axis projection
Filter function examples
Filter function examples
Filter function examples
Filter function examples
Parameters of Mapper in practice - Cover
◮ Uniform cover I
◮ resolution / granularity: r (diameter of intervals) ◮ gain: g (percentage of overlap)
Parameters of Mapper in practice - Cover
◮ Uniform cover I
◮ resolution / granularity: r (diameter of intervals) ◮ gain: g (percentage of overlap)
◮ Example:
Parameters of Mapper in practice - Cover
◮ Uniform cover I
◮ resolution / granularity: r (diameter of intervals) ◮ gain: g (percentage of overlap)
◮ Example: ◮ Modification of r and g can highly effect the result.
Cover examples
Cover examples
Cover examples
Cover examples
Mapper for Y-shape point cloud data
Mapper for Y-shape point cloud data
Parameters of uniform Cover
Parameter r:
◮ Small r: fine cover, Mapper close to Reeb Graph, but
sensitive to δ.
◮ Large r: rough cover, less sensitive to δ, but Mapper far from
Reeb Graph. Parameter g:
◮ Large g(close to 1): more points inside intersections, less
sensitive to δ but far from Reeb Graph.
◮ Small g(close to 0): controlled Mapper dimension, close to
Reeb Graph.
Parameters of Mapper in practice - Clustering algorithm
Single-linkage clustering is one of several methods of hierarchical clustering.
◮ Based on grouping clusters in bottom-up fashion
(agglomerative clustering).
◮ At each step combining two clusters that contain the closest
pair of elements not yet belonging to the same cluster as each
- ther.
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Single-linkage clustering
Example of Clustering algorithm with different parameters
Example of Clustering algorithm with different parameters
Example of Clustering algorithm with different parameters
Example of Clustering algorithm with different parameters
Parameters of graph neighborhood size
Parameter δ:
◮ Large δ: fewer nodes, clean Mapper but far from Reeb
Graph(more straight lines).
◮ Small δ: presence of topological structure but lots of nodes
(noisy).
Higher Dimensional Parameter Spaces
◮ We use 1 function and let R to be our 1-dimensional
parameter space.
Higher Dimensional Parameter Spaces
◮ We use 1 function and let R to be our 1-dimensional
parameter space.
◮ We can use M functions and let RM to be our M-dimensional
parameter space, remain to find a covering of an M-dimensional hypercube which is defined by the ranges of the M functions.
Example of parameter space R2
◮ Assume we have a point could dataset P (2-Dim) as following. ◮ Assume we have two filter functions f : P → R, g : P → R,
and f = f −1 and g = g−1.
Example of parameter space R2
◮ Moreover, assume we have the following cover C, which is
also the cover of P since f = f −1 and g = g−1.
Example of parameter space R2
◮ Moreover, assume we have the following cover C, which is
also the cover of P since f = f −1 and g = g−1.
◮ Assume the clustering algorithm group every points in each
rectangle as one cluster.
Example of parameter space R2
◮ Moreover, assume we have the following cover C, which is
also the cover of P since f = f −1 and g = g−1.
◮ Assume the clustering algorithm group every points in each
rectangle as one cluster.
Example of parameter space R2
◮ Moreover, assume we have the following cover C, which is
also the cover of P since f = f −1 and g = g−1.
◮ Assume the clustering algorithm group every points in each
rectangle as one cluster.
◮ Whenever clusters corresponding to any n vertices have non
empty intersection, add a corresponding n-1 simplex.
Example of parameter space R2
◮ Two clusters intersection = 1 edge.
Example of parameter space R2
◮ Three clusters intersection = 1 triangle.
Example of parameter space R2
◮ Four clusters intersection = 1 tetrahedron.
Example of parameter space R2
◮ Final simplical complex.
Higher Dimensional Parameter Spaces
Mapper to the parameter space RM can be extended in a similar fashion (by finding a covering of an M-dimensional hypercube which is defined by the ranges of the M functions).
Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary
Mapper in Applications
Most commonly used in:
◮ Clustering ◮ Feature selection (flares, loops)
Applications to Medical science data
145 patients who had diabetes, for each patient, six quantities were measured:
◮ Age ◮ Relative weight ◮ Fasting plasma glucose ◮ Area under the plasma glucose curve for the three hour
glucose tolerance test (OGTT)
◮ Aarea under the plasma insulin curve for the (OGTT) ◮ Steady state plasma glucose response
This creates a 6 dimensional data set.
Applications to Medical science data
◮ Applying projection pursuit methods to obtain a projection
into three dimensional Euclidean space
Applications to Medical science data
◮ Applying projection pursuit methods to obtain a projection
into three dimensional Euclidean space We want to use Mapper as an automatic tool for detecting such flares in the data.
Applications to Medical science data
◮ Left: 3 intervals, 50% overlap. ◮ Right: 4 intervals, 50% overlap. ◮ For each output:
◮ Left flare: adult onset Right flare: juvenile onset ◮ Distance function: L2-distance ◮ Filter function: density kernel with e=130,000
Mapper in Applications
◮ Innate and adaptive T cells in asthmatic patients:
Relationship to severity and disease mechanisms, Hinks et al.,
- J. Allergy Clinical Immunology, 2015
◮ Topological Data Analysis for Discovery in Preclinical Spinal
Cord Injury and Traumatic Brain Injury, Nielson et al., Nature, 2015
◮ Using Topological Data Analysis for Diagnosis Pulmonary
Embolism, Rucco et al., arXiv preprint, 2014
◮ CD8 T-cell reactivity to islet antigens is unique to type 1
while CD4 T-cell reactivity exists in both type 1 and type 2 diabetes, Sarikonda et al., J. Autoimmunity, 2013
◮ Extracting insights from the shape of complex data using
topology, Lum et al., Nature, 2013
◮ Topological Methods for Exploring Low-density States in
Biomolecular Folding Pathways, Yao et al., J. Chemical Physics, 2009
Introduction Mapper in the continuous setting Mapper in practice Parameters of Mapper in practice Applications Summary
Summary
◮ Mapper: a computational method which retrieves a
higher-level understanding of the structure of data.
◮ Mapper in continuous setting. ◮ Mapper in practice ◮ Parameters of Mapper in practice
◮ filter function. ◮ covering algorithm. ◮ clustering algorithm.