An Introduction to Topological Data Analysis Yuan Yao Department of - PowerPoint PPT Presentation

Outline Why Topology? Simplicial Complex Persistent Homology An Introduction to Topological Data Analysis Yuan Yao Department of Mathematics HKUST April 22, 2020 1

Outline Why Topology? Simplicial Complex Persistent Homology 1 Why Topological Methods? Methods for Visualizing a Data Geometry 2 Simplicial Complex for Data Representation Simplicial Complex Nerve, Reeb Graph, and Mapper Applications of Mapper Graph ˇ Cech, Vietoris-Rips, and Witness Complexes 3 Persistent Homology Betti Numbers Betti Number at Different Scales Applications: H1N1 Evolution, Sensor Network Coverage, Natural Image Patches Outline 2

Outline Why Topology? Simplicial Complex Persistent Homology Outline 1 Why Topological Methods? Methods for Visualizing a Data Geometry 2 Simplicial Complex for Data Representation Simplicial Complex Nerve, Reeb Graph, and Mapper Applications of Mapper Graph ˇ Cech, Vietoris-Rips, and Witness Complexes 3 Persistent Homology Betti Numbers Betti Number at Different Scales Applications: H1N1 Evolution, Sensor Network Coverage, Natural Image Patches Why Topological Methods? 3

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Imposing a Geometry Figure: Define a metric Why Topological Methods? 4

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Methods for Summarizing or Visualizing a Geometry Figure: Linear projection (PCA, MDS, etc. Euclidean Metric) Why Topological Methods? 5

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Methods for Summarizing or Visualizing a Geometry Figure: Nonlinear Dimensionality Reduction (ISOMAP, LLE etc. Riemannian Metric) Why Topological Methods? 6

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Geometric Data Reduction General method of manifold learning takes the following Spectral Kernal Embedding approach • construct a neighborhood graph of data, G • construct a positive semi-definite kernel on graphs, K • find global embedding coordinates of data by eigen-decomposition of K = Y Y T Sometimes ‘distance metric’ is just a similarity measure (nonmetric MDS, ordinal embedding) Sometimes coordinates are not a good way to organize/visualize the data (e.g. d > 3 ) Sometimes all that is required is a qualitative view Why Topological Methods? 7

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Methods for Summarizing or Visualizing a Geometry Figure: Clustering the data Why Topological Methods? 8

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Methods for Summarizing or Visualizing a Geometry Average Linkage Complete Linkage Single Linkage Figure: Cluster trees: Average, complete, and single linkage. From Introduction to Statistical Learning with Applications in R . Why Topological Methods? 9

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Hierarchical Cluster Trees 1 Start with each data point as its own cluster; 2 Repeatedly merge two “closest” clusters, where notions of “distance” between two clusters are given by: • Single linkage: closest pair of points • Complete linkage: furthest pair of points • Average linkage (several variants): (i) distance between centroids (ii) average pairwise distance (iii) Ward’s method: increase in k -means cost due to merger Why Topological Methods? 10

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Methods for Summarizing or Visualizing a Geometry Figure: Define a graph or network structure Why Topological Methods? 11

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Topology Origins of Topology in Math • Leonhard Euler 1736, Seven Bridges of K¨ onigsberg • Johann Benedict Listing 1847, Vorstudien zur Topologie • J.B. Listing (orbituary) Nature 27:316-317, 1883. “qualitative geometry from the ordinary geometry in which quantitative relations chiefly are treated.” Why Topological Methods? 12

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry RNA hairpin folding pathways 2 3 % 9 8 % 100% 100% 100% 9 9 % 100% 9 8 % 4 4 % 3% G1 G1 G1 G1 C6 A7 G5 G2 G2 A8 G2 G2 C4 G3 G3 G3 G3 G9 C6 A7 C6 C4 C4 C4 C4 G3 0.41 C6 A7 C10 A7 0.51 0.58 0.41 G5 G5 G5 G5 A8 G5 G5 G2 G5 A8 C11 A8 G1 0.96 C4 U12 C4 G9 C6 C6 C6 C6 0.71 0.75 0.46 G9 C4 G9 0.62 G3 A7 A7 A7 A7 G3 C10 0.80 C10 0.72 C6 G3 A7 0.51 C10 0.79 G2 0.72 A8 A8 A8 A8 0.57 G2 G2 C11 C11 C11 G5 A8 0.45 G1 0.75 G1 0.50 G9 G9 G9 G9 U12 G1 U12 U12 0.70 C4 G9 C10 C10 C10 C10 G3 C10 C11 C11 C11 C11 G2 C11 G1 U12 U12 U12 U12 U12 C6 G5 A7 C4 A8 0.42 0.50 G3 G9 G2 0.50 G1 C10 C11 U12 Figure: Jointly with Xuhui Huang, Jian Sun, Greg Bowman, Gunnar Carlsson, Leo Guibas, and Vijay Pande, JACS’08 , JCP’09 Why Topological Methods? 13

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Differentiation process from murine embryonic stem cells to motor neurons s n o r u e N Pluripotent cells Progenitors Group 1a Group 1b Group 2 Group 3 Neural precursors genes genes genes genes log 2 (1+TPM) 3.0 2.3 3.9 4.4 0.0 0.0 0.0 0.0 Figure: Mapper graph of single cell data, where the different regions in the Mapper graph nicely line up with different points along the differentiation timeline. Rizvi et al. Nature Biotechnol. 35.6 (2017), 551-560. Why Topological Methods? 14

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Key elements Coordinate free representation Invariance under deformations Compressed qualitative representation Why Topological Methods? 15

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Topology in continuous spaces To see points in neighborhood the same requires distortion of distances, i.e. stretching and shrinking We do not permit tearing , i.e. distorting distances in a discontinuous way Why Topological Methods? 16

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Continous Topology Figure: Homeomorphic Why Topological Methods? 17

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Continuous Topology Figure: Homeomorphic Why Topological Methods? 18

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Discrete case? How does topology make sense, in discrete and noisy setting? Why Topological Methods? 19

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Properties of Data Geometry Fact We Don’t Trust Large Distances! In life or social sciences, distance (metric) are constructed using a notion of similarity (proximity), but have no theoretical backing (e.g. distance between faces, gene expression profiles, Jukes-Cantor distance between sequences) Small distances still represent similarity (proximity), but long distance comparisons hardly make sense Why Topological Methods? 20

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Properties of Data Geometry Fact We Only Trust Small Distances a Bit! Both pairs are regarded as similar, but the strength of the similarity as encoded by the distance may not be so significant Similar objects lie in neighborhood of each other, which suffices to define topology Why Topological Methods? 21

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry Properties of Data Geometry Fact Even Local Connections are Noisy, depending on observer’s scale! Is it a circle, dots, or circle of circles? To see the circle, we ignore variations in small distance (tolerance for proximity) Why Topological Methods? 22

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry So we need robust topology against metric distortions Distance measurements are noisy Physical device like human eyes may ignore differences in proximity (or as an average effect) Topology is the crudest way to capture invariants under distortions of distances At the presence of noise, one need topology varied with scales Why Topological Methods? 23

Outline Why Topology? Simplicial Complex Persistent Homology Methods for Visualizing a Data Geometry What kind of topology? Topology studies (global) mappings between spaces Point-set topology: continuous mappings on open sets Differential topology: differentiable mappings on smooth manifolds • Morse theory tells us topology of continuous space can be learned by discrete information on critical points Algebraic topology: homomorphisms on algebraic structures, the most concise encoder for topology Combinatorial topology: mappings on simplicial (cell) complexes • Simplicial complex may be constructed from data • Algebraic, differential structures can be defined here Why Topological Methods? 24

An Introduction to Topological Data Analysis Yuan Yao Department of - PowerPoint PPT Presentation

Outline Why Topology? Simplicial Complex Persistent Homology An Introduction to Topological Data Analysis Yuan Yao Department of Mathematics HKUST April 22, 2020 1 Outline Why Topology? Simplicial Complex Persistent Homology 1 Why

Topological Sort Shivam Patel Viktor Zenkov Questions 1. Who first described topological sort?

Topological invariants in disordered topological insulators Subtitle: Spectral localizer of

Introduction to Topological Data Analysis Persistent Homology Norm Matloff University of

Topological Structures in the Analysis of Images and Data Chao Chen City University of New York

Software for TDA ACM-BCB Workshop on TDA October 2, 2016 by Svetlana Lockwood Topological Data

Exotic topological states of ultra-cold atomic matter Lecture 1: Topolgical and non- topological

Lecture 19: Topological Mapping CS 344R/393R: Robotics Benjamin Kuipers Exploration Defines

G -bases in free objects of Topological Algebra (Local) -bases in topological and uniform

Topological states of matter: topological order vs SPT phases Victor Gurarie January 2018

EE 355 Unit 18 DFS and Topological Sort Mark Redekopp 2 Topological Sort Given a graph of

W4231: Analysis of Algorithms Topological Sort 10/26/1999 Given a directed graph G = ( V, E ) , a

Floquet Topological Insulator: UnderstandingFloquet topological insulator in semiconductor

CSE 326: Data Structures Graph representations Graphs Topological Sort Topological

A Short Introduction to Topological Superconductors --- A Glimpse of Topological Phases of Matter

Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials

Topological dynamics and ergodic theory of automorphism groups Alexander S. Kechris Harvard;

Teaching an old DAG new tricks Migrating a decade old pipeline to Airflow Outline Cloud native

Running multiple customer-facing application in Fargate! Nils Rhode | Haufe.Group |

Computer Communication Networks Link Layer IECE / ICSI 416 Spring 2020 Prof. Dola Saha 1

Symmetric Key Cryptography Introduction to Symmetric Key Cryptography What can we do?

Transplant vs. Surgery for Early HCC Rajesh Ramanathan, MD Surgical Oncology ISIGO October 10 th

Building Resilient Serverless Systems @johnchapin | symphonia.io John Chapin Partner,

SQL Basics Lecture 7b SQL Basics 5 November 2014 1 Wentworth Institute of Technology COMP570

LJS tlfh 13E P 1o..t.. hxb a.f?e /.rrr r !yt t a!,rs &\4... qpera F.ovi 6ruu

An Introduction to Topological Data Analysis Yuan Yao Department of - PowerPoint PPT Presentation

Outline Why Topology? Simplicial Complex Persistent Homology An Introduction to Topological Data Analysis Yuan Yao Department of Mathematics HKUST April 22, 2020 1 Outline Why Topology? Simplicial Complex Persistent Homology 1 Why

Topological Sort Shivam Patel Viktor Zenkov Questions 1. Who first described topological sort?

Topological invariants in disordered topological insulators Subtitle: Spectral localizer of

Introduction to Topological Data Analysis Persistent Homology Norm Matloff University of

Topological Structures in the Analysis of Images and Data Chao Chen City University of New York

Software for TDA ACM-BCB Workshop on TDA October 2, 2016 by Svetlana Lockwood Topological Data

Exotic topological states of ultra-cold atomic matter Lecture 1: Topolgical and non- topological

Lecture 19: Topological Mapping CS 344R/393R: Robotics Benjamin Kuipers Exploration Defines

G -bases in free objects of Topological Algebra (Local) -bases in topological and uniform

Topological states of matter: topological order vs SPT phases Victor Gurarie January 2018

EE 355 Unit 18 DFS and Topological Sort Mark Redekopp 2 Topological Sort Given a graph of

W4231: Analysis of Algorithms Topological Sort 10/26/1999 Given a directed graph G = ( V, E ) , a

Floquet Topological Insulator: UnderstandingFloquet topological insulator in semiconductor

CSE 326: Data Structures Graph representations Graphs Topological Sort Topological

A Short Introduction to Topological Superconductors --- A Glimpse of Topological Phases of Matter

Introduction to topological data analysis Ippei Obayashi Adavnced Institute for Materials

Topological dynamics and ergodic theory of automorphism groups Alexander S. Kechris Harvard;

Teaching an old DAG new tricks Migrating a decade old pipeline to Airflow Outline Cloud native

Running multiple customer-facing application in Fargate! Nils Rhode | Haufe.Group |

Computer Communication Networks Link Layer IECE / ICSI 416 Spring 2020 Prof. Dola Saha 1

Symmetric Key Cryptography Introduction to Symmetric Key Cryptography What can we do?

Transplant vs. Surgery for Early HCC Rajesh Ramanathan, MD Surgical Oncology ISIGO October 10 th

Building Resilient Serverless Systems @johnchapin | symphonia.io John Chapin Partner,

SQL Basics Lecture 7b SQL Basics 5 November 2014 1 Wentworth Institute of Technology COMP570

LJS tlfh 13E P 1o..t.. hxb a.f?e /.rrr r !yt t a!,rs &amp;\4... qpera F.ovi 6ruu

LJS tlfh 13E P 1o..t.. hxb a.f?e /.rrr r !yt t a!,rs &\4... qpera F.ovi 6ruu