On Efficient Low Distortion Ultrametric Embedding Vincent - PowerPoint PPT Presentation

On Efficient Low Distortion Ultrametric Embedding Vincent Cohen-Addad -- CNRS & Google Zürich Karthik C. S. -- Tel Aviv University Guillaume Lagarde -- LaBRI

“Flat” clustering

“Flat” clustering - Cluster analysis - Features for machine learning - Data compression - etc.

Hierarchical clustering Recursive partitioning of the data ● n points → 2n-1 nested clusters with different granularities ●

Ultrametric Recursive partitioning of the data ● Ultrametric n points → 2n-1 nested clusters with different granularities ● Metric where: 150 Triangle inequality d(x,z) ≤ d(x,y) + d(y,z) 100 111 34 is strengthened to 41 11 14 Ultrametric inequality d(x,z) ≤ max(d(x,y), d(y,z))

Ultrametric Recursive partitioning of the data ● Ultrametric n points → 2n-1 nested clusters with different granularities ● Metric where: 150 Triangle inequality d(x,z) ≤ d(x,y) + d(y,z) 100 111 34 is strengthened to 41 11 14 Ultrametric inequality d(x,z) ≤ max(d(x,y), d(y,z)) d(x,y) = value of the lowest common ancestor

Agglomerative algorithms average-linkage, single-linkage, Ward’s method, complete-linkage, … ● Produce an embedding of a metric into an ultrametric ● Bottom-up: proceed by agglomerating the pair of clusters of minimum dissimilarity ●

Agglomerative algorithms average-linkage, single-linkage, Ward’s method, complete-linkage, … ● Produce an embedding of a metric into an ultrametric ● Bottom-up: proceed by agglomerating the pair of clusters of minimum dissimilarity ● Major drawback: quadratic running time

Ultrametric Goal : given some dataset, find eiciently its best ultrametric representation

Ultrametric Goal : given some dataset, find eiciently its best ultrametric representation wait… the best ?

Problem statement BEST ULTRAMETRIC FIT (BUF ∞ ) INPUT : a set V of n elements v 1 , v 2 , …, v n ● a weight function w : V× V→ R ● OUTPUT : an ultrametric Δ such that ● w(v i , v j ) ≤ Δ(v i , v j ) ≤ 𝛽 · w(v i , v j ) for the minimal value 𝛽 .

Main results V = R d Theorem 1 (upper bound) There are algorithms that produce, for Euclidean instances of BUF ∞ For any γ>1, a 5 γ -approximation in time O(nd+n 1+O(1/γ^2) ) ● w(v i , v j ) = ||v i - v j || 2 a √(log n)-approximation in time O(nd + n log 2 n) ●

Main results V = R d Theorem 1 (upper bound) There are algorithms that produce, for Euclidean instances of BUF ∞ For any γ>1, a 5 γ -approximation in time O(nd+n 1+O(1/γ^2) ) ● w(v i , v j ) = ||v i - v j || 2 a √(log n)-approximation in time O(nd + n log 2 n) ● SAT can’t be Theorem 2 (lower bounds) -- informal statement solved in 2 n(1−o(1)) Assuming the Strong Exponential Time Hypothesis ● SETH (SETH) , there is no algorithm running in subquadratic time that can approximate BUF ∞ within a factor 3/2−o(1) for the w(v i , v j ) = ||v i - v j || ∞ L ∞ norm + another lower bound for Euclidean metric under a ● “ Colinearity Hypothesis” .

Related work [CM10] (Carlsson and Mémoli) → study of linkage algorithms [Das15] (Dasgupta) → what is a good hierarchical clustering? (cost functions) [MW17] (Moseley and Wang) [CAKMTM18] (Cohen-Addad, Kanade, Mallmann-Trenn, Mathieu) → good approximation guarantees for average-linkage for the (dual of) Dasgupta’s cost function & new algorithms ‘beyond-worst-case’ scenario [CM15] (Cochez and Mou) [ACH19] (Abboud, Cohen-Addad, and Houdrouge) → subquadratic running time implementation of average-linkage and Ward’s method many others [RP16, CC17, CAKMT17, CCN19, CCNY18, ...] +

Starting point

Starting point Solves a slightly more general problem ● Provides an algorithm that runs in O(n 2 ) (given ● queries to w are done in constant time) This algorithm is optimal ●

APPROX-BUF: an approximation algorithm for BUF ∞ APPROX-BUF

APPROX-BUF: an approximation algorithm for BUF ∞ APPROX-BUF 1. Compute a γ-approximate MST T over the complete graph G

APPROX-BUF: an approximation algorithm for BUF ∞ APPROX-BUF 1. Compute a γ-approximate MST T over the complete graph G 2. Compute a ฀ -estimate of the cut weights of the edges in T

APPROX-BUF: an approximation algorithm for BUF ∞ b APPROX-BUF a 1. Compute a γ-approximate MST T over the complete graph G 2. Compute a ฀ -estimate of the cut weights of the edges in T

APPROX-BUF: an approximation algorithm for BUF ∞ b APPROX-BUF a 1. Compute a γ-approximate MST T over the complete graph G L R 2. Compute a ฀ -estimate of the cut weights of the edges in T

APPROX-BUF: an approximation algorithm for BUF ∞ b APPROX-BUF a 84 1. Compute a γ-approximate MST T over the complete graph G L R 2. Compute a ฀ -estimate of the cut weights of the edges in T

APPROX-BUF: an approximation algorithm for BUF ∞ 10 43 13 117 APPROX-BUF 80 23 8 50 1. Compute a γ-approximate MST T over 61 the complete graph G 2. Compute a ฀ -estimate of the cut weights of the edges in T

APPROX-BUF: an approximation algorithm for BUF ∞ 10 43 13 117 APPROX-BUF 80 23 8 50 1. Compute a γ-approximate MST T over 61 the complete graph G 2. Compute a ฀ -estimate of the cut weights of the edges in T 3. Compute the cartesian tree

APPROX-BUF: an approximation algorithm for BUF ∞ 10 43 13 117 APPROX-BUF 80 23 8 50 1. Compute a γ-approximate MST T over 61 the complete graph G 2. Compute a ฀ -estimate of the cut weights of the edges in T 117 3. Compute the cartesian tree

APPROX-BUF: an approximation algorithm for BUF ∞ 10 43 13 117 APPROX-BUF 80 23 8 50 1. Compute a γ-approximate MST T over 61 the complete graph G 2. Compute a ฀ -estimate of the cut weights of the edges in T 117 3. Compute the cartesian tree → This gives a γ · ฀ -approximation to BUF ∞

APPROX-BUF: an approximation algorithm for BUF ∞ Fast implementation in Euclidean space of dimension d APPROX-BUF Based on γ-spanner constructions using Har-Peled, Indyk, Sidiropoulos 1. Compute a γ-approximate MST T over the complete graph G Any γ >1 γ = √(log n) 2. Compute a ฀ -estimate of the cut weights of the edges in T Locality sensitive hash Lipschitz partitions 3. Compute the cartesian tree family (Andoni and (Charikar et al.) Indyk) → This gives a γ · ฀ -approximation to BUF ∞ O(nd+n 1+O(1/ γ ^2) ) O(nd+n log 2 n)

APPROX-BUF: an approximation algorithm for BUF ∞ Fast implementation in Euclidean space of dimension d APPROX-BUF Tweak a union-find data structure and compute bottom-up the cut weights 1. Compute a γ-approximate MST T over the complete graph G ฀ =5 2. Compute a ฀ -estimate of the cut weights of the edges in T Triangular inequality 3. Compute the cartesian tree O(nd+n log n) → This gives a γ · 5-approximation to BUF ∞

THEORY REAL LIFE

Experiments: maximum distortion DIABETES -- 768 samples, 8 features ● 𝜷 = max vi,vj Δ(v i ,v j )/w(v i ,v j ) MICE -- 1080 samples, 77 features ● PENDIGITS -- 10992 samples, 16 features ● DIABETES MICE PENDIGITS Average 11.1 9.7 27.5 Complete 18.5 11.8 33.8 Single 6.0 4.9 14 Approx-BUF: approx MST + approx cut weights Ward 61.0 59.3 433.8 Approx-BUF 41.0 51.2 109.8 Approx-BUF2: exact MST + approx cut weights Approx-BUF2 9.6 9.4 37.2 Farach et al. 6.0 4.9 13.9

Experiments: running time Running times, in seconds

Conclusion Seems promising ● A good MST is crucial → can we compute a better one efficiently? ● Cut weights suffer from an approximation of ฀ =5 → can we do better? ●

Thanks!

On Efficient Low Distortion Ultrametric Embedding Vincent - PowerPoint PPT Presentation

On Efficient Low Distortion Ultrametric Embedding Vincent Cohen-Addad -- CNRS & Google Zrich Karthik C. S. -- Tel Aviv University Guillaume Lagarde -- LaBRI Flat clustering Flat clustering - Cluster analysis - Features for

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

Magnetic Distortion Distortion of of Magnetic HPD Images HPD Images smund Skjveland

Using Distortion in 3D Using Distortion in 3D Sheelagh Carpendale Sheelagh Carpendale

Digital Pre-Distortion Derek Kozel What is Digital Pre-Distortion (DPD) A technique for

CMB Spectral Distortion Computations using the Greens function package of CosmoTherm Primordial

Temporal Distortion Temporal Distortion Perspective) Perspective) t t Blue view Blue view y

Non Linear Distortion and Dynamic Range Issues Non Linear Distortion and Dynamic Range Issues in

Chapter 10 Rate Distortion Theory Peng-Hua Wang Graduate Inst. of Comm. Engineering National

Resolving Profile Distortion Resolving Profile Distortion for Electron-based IPMs for

AD831: Low Distortion Mixer Presented By, Adil Ahmed Nachiket Mehta Pruthav Joshi April 29,

Embedding 3-manifolds via surgery on surfaces Kyle Larson University of Texas at Austin

How does the distortion of linear embedding of C 0 ( K ) into C 0 ( , X ) spaces depend on the

Dimensionality Reduction embedding Distortion L Norm Corollaries Anil Maheshwari Euclidean

Giving your Hammie a fixed hip distortion A common problem in achieving a neutral seated position

Unsupervised learning introduction October 7, 2019 Unsupervised learning introduction

Novel Data Linkage Techniques Dongwon Lee The Pennsylvania State University http://pike.psu.edu/

Economy: An Input-Output Analysis Tulika Bhattacharya and Bornali Bhandari 11 September 2019

STRUCTURAL TRANSFORMATION, BACKWARD AND FORWARD LINKAGES AND JOB CREATION IN ASIA-PACIFIC LDCS AN

Entity Linkage for Heterogeneous, Uncertain, and Volatile Data Ekaterini Ioannou L3S Research

First Nations Health Data Linkage PRESENTER: SABA KHAN , DATA PARTNERSHIPS PROJECT MANAGER DATE:

Use of Unique Beneficiary IDs in Medicaid Data Analyses Medicaid Innovation Accelerator

Considerations for Development and Use of a Master Person Index (MPI) July 26, 2016 3 - 4 pm