Nova Fandina Hebrew University, Israel fandina@cs.huji.ac.il Joint - PowerPoint PPT Presentation

8-14 December, Vancouver, Canada Nova Fandina Hebrew University, Israel fandina@cs.huji.ac.il Joint work with Yair Bartal , Hebrew University, Israel, yair@cs.huji.ac.il Ofer Neiman , Ben Gurion University, Israel, neimano@cs.bgu.ac.il 1

• • 2

A basic task in metric embedding theory (informally) is: Given metric spaces and , embed into , with small error on the distances. How well it can be done? How to measure an error? In theory: “well” traditionally means to minimize distortion of the worst pair Definition: worst case distortion For an embedding , for a pair of points � � � � ,� � � � �,� • 𝑓𝑦𝑞𝑏𝑜𝑡 � 𝑣, 𝑤 = , 𝑑𝑝𝑜𝑢𝑠 � 𝑣, 𝑤 = � � �,� � � � � ,� � • 𝑒𝑗𝑡𝑢𝑝𝑠𝑢𝑗𝑝𝑜 𝑔 = 𝑛𝑏𝑦 ��∈� 𝑓𝑦𝑞𝑏𝑜𝑡 � 𝑣, 𝑤 ⋅ 𝑛𝑏𝑦 ��∈� {𝑑𝑝𝑜𝑢𝑠 � (𝑣, 𝑤)} 3

In practice, the demand for the worst-case guarantee is too strong: the quality of a method in practical applications is rather usually measured by its average performance over all pairs. There is a reach body of research literature where the variety of average quality measurement criteria is studded and applied: • Yuval Shavitt and Tomer Tankel. Big-bang simulation for embedding network distances in Euclidean space . IEEE/ACM Trans. Netw., 12(6), 2004. • P. Sharma, Z. Xu, S. Banerjee, and S. Lee . Estimating network proximity and latency. Computer Communication Review, 36(3), 2006. • P. J. F. Groenen, R. Mathar, and W. J. Heiser. The majorization approach to multidimensional scaling for minkowski distances . Journal of Classification, 12(1), 1995. • J. F. Vera, W. J. Heiser, and A. Murillo. Global optimization in any minkowski metric: A permutation- translation simulated annealing algorithm for multidimensional scaling . Journal of Classification, 24(2), 2007. • A. Censi and D. Scaramuzza. Calibration by correlation using metric embedding from nonmetric similarities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(10), 2013. • C. Lumezanu and N. Spring . Measurement manipulation and space selection in network coordinates . The 28th International Conference on Distributed Computing Systems, 2008. • S. Chatterjee, B. Neff, and P. Kumar. Instant approximate 1-center on road networks via embeddings. In Proceedings of the 19th International Conference on Advances in Geographic Information Systems, GIS ’11, 2011. • S. Lee, Z. Zhang, S. Sahu, and D. Saha. On suitability of Euclidean embedding for host-based network coordinate systems . IEEE/ACM Trans. Netw., 18(1), 2010. • L. Chennuru Vankadara and U. von Luxburg. Measures of distortion for machine learning . Advances in Neural Information Processing Systems, Curran Associates, Inc., 2018. Just a small sample from googolplex number of such studies. 4

For , for a pair � � � 𝒓 -distortion �/� �/� � � � - � � � � � � � Relative Error Measure [commonly used in network applications: CDKLM04, SXBL06, ST04] �/� �/� � � (�) (�) � � � � � � 5

Initiated and studied within the Multi-Dimensional Scaling framework [CC00]. Found an enormous number of applications in visualization, clustering, indexing and many more fields [see a long list of citations in the paper]. We further generalize the basic variants that appear in the literature: For a pair �� /� �/� �/� �/� � �� | � ] � �� | � ] � �� | � ] � �� | � ] 𝐹 � [|𝑒 �� − 𝑒 𝐹 � [|𝑒 �� − 𝑒 𝐹 � [|𝑒 �� − 𝑒 𝐹 � [|𝑒 �� − 𝑒 ∗ 𝑔 = ∗ 𝑔 = 𝑇𝑢𝑠𝑓𝑡𝑡 � 𝑇𝑢𝑠𝑓𝑡𝑡 � 𝑇𝑢𝑠𝑓𝑡𝑡 � 𝑔 = 𝑇𝑢𝑠𝑓𝑡𝑡 � 𝑔 = � ] � ] 𝐹 � [ 𝑒 �� ] 𝐹 � [ 𝑒 �� ] � �� 𝐹 � [ 𝑒 𝐹 � [ 𝑒 �/� �/� �/� �/� � � � � � �� − 𝑒 �� | � � �� − 𝑒 �� | � |𝑒 |𝑒 − 𝑒 | |𝑒 |𝑒 − 𝑒 | 𝐹𝑜𝑓𝑠𝑕𝑧 � 𝑔 = 𝐹𝑜𝑓𝑠𝑕𝑧 � 𝑔 = E � E � 𝑆𝐹𝑁 � 𝑔 = 𝑆𝐹𝑁 � 𝑔 = E � E � � �� } � �� } 𝑒 �� 𝑒 �� min{𝑒 �� , 𝑒 min{𝑒 �� , 𝑒 6

-distortion: defined and studied in VL18 [NeurIPS18] �/� �/� � � 𝑓𝑦𝑏𝑞𝑜𝑡 � 𝑣, 𝑤 𝑓𝑦𝑏𝑞𝑜𝑡 � 𝑣, 𝑤 𝜏 − 𝑒𝑗𝑡𝑢 (�)�,� 𝑔 = 𝜏 − 𝑒𝑗𝑡𝑢 �,� 𝑔 = E � E � � − 𝑓𝑦𝑞𝑏𝑜𝑡 𝑔 − 1 � − 𝑓𝑦𝑞𝑏𝑜𝑡 𝑔 − 1 ℓ � ℓ � (�) • � � � � (�) • � � � � Necessary properties a quality measure has to posses to be valid for the ML applications were defined and studied in [VL18]: • translation invariance • scale invariance • monotonicity • robustness (outliers, noise) • incorporation of probability 7

• We show that all the other average distortion measures considered here can be easily adapted to satisfy similar ML motivated properties, generalizing the results of VL18. • We show deep tight relations between these different objective functions, and further develop properties and tools for analyzing embeddings for these measures. While these measures have been extensively studied from a practical point of view, and many heuristics are known in the literature, almost nothing is known in terms of rigorous analysis and absolute bounds. Moreover, many real-world misconceptions exist about what dimension may be necessary for good embeddings. • We present the first theoretical analysis of all these measures providing absolute bounds that shed light on these questions. We exhibit approximation algorithms for optimizing these measures, and further applications. • We validate our theoretical findings experimentally, by implementing our algorithms and running them on various randomly generated Euclidean and non-Euclidean metric spaces. 8

The main theoretical question we study in the paper is: -Dimension Reduction Given a dimension bound and , what is the least such that every finite subset of Euclidean space embeds into dim. with ? 𝐫 • We answer the question by providing almost tight upper and lower bounds on α ( k; q ), for all the discussed measures. • We prove that the Johnson-Lindenstrauss dimensionality reduction achieves bounds in terms of q and k that dramatically outperform a widely used in practice PCA algorithm. • Moreover, in experiments, we show that the JL outperforms Isomap and PCA methods, on various randomly generated metric spaces. 9

� and Given an -point metric space in � , the JL lemma states: 𝟑 , [JL84] Projection of onto a random subspace of dim. with const. prob. has worst case . There are many implementations of the JL transform (satisfying the JL property): [Achl03] The entries of T are uniform indep. from [DKS10,KN10, AL10] Sparse/Fast: particular distr. from [IM98] is a matrix of size with indep. entries sampled from . 𝒍 is defined by The embedding . 𝟑 10

• The JL transform of IM98 provides constant upper bounds for all � . The bounds are almost tight. All our theorems true for that implementation. • Other mentioned implementations do not work for � -dist and for � : Observation � samples its entries form a discrete set of values of � If a linear transformation � results in � -dist, �/� , then applying it on a standard basis of size � • PCA may produce an embedding of extremely poor quality for all the measures (this does not happen to the JL). In the next slides we give an example of a family of Euclidean metric spaces, on which PCA produces provably large distortions. 11

� and a given integer PCA/c-MDS For a given finite , computes the � best rank � �∈� � has optimal • � � over all projections. � ��∈� �� • Often misused: “minimizing � over all embeddings into -dim”. • In fact, PCA does not minimize any of the mentioned measures. Next, we present a metric space of dimension that can be efficiently embedded into a line (with small � distortions) but such that PCA fails to produce a comparable result. 12

• The metric is in dimensional Euclidean space, for any large enough. • Fix some , and • Consider the standard basis vectors � � . � � • � For each vector � let � be the set of copies of vector � , � � � and let � be the set of the same size of the antipodal vector � . In the paper we show an embedding of this metric space 𝑨 𝟐/𝟑 . j into with: 𝟑 PCA projects this space onto 𝛽 � � � 𝛽 � For we have: • 𝟑 i • 𝑧 PCA is not better than a naïve algo: 𝑦 any non-expansive embedding has const The JL embedding has bounded measures for Stress measure 13 • any space: , as increases. 𝒓 -dist/ 𝒓

Theorem [Moment analysis of JL transform] � s.t. for a given There is a map (JL or normalized JL) with const. prob. � 𝑙/4 ≤ 𝑟 ≤ 𝑙 𝑟 = 𝑙 𝑙 ≤ 𝑟 ≤ ∞ 1 ≤ 𝑟 < 𝑙 𝑙 ≤ 𝑟 ≤ 𝑙/4 �(�/�) 1 𝑟 � � �� 𝑙 �/� 1 + 𝑃 1 + 𝑃 𝑙 − 𝑟 𝒓 -dist(f) 𝑃 log 𝑜 𝑜 � 𝑙 𝑙 − 𝑟 𝒓 � � • � � � � • � � � �� /� 14

Nova Fandina Hebrew University, Israel fandina@cs.huji.ac.il Joint - PowerPoint PPT Presentation

8-14 December, Vancouver, Canada Nova Fandina Hebrew University, Israel fandina@cs.huji.ac.il Joint work with Yair Bartal , Hebrew University, Israel, yair@cs.huji.ac.il Ofer Neiman , Ben Gurion University, Israel, neimano@cs.bgu.ac.il 1

Israel has peace Israel has peace The judges Israel rebels Israel rebels Israel rebels

The Hebrew Bible The Hebrew Bible (Old Testament) The Hebrew Bible (Old Testament) The

Devotionals from the Psalms 2 nd Edition The Hebrew Scriptures The Hebrew Scriptures The Hebrew

D EVOTIO N ALS DEVOTIONALS The Hebrew Bible The Hebrew Bible (Old Testament) The Hebrew

Israel Analyst Presentation 24 th May 2017 Overview Overview Israel economy and demographics

Israel Competitiveness 2000 2010 2010 2000 Israel 2021 Israel 2021 Israel - -

Covering Metric Spaces by Few Trees Yair Bartal Nova Fandina Ofer neiman Tree Covers Let

Hebrew Dependency Parsing: Initial Results Yoav Goldberg Michael Elhadad IWPT 2009, Paris

Dim imensionality ty Redu eduction: Th Theoretic ical Ana nalysis of Pr Practi tical Mea

HR Connection Orientation Welcome to the NOVA Team! Whats on the Agenda? NOVA Overview

The Hebrew-Christian Messiah; Or, the Presentation of the Messiah to the The Hebrew-Christian

A MICROKERNEL-BASED OPERATING SYSTEM FOR EXASCALE COMPUTING Amnon Barak Hebrew University

Israel Water Association Israel Water Association Avraham Israeli Israel Water Association The

TOURISM NOVA SCOTIA INFORMATION & OPPORTUNITIES October 11, 2018 Presented by Tourism Nova

NOVA Wood DESKING SYSTEM NOVA Wood Natures touch in your office! A desking system that

RESULTS PRESENTATION Six-months ended 31 December 2016 Nova Park, Gorzow, Poland Nova Park,

Interlinking: Performance Assessment of User Evaluation vs. Supervised Learning Approaches Mofeed

ECE 5984: Introduction to Machine Learning Topics: Supervised Learning Measuring

Nearest Neighbor Classification Seed classification by area and What should we compactness

Hilberts problems and contemporary mathematical logic Jan Kraj cek MFF UK (KA)

Web Data Representation Web Graph, Text, Images, Metadata, Search spaces Web Search 1 The Web

A Minkowski problem for nonlinear capacity Andrew Vogel April 22, Boston AMS special session

Twistors and Amplitudes Andrew Hodges (Oxford) MHV@30 Fermi National Accelerator Laboratory,

A glimpse into convex geometry 5 \ A glimpse into convex geometry Two

Nova Fandina Hebrew University, Israel fandina@cs.huji.ac.il Joint - PowerPoint PPT Presentation

8-14 December, Vancouver, Canada Nova Fandina Hebrew University, Israel fandina@cs.huji.ac.il Joint work with Yair Bartal , Hebrew University, Israel, yair@cs.huji.ac.il Ofer Neiman , Ben Gurion University, Israel, neimano@cs.bgu.ac.il 1

Israel has peace Israel has peace The judges Israel rebels Israel rebels Israel rebels

The Hebrew Bible The Hebrew Bible (Old Testament) The Hebrew Bible (Old Testament) The

Devotionals from the Psalms 2 nd Edition The Hebrew Scriptures The Hebrew Scriptures The Hebrew

D EVOTIO N ALS DEVOTIONALS The Hebrew Bible The Hebrew Bible (Old Testament) The Hebrew

Israel Analyst Presentation 24 th May 2017 Overview Overview Israel economy and demographics

Israel Competitiveness 2000 2010 2010 2000 Israel 2021 Israel 2021 Israel - -

Covering Metric Spaces by Few Trees Yair Bartal Nova Fandina Ofer neiman Tree Covers Let

Hebrew Dependency Parsing: Initial Results Yoav Goldberg Michael Elhadad IWPT 2009, Paris

Dim imensionality ty Redu eduction: Th Theoretic ical Ana nalysis of Pr Practi tical Mea

HR Connection Orientation Welcome to the NOVA Team! Whats on the Agenda? NOVA Overview

The Hebrew-Christian Messiah; Or, the Presentation of the Messiah to the The Hebrew-Christian

A MICROKERNEL-BASED OPERATING SYSTEM FOR EXASCALE COMPUTING Amnon Barak Hebrew University

Israel Water Association Israel Water Association Avraham Israeli Israel Water Association The

TOURISM NOVA SCOTIA INFORMATION &amp; OPPORTUNITIES October 11, 2018 Presented by Tourism Nova

NOVA Wood DESKING SYSTEM NOVA Wood Natures touch in your office! A desking system that

RESULTS PRESENTATION Six-months ended 31 December 2016 Nova Park, Gorzow, Poland Nova Park,

Interlinking: Performance Assessment of User Evaluation vs. Supervised Learning Approaches Mofeed

ECE 5984: Introduction to Machine Learning Topics: Supervised Learning Measuring

Nearest Neighbor Classification Seed classification by area and What should we compactness

Hilberts problems and contemporary mathematical logic Jan Kraj cek MFF UK (KA)

Web Data Representation Web Graph, Text, Images, Metadata, Search spaces Web Search 1 The Web

A Minkowski problem for nonlinear capacity Andrew Vogel April 22, Boston AMS special session

Twistors and Amplitudes Andrew Hodges (Oxford) MHV@30 Fermi National Accelerator Laboratory,

A glimpse into convex geometry 5 \ A glimpse into convex geometry Two

TOURISM NOVA SCOTIA INFORMATION & OPPORTUNITIES October 11, 2018 Presented by Tourism Nova