dim imensionality ty redu eduction
play

Dim imensionality ty Redu eduction: Th Theoretic ical Ana - PowerPoint PPT Presentation

Dim imensionality ty Redu eduction: Th Theoretic ical Ana nalysis of Pr Practi tical Mea easu sures Nova Fandina Hebrew University Joint work with Yair Bartal, Hebrew University Ofer Neiman, Ben Gurion University 1 Outl utline


  1. Dim imensionality ty Redu eduction: Th Theoretic ical Ana nalysis of Pr Practi tical Mea easu sures Nova Fandina Hebrew University Joint work with Yair Bartal, Hebrew University Ofer Neiman, Ben Gurion University 1

  2. Outl utline โ€ข Measuring the Quality of Embedding - in theory : worst case distortion analysis - in practice: average case distortion measures - in between: theoretical analysis of practical measures (for dimensionality reduction methods) โ€ข Our Results - upper bounds - lower bounds - approximating optimal embedding 2

  3. Measuring the Quality ty of Embeddin ing: g: in theory Basic question in metric embedding theory (informally) Given metric spaces ๐‘Œ and ๐‘ , embed ๐‘Œ into ๐‘ , with small error on the distances How well it can be done? In theory: โ€œwellโ€ traditionally means to minimize distortion of the worst pair Definition For an embedding ๐‘”: ๐‘Œ โ†’ ๐‘ , for a pair of points ๐‘ฃ โ‰  ๐‘ค โˆˆ ๐‘Œ ๐‘’ ๐‘ ๐‘” ๐‘ฃ ,๐‘” ๐‘ค ๐‘’ ๐‘Œ ๐‘ฃ,๐‘ค โ€ข ๐‘“๐‘ฆ๐‘ž๐‘๐‘œ๐‘ก ๐‘” ๐‘ฃ, ๐‘ค = , ๐‘‘๐‘๐‘œ๐‘ข๐‘  ๐‘” ๐‘ฃ, ๐‘ค = ๐‘’ ๐‘Œ ๐‘ฃ,๐‘ค ๐‘’ ๐‘ ๐‘” ๐‘ฃ ,๐‘” ๐‘ค โ€ข ๐‘’๐‘—๐‘ก๐‘ข๐‘๐‘ ๐‘ข๐‘—๐‘๐‘œ ๐‘” = ๐‘›๐‘๐‘ฆ ๐‘ฃโ‰ ๐‘คโˆˆ๐‘Œ ๐‘“๐‘ฆ๐‘ž๐‘๐‘œ๐‘ก ๐‘” ๐‘ฃ, ๐‘ค โ‹… ๐‘›๐‘๐‘ฆ ๐‘ฃโ‰ ๐‘คโˆˆ๐‘Œ {๐‘‘๐‘๐‘œ๐‘ข๐‘  ๐‘” (๐‘ฃ, ๐‘ค)} 3

  4. Mea easuring the Quality ty of Embeddin ing: in pract ctice ce Demand for the worst case guarantee is too strong: The quality of a method in practical applications is its average performance over all pairs โ€ข Yuval Shavitt and Tomer Tankel. Big-bang simulation for embedding network distances in Euclidean space . IEEE/ACM Trans. Netw., 12(6), 2004. โ€ข P. Sharma, Z. Xu, S. Banerjee, and S. Lee . Estimating network proximity and latency. Computer Communication Review, 36(3), 2006. โ€ข P. J. F. Groenen, R. Mathar, and W. J. Heiser. The majorization approach to multidimensional scaling for minkowski distances . Journal of Classification, 12(1), 1995. โ€ข J. F. Vera, W. J. Heiser, and A. Murillo. Global optimization in any minkowski metric: A permutation- translation simulated annealing algorithm for multidimensional scaling . Journal of Classification, 24(2), 2007. โ€ข A. Censi and D. Scaramuzza. Calibration by correlation using metric embedding from nonmetric similarities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(10), 2013. โ€ข C. Lumezanu and N. Spring . Measurement manipulation and space selection in network coordinates . The 28th International Conference on Distributed Computing Systems, 2008. โ€ข S. Chatterjee, B. Neff, and P. Kumar. Instant approximate 1-center on road networks via embeddings. In Proceedings of the 19th International Conference on Advances in Geographic Information Systems, GIS โ€™11, 2011. โ€ข S. Lee, Z. Zhang, S. Sahu, and D. Saha. On suitability of Euclidean embedding for host-based network coordinate systems . IEEE/ACM Trans. Netw., 18(1), 2010. โ€ข L. Chennuru Vankadara and U. von Luxburg. Measures of distortion for machine learning . Advances in Neural Information Processing Systems, Curran Associates, Inc., 2018. Just a small sample from googolplex number of such studies 4

  5. Measuring the Quality ty of Embeddin ing: g: in pract ctice ce Moments of Distortion and Relative Error For ๐‘”: ๐‘Œ โ†’ ๐‘ , for a pair ๐‘ฃ โ‰  ๐‘ค โˆˆ ๐‘Œ, ๐‘’๐‘—๐‘ก๐‘ข ๐‘” ๐‘ฃ, ๐‘ค โ‰” max{๐‘“๐‘ฆ๐‘ž๐‘๐‘œ๐‘ก ๐‘” ๐‘ฃ, ๐‘ค , ๐‘‘๐‘๐‘œ๐‘ข๐‘ ๐‘๐‘‘๐‘ข ๐‘” (๐‘ฃ, ๐‘ค)} โ„“ ๐’“ -di distortio ion ABN[06] For ๐‘”: ๐‘Œ โ†’ ๐‘, for a distribution ฮ  over pairs of ๐‘Œ, ๐‘Ÿ โ‰ฅ 1 1/๐‘Ÿ ๐‘Ÿ ฮ  - dist(f) = ๐น ๐›ฒ โ„“ ๐‘Ÿ ๐‘’๐‘—๐‘ก๐‘ข ๐‘” ๐‘ฃ, ๐‘ค Relative Error Measure [In many papers] 1/๐‘Ÿ (ฮ ) = ๐น ๐›ฒ ๐‘Ÿ ๐‘†๐น๐‘ ๐‘Ÿ |๐‘’๐‘—๐‘ก๐‘ข ๐‘” ๐‘ฃ, ๐‘ค โˆ’ 1| 5

  6. Measuring the Quality ty of Embeddin ing: g: in pract ctice ce Additive Distortion Measures [MDS: optimally embed a given finite X into a k-dim Euclidean space, for a given k] แˆ˜ For a pair ๐‘ฃ โ‰  ๐‘ค โˆˆ ๐‘Œ, ๐‘’ ๐‘ฃ๐‘ค = ๐‘’ ๐‘Œ ๐‘ฃ, ๐‘ค , ๐‘’ ๐‘ฃ๐‘ค = ๐‘’ ๐‘ ๐‘” ๐‘ฃ , ๐‘” ๐‘ค 1/๐‘Ÿ 1/๐‘Ÿ ๐น ฮ  [|๐‘’ ๐‘ฃ๐‘ค โˆ’ แˆ˜ ๐‘’ ๐‘ฃ๐‘ค | ๐‘Ÿ ] ๐น ฮ  [|๐‘’ ๐‘ฃ๐‘ค โˆ’ แˆ˜ ๐‘’ ๐‘ฃ๐‘ค | ๐‘Ÿ ] โˆ— ๐‘” = ๐‘‡๐‘ข๐‘ ๐‘“๐‘ก๐‘ก ๐‘Ÿ ๐‘‡๐‘ข๐‘ ๐‘“๐‘ก๐‘ก ๐‘Ÿ ๐‘” = ๐‘Ÿ ] ๐น ฮ  [ ๐‘’ ๐‘ฃ๐‘ค ๐‘Ÿ ] ๐น ฮ  [ แˆ˜ ๐‘’ ๐‘ฃ๐‘ค 1/๐‘Ÿ 1/๐‘Ÿ ๐‘Ÿ ๐‘Ÿ | แˆ˜ | แˆ˜ ๐‘’ ๐‘ฃ๐‘ค โˆ’ ๐‘’ ๐‘ฃ๐‘ค | ๐‘’ ๐‘ฃ๐‘ค โˆ’ ๐‘’ ๐‘ฃ๐‘ค | ๐‘†๐น๐‘ ๐‘Ÿ ๐‘” = E ฮ  ๐น๐‘œ๐‘“๐‘ ๐‘•๐‘ง ๐‘Ÿ ๐‘” = E ฮ  min{๐‘’ ๐‘ฃ๐‘ค , แˆ˜ ๐‘’ ๐‘ฃ๐‘ค ๐‘’ ๐‘ฃ๐‘ค } 6

  7. Measuring the Quality ty of Embeddin ing: g: in pract ctice ce ๐‰ -distortion ML motivated, in [VvL18] โžข Many heuristics for optimizing these measures 1/๐‘Ÿ ๐‘Ÿ ๐‘“๐‘ฆ๐‘๐‘ž๐‘œ๐‘ก ๐‘” ๐‘ฃ, ๐‘ค ๐œ โˆ’ ๐‘’๐‘—๐‘ก๐‘ข (ฮ )๐‘Ÿ,๐‘  ๐‘” = E ฮ  ๐‘‰ โˆ’ ๐‘“๐‘ฆ๐‘ž๐‘๐‘œ๐‘ก ๐‘” โˆ’ 1 โ„“ ๐‘  โžข Almost nothing is known in terms of rigorous analysis (U) โˆ’ ๐‘“๐‘ฆ๐‘ž๐‘๐‘œ๐‘ก ๐‘” = ๐น U [(๐‘“๐‘ฆ๐‘ž๐‘๐‘œ๐‘ก ๐‘” ๐‘ฃ, ๐‘ค ๐‘  )] โ€ข โ„“ ๐‘  (U) โˆ’ ๐‘‘๐‘๐‘œ๐‘ข๐‘  ๐‘” = ๐น U [(๐‘‘๐‘๐‘œ๐‘ข๐‘  ๐‘” ๐‘ฃ, ๐‘ค ๐‘  )] โ€ข โ„“ ๐‘  โ€œNecessary properties for ML applicationsโ€ โ€ข translation invariance โ€ข scale invariance โ€ข monotonicity โ€ข robustness (outliers, noise) โ€ข incorporation of probability 7

  8. Measuring the Quality ty of Embeddin ing: g: in betw tween Bridging the gap between theory and practice outlook ๐›ฝ(๐‘™, ๐‘Ÿ) -Dimension Reduction Given a dimension bound ๐ฅ โ‰ฅ ๐Ÿ and ๐ซ โ‰ฅ ๐Ÿ , what is the least ๐›ƒ(๐ฅ, ๐ซ) such that every finite subset of Euclidean space embeds into ๐ฅ dim. with ๐๐Ÿ๐›๐ญ๐ฏ๐ฌ๐Ÿ ๐ซ โ‰ค ๐›ƒ(๐ฅ, ๐ซ) ? General Metrics: Approximating the Optimal Embedding General Metrics (MDS) For a given finite ๐‘Œ and for ๐‘™ โ‰ฅ 1 , compute an embedding of X into k-dim Euclidean For a given finite ๐‘Œ and ๐‘™ โ‰ฅ 1 , compute the optimal embedding of ๐‘Œ into k-dim space that approximates the best possible embedding, for a given ๐‘๐‘“๐‘๐‘ก๐‘ฃ๐‘ ๐‘“ ๐‘Ÿ . Euclidean space, minimizing a particular ๐‘๐‘“๐‘๐‘ก๐‘ฃ๐‘ ๐‘“ ๐‘Ÿ . [CD06] Optimizing is NP-hard for ๐‘‡๐‘ข๐‘ ๐‘“๐‘ก๐‘ก ๐‘Ÿ and ๐‘™ = 1 8

  9. Our ur Resu sult lts: : upper boun ounds s previous results ๐›ฝ(๐‘™, ๐‘Ÿ) -Dimension Reduction Given a dimension bound ๐ฅ โ‰ฅ ๐Ÿ and ๐ซ โ‰ฅ ๐Ÿ , what is the least ๐›ƒ(๐ฅ, ๐ซ) such that every finite subset of Euclidean space embeds into ๐ฅ dim. with ๐๐Ÿ๐›๐ญ๐ฏ๐ฌ๐Ÿ ๐ซ โ‰ค ๐›ƒ(๐ฅ, ๐ซ) ? Previous results: worst case distortion 2 ๐‘’ embeds into โ„“ 2 ๐‘™ with distortion ๐‘ƒ ๐‘œ ๐‘™ (log ๐‘œ)/๐‘™ JL[84] Every ๐‘œ -point ๐‘Œ โˆˆ โ„“ 2 ๐‘™+1 such that any ๐‘”: ๐‘Š โ†’ โ„“ 2 ๐‘™ must have distortion ๐‘œ ฮฉ(1/๐‘™) Mat[90] There is ๐‘Š โˆˆ โ„“ 2 distortion(f) โ‰ค (โ„“ โˆž - dist) 2 โ–ช โ–ช For every ๐‘”: ๐‘Œ โ†’ ๐‘ (scalable) there is g: ๐‘Œ โ†’ ๐‘ with โ„“ โˆž - ๐ž๐ฃ๐ญ๐ฎ ๐ก = ๐ž๐ฃ๐ญ๐ฎ ๐  What about the ๐‘๐‘“๐‘๐‘ก๐‘ฃ๐‘ ๐‘“ ๐‘Ÿ guarantees for ๐‘Ÿ < โˆž? 9

  10. Our ur Resu esult lts: upper boun bounds s JL transform: IM implementation The answer to the ๐›ฝ(๐‘™, ๐‘Ÿ) -Dim. Reduction question is, essentially, the JL transform [JL84] Projection onto a random subspace of dim. ๐’ = ๐‘ท(๐ฆ๐ฉ๐ก ๐’ /๐‘ ๐Ÿ‘ ) , with const. prob. ๐’†๐’‹๐’•๐’– ๐’ˆ = ๐Ÿ + ๐‘ [tight, LN16] [IM 98] ๐‘ˆ is a matrix of size ๐‘™ ร— ๐‘’ with indep. entries sampled from ๐‘‚(0,1) . ๐‘™ is The embedding ๐‘”: ๐‘Œ โ†’ โ„“ 2 ๐‘” ๐‘ฆ = 1/ ๐‘™ โ‹… ๐‘ˆ(๐‘ฆ) โ€ข The JL transform of IM98 provides constant upper bounds for all ๐‘๐‘“๐‘๐‘ก๐‘ฃ๐‘ ๐‘“ ๐‘Ÿ The bounds are almost optimal โ€ข Other popular implementations of JL do not work for โ„“ ๐‘Ÿ -dist and for ๐‘†๐น๐‘ ๐‘Ÿ โ€ข PCA may produce an embedding of extremely poor quality for all the measures (this does not happen to the JL) 10

  11. Our ur Resu sult lts: : upper boun ounds s other implementations of JL [Achl03] The entries of T are uniform indep. from {ยฑ1} [DKS10,KN10, AL10] Sparse/Fast: particular distr. from {ยฑ1,0} Constant bounds cannot be achieved using the above implementations Observation If a linear transformation ๐‘ˆ: ๐‘† ๐‘’ โ†’ ๐‘† ๐‘™ samples its entries form a discrete set of values of size ๐‘ก โ‰ค ๐‘’ 1/๐‘™ , then applying it on a standard basis of ๐‘† ๐‘’ results in โ„“ ๐‘Ÿ -dist, ๐‘†๐น๐‘ ๐‘Ÿ = โˆž. 1/๐‘Ÿ ๐‘Ÿ 1/๐‘Ÿ (ฮ ) = ๐น ๐›ฒ ฮ  - dist(f) = ๐น ๐›ฒ ๐‘Ÿ โ–ช โ„“ ๐‘Ÿ ๐‘’๐‘—๐‘ก๐‘ข ๐‘” ๐‘ฃ, ๐‘ค , ๐‘†๐น๐‘ ๐‘Ÿ |๐‘’๐‘—๐‘ก๐‘ข ๐‘” ๐‘ฃ, ๐‘ค โˆ’ 1| โ–ช ๐‘’๐‘—๐‘ก๐‘ข ๐‘” ๐‘ฃ, ๐‘ค = max(๐‘“๐‘ฆ๐‘ž๐‘๐‘œ๐‘ก ๐‘” ๐‘ฃ, ๐‘ค , ๐‘‘๐‘๐‘œ๐‘ข๐‘ ๐‘๐‘‘๐‘ข ๐‘” (๐‘ฃ, ๐‘ค)) ๐‘ˆ ๐‘“ 1 , โ€ฆ , ๐‘“ ๐‘’ = {๐‘‘๐‘๐‘š๐‘ฃ๐‘›๐‘œ๐‘ก ๐‘๐‘” ๐‘ˆ}. T he number of different columns is ๐‘ก ๐‘™ < ๐‘’ โžข 11

  12. Our ur Resu sult lts: upper boun ounds s limitation of PCA ๐‘’ and a given integer ๐‘™ โ‰ฅ 1 , computes the PCA/c-MDS For a given finite ๐‘Œ โˆˆ โ„“ 2 best rank ๐‘™ - approx. to ๐‘Œ: A projection ๐‘„ onto the ๐‘™ - dim subspace spanned by largest eigenvectors of 2 the covariance matrix, with the smallest ฯƒ ๐‘ฃโˆˆ๐‘Œ ๐‘ฃ โˆ’ ๐‘„ ๐‘ฃ ๐‘™ with optimal ฯƒ ๐‘ฃโ‰ ๐‘คโˆˆ๐‘Œ (๐‘’ ๐‘ฃ๐‘ค 2 โˆ’ แˆ˜ 2 ) over all projections โ–ช ๐‘”: ๐‘Œ โ†’ โ„“ 2 ๐‘’ ๐‘ฃ๐‘ค โ–ช Often misused: โ€œminimizing ๐‘‡๐‘ข๐‘ ๐‘“๐‘ก๐‘ก 2 over all embeddings into ๐‘™ - dimโ€ โ–ช Actually, PCA does not minimize any of the mentioned measures 12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend