dimensionality reduction
play

Dimensionality Reduction embedding Distortion L Norm Corollaries - PDF document

Dimensionality Reduction Metric Space Isometric Dimensionality Reduction embedding Distortion L Norm Corollaries Anil Maheshwari Euclidean Norm anil@scs.carleton.ca School of Computer Science Carleton University Canada Metric Space


  1. Dimensionality Reduction Metric Space Isometric Dimensionality Reduction embedding Distortion L ∞ Norm Corollaries Anil Maheshwari Euclidean Norm anil@scs.carleton.ca School of Computer Science Carleton University Canada

  2. Metric Space h X, d i Dimensionality Reduction Let X be a set of n -points and let d be a distance Metric Space measure associated with pairs of elements in X . Isometric embedding We say that h X, d i is a finite metric space if the function d Distortion satisfies metric properties, i.e. L ∞ Norm (a) 8 x 2 X , d ( x, x ) = 0 , Corollaries (b) 8 x, y 2 X, x 6 = y, d ( x, y ) > 0 , Euclidean Norm (c) 8 x, y 2 X , d ( x, y ) = d ( y, x ) (symmetry), and (d) 8 x, y, z 2 X , d ( x, y )  d ( x, z ) + d ( z, y ) (triangle inequality).

  3. Embeddings Dimensionality Reduction Let h X, d i and h X 0 , d 0 i be two metric spaces. Metric Space Isometric Embedding: A map f : X ! X 0 is called an embedding. embedding Distortion Isometric embedding (i.e., distance preserving) if for all L ∞ Norm x, y 2 X , d ( x, y ) = d 0 ( f ( x ) , f ( y )) . Corollaries Euclidean Norm

  4. Motivating Problem Dimensionality Reduction Input: X =Set of n -points in k -dimensional space, where Metric Space n >> 2 k Isometric embedding Output: A pair of points that maximize L 1 -distance. Distortion � n ) = O ( kn 2 ) time � L ∞ Norm Naive Solution: O ( k 2 Corollaries 1 ! L 2 k Better algorithm via isometric embedding of L k Euclidean Norm 1 running in O (2 k n ) time

  5. Universality of L 1 -metric Dimensionality Reduction Metric Space L 1 -metric Isometric embedding Let h X, d i be any finite metric space, where n = | X | . X Distortion can be isometrically embedded into L 1 -metric space of L ∞ Norm appropriate dimension. Corollaries Euclidean Norm

  6. Euclidean Metric Dimensionality Reduction Input: Metric Space defined by K 4 , C 4 , and star- Y w.r.t. Metric Space unweighted SP . Isometric embedding Question: Can one embed 4-points in Euclidean space Distortion isometrically? L ∞ Norm Corollaries Euclidean Norm

  7. Distortion Dimensionality Reduction Contraction: Is the maximum factor by which the Metric Space d ( x,y ) Isometric distances shrink and it equals max x,y 2 X d 0 ( f ( x ) ,f ( y )) . embedding Distortion Expansion: Is the maximum factor by which the L ∞ Norm distances are stretched and it equals Corollaries d 0 ( f ( x ) ,f ( y )) max x,y 2 X . d ( x,y ) Euclidean Norm Distortion: of an embedding is the product of its expansion and contraction factor.

  8. 2 D D log n ) ! L k = O ( Dn Dimensionality h X, d i , 1 Reduction Input: A metric space h X, d i , where X is a set of n -points Metric Space Isometric and let d satisfies the metric properties. embedding 2 D log n ) Output: An embedding of X in a k = O ( Dn Distortion dimensional space such that such that the distances gets L ∞ Norm distorted (actually contracted) by a factor of at most D Corollaries Euclidean Norm under L 1 norm.

  9. 2 D D log n ) ! L k = O ( Dn Dimensionality h X, d i (contd.) , 1 Reduction Let x, y 2 X and let f ( x ) , f ( y ) be their embedding in the Metric Space Isometric k -dimensional space, respectively. embedding Distortion Property L ∞ Norm The distances gets contracted by a factor of at most Corollaries d ( x,y ) D � 1 . Formally, max x,y 2 X || f ( x ) � f ( y ) || 1  D Euclidean Norm Example: If D = O (log n ) , k = O (log 2 n ) , i.e. O (log n ) L O (log 2 n ) h X, d i ! , 1 Meaning: Any metric space h X, d i can be embedded in a O (log 2 n ) -dimensional space and the distances may distort (contract) by a factor of at most O (log n ) . Applications ?

  10. 2 D D log n ) ! L k = O ( Dn Dimensionality Proof of h X, d i , 1 Reduction Metric Space Constructive proof via a randomized algorithm. Isometric embedding Definition Distortion Let S ✓ X . For x 2 X , define distance of x from S as L ∞ Norm d ( x, S ) = min z 2 S d ( x, z ) Corollaries Euclidean Norm Claim Let x, y 2 X . For all S ✓ X , | d ( x, S ) � d ( y, S ) |  d ( x, y ) .

  11. Proof Contd. Dimensionality Reduction Metric Space Definition Isometric embedding ( Mapping ) Let x 2 X . Let S 1 , S 2 , · · · , S k ✓ X . The Distortion mapping f maps x to the point L ∞ Norm Corollaries f ( x ) = { d ( x, S 1 ) , d ( x, S 2 ) , · · · , d ( x, S k ) } . Euclidean Norm Observation: Let S 1 , S 2 , · · · , S k ✓ X . For x, y 2 X , || f ( x ) � f ( y ) || 1  d ( x, y ) .

  12. Proof Contd. Dimensionality Reduction 2020-10-19 Definition ( Mapping ) Let x 2 X . Let S 1 , S 2 , · · · , S k ✓ X . The L ∞ Norm mapping f maps x to the point f ( x ) = { d ( x, S 1 ) , d ( x, S 2 ) , · · · , d ( x, S k ) } . Observation: Let S 1 , S 2 , · · · , S k ✓ X . For x, y 2 X , || f ( x ) � f ( y ) || 1  d ( x, y ) . Proof Contd. Proof. Follows from the above claim, as for each 1  i  k , | d ( x, S i ) � d ( y, S i ) |  d ( x, y ) .

  13. Randomized Algorithm Dimensionality Reduction Input: Metric space X and parameter D . Metric Space Output: A set of O ( Dm ) subsets of X . Isometric embedding Distortion 2 , n � 2 p min( 1 D ) 1 L ∞ Norm 2 Corollaries D log n ) m O ( n 2 Euclidean Norm For j 1 to d D 2 e and 3 For i 1 to m : Choose set S ij by sampling each element of X independently with probability p j For each x 2 X return f ( x ) = [ d ( x, S 11 ) , · · · d ( x, S m 1 ) , 4 d ( x, S 12 ) , · · · , d ( x, S m 2 ) , · · · d ( x, S 1 d D 2 e ) , · · · , d ( x, S m d D 2 e )]

  14. An Observation Dimensionality Reduction Let x, y be two distinct points of X . Let B ( x, r ) be the set Metric Space of points of X that are within a distance of r from x (think Isometric embedding of B ( x, r ) as a ball of radius r centred at x ). Similarly, let Distortion B ( y, r + ∆ ) be the set of points of X that are within a L ∞ Norm distance of r + ∆ from y . Consider a subset S ⇢ X such Corollaries that S \ B ( x, r ) 6 = ; and S \ B ( y, r + ∆ ) = ; . Then Euclidean Norm | d ( x, S ) � d ( y, S ) | � ∆ .

  15. Key Lemma Dimensionality Reduction Metric Space Lemma Isometric embedding Let x, y be two distinct points of X . There exists an index Distortion j 2 { 1 , · · · , d D 2 e } such that if S ij is as chosen in the L ∞ Norm || f ( x ) � f ( y ) || 1 � d ( x,y ) � p ⇥ ⇤ Algorithm, than Pr 12 D Corollaries Euclidean Norm 2 , n � 2 p min( 1 D ) 1 2 D log n ) m O ( n 2 For j 1 to d D 2 e and 3 For i 1 to m : Choose set S ij by sampling each element of X independently with probability p j For each x 2 X return f ( x ) = [ d ( x, S 11 ) , · · · d ( x, S m 1 ) , 4 d ( x, S 12 ) , · · · , d ( x, S m 2 ) , · · · 2 e ) , · · · , d ( x, S m d D d ( x, S 1 d D 2 e )]

  16. Ball Properties Dimensionality Reduction Set ∆ = d ( x,y ) . Metric Space D For i = 0 , · · · , d D 2 e , define balls of radius i ∆ as follows. Isometric embedding Let B 0 = { x } . Distortion B 1 be the ball of radius ∆ centred at y . L ∞ Norm B 2 is the ball of radius 2 ∆ centred at x . Corollaries B 3 is the ball centred at y of radius 3 ∆ and so on. Euclidean Norm Property I No even ball overlaps with an odd ball.

  17. Ball Properties (contd.) Dimensionality Reduction For even (odd) i , let | B i | denote the number of points of Metric Space X that are within a distance of at most i ∆ from x Isometric embedding (respectively, y ). Distortion L ∞ Norm Property II Corollaries There is an index t 2 { 0 , · · · , d D 2 e � 1 } , such that Euclidean Norm 2( t +1) 2 t D and | B t +1 |  n | B t | � n D

  18. Ball Properties (contd.) Dimensionality Reduction 2 t D and Let t be the index such that | B t | � n Metric Space 2( t +1) Isometric | B t +1 |  n D embedding Consider when j = t + 1 in the Algorithm. Distortion L ∞ Norm Property III Corollaries The set S ij chosen by the algorithm has non-empty Euclidean Norm intersection with B t with probability at least p/ 3 , and it will avoid B t +1 with probability at least 1 / 4 . Define: Event E 1 : S ij \ B t 6 = ; . Event E 2 : S ij \ B t +1 = ; .

  19. Event E 1 Dimensionality Reduction Metric Space Pr ( S ij \ B t 6 = ; ) � p/ 3 Isometric embedding Distortion L ∞ Norm Corollaries Euclidean Norm

  20. Event E 2 Dimensionality Reduction Metric Space Pr ( S ij \ B t +1 = ; ) � 1 / 4 Isometric embedding Distortion L ∞ Norm Corollaries Euclidean Norm

  21. Main Theorem Dimensionality Reduction Metric Space 2 D D log n ) ! L k = O ( Dn Isometric h X, d i , 1 embedding Distortion L ∞ Norm Corollaries Euclidean Norm

  22. L O (log 2 n ) Θ (log n ) Dimensionality Corollary 1: h X, d i ! , 1 Reduction Metric Space Set D = Θ (log n ) , in the Theorem 2 Isometric D D log n ) ! L k = O ( Dn embedding h X, d i and we obtain , 1 Distortion Θ (log n ) L O (log 2 n ) h X, d i ! . , 1 L ∞ Norm Corollaries Euclidean Norm

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend