near optimal dimensionality reductions that preserve
play

Near Optimal Dimensionality Reductions that Preserve Volumes - PowerPoint PPT Presentation

Near Optimal Dimensionality Reductions that Preserve Volumes RANDOM/APPROX 2008 Avner Magen Anastasios Zouzias University of Toronto August, 2008 A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 1 /


  1. Near Optimal Dimensionality Reductions that Preserve Volumes RANDOM/APPROX 2008 Avner Magen Anastasios Zouzias University of Toronto August, 2008 A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 1 / 23

  2. Dimension Reduction P ⊆ IR t : set of n points Goal: Find f : P → IR d ( d ≪ n , t ) s.t. some property is preserved. Measure of quality (Distance) f has distortion 1 + ε if ∀ p , q ∈ P � p − q � ≤ � f ( p ) − f ( q ) � ≤ (1 + ε ) � p − q � . Measure of quality (Volume) f has volume distortion 1 + ε if 1 � vol(f(S)) � | S |− 1 ∀ S ⊂ P , | S | ≤ k 1 ≤ ≤ 1 + ε. vol(S) A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 2 / 23

  3. Dimension Reduction P ⊆ IR t : set of n points Goal: Find f : P → IR d ( d ≪ n , t ) s.t. some property is preserved. Measure of quality (Distance) f has distortion 1 + ε if ∀ p , q ∈ P � p − q � ≤ � f ( p ) − f ( q ) � ≤ (1 + ε ) � p − q � . Measure of quality (Volume) f has volume distortion 1 + ε if 1 � vol(f(S)) � | S |− 1 ∀ S ⊂ P , | S | ≤ k 1 ≤ ≤ 1 + ε. vol(S) A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 2 / 23

  4. Dimension Reduction P ⊆ IR t : set of n points Goal: Find f : P → IR d ( d ≪ n , t ) s.t. some property is preserved. Measure of quality (Distance) f has distortion 1 + ε if ∀ p , q ∈ P � p − q � ≤ � f ( p ) − f ( q ) � ≤ (1 + ε ) � p − q � . Measure of quality (Volume) f has volume distortion 1 + ε if 1 � vol(f(S)) � | S |− 1 ∀ S ⊂ P , | S | ≤ k 1 ≤ ≤ 1 + ε. vol(S) A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 2 / 23

  5. Dimension Reduction P ⊆ IR t : set of n points Goal: Find f : P → IR d ( d ≪ n , t ) s.t. some property is preserved. Measure of quality (Distance) f has distortion 1 + ε if ∀ p , q ∈ P � p − q � ≤ � f ( p ) − f ( q ) � ≤ (1 + ε ) � p − q � . Measure of quality (Volume) (This talk) f has volume distortion 1 + ε if 1 � vol(f(S)) � | S |− 1 ∀ S ⊂ P , | S | ≤ k 1 ≤ ≤ 1 + ε. vol(S) A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 2 / 23

  6. Johnson Lindenstrauss Lemma Lemma (Distances) Let P an n -point subset of Euclidean space. There exists a mapping f from P into R d , d = O ( ε − 2 log n ) such that ∀ x , y ∈ P (1 − ε ) � x − y � ≤ � f ( x ) − f ( y ) � ≤ (1 + ε ) � x − y � A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 3 / 23

  7. Johnson Lindenstrauss Lemma Lemma (Distances) Let P an n -point subset of Euclidean space. There exists a mapping f from P into R d , d = O ( ε − 2 log n ) such that ∀ x , y ∈ P (1 − ε ) � x − y � ≤ � f ( x ) − f ( y ) � ≤ (1 + ε ) � x − y � Almost tight Lower bound Ω ( ε − 2 log n / log(1 /ε )) [Alon, 2003]. A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 3 / 23

  8. Random Projections Many ways to generate such a (linear) mapping (encoded by X ∈ R n × d ): X i , j ∼ N (0 , 1) X i , j ∼ ± 1 w.p. 1 / 2 . Sparse Gaussian matrix (with preprocessing) Entries with Subgaussian tails ECC and Rademacher r.v. Lean Walsh Transform. (Next talk, [Liberty et al., 2008]) A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 4 / 23

  9. Random Projections Many ways to generate such a (linear) mapping (encoded by X ∈ R n × d ): X i , j ∼ N (0 , 1) (This talk) X i , j ∼ ± 1 w.p. 1 / 2 . Sparse Gaussian matrix (with preprocessing) Entries with Subgaussian tails ECC and Rademacher r.v. Lean Walsh Transform. (Next talk, [Liberty et al., 2008]) A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 4 / 23

  10. Related Work: Extensions of JL to other cases [Magen, 2002] Preserve volumes of subsets of size up to k and affine distances using O ( k ε − 2 log n ) dimensions. [Sarlos, 2006] Preserve distances of all points lying in any k dim. linear subspace by projecting into O ( k ε − 2 log( k /ε )) dimensions. [Wakin and Baraniuk, 2006, Agarwal et al., 2007, Clarkson, 2008] Moving points, curves, surfaces and manifolds etc. A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 5 / 23

  11. Related Work: Extensions of JL to other cases [Magen, 2002] Preserve volumes of subsets of size up to k and affine distances using O ( k ε − 2 log n ) dimensions. [Sarlos, 2006] Preserve distances of all points lying in any k dim. linear subspace by projecting into O ( k ε − 2 log( k /ε )) dimensions. [Wakin and Baraniuk, 2006, Agarwal et al., 2007, Clarkson, 2008] Moving points, curves, surfaces and manifolds etc. Our Contribution Improve Magen’s result for volumes, by showing that O (max { k /ε,ε − 2 log n } ) dimensions are enough. JL Lemma preserves more than distances. It preserves volumes of subsets of size up to log n /ε . A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 5 / 23

  12. Our Result Theorem ε ,ε − 2 log n } ) , s.t. ∀ subset S of P , Let P ⊂ R n . There ∃ f : P → R d , d = O (max { k 1 < | S | < k , 1 � vol(f(S)) � | S |− 1 1 − ε ≤ ≤ 1 + ε. vol(S) Overview of proof: There are roughly O ( n s ) sets of size s . It suffices to prove the failure probability for a subset of size s is roughly e − Ω ( sd ε 2 ) . Union bound implies that a volume-preserving mapping exists. A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 6 / 23

  13. Our Result Theorem ε ,ε − 2 log n } ) , s.t. ∀ subset S of P , Let P ⊂ R n . There ∃ f : P → R d , d = O (max { k 1 < | S | < k , 1 � vol(f(S)) � | S |− 1 1 − ε ≤ ≤ 1 + ε. vol(S) Overview of proof: There are roughly O ( n s ) sets of size s . It suffices to prove the failure probability for a subset of size s is roughly e − Ω ( sd ε 2 ) . (Core of the talk.) Union bound implies that a volume-preserving mapping exists. A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 6 / 23

  14. Proof Two steps: Prove it for the regular n -simplex. 1 Reduce the general case to the above case. 2 A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 7 / 23

  15. The n -simplex Assume input points are { e 1 ,..., e n } . A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 8 / 23

  16. The n -simplex Assume input points are { e 1 ,..., e n } . Form a matrix e i → i th row, i.e., identity matrix. A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 8 / 23

  17. The n -simplex Assume input points are { e 1 ,..., e n } . Form a matrix e i → i th row, i.e., identity matrix. Random Projection (without normalization)   ...  1 0 0 0              0 1 0 ... 0                 .    ...  X ij ∼ N (0 , 1)     .        0 0 .                  . .        . .      . .    1 0                     0 0 ... 0 1    n × n n × d A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 8 / 23

  18. The n -simplex Assume input points are { e 1 ,..., e n } . Projected points are Random Gaussian Vectors in R d .                                         X ij ∼ N (0 , 1)                                             n × d A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 8 / 23

  19. The n -simplex Assume input points are { e 1 ,..., e n } . Projected points are Random Gaussian Vectors in R d . Pick any subset S , | S | = s of rows of X                   ∼ N (0 , 1) X S : =  X ij                        s × d A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 8 / 23

  20. The n -simplex Assume input points are { e 1 ,..., e n } . Projected points are Random Gaussian Vectors in R d . Pick any subset S , | S | = s of rows of X                   ∼ N (0 , 1) X S : =  X ij                        s × d � det(X S X ⊤ S ) / s! . vol(S ∪{ 0 } ) = A. Zouzias (University of Toronto) Dimensionality Reductions for Volumes RANDOM/APPROX 2008 8 / 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend