rehashing kernel evaluation in high dimensions
play

Rehashing Kernel Evaluation in High Dimensions Paris Siminelakis* - PowerPoint PPT Presentation

Intro Contribution Sketching Diagnostics Evaluation Conclusion Rehashing Kernel Evaluation in High Dimensions Paris Siminelakis* Ph.D. Candidate Kexin Rong*, Peter Bailis, Moses Charikar, Phillip Levis (Stanford University) ICML @ Long


  1. Intro Contribution Sketching Diagnostics Evaluation Conclusion Rehashing Kernel Evaluation in High Dimensions Paris Siminelakis* Ph.D. Candidate Kexin Rong*, Peter Bailis, Moses Charikar, Phillip Levis (Stanford University) ICML @ Long Beach, California June 11, 2019 * equal contribution.

  2. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Function P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 n points

  3. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Function P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 n points kernel k

  4. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Function P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 n points kernel k

  5. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Function P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n � 1 � � KDF P ( q ) = k ( x i , q ) n i =1 n points kernel k

  6. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Evaluation P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 Where is it used? 1 Non-parametric density estimation KDF P ( q ) 2 Kernel methods f ( x ) = � i α i φ ( � x − x i � ) 3 Comparing point sets (distributions) with “Kernel Distance”

  7. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Evaluation P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 Where is it used? 1 Non-parametric density estimation KDF P ( q ) 2 Kernel methods f ( x ) = � i α i φ ( � x − x i � ) 3 Comparing point sets (distributions) with “Kernel Distance”

  8. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Evaluation P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 Where is it used? 1 Non-parametric density estimation KDF P ( q ) 2 Kernel methods f ( x ) = � i α i φ ( � x − x i � ) 3 Comparing point sets (distributions) with “Kernel Distance”

  9. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Evaluation P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 Where is it used? 1 Non-parametric density estimation KDF P ( q ) 2 Kernel methods f ( x ) = � i α i φ ( � x − x i � ) 3 Comparing point sets (distributions) with “Kernel Distance”

  10. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Evaluation P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n � KDF u P ( q ) = u i k ( x i , q ) i =1 Where is it used? 1 Non-parametric density estimation KDF P ( q ) 2 Kernel methods f ( x ) = � i α i φ ( � x − x i � ) 3 Comparing point sets (distributions) with “Kernel Distance” Evaluating at a single point requires O ( n )

  11. Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Evaluation P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n � KDF u P ( q ) = u i k ( x i , q ) i =1 Where is it used? 1 Non-parametric density estimation KDF P ( q ) 2 Kernel methods f ( x ) = � i α i φ ( � x − x i � ) 3 Comparing point sets (distributions) with “Kernel Distance” How fast can we approximate KDF ?

  12. Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d

  13. Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Space Partitions log(1 /µǫ ) O ( d ) FMM [Greengard, Rokhlin’87] Dual-Tree [Lee, Gray, Moore’06] FIG-Tree [Moriaru et al. NeurIPS’09] Slow in high dim

  14. Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Space Partitions log(1 /µǫ ) O ( d ) img: computer.org Slow in high dim

  15. Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Random Sampling Space Partitions 1 /µǫ 2 log(1 /µǫ ) O ( d ) img: computer.org Slow in high dim Linear in 1 /µ

  16. Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Random Sampling Space Partitions Hashing O (1 / √ µǫ 2 ) 1 /µǫ 2 log(1 /µǫ ) O ( d ) Hashing-Based- Estimators [Charikar, S ’17] Similar idea: Locality Senstive Samplers [Spring, Shrivastava ’17] img: computer.org Slow in high dim Sub-linear in 1 /µ Linear in 1 /µ

  17. Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Random Sampling Space Partitions Hashing O (1 / √ µǫ 2 ) 1 /µǫ 2 log(1 /µǫ ) O ( d ) Importance Sampling via Randomized Space Partitions img: computer.org Slow in high dim Sub-linear in 1 /µ Linear in 1 /µ

  18. Intro Contribution Sketching Diagnostics Evaluation Conclusion Randomized Space Partitions Distribution H over partitions h : R d → [ M ]

  19. Intro Contribution Sketching Diagnostics Evaluation Conclusion Randomized Space Partitions Distribution H over partitions h : R d → [ M ] h 1 h 2 h 3 h 4 h 5 h 6

  20. Intro Contribution Sketching Diagnostics Evaluation Conclusion Locality Sensitive Hashing Partitions H such P h ∼H [ h ( x ) = h ( y )] = p ( � x − y � ) Euclidean LSH [Datar, Immorlika, Indyk, Mirrokni’04] Concatenate k hashes p k ( � x − y � )

  21. Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators [Charikar, S. FOCS’17] Preprocess: Sample h 1 , . . . , h m ∼ H and evaluate on P Query: H t ( q ) hash-bucket for q in table t

  22. Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators [Charikar, S. FOCS’17] Preprocess: Sample h 1 , . . . , h m ∼ H and evaluate on P Query: H t ( q ) hash-bucket for q in table t

  23. Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators [Charikar, S. FOCS’17] Preprocess: Sample h 1 , . . . , h m ∼ H and evaluate on P Query: H t ( q ) hash-bucket for q in table t · · ·

  24. Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators [Charikar, S. FOCS’17] Preprocess: Sample h 1 , . . . , h m ∼ H and evaluate on P Query: H t ( q ) hash-bucket for q in table t · · · Estimator: Sample random point X t from H t ( q ) and return: m Z m = 1 1 k ( X t , q ) � m n p ( X t , q ) / | H t ( q ) | t =1

  25. Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators [Charikar, S. FOCS’17] Preprocess: Sample h 1 , . . . , h m ∼ H and evaluate on P Query: H t ( q ) hash-bucket for q in table t · · · Estimator: Sample random point X t from H t ( q ) and return: m Z m = 1 1 k ( X t , q ) � p ( X t , q ) / | H t ( q ) | m n t =1 How many samples m ? which LSH?

  26. Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators have Practical Limitations Theorem [Charikar, S. FOCS’17] For certain kernels HBE solves the kernel evaluation problem for µ ≥ τ using O (1 / √ µǫ 2 ) samples and O ( n / √ τǫ 2 ) space. Kernel LSH Overhead 2 e ˜ e −� x − y � 2 3 ( n )) O (log Ball Carving [Andoni, Indyk’06] √ e e −� x − y � Euclidean [Datar et al’04] 1 3 t / 2 Euclidean [Datar et al’04] 1+ � x − y � t 2

  27. Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators have Practical Limitations Theorem [Charikar, S. FOCS’17] For certain kernels HBE solves the kernel evaluation problem for µ ≥ τ using O (1 / √ µǫ 2 ) samples and O ( n / √ τǫ 2 ) space. Practical Limitations: 1 Super-linear Space ⇒ Not practical for massive datasets 2 Uses Adaptive procedure to estim. number of samples: ⇒ large-constant + stringent requirements on hash functions. 2 3 Gaussian kernel Ball-Carving LSH very slow e ˜ 3 ( n )) O (log

  28. Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators have Practical Limitations Theorem [Charikar, S. FOCS’17] For certain kernels HBE solves the kernel evaluation problem for µ ≥ τ using O (1 / √ µǫ 2 ) samples and O ( n / √ τǫ 2 ) space. Practical Limitations: 1 Super-linear Space ⇒ Not practical for massive datasets 2 Uses Adaptive procedure to estim. number of samples: ⇒ large-constant + stringent requirements on hash functions. 2 3 Gaussian kernel Ball-Carving LSH very slow e ˜ 3 ( n )) O (log

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend