Rehashing Kernel Evaluation in High Dimensions Paris Siminelakis* - PowerPoint PPT Presentation

Intro Contribution Sketching Diagnostics Evaluation Conclusion Rehashing Kernel Evaluation in High Dimensions Paris Siminelakis* Ph.D. Candidate Kexin Rong*, Peter Bailis, Moses Charikar, Phillip Levis (Stanford University) ICML @ Long Beach, California June 11, 2019 * equal contribution.

Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Function P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 n points

Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Function P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 n points kernel k

Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Function P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n � 1 � � KDF P ( q ) = k ( x i , q ) n i =1 n points kernel k

Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Evaluation P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n KDF u � P ( q ) = u i k ( x i , q ) i =1 Where is it used? 1 Non-parametric density estimation KDF P ( q ) 2 Kernel methods f ( x ) = � i α i φ ( � x − x i � ) 3 Comparing point sets (distributions) with “Kernel Distance”

Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Evaluation P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n � KDF u P ( q ) = u i k ( x i , q ) i =1 Where is it used? 1 Non-parametric density estimation KDF P ( q ) 2 Kernel methods f ( x ) = � i α i φ ( � x − x i � ) 3 Comparing point sets (distributions) with “Kernel Distance” Evaluating at a single point requires O ( n )

Intro Contribution Sketching Diagnostics Evaluation Conclusion Kernel Density Evaluation P = { x 1 , . . . , x n } ⊂ R d , k : R d × R d → R + , u ≥ 0, query point q n � KDF u P ( q ) = u i k ( x i , q ) i =1 Where is it used? 1 Non-parametric density estimation KDF P ( q ) 2 Kernel methods f ( x ) = � i α i φ ( � x − x i � ) 3 Comparing point sets (distributions) with “Kernel Distance” How fast can we approximate KDF ?

Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d

Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Space Partitions log(1 /µǫ ) O ( d ) FMM [Greengard, Rokhlin’87] Dual-Tree [Lee, Gray, Moore’06] FIG-Tree [Moriaru et al. NeurIPS’09] Slow in high dim

Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Space Partitions log(1 /µǫ ) O ( d ) img: computer.org Slow in high dim

Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Random Sampling Space Partitions 1 /µǫ 2 log(1 /µǫ ) O ( d ) img: computer.org Slow in high dim Linear in 1 /µ

Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Random Sampling Space Partitions Hashing O (1 / √ µǫ 2 ) 1 /µǫ 2 log(1 /µǫ ) O ( d ) Hashing-Based- Estimators [Charikar, S ’17] Similar idea: Locality Senstive Samplers [Spring, Shrivastava ’17] img: computer.org Slow in high dim Sub-linear in 1 /µ Linear in 1 /µ

Intro Contribution Sketching Diagnostics Evaluation Conclusion Methods for Fast Kernel Evaluation P ⊂ R d , ǫ > 0 ⇒ (1 ± ǫ )-approx to µ := KDF P ( q ) for any q ∈ R d Random Sampling Space Partitions Hashing O (1 / √ µǫ 2 ) 1 /µǫ 2 log(1 /µǫ ) O ( d ) Importance Sampling via Randomized Space Partitions img: computer.org Slow in high dim Sub-linear in 1 /µ Linear in 1 /µ

Intro Contribution Sketching Diagnostics Evaluation Conclusion Randomized Space Partitions Distribution H over partitions h : R d → [ M ]

Intro Contribution Sketching Diagnostics Evaluation Conclusion Randomized Space Partitions Distribution H over partitions h : R d → [ M ] h 1 h 2 h 3 h 4 h 5 h 6

Intro Contribution Sketching Diagnostics Evaluation Conclusion Locality Sensitive Hashing Partitions H such P h ∼H [ h ( x ) = h ( y )] = p ( � x − y � ) Euclidean LSH [Datar, Immorlika, Indyk, Mirrokni’04] Concatenate k hashes p k ( � x − y � )

Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators [Charikar, S. FOCS’17] Preprocess: Sample h 1 , . . . , h m ∼ H and evaluate on P Query: H t ( q ) hash-bucket for q in table t

Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators [Charikar, S. FOCS’17] Preprocess: Sample h 1 , . . . , h m ∼ H and evaluate on P Query: H t ( q ) hash-bucket for q in table t · · ·

Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators [Charikar, S. FOCS’17] Preprocess: Sample h 1 , . . . , h m ∼ H and evaluate on P Query: H t ( q ) hash-bucket for q in table t · · · Estimator: Sample random point X t from H t ( q ) and return: m Z m = 1 1 k ( X t , q ) � m n p ( X t , q ) / | H t ( q ) | t =1

Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators [Charikar, S. FOCS’17] Preprocess: Sample h 1 , . . . , h m ∼ H and evaluate on P Query: H t ( q ) hash-bucket for q in table t · · · Estimator: Sample random point X t from H t ( q ) and return: m Z m = 1 1 k ( X t , q ) � p ( X t , q ) / | H t ( q ) | m n t =1 How many samples m ? which LSH?

Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators have Practical Limitations Theorem [Charikar, S. FOCS’17] For certain kernels HBE solves the kernel evaluation problem for µ ≥ τ using O (1 / √ µǫ 2 ) samples and O ( n / √ τǫ 2 ) space. Kernel LSH Overhead 2 e ˜ e −� x − y � 2 3 ( n )) O (log Ball Carving [Andoni, Indyk’06] √ e e −� x − y � Euclidean [Datar et al’04] 1 3 t / 2 Euclidean [Datar et al’04] 1+ � x − y � t 2

Intro Contribution Sketching Diagnostics Evaluation Conclusion Hashing-Based-Estimators have Practical Limitations Theorem [Charikar, S. FOCS’17] For certain kernels HBE solves the kernel evaluation problem for µ ≥ τ using O (1 / √ µǫ 2 ) samples and O ( n / √ τǫ 2 ) space. Practical Limitations: 1 Super-linear Space ⇒ Not practical for massive datasets 2 Uses Adaptive procedure to estim. number of samples: ⇒ large-constant + stringent requirements on hash functions. 2 3 Gaussian kernel Ball-Carving LSH very slow e ˜ 3 ( n )) O (log

Rehashing Kernel Evaluation in High Dimensions Paris Siminelakis* - PowerPoint PPT Presentation

Intro Contribution Sketching Diagnostics Evaluation Conclusion Rehashing Kernel Evaluation in High Dimensions Paris Siminelakis* Ph.D. Candidate Kexin Rong*, Peter Bailis, Moses Charikar, Phillip Levis (Stanford University) ICML @ Long

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

CTA WEIGHTS AND CTA WEIGHTS AND DIMENSIONS DIMENSIONS INITIATIVES INITIATIVES Meeting of the

Module 4: Building Working with Standard Dimensions Dimensions Using the Basic Level

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

A kernel in a library Genodes custom kernel approach Martin Stein <

Linux Kernel Synchronization System Calls Synchronization in Kernel the kernel RCU File

Debugging the Linux Kernel with GDB Kieran Bingham Debugging the Linux Kernel with GDB Many

Sinkhorn Algorithm as a Special Case of Stochastic Mirror Descent Konstantin Mishchenko, KAUST

CSC2/458 Parallel and Distributed Systems Automatic Parallelization in Hardware Sreepathi Pai

CLL Compiler Work How many memory references per iteration in copy-@s-@r-@d-st ? Problem grew out

Machine-Level Programming I: Basics CSE 238/2038/2138: Systems Programming Instructor: Fatma

Surfaces How to carry to surface? Texture Synthesis: Surfaces and RD from Wei & Levoy = +

Learning Mixtures of Spherical Gaussians: Moment Methods and Spectral Decompositions Daniel Hsu

CS 6316 Machine Learning Dimensionality Reduction Yangfeng Ji Department of Computer Science

Deep Approximation via Deep Learning Zuowei Shen Department of Mathematics National University

Rehashing Kernel Evaluation in High Dimensions Paris Siminelakis* - PowerPoint PPT Presentation

Intro Contribution Sketching Diagnostics Evaluation Conclusion Rehashing Kernel Evaluation in High Dimensions Paris Siminelakis* Ph.D. Candidate Kexin Rong*, Peter Bailis, Moses Charikar, Phillip Levis (Stanford University) ICML @ Long

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

CTA WEIGHTS AND CTA WEIGHTS AND DIMENSIONS DIMENSIONS INITIATIVES INITIATIVES Meeting of the

Module 4: Building Working with Standard Dimensions Dimensions Using the Basic Level

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

Kernel Machines Steven J Zeil Old Dominion Univ. Fall 2010 1 Support Vector Machines Kernel

A kernel in a library Genodes custom kernel approach Martin Stein &lt;

Linux Kernel Synchronization System Calls Synchronization in Kernel the kernel RCU File

Debugging the Linux Kernel with GDB Kieran Bingham Debugging the Linux Kernel with GDB Many

Sinkhorn Algorithm as a Special Case of Stochastic Mirror Descent Konstantin Mishchenko, KAUST

CSC2/458 Parallel and Distributed Systems Automatic Parallelization in Hardware Sreepathi Pai

CLL Compiler Work How many memory references per iteration in copy-@s-@r-@d-st ? Problem grew out

Machine-Level Programming I: Basics CSE 238/2038/2138: Systems Programming Instructor: Fatma

Surfaces How to carry to surface? Texture Synthesis: Surfaces and RD from Wei &amp; Levoy = +

Learning Mixtures of Spherical Gaussians: Moment Methods and Spectral Decompositions Daniel Hsu

CS 6316 Machine Learning Dimensionality Reduction Yangfeng Ji Department of Computer Science

Deep Approximation via Deep Learning Zuowei Shen Department of Mathematics National University

A kernel in a library Genodes custom kernel approach Martin Stein <

Surfaces How to carry to surface? Texture Synthesis: Surfaces and RD from Wei & Levoy = +