Hypercube locality-sensitive hashing for approximate near neighbors - PowerPoint PPT Presentation

Hypercube locality-sensitive hashing for approximate near neighbors Thijs Laarhoven ♠❛✐❧❅t❤✐❥s✳❝♦♠ ❤tt♣✿✴✴✇✇✇✳t❤✐❥s✳❝♦♠✴ MFCS 2017, Aalborg, Denmark (August 23, 2017)

Nearest neighbor searching O

Nearest neighbor searching Data set O

Nearest neighbor searching Target O

Nearest neighbor searching Nearest neighbor ( ℓ 2 -norm) O

Nearest neighbor searching Nearest neighbor (angular distance) O

Nearest neighbor searching Distance guarantee r O

Nearest neighbor searching Approximate nearest neighbor r O

Nearest neighbor searching Approximation factor c > 1 r O c · r

Nearest neighbor searching Example: Precompute Voronoi cells O

Nearest neighbor searching Given a target... O

Nearest neighbor searching ...quickly find the right cell O

Nearest neighbor searching Works well in low dimensions O

Nearest neighbor searching Problem setting • High dimensions d

Nearest neighbor searching Problem setting • High dimensions d • Large data set of size n = 2 Ω ( d / log d ) ◮ Smaller n ? = ⇒ Use JLT to reduce d

Nearest neighbor searching Problem setting • High dimensions d • Large data set of size n = 2 Ω ( d / log d ) ◮ Smaller n ? = ⇒ Use JLT to reduce d • Assumption: Data set lies on the sphere ◮ Equivalent to angular distance / cosine similarity in all of � d ◮ Reduction from Eucl. NNS in � d to Eucl. NNS on the sphere [ AR’15 ]

Nearest neighbor searching Problem setting • High dimensions d • Large data set of size n = 2 Ω ( d / log d ) ◮ Smaller n ? = ⇒ Use JLT to reduce d • Assumption: Data set lies on the sphere ◮ Equivalent to angular distance / cosine similarity in all of � d ◮ Reduction from Eucl. NNS in � d to Eucl. NNS on the sphere [ AR’15 ] • Goal: Query time O ( n ρ ) with ρ < 1

Hyperplane LSH [ Charikar, STOC’02 ] O

Hyperplane LSH Random point O

Hyperplane LSH Opposite point O

Hyperplane LSH Two Voronoi cells O

Hyperplane LSH Another pair of points O

Hyperplane LSH Another hyperplane O

Hyperplane LSH Defines partition O

Hyperplane LSH Preprocessing O

Hyperplane LSH Query O

Hyperplane LSH Collisions O

Hyperplane LSH Failure O

Hyperplane LSH Rerandomization O

Hyperplane LSH Collisions O

Hyperplane LSH Success O

Hyperplane LSH Overview O

Hyperplane LSH Overview • Simple: one hyperplane corresponds to one inner product O

Hyperplane LSH Overview • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ O

Hyperplane LSH Overview • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ • Can be made very efficient in practice ◮ Sparse hyperplane vectors [ Ach’01, LHC’06 ] ◮ Orthogonal hyperplanes [ TT’07 ] O

Hyperplane LSH Overview • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ • Can be made very efficient in practice ◮ Sparse hyperplane vectors [ Ach’01, LHC’06 ] ◮ Orthogonal hyperplanes [ TT’07 ] • Theoretically suboptimal: use “nicer” (lattice-based) partitions O

Hyperplane LSH Overview • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ • Can be made very efficient in practice ◮ Sparse hyperplane vectors [ Ach’01, LHC’06 ] ◮ Orthogonal hyperplanes [ TT’07 ] • Theoretically suboptimal: use “nicer” (lattice-based) partitions O ◮ Random points [ AI’06, AINR’14, ... ]

Hyperplane LSH Overview • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ • Can be made very efficient in practice ◮ Sparse hyperplane vectors [ Ach’01, LHC’06 ] ◮ Orthogonal hyperplanes [ TT’07 ] • Theoretically suboptimal: use “nicer” (lattice-based) partitions O ◮ Random points [ AI’06, AINR’14, ... ] ◮ Leech lattice [ AI’06 ]

Hyperplane LSH Overview • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ • Can be made very efficient in practice ◮ Sparse hyperplane vectors [ Ach’01, LHC’06 ] ◮ Orthogonal hyperplanes [ TT’07 ] • Theoretically suboptimal: use “nicer” (lattice-based) partitions O ◮ Random points [ AI’06, AINR’14, ... ] ◮ Leech lattice [ AI’06 ] ◮ Classical root lattices A d , D d [ JASG’08 ] ◮ Exceptional root lattices E 6,7,8 , F 4 , G 2 [ JASG’08 ]

Hyperplane LSH Overview • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ • Can be made very efficient in practice ◮ Sparse hyperplane vectors [ Ach’01, LHC’06 ] ◮ Orthogonal hyperplanes [ TT’07 ] • Theoretically suboptimal: use “nicer” (lattice-based) partitions O ◮ Random points [ AI’06, AINR’14, ... ] ◮ Leech lattice [ AI’06 ] ◮ Classical root lattices A d , D d [ JASG’08 ] ◮ Exceptional root lattices E 6,7,8 , F 4 , G 2 [ JASG’08 ] ◮ Cross-polytopes [ TT’07, AILRS’15, KW’17 ]

Hyperplane LSH Overview • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ • Can be made very efficient in practice ◮ Sparse hyperplane vectors [ Ach’01, LHC’06 ] ◮ Orthogonal hyperplanes [ TT’07 ] • Theoretically suboptimal: use “nicer” (lattice-based) partitions O ◮ Random points [ AI’06, AINR’14, ... ] ◮ Leech lattice [ AI’06 ] ◮ Classical root lattices A d , D d [ JASG’08 ] ◮ Exceptional root lattices E 6,7,8 , F 4 , G 2 [ JASG’08 ] ◮ Cross-polytopes [ TT’07, AILRS’15, KW’17 ] ◮ Hypercubes [ TT’07 ]

Hyperplane LSH Asymptotically “optimal” • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ • Can be made very efficient in practice ◮ Sparse hyperplane vectors [ Ach’01, LHC’06 ] ◮ Orthogonal hyperplanes [ TT’07 ] • Theoretically suboptimal: use “nicer” (lattice-based) partitions O ◮ Random points [ AI’06, AINR’14, ... ] ◮ Leech lattice [ AI’06 ] ◮ Classical root lattices A d , D d [ JASG’08 ] ◮ Exceptional root lattices E 6,7,8 , F 4 , G 2 [ JASG’08 ] ◮ Cross-polytopes [ TT’07, AILRS’15, KW’17 ] ◮ Hypercubes [ TT’07 ]

Hyperplane LSH Topic of this paper • Simple: one hyperplane corresponds to one inner product • Easy to analyze: collision probability 1 − θ π for vectors at angle θ • Can be made very efficient in practice ◮ Sparse hyperplane vectors [ Ach’01, LHC’06 ] ◮ Orthogonal hyperplanes [ TT’07 ] • Theoretically suboptimal: use “nicer” (lattice-based) partitions O ◮ Random points [ AI’06, AINR’14, ... ] ◮ Leech lattice [ AI’06 ] ◮ Classical root lattices A d , D d [ JASG’08 ] ◮ Exceptional root lattices E 6,7,8 , F 4 , G 2 [ JASG’08 ] ◮ Cross-polytopes [ TT’07, AILRS’15, KW’17 ] ◮ Hypercubes [ TT’07 ]

Hypercube LSH [ Terasawa–Tanaka, WADS’07 ] O

Hypercube LSH Vertices of hypercube O

Hypercube LSH Random rotation O

Hypercube LSH Voronoi regions O

Hypercube LSH Defines partition O

Hypercube LSH Collision probabilities 1 Hyperplane LSH Hypercube LSH ν → p ( θ ) 1 / d 3 π 1 π 0 arccos ( 2 0 π π π ) π 3 2 → θ

Hypercube LSH Collision probabilities 1 Hyperplane LSH Hypercube LSH ν → p ( θ ) 1 / d 3 π 1 π 0 arccos ( 2 0 π π π ) π 3 2 → θ 2 ) − lie in the same orthant with probability ( 1 • Two vectors at angle ( π π ) d � • Two vectors at angle π 3 π ) d 3 lie in the same orthant with probability (

Hypercube LSH Asymptotic performance (random data) 1 0.5 0.2 → ρ Hyperplane LSH 0.1 Hypercube LSH 0.05 Cross - polytope LSH 1 2 2 2 2 4 → c

Hypercube LSH Asymptotic performance (random data) 1 0.5 0.2 → ρ Hyperplane LSH 0.1 Hypercube LSH 0.05 Cross - polytope LSH 1 2 2 2 2 4 → c � π c ln2 + O ( 1 2 • Hyperplane LSH: ρ = c 2 )

Hypercube LSH Asymptotic performance (random data) 1 0.5 0.2 → ρ Hyperplane LSH 0.1 Hypercube LSH 0.05 Cross - polytope LSH 1 2 2 2 2 4 → c � π c ln2 + O ( 1 2 • Hyperplane LSH: ρ = c 2 ) � π c ln π + O ( 1 2 • Hypercube LSH: ρ = c 2 ) – saves factor log 2 ( π ) ≈ 1.65

Hypercube locality-sensitive hashing for approximate near neighbors - PowerPoint PPT Presentation

Hypercube locality-sensitive hashing for approximate near neighbors Thijs Laarhoven ts ttts MFCS 2017, Aalborg, Denmark (August 23, 2017) Nearest neighbor

Today. Cuckoo hashing. Today. Cuckoo hashing. Johnson-Lindenstrass. Cuckoo hashing. Hashing

Nearest Neighbor and Locality-Sensitive Hashing Nearest Neighbor Set Similarity

Locality-Sensitive Hashing LSH Fingerprints References Anil Maheshwari School of Computer

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Overview Intro to Hashing Intro to Hashing Hashing with Chaining Whats hashing?

14. Hashing Hash Tables, Pre-Hashing, Hashing, Resolving Collisions using Chaining, Simple

Locality-Sensitive Hashing Documents LSH Metric Spaces Sensitive Function Anil Maheshwari

Approximate Nearest Neighbors Search Approximate Nearest Neighbors Search in High Dimensions in

Information near-duplicates Minimum hashing; Locality Sensitive Hashing Web Search Information

MIN-HASHING AND LOCALITY SENSITIVE HASHING Thanks to: Rajaraman and Ullman, Mining Massive

CONTEXT LOCALITY LOCALITY LOCALITY LOCALITY LAYOUTS M E E R L U S T R O A D PICK

Near Neighbor Search in High Dimensional Data (2) Locality-Sensitive Hashing (continued) LS

Last Time: Summary. Hypercube Eigenvalues of hypercube. V = { 0 , 1 } d ( x , y ) E when x

Circular q-shift - Hypercube Using E-cube routing q-shift in a hypercube with p nodes:

Beyond Locality-Sensitive Hashing Huy L. Nguy Alexandr Andoni 1 Piotr Indyk 2 n 3 Ilya

Locality Sensitive Hashing Lecture 14 October 13, 2020 Chandra (UIUC) CS498ABD 1 Fall 2020 1

Support Vector Machines Part 1 Yingyu Liang Computer Sciences 760 Fall 2017

Machine Learning for NLP Support Vector Machines Aurlie Herbelot 2019 Centre for Mind/Brain

Support Vector Machines (I): Overview and Linear SVM LING 572 Advanced Statistical Techniques

IAML: Support Vector Machines I Nigel Goddard School of Informatics Semester 1 1 / 18 Outline

Bisectors and foliations in the complex hyperbolic space Maciej Czarnecki Uniwersytet L

Introduction to Support Vector Machines Starting from slides drawn by Ming-Hsuan Yang and Antoine

COMP24111: Machine Learning and Optimisation Chapter 4: Support Vector Machines Dr. Tingting Mu

Chapter IX: Classification* 1. Basic idea 2. Decision trees 3. Nave Bayes classifier 4.