Hypercube locality-sensitive hashing for approximate near neighbors - - PowerPoint PPT Presentation

hypercube locality sensitive hashing for approximate near
SMART_READER_LITE
LIVE PREVIEW

Hypercube locality-sensitive hashing for approximate near neighbors - - PowerPoint PPT Presentation

Hypercube locality-sensitive hashing for approximate near neighbors Thijs Laarhoven ts ttts MFCS 2017, Aalborg, Denmark (August 23, 2017) Nearest neighbor


slide-1
SLIDE 1

Hypercube locality-sensitive hashing for approximate near neighbors

Thijs Laarhoven

♠❛✐❧❅t❤✐❥s✳❝♦♠ ❤tt♣✿✴✴✇✇✇✳t❤✐❥s✳❝♦♠✴

MFCS 2017, Aalborg, Denmark

(August 23, 2017)

slide-2
SLIDE 2

O

Nearest neighbor searching

slide-3
SLIDE 3

O

Nearest neighbor searching

Data set

slide-4
SLIDE 4

O

Nearest neighbor searching

Target

slide-5
SLIDE 5

O

Nearest neighbor searching

Nearest neighbor (ℓ2-norm)

slide-6
SLIDE 6

O

Nearest neighbor searching

Nearest neighbor (ℓ1-norm)

slide-7
SLIDE 7

O

Nearest neighbor searching

Nearest neighbor (angular distance)

slide-8
SLIDE 8

O

Nearest neighbor searching

Nearest neighbor (ℓ2-norm)

slide-9
SLIDE 9

O

r Nearest neighbor searching

Distance guarantee

slide-10
SLIDE 10

O

r Nearest neighbor searching

Approximate nearest neighbor

slide-11
SLIDE 11

O

r c · r Nearest neighbor searching

Approximation factor c > 1

slide-12
SLIDE 12

O

Nearest neighbor searching

Example: Precompute Voronoi cells

slide-13
SLIDE 13

O

Nearest neighbor searching

Example: Precompute Voronoi cells

slide-14
SLIDE 14

O

Nearest neighbor searching

Given a target...

slide-15
SLIDE 15

O

Nearest neighbor searching

...quickly find the right cell

slide-16
SLIDE 16

O

Nearest neighbor searching

Works well in low dimensions

slide-17
SLIDE 17

Nearest neighbor searching

Problem setting

  • High dimensions d
slide-18
SLIDE 18

Nearest neighbor searching

Problem setting

  • High dimensions d
  • Large data set of size n = 2Ω(d/logd)

◮ Smaller n? =⇒ Use JLT to reduce d

slide-19
SLIDE 19

Nearest neighbor searching

Problem setting

  • High dimensions d
  • Large data set of size n = 2Ω(d/logd)

◮ Smaller n? =⇒ Use JLT to reduce d

  • Assumption: Data set lies on the sphere

◮ Equivalent to angular distance/cosine similarity in all of d ◮ Reduction from Eucl. NNS in d to Eucl. NNS on the sphere [AR’15]

slide-20
SLIDE 20

Nearest neighbor searching

Problem setting

  • High dimensions d
  • Large data set of size n = 2Ω(d/logd)

◮ Smaller n? =⇒ Use JLT to reduce d

  • Assumption: Data set lies on the sphere

◮ Equivalent to angular distance/cosine similarity in all of d ◮ Reduction from Eucl. NNS in d to Eucl. NNS on the sphere [AR’15]

  • Goal: Query time O(nρ) with ρ < 1
slide-21
SLIDE 21

O

Hyperplane LSH

[Charikar, STOC’02]

slide-22
SLIDE 22

O

Hyperplane LSH

Random point

slide-23
SLIDE 23

O

Hyperplane LSH

Opposite point

slide-24
SLIDE 24

O

Hyperplane LSH

Two Voronoi cells

slide-25
SLIDE 25

O

Hyperplane LSH

Another pair of points

slide-26
SLIDE 26

O

Hyperplane LSH

Another hyperplane

slide-27
SLIDE 27

O

Hyperplane LSH

Defines partition

slide-28
SLIDE 28

O

Hyperplane LSH

Preprocessing

slide-29
SLIDE 29

O

Hyperplane LSH

Query

slide-30
SLIDE 30

O

Hyperplane LSH

Collisions

slide-31
SLIDE 31

O

Hyperplane LSH

Failure

slide-32
SLIDE 32

O

Hyperplane LSH

Rerandomization

slide-33
SLIDE 33

O

Hyperplane LSH

Collisions

slide-34
SLIDE 34

O

Hyperplane LSH

Success

slide-35
SLIDE 35

O

Hyperplane LSH

Overview

slide-36
SLIDE 36

O

Hyperplane LSH

Overview

  • Simple: one hyperplane corresponds to one inner product
slide-37
SLIDE 37

O

Hyperplane LSH

Overview

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

slide-38
SLIDE 38

O

Hyperplane LSH

Overview

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

  • Can be made very efficient in practice

◮ Sparse hyperplane vectors [Ach’01, LHC’06] ◮ Orthogonal hyperplanes [TT’07]

slide-39
SLIDE 39

O

Hyperplane LSH

Overview

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

  • Can be made very efficient in practice

◮ Sparse hyperplane vectors [Ach’01, LHC’06] ◮ Orthogonal hyperplanes [TT’07]

  • Theoretically suboptimal: use “nicer” (lattice-based) partitions
slide-40
SLIDE 40

O

Hyperplane LSH

Overview

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

  • Can be made very efficient in practice

◮ Sparse hyperplane vectors [Ach’01, LHC’06] ◮ Orthogonal hyperplanes [TT’07]

  • Theoretically suboptimal: use “nicer” (lattice-based) partitions

◮ Random points [AI’06, AINR’14, ...]

slide-41
SLIDE 41

O

Hyperplane LSH

Overview

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

  • Can be made very efficient in practice

◮ Sparse hyperplane vectors [Ach’01, LHC’06] ◮ Orthogonal hyperplanes [TT’07]

  • Theoretically suboptimal: use “nicer” (lattice-based) partitions

◮ Random points [AI’06, AINR’14, ...] ◮ Leech lattice [AI’06]

slide-42
SLIDE 42

O

Hyperplane LSH

Overview

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

  • Can be made very efficient in practice

◮ Sparse hyperplane vectors [Ach’01, LHC’06] ◮ Orthogonal hyperplanes [TT’07]

  • Theoretically suboptimal: use “nicer” (lattice-based) partitions

◮ Random points [AI’06, AINR’14, ...] ◮ Leech lattice [AI’06] ◮ Classical root lattices Ad, Dd [JASG’08] ◮ Exceptional root lattices E6,7,8, F4, G2 [JASG’08]

slide-43
SLIDE 43

O

Hyperplane LSH

Overview

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

  • Can be made very efficient in practice

◮ Sparse hyperplane vectors [Ach’01, LHC’06] ◮ Orthogonal hyperplanes [TT’07]

  • Theoretically suboptimal: use “nicer” (lattice-based) partitions

◮ Random points [AI’06, AINR’14, ...] ◮ Leech lattice [AI’06] ◮ Classical root lattices Ad, Dd [JASG’08] ◮ Exceptional root lattices E6,7,8, F4, G2 [JASG’08] ◮ Cross-polytopes [TT’07, AILRS’15, KW’17]

slide-44
SLIDE 44

O

Hyperplane LSH

Overview

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

  • Can be made very efficient in practice

◮ Sparse hyperplane vectors [Ach’01, LHC’06] ◮ Orthogonal hyperplanes [TT’07]

  • Theoretically suboptimal: use “nicer” (lattice-based) partitions

◮ Random points [AI’06, AINR’14, ...] ◮ Leech lattice [AI’06] ◮ Classical root lattices Ad, Dd [JASG’08] ◮ Exceptional root lattices E6,7,8, F4, G2 [JASG’08] ◮ Cross-polytopes [TT’07, AILRS’15, KW’17] ◮ Hypercubes [TT’07]

slide-45
SLIDE 45

O

Hyperplane LSH

Asymptotically “optimal”

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

  • Can be made very efficient in practice

◮ Sparse hyperplane vectors [Ach’01, LHC’06] ◮ Orthogonal hyperplanes [TT’07]

  • Theoretically suboptimal: use “nicer” (lattice-based) partitions

◮ Random points [AI’06, AINR’14, ...] ◮ Leech lattice [AI’06] ◮ Classical root lattices Ad, Dd [JASG’08] ◮ Exceptional root lattices E6,7,8, F4, G2 [JASG’08] ◮ Cross-polytopes [TT’07, AILRS’15, KW’17] ◮ Hypercubes [TT’07]

slide-46
SLIDE 46

O

Hyperplane LSH

Topic of this paper

  • Simple: one hyperplane corresponds to one inner product
  • Easy to analyze: collision probability 1 − θ

π for vectors at angle θ

  • Can be made very efficient in practice

◮ Sparse hyperplane vectors [Ach’01, LHC’06] ◮ Orthogonal hyperplanes [TT’07]

  • Theoretically suboptimal: use “nicer” (lattice-based) partitions

◮ Random points [AI’06, AINR’14, ...] ◮ Leech lattice [AI’06] ◮ Classical root lattices Ad, Dd [JASG’08] ◮ Exceptional root lattices E6,7,8, F4, G2 [JASG’08] ◮ Cross-polytopes [TT’07, AILRS’15, KW’17] ◮ Hypercubes [TT’07]

slide-47
SLIDE 47

O

Hypercube LSH

[Terasawa–Tanaka, WADS’07]

slide-48
SLIDE 48

O

Hypercube LSH

Vertices of hypercube

slide-49
SLIDE 49

O

Hypercube LSH

Random rotation

slide-50
SLIDE 50

O

Hypercube LSH

Voronoi regions

slide-51
SLIDE 51

O

Hypercube LSH

Defines partition

slide-52
SLIDE 52

Hypercube LSH

Collision probabilities

Hyperplane LSH Hypercube LSH

arccos( 2

π ) π 3 π 2

π

1 π 3 π

ν 1 → θ → p(θ)1/d

slide-53
SLIDE 53

Hypercube LSH

Collision probabilities

Hyperplane LSH Hypercube LSH

arccos( 2

π ) π 3 π 2

π

1 π 3 π

ν 1 → θ → p(θ)1/d

  • Two vectors at angle ( π

2 )− lie in the same orthant with probability ( 1 π)d

  • Two vectors at angle π

3 lie in the same orthant with probability (

  • 3

π )d

slide-54
SLIDE 54

Hypercube LSH

Asymptotic performance (random data)

Hyperplane LSH Hypercube LSH Cross-polytope LSH

1 2 2 2 2 4 0.05 0.1 0.2 0.5 1 → c → ρ

slide-55
SLIDE 55

Hypercube LSH

Asymptotic performance (random data)

Hyperplane LSH Hypercube LSH Cross-polytope LSH

1 2 2 2 2 4 0.05 0.1 0.2 0.5 1 → c → ρ

  • Hyperplane LSH: ρ =
  • 2

πcln2 + O( 1 c2 )

slide-56
SLIDE 56

Hypercube LSH

Asymptotic performance (random data)

Hyperplane LSH Hypercube LSH Cross-polytope LSH

1 2 2 2 2 4 0.05 0.1 0.2 0.5 1 → c → ρ

  • Hyperplane LSH: ρ =
  • 2

πcln2 + O( 1 c2 )

  • Hypercube LSH: ρ =
  • 2

πclnπ + O( 1 c2 ) – saves factor log2(π) ≈ 1.65

slide-57
SLIDE 57

Hypercube LSH

Asymptotic performance (random data)

Hyperplane LSH Hypercube LSH Cross-polytope LSH

1 2 2 2 2 4 0.05 0.1 0.2 0.5 1 → c → ρ

  • Hyperplane LSH: ρ =
  • 2

πcln2 + O( 1 c2 )

  • Hypercube LSH: ρ =
  • 2

πclnπ + O( 1 c2 ) – saves factor log2(π) ≈ 1.65

  • Cross-polytope LSH: ρ =

1 2c2−1 + o( 1 c2 )

slide-58
SLIDE 58

Conclusions

Positive results

  • Exact asymptotics for full-dimensional hypercube LSH
  • Exact asymptotics for partial hypercube LSH when d′ ≤ O(d/logd)
  • Asymptotically superior to hyperplane LSH
  • Theoretical justification for using orthogonal hyperplanes
slide-59
SLIDE 59

Conclusions

Positive results

  • Exact asymptotics for full-dimensional hypercube LSH
  • Exact asymptotics for partial hypercube LSH when d′ ≤ O(d/logd)
  • Asymptotically superior to hyperplane LSH
  • Theoretical justification for using orthogonal hyperplanes

Negative results

  • Asymptotically inferior to e.g. cross-polytope LSH
  • Need large hypercubes to beat hyperplane LSH
slide-60
SLIDE 60

Conclusions

Positive results

  • Exact asymptotics for full-dimensional hypercube LSH
  • Exact asymptotics for partial hypercube LSH when d′ ≤ O(d/logd)
  • Asymptotically superior to hyperplane LSH
  • Theoretical justification for using orthogonal hyperplanes

Negative results

  • Asymptotically inferior to e.g. cross-polytope LSH
  • Need large hypercubes to beat hyperplane LSH

Open problems

  • Exact asymptotics for all of partial hypercube LSH
  • Other, better partition families?
slide-61
SLIDE 61

Conclusions

Positive results

  • Exact asymptotics for full-dimensional hypercube LSH
  • Exact asymptotics for partial hypercube LSH when d′ ≤ O(d/logd)
  • Asymptotically superior to hyperplane LSH
  • Theoretical justification for using orthogonal hyperplanes

Negative results

  • Asymptotically inferior to e.g. cross-polytope LSH
  • Need large hypercubes to beat hyperplane LSH

Open problems

  • Exact asymptotics for all of partial hypercube LSH
  • Other, better partition families?

Thank you for your attention!