new directions in approximate nearest neighbors for the
play

New directions in approximate nearest neighbors for the angular - PowerPoint PPT Presentation

New directions in approximate nearest neighbors for the angular distance Thijs Laarhoven mail@thijs.com http://www.thijs.com/ Proximity Workshop, College Park (MD), USA (January 13, 2016) Nearest neighbor searching O Nearest neighbor


  1. Structured filters Construct concatenated code O

  2. Structured filters Normalize (only for example) O

  3. Structured filters Normalize (only for example) O

  4. Structured filters Normalize (only for example) O

  5. Structured filters Construct Voronoi cells O

  6. Structured filters Defines partition O

  7. Structured filters ...with efficient decoding O

  8. Structured filters Techniques • Idea 1: Increase number of regions to 2 Θ( d ) ◮ Number of hash tables increases to 2 Θ( d ) – ok for n = 2 Θ( d ) ◮ Decoding cost potentially too large O

  9. Structured filters Techniques • Idea 1: Increase number of regions to 2 Θ( d ) ◮ Number of hash tables increases to 2 Θ( d ) – ok for n = 2 Θ( d ) ◮ Decoding cost potentially too large • Idea 2: Use structured codes for random regions ◮ Spherical/Voronoi LSH with dependent random points ◮ Concatenated code of log d low-dim. spherical codes O ◮ Allows for efficient list-decoding

  10. Structured filters Techniques • Idea 1: Increase number of regions to 2 Θ( d ) ◮ Number of hash tables increases to 2 Θ( d ) – ok for n = 2 Θ( d ) ◮ Decoding cost potentially too large • Idea 2: Use structured codes for random regions ◮ Spherical/Voronoi LSH with dependent random points ◮ Concatenated code of log d low-dim. spherical codes O ◮ Allows for efficient list-decoding • Idea 3: Replace partitions with filters ◮ Relaxation: filters need not partition the space ◮ Simplified analysis ◮ Might not be needed to achieve improvement

  11. Structured filters Results For random sparse settings ( n = 2 o ( d ) ), query time O ( n ρ ) with 1 � � ρ = 1 + o d (1) . 2 c 2 − 1 O

  12. Structured filters Results For random sparse settings ( n = 2 o ( d ) ), query time O ( n ρ ) with 1 � � ρ = 1 + o d (1) . 2 c 2 − 1 For random dense settings ( n = 2 κ d with small κ ), we obtain 1 − κ � � ρ = 1 + o d ,κ (1) . 2 c 2 − 1 O

  13. Structured filters Results For random sparse settings ( n = 2 o ( d ) ), query time O ( n ρ ) with 1 � � ρ = 1 + o d (1) . 2 c 2 − 1 For random dense settings ( n = 2 κ d with small κ ), we obtain 1 − κ � � ρ = 1 + o d ,κ (1) . 2 c 2 − 1 O For random dense settings ( n = 2 κ d with large κ ), we obtain ρ = − 1 1 � � � � 2 κ log 1 − 1 + o d (1) . 2 c 2 − 1

  14. Asymmetric nearest neighbors Previous results: symmetric NNS • Query time: O ( n ρ ) • Update time: O ( n ρ ) • Preprocessing time: O ( n 1+ ρ ) • Space complexity: O ( n 1+ ρ )

  15. Asymmetric nearest neighbors Previous results: symmetric NNS • Query time: O ( n ρ ) • Update time: O ( n ρ ) • Preprocessing time: O ( n 1+ ρ ) • Space complexity: O ( n 1+ ρ ) Can we get a tradeoff between these costs?

  16. Asymmetric nearest neighbors Voronoi regions O

  17. Asymmetric nearest neighbors Spherical cap

  18. Asymmetric nearest neighbors Cap height α α

  19. Asymmetric nearest neighbors Smaller α = ⇒ Larger caps, more work α

  20. Asymmetric nearest neighbors Larger α = ⇒ Smaller caps, less work α

  21. Asymmetric nearest neighbors α q > α u = ⇒ Faster queries, slower updates α u α q

  22. Asymmetric nearest neighbors α q < α u = ⇒ Slower queries, faster updates α q α u

  23. Asymmetric nearest neighbors Results General expressions ρ q = ( 2c 2 − 1 ) / c 4 Minimize space ( α q /α u = cos θ ) ρ u = 0 ρ q = 1 / ( 2c 2 − 1 ) Balance costs α q α u ρ u = 1 / ( 2c 2 − 1 ) ( α q /α u = 1) Minimize time ρ q = 0 ρ u = ( 2c 2 − 1 ) / ( c 2 − 1 ) 2 ( α q /α u = 1 / cos θ ) Query time O ( n ρ q ), update time O ( n ρ u ), preprocessing time O ( n 1+ ρ u ), space complexity O ( n 1+ ρ u )

  24. Asymmetric nearest neighbors Results General expressions Small c = 1 + ε ρ q = ( 2c 2 − 1 ) / c 4 ρ q = 1 − 4 ε 2 + O ( ε 3 ) Minimize space ( α q /α u = cos θ ) ρ u = 0 ρ u = 0 ρ q = 1 / ( 2c 2 − 1 ) ρ q = 1 − 4 ε + O ( ε 2 ) Balance costs α q α u ρ u = 1 / ( 2c 2 − 1 ) ρ u = 1 − 4 ε + O ( ε 2 ) ( α q /α u = 1) Minimize time ρ q = 0 ρ q = 0 ρ u = ( 2c 2 − 1 ) / ( c 2 − 1 ) 2 ρ u = 1 / (4 ε 2 ) + O (1 /ε ) ( α q /α u = 1 / cos θ ) Query time O ( n ρ q ), update time O ( n ρ u ), preprocessing time O ( n 1+ ρ u ), space complexity O ( n 1+ ρ u )

  25. Asymmetric nearest neighbors Results General expressions Large c → ∞ ρ q = ( 2c 2 − 1 ) / c 4 ρ q = 2 / c 2 + O (1 / c 4 ) Minimize space ( α q /α u = cos θ ) ρ u = 0 ρ u = 0 ρ q = 1 / ( 2c 2 − 1 ) ρ q = 1 / (2 c 2 ) + O (1 / c 4 ) Balance costs α q α u ρ u = 1 / ( 2c 2 − 1 ) ρ u = 1 / (2 c 2 ) + O (1 / c 4 ) ( α q /α u = 1) Minimize time ρ q = 0 ρ q = 0 ρ u = 2 / c 2 + O (1 / c 4 ) ρ u = ( 2c 2 − 1 ) / ( c 2 − 1 ) 2 ( α q /α u = 1 / cos θ ) Query time O ( n ρ q ), update time O ( n ρ u ), preprocessing time O ( n 1+ ρ u ), space complexity O ( n 1+ ρ u )

  26. Asymmetric nearest neighbors Tradeoffs α q α u

  27. Conclusions Main result: Allow using more regions with list-decodable codes • For n = 2 o ( d ) , non-asymptotic improvement • For n = 2 Θ( d ) , asymptotic improvement • Corollary: Lower bounds for n = 2 o ( d ) do not hold for n = 2 Θ( d ) • Improved tradeoffs between query and update complexities

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend