Learning Nearest Neighbor Graphs from Noisy Distance Samples Noisy - - PowerPoint PPT Presentation

learning nearest neighbor graphs from
SMART_READER_LITE
LIVE PREVIEW

Learning Nearest Neighbor Graphs from Noisy Distance Samples Noisy - - PowerPoint PPT Presentation

Learning Nearest Neighbor Graphs from Learning Nearest Neighbor Graphs from Noisy Distance Samples Noisy Distance Samples Blake Mason, Ardhendu Tripathy, & Robert Nowak Blake Mason, Ardhendu Tripathy, & Robert Nowak Motivation Wish to


slide-1
SLIDE 1

Blake Mason, Ardhendu Tripathy, & Robert Nowak

Learning Nearest Neighbor Graphs from Noisy Distance Samples

Blake Mason, Ardhendu Tripathy, & Robert Nowak

Learning Nearest Neighbor Graphs from Noisy Distance Samples

slide-2
SLIDE 2

Motivation

Wish to learn ‘most similar’ or ‘closest’ items to a given from noisy measurements

slide-3
SLIDE 3

Motivation

Wish to learn ‘most similar’ or ‘closest’ items to a given from noisy measurements

amazon.com/discover

slide-4
SLIDE 4

Motivation

Wish to learn ‘most similar’ or ‘closest’ items to a given from noisy measurements

Fujitsu white paper

slide-5
SLIDE 5

Motivation

Wish to learn ‘most similar’ or ‘closest’ items to a given from noisy measurements We don’t know the given a priori. We want to answer ‘closest’ queries for any item quickly!

slide-6
SLIDE 6

The Nearest Neighbor Graph Problem

Sharma et al. (2015)

slide-7
SLIDE 7

Preliminaries and Notation

slide-8
SLIDE 8

Outline of ANNTri

slide-9
SLIDE 9

Elimination via the triangle inequality

j i l k

slide-10
SLIDE 10

Triangle Inequality Bounds

slide-11
SLIDE 11

Theoretical Results

  • Worst case complexity is always O(n2)
  • In general, order matters
slide-12
SLIDE 12

Theoretical Results

  • Often, we can do better:
slide-13
SLIDE 13

Theoretical Results

  • An example of separation:
slide-14
SLIDE 14

Theoretical Results

slide-15
SLIDE 15

Experimental Results

  • Simulated data
  • 100 points in ℝ2
  • 10 clusters of 10 points
  • Euclidean distance
  • Gaussian noise, 𝜏2 =

0.1

slide-16
SLIDE 16

Experimental Results

  • Compare against

Random sampling

  • Test effect of triangle

inequality

slide-17
SLIDE 17

Experimental Results

  • The metric is (2d)

Euclidean

  • We can compare

against (distance) matrix completion

  • With a distance matrix,

the graph can be computed easily

slide-18
SLIDE 18

Experimental Results

  • What shoes are most similar?
slide-19
SLIDE 19

Experimental Results

  • What shoes are most similar?
  • 85 images from UTZappos50K dataset
slide-20
SLIDE 20

Experimental Results

  • What shoes are most similar?
  • 85 images from UTZappos50K dataset
  • Human judgements collected by Heim et al., (2015).
slide-21
SLIDE 21

Experimental Results

slide-22
SLIDE 22

Experimental Results

  • What shoes are most similar?
  • 85 images from UTZappos50K dataset
  • Human judgements collected by Heim et al., (2015).
slide-23
SLIDE 23

Experimental Results

  • What shoes are most similar?
  • 85 images from UTZappos50K dataset
  • Human judgements collected by Heim et al., (2015).
slide-24
SLIDE 24

Main takeways for ANNTri

  • 1. ANNTri finds the nearest neighbor graph for general

metrics using the triangle inequality

  • 2. Only requires access to noisy oracle
  • 3. In favorable settings, requires 𝑷(𝒐𝒎𝒑𝒉 𝒐 𝚬−𝟑) queries

versus 𝑷 𝒐𝟑𝚬−𝟑 needed by brute force!