Approximate Nearest Neighbors via Point Location Among Balls Method - - PowerPoint PPT Presentation

approximate nearest neighbors via point location among
SMART_READER_LITE
LIVE PREVIEW

Approximate Nearest Neighbors via Point Location Among Balls Method - - PowerPoint PPT Presentation

Approximate Nearest Neighbors via Point Location Among Balls Method of Har-Peled (improved version from notes) Reduce -ANN query on n points to point 1 location in equal balls (PLEB) queries O n log t n


slide-1
SLIDE 1

Approximate Nearest Neighbors via Point Location Among Balls

slide-2
SLIDE 2

Method of Har-Peled

(improved version from notes)

 Reduce -ANN query on n points to point

location in equal balls (PLEB) queries

− Preprocessing space − Preprocessing time − Query time

O n  log t n   Olog n  

Olog n  

1

slide-3
SLIDE 3

Notation

d Pq

Distance from point q to nearest neighbor point in set P

NNbr P,r

UballsP ,r

Union of balls of radius r about points in P “Nearest Neighbor” data structure Returns TRUE and a witness point if query point q is in and FALSE otherwise

U ballsP ,r

 I P ,r , R ,

“Interval Nearest Neighbor” data structure for points in set P,

  • ver range [r, R], with approximation error

Indicates if is outside range [r, R] or returns the ball centered at the point -ANN to q

d Pq 1

slide-4
SLIDE 4

Reduction from ANN to PLEBs

 Build a tree D

− Each node v has an interval NNbr data structure − Use to decide how to traverse the tree when

search reaches node v

I v

I v

slide-5
SLIDE 5

Constructing D

 Given set P of n points in metric space M

slide-6
SLIDE 6

Constructing D

 Find the ball radius r such that has

connected components

UballsP ,r ⌈n/2⌉

Connected Components: 8 r = 0

slide-7
SLIDE 7

Constructing D

 Find the value of r such that has

connected components

U ballsP ,r ⌈n/2⌉

Connected Components: 8 r = 0.25

slide-8
SLIDE 8

Constructing D

 Find the value of r such that has

connected components

U ballsP ,r ⌈n/2⌉

Connected Components: 6 r = 0.5

slide-9
SLIDE 9

Constructing D

 Find the value of r such that has

connected components

U ballsP ,r ⌈n/2⌉

Connected Components: 4 r = 0.65

slide-10
SLIDE 10

Constructing D

 Recursively build a sub tree for each connected

component and add as child of root node v

v

slide-11
SLIDE 11

Outer Child

 Choose one representative from each

connected component to be in set Q

v

slide-12
SLIDE 12

Outer Child

 Recursively build a tree over points in Q and

hang it on on node v

 This child of v is the “

  • uter child”

v

slide-13
SLIDE 13

Constructing D

 Build the interval NNbr data structure for node v

 I v= I P ,r ,R ,/ 4

point set search range [r, R] approximation error

R=2c nr /

Where & are parameters that will be defined later...

c 

Let

slide-14
SLIDE 14

Answering a query using D

 Given query point q, use to decide between

three cases

 I v

v

slide-15
SLIDE 15

Answering a query using D

Case 1:

− returns and search terminates

v

 I v 1 ANN

slide-16
SLIDE 16

Answering a query using D

Case 2:

− Recurse into child corresponding to connected

component containing q

v

d Pq≤rv

slide-17
SLIDE 17

Answering a query using D

Case 3:

− Recurse into outer child

v

d PqRv

slide-18
SLIDE 18

algorithm terminates

 If at step i we consider a set of size

then at step i+1 we consider a set of size

 Thus search halts after number of steps

ni ni  1 ≤ ni / 2  1 steps≤log3/2n

slide-19
SLIDE 19

Algorithm is correct

 Same result as target ball query on all

constructed balls

 Approximation error

− From node v to a connected component child

 No approximation error

− From node v to the “outer child”: − From the interval NNbr search: 1/c 1/4

slide-20
SLIDE 20

Approximation error

t≤1 4  ∏

i=1 log3 /2n

1  c  ≤exp  4  ∏

i=1 log3/2n

 c c  ≤exp  4  ∑

i=1 log3/2n 

c   ≤exp  2  1

set =⌈ log3/2n⌉

c

and large enough so that... Thus result of a query on d is -ANN to query point q

≤1

slide-21
SLIDE 21

Query time

 As search proceeds down tree D

− at most two NNbr queries are performed at a node

and we traverse O(log n) nodes

− at last node the data structure performs

NNbr queries

− Query time is

 I v

Olog log n  /=O log n   Olog n  

slide-22
SLIDE 22

Efficient Construction

 Construction space/time is currently  Use HST of P to t-approximate metric M  Use correspondence between subtrees in HST

and connected components to find the ball radius r that gives connected components

 Results in construction space/time O n

 log t n   On

2

⌈n/2⌉

slide-23
SLIDE 23

What have we done?

 Reduced an ANN query to multiple NNbr

queries

 But NNbr queries seem hard to solve efficiently

− Solution: Use deformed “approximate balls” − Same bounds hold for the extension to

“approximate balls”

slide-24
SLIDE 24

Questions