Approximate Nearest Line Search in High Dimensions Sepideh Mahabadi - - PowerPoint PPT Presentation

β–Ά
approximate nearest line search in high dimensions
SMART_READER_LITE
LIVE PREVIEW

Approximate Nearest Line Search in High Dimensions Sepideh Mahabadi - - PowerPoint PPT Presentation

Approximate Nearest Line Search in High Dimensions Sepideh Mahabadi 1 The NLS Problem Given: a set of lines in 2 The NLS Problem Given: a set of lines in Goal: build a data structure s.t.


slide-1
SLIDE 1

Approximate Nearest Line Search in High Dimensions

Sepideh Mahabadi

1

slide-2
SLIDE 2

The NLS Problem

  • Given: a set of 𝑂 lines 𝑀 in ℝ𝑒

2

slide-3
SLIDE 3

The NLS Problem

  • Given: a set of 𝑂 lines 𝑀 in ℝ𝑒
  • Goal: build a data structure s.t.

– given a query π‘Ÿ, find the closest line β„“βˆ— to π‘Ÿ

3

slide-4
SLIDE 4

The NLS Problem

  • Given: a set of 𝑂 lines 𝑀 in ℝ𝑒
  • Goal: build a data structure s.t.

– given a query π‘Ÿ, find the closest line β„“βˆ— to π‘Ÿ – polynomial space – sub-linear query time

4

slide-5
SLIDE 5

The NLS Problem

  • Given: a set of 𝑂 lines 𝑀 in ℝ𝑒
  • Goal: build a data structure s.t.

– given a query π‘Ÿ, find the closest line β„“βˆ— to π‘Ÿ – polynomial space – sub-linear query time

Approximation

  • Finds an approximate closest line β„“

𝑒𝑗𝑑𝑒 π‘Ÿ,β„“ ≀ 𝑒𝑗𝑑𝑒(π‘Ÿ, β„“βˆ—)(1 + πœ—)

5

slide-6
SLIDE 6

BACKGROUND

Nearest Neighbor Problems Motivation Previous Work Our result Notation

6

slide-7
SLIDE 7

Nearest Neighbor Problem

NN: Given a set of 𝑂 points 𝑄, build a data structure s.t. given a query point π‘Ÿ, finds the closest point π‘žβˆ— to π‘Ÿ.

7

slide-8
SLIDE 8

Nearest Neighbor Problem

NN: Given a set of 𝑂 points 𝑄, build a data structure s.t. given a query point π‘Ÿ, finds the closest point π‘žβˆ— to π‘Ÿ.

  • Applications: database, information retrieval,

pattern recognition, computer vision

– Features: dimensions – Objects: points – Similarity: distance between points

8

slide-9
SLIDE 9

Nearest Neighbor Problem

NN: Given a set of 𝑂 points 𝑄, build a data structure s.t. given a query point π‘Ÿ, finds the closest point π‘žβˆ— to π‘Ÿ.

  • Applications: database, information retrieval,

pattern recognition, computer vision

– Features: dimensions – Objects: points – Similarity: distance between points

  • Current solutions suffer from β€œcurse of

dimensionality”:

– Either space or query time is exponential in 𝑒 – Little improvement over linear search

9

slide-10
SLIDE 10

Approximate Nearest Neighbor(ANN)

  • ANN: Given a set of 𝑂 points 𝑄, build a data

structure s.t. given a query point π‘Ÿ, finds an approximate closest point π‘ž to π‘Ÿ, i.e., 𝑒𝑗𝑑𝑒 π‘Ÿ,π‘ž ≀ 𝑒𝑗𝑑𝑒 π‘Ÿ, π‘žβˆ— 1 + πœ—

10

slide-11
SLIDE 11

Approximate Nearest Neighbor(ANN)

  • ANN: Given a set of 𝑂 points 𝑄, build a data

structure s.t. given a query point π‘Ÿ, finds an approximate closest point π‘ž to π‘Ÿ, i.e., 𝑒𝑗𝑑𝑒 π‘Ÿ,π‘ž ≀ 𝑒𝑗𝑑𝑒 π‘Ÿ, π‘žβˆ— 1 + πœ—

  • There exist data structures with different
  • tradeoffs. Example:

– Space: 𝑒𝑂 𝑃

1 πœ—2

– Query time: 𝑒 log 𝑂

πœ— 𝑃 1

11

slide-12
SLIDE 12

Motivation for NLS

One of the simplest generalizations of ANN: data items are represented by 𝑙- flats (affine subspace) instead of points

12

slide-13
SLIDE 13

Motivation for NLS

One of the simplest generalizations of ANN: data items are represented by 𝑙- flats (affine subspace) instead of points

  • Model data under linear variations
  • Unknown or unimportant parameters in

database

13

slide-14
SLIDE 14

Motivation for NLS

One of the simplest generalizations of ANN: data items are represented by 𝑙- flats (affine subspace) instead of points

  • Model data under linear variations
  • Unknown or unimportant parameters in

database

  • Example:

– Varying light gain parameter of images – Each image/point becomes a line – Search for the closest line to the query image

14

slide-15
SLIDE 15

Previous and Related Work

15

  • Magen[02]: Nearest Subspace Search

– Query time is fast : 𝑒 + log 𝑂 +

1 πœ— 𝑃 1

– Space is super-polynomial : 2 log 𝑂 𝑃 1

slide-16
SLIDE 16

Previous and Related Work

16

  • Magen[02]: Nearest Subspace Search

– Query time is fast : 𝑒 + log 𝑂 +

1 πœ— 𝑃 1

– Space is super-polynomial : 2 log 𝑂 𝑃 1

Dual Problem: Database is a set of points, query is a 𝑙-flat

  • [AIKN] for 1-flat: for any 𝑒 > 0

– Query time: 𝑃 𝑒3𝑂0.5+𝑒 – Space: 𝑒2𝑂𝑃

1 πœ—2+ 1 𝑒2

slide-17
SLIDE 17

Previous and Related Work

17

  • Magen[02]: Nearest Subspace Search

– Query time is fast : 𝑒 + log 𝑂 +

1 πœ— 𝑃 1

– Space is super-polynomial : 2 log 𝑂 𝑃 1

Dual Problem: Database is a set of points, query is a 𝑙-flat

  • [AIKN] for 1-flat: for any 𝑒 > 0

– Query time: 𝑃 𝑒3𝑂0.5+𝑒 – Space: 𝑒2𝑂𝑃

1 πœ—2+ 1 𝑒2

  • Very recently [MNSS] extended it for 𝑙-flats

– Query time 𝑃 π‘œ

𝑙 𝑙+1βˆ’πœ+𝑒

– Space: 𝑃(π‘œ

1+

πœπ‘™ 𝑙+1βˆ’πœ + π‘œ log𝑃 1 𝑒 π‘œ)

slide-18
SLIDE 18

Our Result

We give a randomized algorithm that for any sufficiently small πœ— reports a 1 + πœ— -approximate solution with high probability

  • Space: 𝑂 + 𝑒 𝑃

1 πœ—2

  • Time : 𝑒 + log 𝑂 +

1 πœ— 𝑃 1

18

slide-19
SLIDE 19

Our Result

We give a randomized algorithm that for any sufficiently small πœ— reports a 1 + πœ— -approximate solution with high probability

  • Space: 𝑂 + 𝑒 𝑃

1 πœ—2

  • Time : 𝑒 + log 𝑂 +

1 πœ— 𝑃 1

  • Matches up to polynomials, the performance of best

algorithm for ANN. No exponential dependence on 𝑒

19

slide-20
SLIDE 20

Our Result

We give a randomized algorithm that for any sufficiently small πœ— reports a 1 + πœ— -approximate solution with high probability

  • Space: 𝑂 + 𝑒 𝑃

1 πœ—2

  • Time : 𝑒 + log 𝑂 +

1 πœ— 𝑃 1

  • Matches up to polynomials, the performance of best

algorithm for ANN. No exponential dependence on 𝑒

  • The first algorithm with poly log query time and

polynomial space for objects other than points

20

slide-21
SLIDE 21

Our Result

We give a randomized algorithm that for any sufficiently small πœ— reports a 1 + πœ— -approximate solution with high probability

  • Space: 𝑂 + 𝑒 𝑃

1 πœ—2

  • Time : 𝑒 + log 𝑂 +

1 πœ— 𝑃 1

  • Matches up to polynomials, the performance of best

algorithm for ANN. No exponential dependence on 𝑒

  • The first algorithm with poly log query time and

polynomial space for objects other than points

  • Only uses reductions to ANN

21

slide-22
SLIDE 22

Notation

  • 𝑀 : the set of lines with size 𝑂
  • q : the query point

22

slide-23
SLIDE 23

Notation

  • 𝑀 : the set of lines with size 𝑂
  • q : the query point
  • 𝐢(𝑑, 𝑠): ball of radius 𝑠 around 𝑑

23

slide-24
SLIDE 24

Notation

  • 𝑀 : the set of lines with size 𝑂
  • q : the query point
  • 𝐢(𝑑, 𝑠): ball of radius 𝑠 around 𝑑
  • 𝑒𝑗𝑑𝑒: the Euclidean distance

between objects

24

slide-25
SLIDE 25

Notation

  • 𝑀 : the set of lines with size 𝑂
  • q : the query point
  • 𝐢(𝑑, 𝑠): ball of radius 𝑠 around 𝑑
  • 𝑒𝑗𝑑𝑒: the Euclidean distance

between objects

  • π‘π‘œπ‘•π‘šπ‘“: defined between lines

25

slide-26
SLIDE 26

Notation

  • 𝑀 : the set of lines with size 𝑂
  • q : the query point
  • 𝐢(𝑑, 𝑠): ball of radius 𝑠 around 𝑑
  • 𝑒𝑗𝑑𝑒: the Euclidean distance

between objects

  • π‘π‘œπ‘•π‘šπ‘“: defined between lines
  • πœ€-close: two lines β„“ , β„“β€² are πœ€-close

if sin(π‘π‘œπ‘•π‘šπ‘“ β„“, β„“β€² ) ≀ πœ€. Similarly we define πœ€-far/ strictly πœ€-close/ strictly πœ€-far

26

slide-27
SLIDE 27

Notation

  • 𝑀 : the set of lines with size 𝑂
  • q : the query point
  • 𝐢(𝑑, 𝑠): ball of radius 𝑠 around 𝑑
  • 𝑒𝑗𝑑𝑒: the Euclidean distance

between objects

  • π‘π‘œπ‘•π‘šπ‘“: defined between lines
  • πœ€-close: two lines β„“ , β„“β€² are πœ€-close

if sin(π‘π‘œπ‘•π‘šπ‘“ β„“, β„“β€² ) ≀ πœ€. Similarly we define πœ€-far/ strictly πœ€-close/ strictly πœ€-far

  • 𝐷𝑄ℓ1β†’β„“2: closest point on β„“1 to β„“2

27

slide-28
SLIDE 28

MODULES

Unbounded Module Net Module Parallel Module

28

slide-29
SLIDE 29

Unbounded Module - Intuition

  • All lines in 𝑀 pass through the origin

𝑝

29

slide-30
SLIDE 30

Unbounded Module - Intuition

  • All lines in 𝑀 pass through the origin

𝑝

  • Data structure:

– Project all lines onto any sphere 𝑇 𝑝,𝑠 to get point set 𝑄 – Build ANN data structure 𝐡𝑂𝑂(𝑄, πœ—)

30

slide-31
SLIDE 31

Unbounded Module - Intuition

  • All lines in 𝑀 pass through the origin

𝑝

  • Data structure:

– Project all lines onto any sphere 𝑇 𝑝,𝑠 to get point set 𝑄 – Build ANN data structure 𝐡𝑂𝑂(𝑄, πœ—)

  • Query Algorithm:

– Project the query on 𝑇(𝑝, 𝑠) to get π‘Ÿβ€² – Find the approximate closest point to π‘Ÿβ€², i.e., π‘ž = 𝐡𝑂𝑂𝑄 π‘Ÿβ€² – Return the corresponding line of π‘ž

31

slide-32
SLIDE 32

Unbounded Module

  • All lines in 𝑀 pass through a small ball

𝐢 𝑝, 𝑠

  • Query is far enough, outside of 𝐢(𝑝, 𝑆)
  • Use the same data structure and

query algorithm

32

slide-33
SLIDE 33

Unbounded Module

  • All lines in 𝑀 pass through a small ball

𝐢 𝑝, 𝑠

  • Query is far enough, outside of 𝐢(𝑝, 𝑆)
  • Use the same data structure and

query algorithm Lemma: if 𝑆 β‰₯ 𝑠

πœ—πœ€ , the returned line β„“π‘ž is

  • Either an approximate closest line
  • Or is πœ€-close to the closest line β„“βˆ—

33

slide-34
SLIDE 34

Unbounded Module

  • All lines in 𝑀 pass through a small ball

𝐢 𝑝, 𝑠

  • Query is far enough, outside of 𝐢(𝑝, 𝑆)
  • Use the same data structure and

query algorithm Lemma: if 𝑆 β‰₯ 𝑠

πœ—πœ€ , the returned line β„“π‘ž is

  • Either an approximate closest line
  • Or is πœ€-close to the closest line β„“βˆ—

This helps us further restrict our search to almost parallel lines to β„“π‘ž

34

slide-35
SLIDE 35

Net Module

  • Intuition: sampling points from each

line finely enough to get a set of points 𝑄, and building an 𝐡𝑂𝑂(𝑄, πœ—) should suffice to find the approximate closest line.

38

slide-36
SLIDE 36

Net Module

  • Intuition: sampling points from each

line finely enough to get a set of points 𝑄, and building an 𝐡𝑂𝑂(𝑄, πœ—) should suffice to find the approximate closest line. Lemma:

  • Let 𝑦 be the separation parameter:

distance between two adjacent samples on a line

  • Then

– Either the returned line β„“π‘ž is an approximate closest line – Or 𝑒𝑗𝑑𝑒 π‘Ÿ, β„“π‘ž ≀ 𝑦/πœ—

39

slide-37
SLIDE 37

Parallel Module - Intuition

  • All lines in 𝑀 are parallel

42

slide-38
SLIDE 38

Parallel Module - Intuition

  • All lines in 𝑀 are parallel
  • Data structure:

– Project all lines onto any hyper-plane 𝑕 which is perpendicular to all the lines to get point set 𝑄 – Build ANN data structure 𝐡𝑂𝑂(𝑄, πœ—)

43

slide-39
SLIDE 39

Parallel Module - Intuition

  • All lines in 𝑀 are parallel
  • Data structure:

– Project all lines onto any hyper-plane 𝑕 which is perpendicular to all the lines to get point set 𝑄 – Build ANN data structure 𝐡𝑂𝑂(𝑄, πœ—)

  • Query algorithm:

– Project the query on 𝑕 to get π‘Ÿβ€² – Find the approximate closest point to π‘Ÿβ€², i.e., π‘ž = 𝐡𝑂𝑂𝑄 π‘Ÿβ€² – Return the corresponding line to π‘ž

44

slide-40
SLIDE 40

Parallel Module

  • All lines in 𝑀 are πœ€-close to a base line ℓ𝑐
  • Project the lines onto a hyper-plane 𝑕 which is

perpendicular to ℓ𝑐

  • Query is close enough to 𝑕
  • Use the same data structure and query algorithm

45

slide-41
SLIDE 41

Parallel Module

  • All lines in 𝑀 are πœ€-close to a base line ℓ𝑐
  • Project the lines onto a hyper-plane 𝑕 which is

perpendicular to ℓ𝑐

  • Query is close enough to 𝑕
  • Use the same data structure and query algorithm

Lemma: if 𝑒𝑗𝑑𝑒 π‘Ÿ, 𝑕 ≀

πΈπœ— πœ€ , then

  • Either the returned line β„“π‘ž is an approximate closest

line

  • Or 𝑒𝑗𝑑𝑒 π‘Ÿ, β„“π‘ž

≀ 𝐸

46

slide-42
SLIDE 42

Parallel Module

  • All lines in 𝑀 are πœ€-close to a base line ℓ𝑐
  • Project the lines onto a hyper-plane 𝑕 which is

perpendicular to ℓ𝑐

  • Query is close enough to 𝑕
  • Use the same data structure and query algorithm

Lemma: if 𝑒𝑗𝑑𝑒 π‘Ÿ, 𝑕 ≀

πΈπœ— πœ€ , then

  • Either the returned line β„“π‘ž is an approximate closest

line

  • Or 𝑒𝑗𝑑𝑒 π‘Ÿ, β„“π‘ž

≀ 𝐸 Thus, for a set of almost parallel lines, we can use a set

  • f parallel modules to cover a bounded region.

47

slide-43
SLIDE 43

ALGORITHMS

General Case

  • Input lines can have any configuration
  • Divergent Case
  • Input lines are 𝑃(πœ—)-far from each other
  • Almost Parallel Case
  • Input lines are all 𝑃(πœ—)-close to each other

51

slide-44
SLIDE 44

Outline of the Algorithms

  • Input: a set of π‘œ lines 𝑇

52

slide-45
SLIDE 45

Outline of the Algorithms

  • Input: a set of π‘œ lines 𝑇
  • Randomly choose a subset of π‘œ/2 lines π‘ˆ

53

slide-46
SLIDE 46

Outline of the Algorithms

  • Input: a set of π‘œ lines 𝑇
  • Randomly choose a subset of π‘œ/2 lines π‘ˆ
  • Solve the problem over π‘ˆ to get a line β„“π‘ž

54

slide-47
SLIDE 47

Outline of the Algorithms

  • Input: a set of π‘œ lines 𝑇
  • Randomly choose a subset of π‘œ/2 lines π‘ˆ
  • Solve the problem over π‘ˆ to get a line β„“π‘ž
  • For logπ‘œ iterations

– Use β„“π‘ž to find a much closer line β„“π‘žβ€² – Update β„“π‘ž with β„“π‘ž

β€²

55

Improvement step

slide-48
SLIDE 48

Outline of the Algorithms

  • Input: a set of π‘œ lines 𝑇
  • Randomly choose a subset of π‘œ/2 lines π‘ˆ
  • Solve the problem over π‘ˆ to get a line β„“π‘ž
  • For logπ‘œ iterations

– Use β„“π‘ž to find a much closer line β„“π‘žβ€² – Update β„“π‘ž with β„“π‘ž

β€²

56

Improvement step

slide-49
SLIDE 49

Outline of the Algorithms

  • Input: a set of π‘œ lines 𝑇
  • Randomly choose a subset of π‘œ/2 lines π‘ˆ
  • Solve the problem over π‘ˆ to get a line β„“π‘ž
  • For logπ‘œ iterations

– Use β„“π‘ž to find a much closer line β„“π‘žβ€² – Update β„“π‘ž with β„“π‘ž

β€²

Why?

57

Improvement step

slide-50
SLIDE 50

Outline of the Algorithms

  • Input: a set of π‘œ lines 𝑇
  • Randomly choose a subset of π‘œ/2 lines π‘ˆ
  • Solve the problem over π‘ˆ to get a line β„“π‘ž
  • For logπ‘œ iterations

– Use β„“π‘ž to find a much closer line β„“π‘žβ€² – Update β„“π‘ž with β„“π‘ž

β€²

Let β„“1, … , β„“log π‘œ be the log π‘œ closest lines to π‘Ÿ in the set 𝑇

58

Improvement step

slide-51
SLIDE 51

Outline of the Algorithms

  • Input: a set of π‘œ lines 𝑇
  • Randomly choose a subset of π‘œ/2 lines π‘ˆ
  • Solve the problem over π‘ˆ to get a line β„“π‘ž
  • For logπ‘œ iterations

– Use β„“π‘ž to find a much closer line β„“π‘žβ€² – Update β„“π‘ž with β„“π‘ž

β€²

Let β„“1, … , β„“log π‘œ be the log π‘œ closest lines to π‘Ÿ in the set 𝑇 With high probability at least one of {β„“1, … , β„“log π‘œ} are sampled in π‘ˆ

– 𝑒𝑗𝑑𝑒 π‘Ÿ, β„“π‘ž ≀ 𝑒𝑗𝑑𝑒 π‘Ÿ, β„“log π‘œ (1 + πœ—) – log π‘œ improvement steps suffices to find an approximate closest line

59

Improvement step

slide-52
SLIDE 52

Improvement Step

Given a line β„“, how to improve it, i.e., find a closer line?

60

slide-53
SLIDE 53

Improvement Step

Given a line β„“, how to improve it, i.e., find a closer line?

  • Data structure
  • Query Processing Algorithm

61

slide-54
SLIDE 54

General Case

  • Search among all lines that are πœ—-far from

current line using Divergent Case

62

slide-55
SLIDE 55

General Case

  • Search among all lines that are πœ—-far from

current line using Divergent Case

  • Search among the lines that are almost

parallel to line found in previous step using Almost Parallel Case

63

slide-56
SLIDE 56

Divergent Case

Assume any two lines are πœ—-far; they diverge quickly.

64

slide-57
SLIDE 57

Divergent Case

Assume any two lines are πœ—-far; they diverge quickly.

  • Let β„“ be the current line, and β„“βˆ— be the closest

line to π‘Ÿ

  • Let 𝑦 = 𝑒𝑗𝑑𝑒(π‘Ÿ, β„“)
  • 𝑒𝑗𝑑𝑒 π‘Ÿ, β„“βˆ— ≀ 𝑦

65

slide-58
SLIDE 58

Divergent Case

Assume any two lines are πœ—-far; they diverge quickly.

  • Let β„“ be the current line, and β„“βˆ— be the closest

line to π‘Ÿ

  • Let 𝑦 = 𝑒𝑗𝑑𝑒(π‘Ÿ, β„“)
  • 𝑒𝑗𝑑𝑒 π‘Ÿ, β„“βˆ— ≀ 𝑦

– All potential β„“βˆ— intersect 𝐢(π‘Ÿ, 𝑦)

66

slide-59
SLIDE 59

Divergent Case

Assume any two lines are πœ—-far; they diverge quickly.

  • Let β„“ be the current line, and β„“βˆ— be the closest

line to π‘Ÿ

  • Let 𝑦 = 𝑒𝑗𝑑𝑒(π‘Ÿ, β„“)
  • 𝑒𝑗𝑑𝑒 π‘Ÿ, β„“βˆ— ≀ 𝑦

– All potential β„“βˆ— intersect 𝐢(π‘Ÿ, 𝑦) – Good news: we can build a net module inside 𝐢 π‘Ÿ, 𝑦 with separation parameter 𝑦ϡ2 to improve over β„“

67

slide-60
SLIDE 60

Divergent Case

Assume any two lines are πœ—-far; they diverge quickly.

  • Let β„“ be the current line, and β„“βˆ— be the closest

line to π‘Ÿ

  • Let 𝑦 = 𝑒𝑗𝑑𝑒(π‘Ÿ, β„“)
  • 𝑒𝑗𝑑𝑒 π‘Ÿ, β„“βˆ— ≀ 𝑦

– All potential β„“βˆ— intersect 𝐢(π‘Ÿ, 𝑦) – Good news: we can build a net module inside 𝐢 π‘Ÿ, 𝑦 with separation parameter 𝑦ϡ2 to improve over β„“ – Bad news: we don’t know this ball in advance

68

slide-61
SLIDE 61

Divergent Case contd.

What we know:

  • 𝑒𝑗𝑑𝑒 β„“,β„“βˆ— ≀ 2𝑦
  • Let π‘Ÿβ€² be the projection of π‘Ÿ on β„“

69

slide-62
SLIDE 62

Divergent Case contd.

What we know:

  • 𝑒𝑗𝑑𝑒 β„“,β„“βˆ— ≀ 2𝑦
  • Let π‘Ÿβ€² be the projection of π‘Ÿ on β„“

70

slide-63
SLIDE 63

Divergent Case contd.

What we know:

  • 𝑒𝑗𝑑𝑒 β„“,β„“βˆ— ≀ 2𝑦
  • Let π‘Ÿβ€² be the projection of π‘Ÿ on β„“

– π·π‘„β„“β†’β„“βˆ— is not farther than

𝑦 πœ— from π‘Ÿβ€²

since they are πœ—-far

71

slide-64
SLIDE 64

Divergent Case contd.

What we know:

  • 𝑒𝑗𝑑𝑒 β„“,β„“βˆ— ≀ 2𝑦
  • Let π‘Ÿβ€² be the projection of π‘Ÿ on β„“

– π·π‘„β„“β†’β„“βˆ— is not farther than

𝑦 πœ— from π‘Ÿβ€²

since they are πœ—-far – π‘ͺ(𝒓′, 𝑷

π’š 𝝑 ) touches all such lines

72

slide-65
SLIDE 65

Data Structure

For each β„“ ∈ 𝑇

  • Sort all lines β„“β€² according to their distance from β„“

73

slide-66
SLIDE 66

Data Structure

For each β„“ ∈ 𝑇

  • Sort all lines β„“β€² according to their distance from β„“
  • For all 1 ≀ 𝑗 ≀ π‘œ, let 𝑇𝑗 be the π‘—π‘’β„Ž closest lines

74

slide-67
SLIDE 67

Data Structure

For each β„“ ∈ 𝑇

  • Sort all lines β„“β€² according to their distance from β„“
  • For all 1 ≀ 𝑗 ≀ π‘œ, let 𝑇𝑗 be the π‘—π‘’β„Ž closest lines

– Sort all lines in 𝑇𝑗 such as β„“β€² according to the position of 𝐷𝑄ℓ→ℓ′

75

slide-68
SLIDE 68

Data Structure

For each β„“ ∈ 𝑇

  • Sort all lines β„“β€² according to their distance from β„“
  • For all 1 ≀ 𝑗 ≀ π‘œ, let 𝑇𝑗 be the π‘—π‘’β„Ž closest lines

– Sort all lines in 𝑇𝑗 such as β„“β€² according to the position of 𝐷𝑄ℓ→ℓ′ – For each interval of lines 𝐡 in sorted 𝑇𝑗

76

slide-69
SLIDE 69

Data Structure

For each β„“ ∈ 𝑇

  • Sort all lines β„“β€² according to their distance from β„“
  • For all 1 ≀ 𝑗 ≀ π‘œ, let 𝑇𝑗 be the π‘—π‘’β„Ž closest lines

– Sort all lines in 𝑇𝑗 such as β„“β€² according to the position of 𝐷𝑄ℓ→ℓ′ – For each interval of lines 𝐡 in sorted 𝑇𝑗

  • Find smallest ball 𝐢𝐡(oA, rA) with its

center on β„“ which intersects all lines in 𝐡

  • > (𝑠

𝐡 ≀ 𝑃( 𝑦 πœ—))

77

slide-70
SLIDE 70

Data Structure

For each β„“ ∈ 𝑇

  • Sort all lines β„“β€² according to their distance from β„“
  • For all 1 ≀ 𝑗 ≀ π‘œ, let 𝑇𝑗 be the π‘—π‘’β„Ž closest lines

– Sort all lines in 𝑇𝑗 such as β„“β€² according to the position of 𝐷𝑄ℓ→ℓ′ – For each interval of lines 𝐡 in sorted 𝑇𝑗

  • Find smallest ball 𝐢𝐡(oA, rA) with its

center on β„“ which intersects all lines in 𝐡

  • > (𝑠

𝐡 ≀ 𝑃( 𝑦 πœ—))

  • Construct a net module inside of the ball
  • f 𝐢(𝑝𝐡, 𝑠

𝐡/πœ—2) with separation 𝑠 π΅πœ—3

(#samples = O(π‘œ 𝑠

𝐡/(πœ—2𝑠 π΅πœ—3)) = 𝑃(π‘œ/πœ—5))

78

slide-71
SLIDE 71

Data Structure

For each β„“ ∈ 𝑇

  • Sort all lines β„“β€² according to their distance from β„“
  • For all 1 ≀ 𝑗 ≀ π‘œ, let 𝑇𝑗 be the π‘—π‘’β„Ž closest lines

– Sort all lines in 𝑇𝑗 such as β„“β€² according to the position of 𝐷𝑄ℓ→ℓ′ – For each interval of lines 𝐡 in sorted 𝑇𝑗

  • Find smallest ball 𝐢𝐡(oA, rA) with its

center on β„“ which intersects all lines in 𝐡

  • > (𝑠

𝐡 ≀ 𝑃( 𝑦 πœ—))

  • Construct a net module inside of the ball
  • f 𝐢(𝑝𝐡, 𝑠

𝐡/πœ—2) with separation 𝑠 π΅πœ—3

(#samples = O(π‘œ 𝑠

𝐡/(πœ—2𝑠 π΅πœ—3)) = 𝑃(π‘œ/πœ—5))

  • Construct an unbounded module outside
  • f 𝐢𝐡 𝑝𝐡,

1 πœ—2 𝑠 𝐡

79

slide-72
SLIDE 72

Query Processing Algorithm

Given query point π‘Ÿ

80

slide-73
SLIDE 73

Query Processing Algorithm

Given query point π‘Ÿ – Project π‘Ÿ on β„“ to get π‘Ÿβ€² – Use binary search to find the set 𝐡 of all lines β„“β€² that are within distance 2𝑦 of β„“, and that 𝐷𝑄ℓ→ℓ′ is within distance 2𝑦/πœ— of π‘Ÿβ€²

81

slide-74
SLIDE 74

Query Processing Algorithm

Given query point π‘Ÿ – Project π‘Ÿ on β„“ to get π‘Ÿβ€² – Use binary search to find the set 𝐡 of all lines β„“β€² that are within distance 2𝑦 of β„“, and that 𝐷𝑄ℓ→ℓ′ is within distance 2𝑦/πœ— of π‘Ÿβ€²

82

slide-75
SLIDE 75

Query Processing Algorithm

Given query point π‘Ÿ – Project π‘Ÿ on β„“ to get π‘Ÿβ€² – Use binary search to find the set 𝐡 of all lines β„“β€² that are within distance 2𝑦 of β„“, and that 𝐷𝑄ℓ→ℓ′ is within distance 2𝑦/πœ— of π‘Ÿβ€² – Let 𝐢𝐡(𝑝𝐡, 𝑠

𝐡) be the corresponding ball

83

slide-76
SLIDE 76

Query Processing Algorithm

Given query point π‘Ÿ – Project π‘Ÿ on β„“ to get π‘Ÿβ€² – Use binary search to find the set 𝐡 of all lines β„“β€² that are within distance 2𝑦 of β„“, and that 𝐷𝑄ℓ→ℓ′ is within distance 2𝑦/πœ— of π‘Ÿβ€² – Let 𝐢𝐡(𝑝𝐡, 𝑠

𝐡) be the corresponding ball

– If 𝑦 ∈ 𝐢𝐡(𝑝𝐡,

𝑠𝐡 πœ—2) use net module:

  • Find approximate closest line -> done!
  • Or find a line with distance at most

𝑠

π΅πœ—2 ≀ π‘¦πœ— (𝑠 𝐡 ≀ 𝑦/πœ—) -> we improved

84

slide-77
SLIDE 77

Query Processing Algorithm

Given query point π‘Ÿ – Project π‘Ÿ on β„“ to get π‘Ÿβ€² – Use binary search to find the set 𝐡 of all lines β„“β€² that are within distance 2𝑦 of β„“, and that 𝐷𝑄ℓ→ℓ′ is within distance 2𝑦/πœ— of π‘Ÿβ€² – Let 𝐢𝐡(𝑝𝐡, 𝑠

𝐡) be the corresponding ball

– If 𝑦 ∈ 𝐢𝐡(𝑝𝐡,

𝑠𝐡 πœ—2) use net module:

  • Find approximate closest line -> done!
  • Or find a line with distance at most

𝑠

π΅πœ—2 ≀ π‘¦πœ— (𝑠 𝐡 ≀ 𝑦/πœ—) -> we improved

– Otherwise use unbounded module to find the approximate closest line -> done!

85

slide-78
SLIDE 78

Almost Parallel

All lines are 2πœ—-close to each other. For each line β„“

  • Partition the space into slabs using

perpendicular hyperplanes to β„“ s.t. for any pair of lines β„“1, β„“2:

86

slab

slide-79
SLIDE 79

Almost Parallel

All lines are 2πœ—-close to each other. For each line β„“

  • Partition the space into slabs using

perpendicular hyperplanes to β„“ s.t. for any pair of lines β„“1, β„“2:

– In each slab the relative order of dist𝐼 β„“,𝑝 β„“, β„“1 and 𝑒𝑗𝑑𝑒𝐼 β„“,𝑝 (β„“, β„“2) on the hyper-plane remains the same as we move 𝑝 on β„“ in the slab There is a unique ordering of the lines

87

slide-80
SLIDE 80

Almost Parallel

All lines are 2πœ—-close to each other. For each line β„“

  • Partition the space into slabs using

perpendicular hyperplanes to β„“ s.t. for any pair of lines β„“1, β„“2:

– In each slab the relative order of dist𝐼 β„“,𝑝 β„“, β„“1 and 𝑒𝑗𝑑𝑒𝐼 β„“,𝑝 (β„“, β„“2) on the hyper-plane remains the same as we move 𝑝 on β„“ in the slab There is a unique ordering of the lines – 𝑒𝑗𝑑𝑒𝐼 β„“,𝑝 β„“1,β„“2 on the hyper-plane is monotone

88

slide-81
SLIDE 81

Almost Parallel

All lines are 2πœ—-close to each other. For each line β„“

  • Partition the space into slabs using

perpendicular hyperplanes to β„“ s.t. for any pair of lines β„“1, β„“2:

– In each slab the relative order of dist𝐼 β„“,𝑝 β„“, β„“1 and 𝑒𝑗𝑑𝑒𝐼 β„“,𝑝 (β„“, β„“2) on the hyper-plane remains the same as we move 𝑝 on β„“ in the slab There is a unique ordering of the lines – 𝑒𝑗𝑑𝑒𝐼 β„“,𝑝 β„“1,β„“2 on the hyper-plane is monotone The minimum ball intersecting any prefix of lines have its center on the boundary of slab

89

slide-82
SLIDE 82

Almost Parallel

All lines are 2πœ—-close to each other. For each line β„“

  • Partition the space into slabs using

perpendicular hyperplanes to β„“ s.t. for any pair of lines β„“1, β„“2:

– In each slab the relative order of dist𝐼 β„“,𝑝 β„“, β„“1 and 𝑒𝑗𝑑𝑒𝐼 β„“,𝑝 (β„“, β„“2) on the hyper-plane remains the same as we move 𝑝 on β„“ in the slab There is a unique ordering of the lines – 𝑒𝑗𝑑𝑒𝐼 β„“,𝑝 β„“1,β„“2 on the hyper-plane is monotone The minimum ball intersecting any prefix of lines have its center on the boundary of slab.

  • 𝑃 π‘œ2 slabs suffices

90

slide-83
SLIDE 83

Data Structure in Each Slab

  • For each 𝑗, let 𝐢(𝑝, 𝑠) be the smallest ball

touching the closest π‘—π‘’β„Ž lines s.t. 𝑝 ∈ β„“. We know 𝑝 would be on the boundary of slab.

91

slide-84
SLIDE 84

Data Structure in Each Slab

  • For each 𝑗, let 𝐢(𝑝, 𝑠) be the smallest ball

touching the closest π‘—π‘’β„Ž lines s.t. 𝑝 ∈ β„“. We know 𝑝 would be on the boundary of slab.

92

slide-85
SLIDE 85

Data Structure in Each Slab

  • For each 𝑗, let 𝐢(𝑝, 𝑠) be the smallest ball

touching the closest π‘—π‘’β„Ž lines s.t. 𝑝 ∈ β„“. We know 𝑝 would be on the boundary of slab.

93

slide-86
SLIDE 86

Data Structure in Each Slab

  • For each 𝑗, let 𝐢(𝑝, 𝑠) be the smallest ball

touching the closest π‘—π‘’β„Ž lines s.t. 𝑝 ∈ β„“. We know 𝑝 would be on the boundary of slab.

  • Let πœ€0 > β‹― > πœ€π‘’ be all pairwise angles
  • Let 𝑆0 =

𝑠 πœ—πœ€0 , … , 𝑆𝑒 = 𝑠 πœ—πœ€π‘’

  • Consider the balls 𝐢 𝑝, 𝑆0 , … , 𝐢 𝑝, 𝑆𝑒

94

slide-87
SLIDE 87

Data Structure in Each Slab

  • For each 𝑗, let 𝐢(𝑝, 𝑠) be the smallest ball

touching the closest π‘—π‘’β„Ž lines s.t. 𝑝 ∈ β„“. We know 𝑝 would be on the boundary of slab.

  • Let πœ€0 > β‹― , > be all pairwise angles
  • Let 𝑆0 =

𝑠 πœ—πœ€0 , … , 𝑆𝑒 = 𝑠 πœ—πœ€π‘’

  • Consider the balls 𝐢 𝑝, 𝑆0 , … , 𝐢 𝑝, 𝑆𝑒
  • Build net module inside 𝐢 𝑝, 𝑆0

95

slide-88
SLIDE 88

Data Structure in Each Slab

  • For each 𝑗, let 𝐢(𝑝, 𝑠) be the smallest ball

touching the closest π‘—π‘’β„Ž lines s.t. 𝑝 ∈ β„“. We know 𝑝 would be on the boundary of slab.

  • Let πœ€0 > β‹― > πœ€π‘’ be all pairwise angles
  • Let 𝑆0 =

𝑠 πœ—πœ€0 , … , 𝑆𝑒 = 𝑠 πœ—πœ€π‘’

  • Consider the balls 𝐢 𝑝, 𝑆0 , … , 𝐢 𝑝, 𝑆𝑒
  • Build net module inside 𝐢 𝑝, 𝑆0

96

slide-89
SLIDE 89

Data Structure in Each Slab

  • For each 𝑗, let 𝐢(𝑝, 𝑠) be the smallest ball

touching the closest π‘—π‘’β„Ž lines s.t. 𝑝 ∈ β„“. We know 𝑝 would be on the boundary of slab.

  • Let πœ€0 > β‹― > πœ€π‘’ be all pairwise angles
  • Let 𝑆0 =

𝑠 πœ—πœ€0 , … , 𝑆𝑒 = 𝑠 πœ—πœ€π‘’

  • Consider the balls 𝐢 𝑝, 𝑆0 , … , 𝐢 𝑝, 𝑆𝑒
  • Build net module inside 𝐢 𝑝, 𝑆0
  • For each ball 𝐢(𝑝, 𝑆𝑗)

– Build unbounded module on it

97

slide-90
SLIDE 90

Data Structure in Each Slab

  • For each 𝑗, let 𝐢(𝑝, 𝑠) be the smallest ball

touching the closest π‘—π‘’β„Ž lines s.t. 𝑝 ∈ β„“. We know 𝑝 would be on the boundary of slab.

  • Let πœ€0 > β‹― > πœ€π‘’ be all pairwise angles
  • Let 𝑆0 =

𝑠 πœ—πœ€0 , … , 𝑆𝑒 = 𝑠 πœ—πœ€π‘’

  • Consider the balls 𝐢 𝑝, 𝑆0 , … , 𝐢 𝑝, 𝑆𝑒
  • Build net module inside 𝐢 𝑝, 𝑆0
  • For each ball 𝐢(𝑝, 𝑆𝑗)

– Build unbounded module on it – For each line ℓ𝑐

  • Build a set of parallel modules with ℓ𝑐

as their base line for all the lines that are πœ€π‘—-close to ℓ𝑐 , so that they cover the space between 𝐢(𝑝, 𝑆𝑗) and 𝐢(𝑝, 𝑆𝑗+1) with separation 𝑆𝑗+1πœ—

98

slide-91
SLIDE 91

Query Processing Algorithm

  • Given π‘Ÿ, find the right slab, and

retrieve all candidate lines

  • Using binary search find 𝑠

99

slide-92
SLIDE 92

Query Processing Algorithm

  • Given π‘Ÿ, find the right slab, and

retrieve all candidate lines

  • Using binary search find 𝑠
  • Find largest 𝑗 such that π‘Ÿ βˆ‰ 𝐢(𝑝, 𝑆𝑗)

100

slide-93
SLIDE 93

Query Processing Algorithm

  • Given π‘Ÿ, find the right slab, and

retrieve all candidate lines

  • Using binary search find 𝑠
  • Find largest 𝑗 such that π‘Ÿ βˆ‰ 𝐢(𝑝, 𝑆𝑗)
  • Use the unbounded module of

𝐢(𝑝, 𝑆𝑗) to find a line β„“β€², we know – Either β„“β€² is an approximate closest line -> done – It is πœ€π‘—+1-close to β„“βˆ—

101

slide-94
SLIDE 94

Query Processing Algorithm

  • Given π‘Ÿ, find the right slab, and

retrieve all candidate lines

  • Using binary search find 𝑠
  • Find largest 𝑗 such that π‘Ÿ βˆ‰ 𝐢(𝑝, 𝑆𝑗)
  • Use the unbounded module of

𝐢(𝑝, 𝑆𝑗) to find a line β„“β€², we know – Either β„“β€² is an approximate closest line -> done – It is πœ€π‘—+1-close to β„“βˆ—

  • Use the parallel modules of β„“β€² to

find an approximate closest line. -> done

102

slide-95
SLIDE 95

Query Processing Algorithm

  • Given π‘Ÿ, find the right slab, and

retrieve all candidate lines

  • Using binary search find 𝑠
  • Find largest 𝑗 such that π‘Ÿ βˆ‰ 𝐢(𝑝, 𝑆𝑗)
  • Use the unbounded module of

𝐢(𝑝, 𝑆𝑗) to find a line β„“β€², we know – Either β„“β€² is an approximate closest line -> done – It is πœ€π‘—+1-close to β„“βˆ—

  • Use the parallel modules of β„“β€² to

find an approximate closest line. -> done

103

slide-96
SLIDE 96

Summary

  • Nearest Line Search Problem

104

slide-97
SLIDE 97

Summary

  • Nearest Line Search Problem
  • Modules: unbounded, net, parallel

105

slide-98
SLIDE 98

Summary

  • Nearest Line Search Problem
  • Modules: unbounded, net, parallel
  • Use of random sampling

106

slide-99
SLIDE 99

Summary

  • Nearest Line Search Problem
  • Modules: unbounded, net, parallel
  • Use of random sampling
  • How to improve given a line

107

slide-100
SLIDE 100

Summary

  • Nearest Line Search Problem
  • Modules: unbounded, net, parallel
  • Use of random sampling
  • How to improve given a line
  • Bounds of our algorithm

– Polynomial Space:

𝑒𝑂 πœ— 𝑃 1

Γ— 𝒯

𝑂 πœ— 𝑃 1

, πœ— = 𝑃 𝑂 + 𝑒 𝑃

1 πœ—2

– Poly-logarithmic query time :

𝑒 log 𝑂 𝑃 1 Γ— 𝒰(

𝑂 πœ— 𝑃 1

, πœ—) = 𝑒 + log 𝑂 +

1 πœ— 𝑃 1

108

slide-101
SLIDE 101

Future Work

  • The current result is not good in practice

– Large exponents – Algorithm is complicated Can we get a simpler algorithms?

109

slide-102
SLIDE 102

Future Work

  • The current result is not good in practice

– Large exponents – Algorithm is complicated Can we get a simpler algorithms?

  • Generalization to higher dimensional flats

110

slide-103
SLIDE 103

Future Work

  • The current result is not good in practice

– Large exponents – Algorithm is complicated Can we get a simpler algorithms?

  • Generalization to higher dimensional flats
  • Generalization to other objects, e.g. balls

111

slide-104
SLIDE 104

THANK YOU!

112