Average - case Lower Bounds for Approximate Near - Neighbor fs om - - PowerPoint PPT Presentation

average case lower bounds for approximate near neighbor
SMART_READER_LITE
LIVE PREVIEW

Average - case Lower Bounds for Approximate Near - Neighbor fs om - - PowerPoint PPT Presentation

Simple Average - case Lower Bounds for Approximate Near - Neighbor fs om Isoperimetric Inequalities Yitong Yin Nanjing University Nearest Neighbor Search ( NNS ) metric space ( X, dist) query x X database access y = ( y 1 , y 2 , . . . , y


slide-1
SLIDE 1

Simple

Average-case Lower Bounds

for

Approximate Near-Neighbor

fsom

Isoperimetric Inequalities

Yitong Yin Nanjing University

slide-2
SLIDE 2

Nearest Neighbor Search

metric space (X,dist) database

y = (y1, y2, . . . , yn) ∈ Xn

preprocessing

data structure query x ∈ X

  • utput: database point yi closest to the query point x

(NNS)

access

applications: database, pattern matching, machine learning, ...

x

slide-3
SLIDE 3

Near Neighbor Problem

database

y = (y1, y2, . . . , yn) ∈ Xn

data structure query x ∈ X

(λ-NN)

access

x

radius λ

“no” if all yi are >λ-faraway from x

λ

λ-NN: answer “yes” if ∃yi that is ≤λ-close to x

metric space (X,dist)

preprocessing

slide-4
SLIDE 4

Approximate Near Neighbor

database

y = (y1, y2, . . . , yn) ∈ Xn

data structure query x ∈ X

(ANN)

access

x

radius λ

“no” if all yi are >γλ-faraway from x

approximation ratio γ≥1

arbitrary if otherwise

λ γλ

(γ, λ)-ANN: answer “yes” if ∃yi that is ≤λ-close to x

metric space (X,dist)

preprocessing

slide-5
SLIDE 5

Approximate Near Neighbor

database

y = (y1, y2, . . . , yn) ∈ Xn

data structure query x ∈ X

(ANN)

access

x

λ γλ

Hamming space X = {0, 1}d metric space (X,dist) dist(x, z) = kx zk1

Hamming distance

Curse of dimensionality!

preprocessing

radius λ

approximation ratio γ≥1

100 log n < d < no(1)

slide-6
SLIDE 6

code T

table query x ∈ X

t adaptive cell-probes

Cell-Probe Model

} w bits

(

s cells (words) algorithm A:

Σ = {0, 1}w

where (decision tree)

protocol: the pair (A, T) database y ∈ Y

T : Y → Σs

f : X × Y → Z data structure problem: (s, w, t)-cell-probing scheme

f(x, y)

slide-7
SLIDE 7

Approximate Near-Neighbor (ANN) Randomized Exact Near-Neighbor (RENN)

Deterministic Randomized

Near-Neighbor Lower Bounds

Hamming space X = {0, 1}d database size: n time: t cell-probes; space: s cells, each of w bits

t = Ω ⇣

d log s

[Miltersen et al.1995] [Liu 2004]

t = Ω ⇣

d log sw

n

[Pătraşcu Thorup 2006]

t = Ω ⇣

d log sw

nd

[Wang Y . 2014]

t = Ω ⇣

log n log sw

n

[Panigrahy Talwar Wieder 2008, 2010]

t = Ω ⇣

d log sw

n

[Pătraşcu Thorup 2006]

t = Ω ⇣

d log s

[Borodin Ostrovsky Rabani 1999]

[Barkol Rabani 2000] for s = poly(n) [Chakrabarti Regev 2004]

t = O(1)

  • matches the highest known lower bounds for any data structure problems:

Polynomial Evaluation [Larsen’12], ball-inheritance (range reporting) [Grønlund, Larsen’16]

linear space: s = Θ(n) w = Θ(d) d = Θ(log n) t = Ω (1)

t = Ω ⇣

log n log log n

⌘ t = Ω ⇣

log n log log n

t = Ω (log n)

t = Ω ⇣

log n log log n

t = Ω (1)

slide-8
SLIDE 8

Why are data structure lower bounds so difficult?

  • (Observed by [Miltersen et al. 1995]) An ω(log n) cell-probe

lower bound on polynomial space for any function in P would prove P ⊈ linear-time poly-size Boolean branching programs. (Solved in [Ajtai 1999])

  • (Observed by [Brody, Larsen 2012]) Even non-adaptive data

structures are circuits with arbitrary gates of depth 2:

f : X × Y → Z

data y y1 y2 yn-1 yn table cells: f(x,y) f(x’,y)

arbitrary fan-in & -out

t fan-in

1 s

slide-9
SLIDE 9

database size: n

Approximate Near-Neighbor (ANN) Randomized Exact Near-Neighbor (RENN)

Deterministic Randomized

Near-Neighbor Lower Bounds

Hamming space X = {0, 1}d time: t cell-probes; space: s cells, each of w bits

t = Ω ⇣

d log s

[Miltersen et al.1995] [Liu 2004]

t = Ω ⇣

d log sw

n

[Pătraşcu Thorup 2006]

t = Ω ⇣

d log sw

nd

[Wang Y . 2014]

t = Ω ⇣

log n log sw

n

[Panigrahy Talwar Wieder 2008, 2010]

t = Ω ⇣

d log sw

n

[Pătraşcu Thorup 2006]

t = Ω ⇣

d log s

[Borodin Ostrovsky Rabani 1999]

[Barkol Rabani 2000]

slide-10
SLIDE 10

Average-Case Lower Bounds

  • Hard distribution: [Barkol Rabani 2000] [Liu 2004] [PTW’08 ’10]
  • database: y1,...,yn∈{0,1}d i.i.d. uniform
  • query: uniform and independent x∈{0,1}d
  • Expected cell-probe complexity:
  • E(x,y)[# of cell-probes to resolve query x on database y]
  • “Curse of dimensionality” should hold on average.
  • In data-dependent LSH [Andoni Razenshteyn 2015]: a

key step is to solve the problem on random input.

slide-11
SLIDE 11

database size: n

Approximate Near-Neighbor (ANN) Randomized Exact Near-Neighbor (RENN)

Deterministic Randomized Hamming space X = {0, 1}d time: t cell-probes; space: s cells, each of w bits

t = Ω ⇣

d log s

[Miltersen et al.1995] [Liu 2004]

t = Ω ⇣

d log sw

n

[Pătraşcu Thorup 2006]

t = Ω ⇣

d log sw

nd

[Wang Y . 2014]

t = Ω ⇣

log n log sw

n

[Panigrahy Talwar Wieder 2008, 2010]

t = Ω ⇣

d log sw

n

[Pătraşcu Thorup 2006]

t = Ω ⇣

d log s

[Borodin Ostrovsky Rabani 1999]

[Barkol Rabani 2000]

Average-Case Lower Bounds

slide-12
SLIDE 12

Approximate Near-Neighbor (ANN) Randomized Exact Near-Neighbor (RENN)

Deterministic Randomized Hamming space X = {0, 1}d database size: n time: t cell-probes; space: s cells, each of w bits

t = Ω ⇣

d log s

[Miltersen et al.1995] [Liu 2004]

t = Ω ⇣

d log sw

nd

t = Ω ⇣

log n log sw

n

[Panigrahy Talwar Wieder 2008, 2010]

t = Ω ⇣

d log s

[Borodin Ostrovsky Rabani 1999]

[Barkol Rabani 2000]

Average-Case Lower Bounds

  • ur result:
slide-13
SLIDE 13

Metric Expansion

  • λ-neighborhoods are weakly independent under μ:

∀x ∈ X, μ(Nλ(x)) < 0.99/n metric space (X,dist) λ-neighborhood: ∀x ∈ X, Nλ(x) = {z ∈ X | dist(x, z) ≤ λ}

∀A⊆X, Nλ(A) = {z ∈ X | ∃x∈A s.t. dist(x, z) ≤ λ}

  • λ-neighborhoods are (Φ,Ψ)-expanding under μ:

∀A⊆X, μ(A) ≥ 1/Φ ⇒ μ(Nλ(A)) ≥ 1-1/Ψ [Panigrahy Talwar Wieder 2010] probability distribution μ over X

slide-14
SLIDE 14

Metric Expansion

metric space (X,dist)

  • λ-neighborhoods are (Φ,Ψ)-expanding under μ:

∀A⊆X, μ(A) ≥ 1/Φ ⇒ μ(Nλ(A)) ≥ 1-1/Ψ [Panigrahy Talwar Wieder 2010]

1/Ψ

1/Φ

vertex expansion, “blow-up” effect probability distribution μ over X

slide-15
SLIDE 15

For (γ, λ)-ANN in metric space (X,dist) where

∀ deterministic algorithm that makes t cell-probes in expectation

  • n a table of size s cells, each of w bits (assuming w+log s < n / log Φ),

under the input distribution:

  • γλ-neighborhoods are weakly independent under μ:

μ(Nγλ(x)) < 0.99/n for ∀x ∈ X

  • λ-neighborhoods are (Φ,Ψ)-expanding under μ:

∀A⊆X that μ(A) ≥ 1/Φ ⇒ μ(Nλ(A)) ≥ 1-1/Ψ

t = Ω log Φ log

sw n log Ψ

!

database y=(y1, y2,...,yn) where y1, y2,...,yn ∼ μ, i.i.d. query x ∼ μ, independently

Main Theorem:

slide-16
SLIDE 16

t = Ω log Φ log

sw n log Ψ

!

Hamming space X={0,1}d, uniform distribution μ over X:

  • γλ-neighborhoods are weakly independent under μ:

μ(Nγλ(x)) < 0.99/n for ∀x ∈ X

  • λ-neighborhoods are (2Θ(d), 2Θ(d))-expanding under μ:

∀A⊆X, μ(A) ≥ 2-Θ(d) ⇒ μ(Nλ(A)) ≥ 1-2-Θ(d) γλ = d

2 −

p 2d ln(2n) choose t = Ω ✓ d log sw

nd

◆ Harper’s Isoperimetric inequality: ∀A⊆X, μ(A) ≥ μ(Nr(0)) ⇒ μ(Nλ(A)) ≥ μ(Nr+λ(0)) “Hamming balls have the smallest vertex-expansion.”

1/Ψ

1/Φ

For (γ, λ)-ANN in metric space (X,dist) where

∀ deterministic algorithm that makes t cell-probes in expectation

  • n a table of size s cells, each of w bits (assuming w+log s < n / log Φ),

under the input distribution:

  • γλ-neighborhoods are weakly independent under μ:

μ(Nγλ(x)) < 0.99/n for ∀x ∈ X

  • λ-neighborhoods are (Φ,Ψ)-expanding under μ:

∀A⊆X that μ(A) ≥ 1/Φ ⇒ μ(Nλ(A)) ≥ 1-1/Ψ

t = Ω log Φ log

sw n log Ψ

!

database y=(y1, y2,...,yn) where y1, y2,...,yn ∼ μ, i.i.d. query x ∼ μ, independently

Main Theorem:

slide-17
SLIDE 17

x ∈ X y ∈ Y f : X × Y → {0, 1}

f(x, y)

t log s tw

The Richness Lemma

table (s cells, each of w bits)

cell-probing algorithm f has 1-rectangle A×B with μ(A) ≥ 2-O(t log s) ν(B) ≥ 2-O(t log s+ tw)

f is 0.01-dense under μ×ν f has (s,w,t)-cell-probing scheme

  • Richness lemma (Miltersen, Nisan, Safra, Wigderson, 1995)

n

α-dense: density of 1s ≥ α under μ×ν

monochromatic 1-rectangle: A×B with A⊆X, B⊆Y s.t. ∀(x,y)∈ A×B, f(x,y)=1

distributions μ over X, ν over Y

slide-18
SLIDE 18

A New Richness Lemma

f has 1-rectangle A×B with μ(A) ≥ 2-O(t log s) ν(B) ≥ 2-O(t log s+ tw)

f is 0.01-dense under μ×ν f has (s,w,t)-cell-probing scheme

  • Richness lemma (Miltersen, Nisan, Safra, Wigderson, 1995)

n distributions μ over X, ν over Y when ∆=O(t), it becomes the richness lemma (with slightly better bounds) f : X × Y → {0, 1}

∀∆ ∈[320000t,s], f has 1-rectangle A×B with μ(A) ≥ 2-O(t log (s/∆)) ν(B) ≥ 2-O(∆ log (s/∆) + ∆ w)

f is 0.01-dense under μ×ν f has average-case (s,w,t)-cell-probing scheme under μ×ν

  • New Richness lemma

n

slide-19
SLIDE 19

∀∆ ∈[320000t,s], f has 1-rectangle A×B with μ(A) ≥ 2-O(t log (s/∆)) ν(B) ≥ 2-O(∆ log (s/∆) + ∆ w)

f is 0.01-dense under μ×ν f has average-case (s,w,t)-cell-probing scheme under μ×ν

  • New Richness lemma

n distributions μ over X, ν over Y f : X × Y → {0, 1} metric space (X,dist), query x∈X, database y=(y1,...,yn)∈Xn f(x, y) =

n

^

i=1

g(x, yi)

g(x, yi) =      1 dist(x, yi) > γλ dist(x, yi) ≤ λ ∗

  • therwise

where ¬(γ, λ)-ANN:

Other examples: partial match, membership, range query, ...

slide-20
SLIDE 20

metric space (X,dist), query x∈X, database y=(y1,...,yn)∈Xn

f(x, y) =

n

^

i=1

g(x, yi)

g(x, yi) =      1 dist(x, yi) > γλ dist(x, yi) ≤ λ ∗

  • therwise

where ¬(γ, λ)-ANN:

  • γλ-neighborhoods are weakly independent under μ:

μ(Nγλ(x)) < 0.99/n for ∀x ∈ X density of 0s in g is ≤0.99/n under μ×μ

  • λ-neighborhoods are (Φ,Ψ)-expanding under μ:

∀A⊆X, μ(A) ≥ 1/Φ ⇒ μ(Nλ(A)) ≥ 1-1/Ψ

g does not have 1-rectangle A×C with μ(A)>1/Φ and μ(C)>1/Ψ f is 0.01-dense under μ×μn f does not have 1-rectangle A×B with μ(A)>1/Φ and μn(B)>1/Ψn

∀∆ ∈[320000t,s], f has 1-rectangle A×B with μ(A) ≥ 2-O(t log (s/∆)) ν(B) ≥ 2-O(∆ log (s/∆) + ∆ w)

f is 0.01-dense under μ×ν f has average-case (s,w,t)-cell-probing scheme under μ×ν

  • New Richness lemma

n

choose ∆ = O

n log Ψ w

so that μn(B) ≥ 2-O(∆ log (s/∆) + ∆ w) > 1/Ψn

1/Φ ≥ μ(A) ≥ 2-O(t log (s/∆))

t = Ω log Φ log

sw n log Ψ

!

slide-21
SLIDE 21

∀∆ ∈[320000t,s], f has 1-rectangle A×B with μ(A) ≥ 2-O(t log (s/∆)) ν(B) ≥ 2-O(∆ log (s/∆) + ∆ w)

f is 0.01-dense under μ×ν f has average-case (s,w,t)-cell-probing scheme under μ×ν

  • New Richness lemma

n

≥0.0025-fraction (under ν) of databases y∈Y are “good”: s.t. ∀ good database y,

Ty :

positive queries:

≥0.005-fraction of queries x∈X are positive

  • avg. cell-probes for positive queries ≤ 80000t

∃ ∆ cells resolving 2-O(t log (s/∆)) fraction (under μ) positive queries

slide-22
SLIDE 22

≥0.0025-fraction (under ν) of databases y∈Y are “good”: s.t. ∀ good database y,

∃ ∆ cells resolving 2-O(t log (s/∆)) fraction (under μ) positive queries cell-probe model: once ω is fixed, the set of positive queries resolved by ω is fixed

Ty

ω: positions & contents

  • f these ∆ cells

s 1

}w bits

≥ 2-O(∆ log (s/∆) + ∆ w) fraction (under ν) good y ⟼ the same ω good y ⟼ ω

possibilities

≤ s

  • 2∆w = 2O(∆ log s

∆ +∆w)

B : A :

∀∆ ∈[320000t,s], f has 1-rectangle A×B with μ(A) ≥ 2-O(t log (s/∆)) ν(B) ≥ 2-O(∆ log (s/∆) + ∆ w)

f is 0.01-dense under μ×ν f has average-case (s,w,t)-cell-probing scheme under μ×ν

  • New Richness lemma

n

slide-23
SLIDE 23

∀∆ ∈[320000t,s], f has 1-rectangle A×B with μ(A) ≥ 2-O(t log (s/∆)) ν(B) ≥ 2-O(∆ log (s/∆) + ∆ w)

f is 0.01-dense under μ×ν f has average-case (s,w,t)-cell-probing scheme under μ×ν

  • New Richness lemma

n

distributions μ over X, ν over Y f : X × Y → {0, 1}

slide-24
SLIDE 24

For (γ, λ)-ANN in metric space (X,dist) where

∀ deterministic algorithm that makes t cell-probes in expectation

  • n a table of size s cells, each of w bits (assuming w+log s < n / log Φ),

under the input distribution:

  • γλ-neighborhoods are weakly independent under μ:

μ(Nγλ(x)) < 0.99/n for ∀x ∈ X

  • λ-neighborhoods are (Φ,Ψ)-expanding under μ:

∀A⊆X that μ(A) ≥ 1/Φ ⇒ μ(Nλ(A)) ≥ 1-1/Ψ

t = Ω log Φ log

sw n log Ψ

!

database y=(y1, y2,...,yn) where y1, y2,...,yn ∼ μ, i.i.d. query x ∼ μ, independently

Main Theorem:

slide-25
SLIDE 25

Approximate Near-Neighbor (ANN) Randomized Exact Near-Neighbor (RENN)

Deterministic Randomized Hamming space X = {0, 1}d database size: n time: t cell-probes; space: s cells, each of w bits

t = Ω ⇣

d log s

[Miltersen et al.1995] [Liu 2004]

t = Ω ⇣

d log sw

nd

t = Ω ⇣

log n log sw

n

[Panigrahy Talwar Wieder 2008, 2010]

t = Ω ⇣

d log s

[Borodin Ostrovsky Rabani 1999]

[Barkol Rabani 2000]

Average-Case Lower Bounds

  • ur result:
  • database: y1,...,yn∈{0,1}d i.i.d. uniform
  • query: uniform and independent x∈{0,1}d
slide-26
SLIDE 26

Thank you!