1 2 Compress a massive object to a small sketch 2 Compress a - - PowerPoint PPT Presentation

1 2 compress a massive object to a small sketch
SMART_READER_LITE
LIVE PREVIEW

1 2 Compress a massive object to a small sketch 2 Compress a - - PowerPoint PPT Presentation

1 2 Compress a massive object to a small sketch 2 Compress a massive object to a small sketch Rich theories: high-dimensional vectors, matrices, graphs d n 2 Compress a massive object to a small sketch Rich theories:


slide-1
SLIDE 1

1

slide-2
SLIDE 2

2

slide-3
SLIDE 3
  • Compress a massive object to a small sketch

2

slide-4
SLIDE 4
  • Compress a massive object to a small sketch
  • Rich theories: high-dimensional vectors, matrices, graphs

n d

2

slide-5
SLIDE 5
  • Compress a massive object to a small sketch
  • Rich theories: high-dimensional vectors, matrices, graphs
  • Similarity search, compressed sensing, numerical linear algebra

2

slide-6
SLIDE 6
  • Compress a massive object to a small sketch
  • Rich theories: high-dimensional vectors, matrices, graphs
  • Similarity search, compressed sensing, numerical linear algebra
  • Dimension reduction (Johnson, Lindenstrauss 1984): random

projection on a low-dimensional subspace preserves distances

2

slide-7
SLIDE 7
  • Compress a massive object to a small sketch
  • Rich theories: high-dimensional vectors, matrices, graphs
  • Similarity search, compressed sensing, numerical linear algebra
  • Dimension reduction (Johnson, Lindenstrauss 1984): random

projection on a low-dimensional subspace preserves distances

When is sketching possible?

2

slide-8
SLIDE 8

3

slide-9
SLIDE 9
  • Motivation: similarity search

3

slide-10
SLIDE 10
  • Motivation: similarity search
  • Model dis-similarity as a metric

3

slide-11
SLIDE 11
  • Motivation: similarity search
  • Model dis-similarity as a metric

3

slide-12
SLIDE 12
  • Motivation: similarity search
  • Model dis-similarity as a metric
  • Sketching may speed-up computation

and allow indexing

3

slide-13
SLIDE 13
  • Motivation: similarity search
  • Model dis-similarity as a metric
  • Sketching may speed-up computation

and allow indexing

  • Interesting metrics:

3

slide-14
SLIDE 14
  • Motivation: similarity search
  • Model dis-similarity as a metric
  • Sketching may speed-up computation

and allow indexing

  • Interesting metrics:
  • Euclidean ℓ2: d(x, y) = (∑i|xi – yi|2)1/2

3

slide-15
SLIDE 15
  • Motivation: similarity search
  • Model dis-similarity as a metric
  • Sketching may speed-up computation

and allow indexing

  • Interesting metrics:
  • Euclidean ℓ2: d(x, y) = (∑i|xi – yi|2)1/2
  • Manhattan, Hamming ℓ1: d(x, y) = ∑i|xi – yi|

3

slide-16
SLIDE 16
  • Motivation: similarity search
  • Model dis-similarity as a metric
  • Sketching may speed-up computation

and allow indexing

  • Interesting metrics:
  • Euclidean ℓ2: d(x, y) = (∑i|xi – yi|2)1/2
  • Manhattan, Hamming ℓ1: d(x, y) = ∑i|xi – yi|
  • ℓp distances d(x, y) = (∑i|xi – yi|p)1/p for p ≥ 1

3

slide-17
SLIDE 17
  • Motivation: similarity search
  • Model dis-similarity as a metric
  • Sketching may speed-up computation

and allow indexing

  • Interesting metrics:
  • Euclidean ℓ2: d(x, y) = (∑i|xi – yi|2)1/2
  • Manhattan, Hamming ℓ1: d(x, y) = ∑i|xi – yi|
  • ℓp distances d(x, y) = (∑i|xi – yi|p)1/p for p ≥ 1
  • Edit Distance, Earth Mover’s Distance etc.

3

slide-18
SLIDE 18

4

slide-19
SLIDE 19
  • Alice and Bob each hold a point from a

metric space (say x and y)

4

Alice Bob Charlie

x y

slide-20
SLIDE 20
  • Alice and Bob each hold a point from a

metric space (say x and y)

  • Both send s-bit sketches to Charlie

sketch(x) sketch(y)

4

Alice Bob Charlie

x y

slide-21
SLIDE 21
  • Alice and Bob each hold a point from a

metric space (say x and y)

  • Both send s-bit sketches to Charlie
  • For r > 0 and D > 1 distinguish
  • d(x, y) ≤ r
  • d(x, y) ≥ Dr

sketch(x) sketch(y)

d(x, y) ≤ r or d(x, y) ≥ Dr?

4

Alice Bob Charlie

x y

slide-22
SLIDE 22
  • Alice and Bob each hold a point from a

metric space (say x and y)

  • Both send s-bit sketches to Charlie
  • For r > 0 and D > 1 distinguish
  • d(x, y) ≤ r
  • d(x, y) ≥ Dr
  • Shared randomness, allow 1%

probability of error sketch(x) sketch(y)

d(x, y) ≤ r or d(x, y) ≥ Dr?

0 1 1 0 … 1

4

Alice Bob Charlie

x y

slide-23
SLIDE 23
  • Alice and Bob each hold a point from a

metric space (say x and y)

  • Both send s-bit sketches to Charlie
  • For r > 0 and D > 1 distinguish
  • d(x, y) ≤ r
  • d(x, y) ≥ Dr
  • Shared randomness, allow 1%

probability of error

  • Trade-off between s and D

sketch(x) sketch(y)

d(x, y) ≤ r or d(x, y) ≥ Dr?

0 1 1 0 … 1

4

Alice Bob Charlie

x y

slide-24
SLIDE 24

Which metrics can we sketch efficiently?

(Kanpur 2006)

slide-25
SLIDE 25

6

slide-26
SLIDE 26
  • Near Neighbor Search (NNS):
  • Given n-point dataset P
  • A query q within r from some data point
  • Return any data point within Dr from q

6

slide-27
SLIDE 27
  • Near Neighbor Search (NNS):
  • Given n-point dataset P
  • A query q within r from some data point
  • Return any data point within Dr from q
  • Sketches of size s imply NNS with

space nO(s) and a 1-probe query

6

slide-28
SLIDE 28
  • Near Neighbor Search (NNS):
  • Given n-point dataset P
  • A query q within r from some data point
  • Return any data point within Dr from q
  • Sketches of size s imply NNS with

space nO(s) and a 1-probe query

  • Proof idea: amplify probability of

error to 1/n by increasing the size to O(s log n); sketch of q determines the answer

6

slide-29
SLIDE 29
  • Near Neighbor Search (NNS):
  • Given n-point dataset P
  • A query q within r from some data point
  • Return any data point within Dr from q
  • Sketches of size s imply NNS with

space nO(s) and a 1-probe query

  • Proof idea: amplify probability of

error to 1/n by increasing the size to O(s log n); sketch of q determines the answer

  • For many metrics: the only approach

6

slide-30
SLIDE 30

Which metrics can we sketch efficiently?

(Kanpur 2006)

slide-31
SLIDE 31

8

slide-32
SLIDE 32

  • (Indyk 2000): can sketch ℓp for 0 < p ≤ 2 via random projections using

p-stable distributions

  • For D = 1 + ε one gets s = O(1 / ε2)
  • Tight by (Woodruff 2004)

8

slide-33
SLIDE 33

  • (Indyk 2000): can sketch ℓp for 0 < p ≤ 2 via random projections using

p-stable distributions

  • For D = 1 + ε one gets s = O(1 / ε2)
  • Tight by (Woodruff 2004)
  • For p > 2 sketching ℓp is somewhat hard (Alon, Matias, Szegedy 1995),

(Bar-Yossef, Jayram, Kumar, Sivakumar 2002), (Indyk, Woodruff 2005)

  • To achieve D = O(1) one needs sketch size to be s = Θ~(d1-2/p)

8

slide-34
SLIDE 34

9

slide-35
SLIDE 35
  • Distinguish |x – y| ≤ 1 vs.

|x – y| ≥ 1 + ε

x y

9

slide-36
SLIDE 36
  • Distinguish |x – y| ≤ 1 vs.

|x – y| ≥ 1 + ε

  • Randomly shifted pieces of

length 1 + ε/2

x y 1

9

slide-37
SLIDE 37
  • Distinguish |x – y| ≤ 1 vs.

|x – y| ≥ 1 + ε

  • Randomly shifted pieces of

length 1 + ε/2

  • Repeat O(1 / ε2) times

x y 1

9

slide-38
SLIDE 38
  • Distinguish |x – y| ≤ 1 vs.

|x – y| ≥ 1 + ε

  • Randomly shifted pieces of

length 1 + ε/2

  • Repeat O(1 / ε2) times
  • Overall:
  • D = 1 + ε
  • s = O(1 / ε2)

x y 1

9

slide-39
SLIDE 39

10

slide-40
SLIDE 40

  • (Indyk 2000): can reduce sketching of ℓp with 0 < p ≤ 2 to sketching

reals via random projections

10

slide-41
SLIDE 41

  • (Indyk 2000): can reduce sketching of ℓp with 0 < p ≤ 2 to sketching

reals via random projections

  • If (G1, G2, …, Gd) are i.i.d. N(0, 1)’s, then ∑i xiGi – ∑i yiGi is distributed as

‖x - y‖2• N(0, 1)

10

slide-42
SLIDE 42

  • (Indyk 2000): can reduce sketching of ℓp with 0 < p ≤ 2 to sketching

reals via random projections

  • If (G1, G2, …, Gd) are i.i.d. N(0, 1)’s, then ∑i xiGi – ∑i yiGi is distributed as

‖x - y‖2• N(0, 1)

  • For 0 < p < 2 use p-stable distributions instead

10

slide-43
SLIDE 43

  • (Indyk 2000): can reduce sketching of ℓp with 0 < p ≤ 2 to sketching

reals via random projections

  • If (G1, G2, …, Gd) are i.i.d. N(0, 1)’s, then ∑i xiGi – ∑i yiGi is distributed as

‖x - y‖2• N(0, 1)

  • For 0 < p < 2 use p-stable distributions instead
  • Again, get D = 1 + ε with s = O(1 / ε2)

10

slide-44
SLIDE 44

  • (Indyk 2000): can reduce sketching of ℓp with 0 < p ≤ 2 to sketching

reals via random projections

  • If (G1, G2, …, Gd) are i.i.d. N(0, 1)’s, then ∑i xiGi – ∑i yiGi is distributed as

‖x - y‖2• N(0, 1)

  • For 0 < p < 2 use p-stable distributions instead
  • Again, get D = 1 + ε with s = O(1 / ε2)
  • (1 + ε)-NNS: space nO(1/ε^2), query time poly((log n) / ε)

10

slide-45
SLIDE 45

Which metrics can we sketch with constant sketch size and approximation?

slide-46
SLIDE 46

12

slide-47
SLIDE 47

  • A map f: X → Y is an embedding with distortion C, if for a, b from X:

12

slide-48
SLIDE 48

  • A map f: X → Y is an embedding with distortion C, if for a, b from X:

dX(a, b) / C ≤ dY(f(a), f(b)) ≤ dX(a, b)

12

slide-49
SLIDE 49

X Y

  • A map f: X → Y is an embedding with distortion C, if for a, b from X:

dX(a, b) / C ≤ dY(f(a), f(b)) ≤ dX(a, b)

12

slide-50
SLIDE 50

X Y

  • A map f: X → Y is an embedding with distortion C, if for a, b from X:

dX(a, b) / C ≤ dY(f(a), f(b)) ≤ dX(a, b)

12

a b

slide-51
SLIDE 51

X Y

  • A map f: X → Y is an embedding with distortion C, if for a, b from X:

dX(a, b) / C ≤ dY(f(a), f(b)) ≤ dX(a, b)

12

a b f(a) f(b) f f

slide-52
SLIDE 52

X Y

  • A map f: X → Y is an embedding with distortion C, if for a, b from X:

dX(a, b) / C ≤ dY(f(a), f(b)) ≤ dX(a, b)

12

a b f(a) f(b) f f

slide-53
SLIDE 53

X Y

  • A map f: X → Y is an embedding with distortion C, if for a, b from X:

dX(a, b) / C ≤ dY(f(a), f(b)) ≤ dX(a, b)

  • Reductions for geometric problems

12

a b f(a) f(b) f f

slide-54
SLIDE 54

  • A map f: X → Y is an embedding with distortion C, if for a, b from X:

dX(a, b) / C ≤ dY(f(a), f(b)) ≤ dX(a, b)

  • Reductions for geometric problems

12

slide-55
SLIDE 55

  • A map f: X → Y is an embedding with distortion C, if for a, b from X:

dX(a, b) / C ≤ dY(f(a), f(b)) ≤ dX(a, b)

  • Reductions for geometric problems

12

Sketches of size s and approximation D for Y Sketches of size s and approximation CD for X

slide-56
SLIDE 56

13

slide-57
SLIDE 57
  • A metric X admits sketches with s, D = O(1), if:
  • X = ℓp for p ≤ 2
  • X embeds into ℓp for p ≤ 2 with distortion O(1)

13

slide-58
SLIDE 58
  • A metric X admits sketches with s, D = O(1), if:
  • X = ℓp for p ≤ 2
  • X embeds into ℓp for p ≤ 2 with distortion O(1)
  • Are there any other metrics with efficient sketches (D and s are O(1))?

13

slide-59
SLIDE 59
  • A metric X admits sketches with s, D = O(1), if:
  • X = ℓp for p ≤ 2
  • X embeds into ℓp for p ≤ 2 with distortion O(1)
  • Are there any other metrics with efficient sketches (D and s are O(1))?
  • We don’t know!
  • Some new techniques are waiting to be discovered?
  • No new techniques?!

13

slide-60
SLIDE 60

14

slide-61
SLIDE 61

If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ1 – ε with distortion O(sD / ε)

14

slide-62
SLIDE 62

If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ1 – ε with distortion O(sD / ε)

Embedding into ℓp, p ≤ 2 Efficient sketches

(Kushilevitz, Ostrovsky, Rabani 1998) (Indyk 2000) For norms

14

slide-63
SLIDE 63

If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ1 – ε with distortion O(sD / ε)

  • A vector space X with ‖.‖: X → R≥0 is a normed space, if
  • ‖x‖ = 0 iff x = 0
  • ‖αx‖ = |α|‖x‖
  • ‖x + y‖ ≤ ‖x‖ + ‖y‖
  • Every norm gives rise to a metric: define d(x, y) = ‖x - y‖

14

slide-64
SLIDE 64

15

slide-65
SLIDE 65
  • [Li, Nguyen, Woodruff 2014]: streaming any function is equivalent to

linear sketches

15

slide-66
SLIDE 66
  • [Li, Nguyen, Woodruff 2014]: streaming any function is equivalent to

linear sketches

  • [Braverman, Chestnut, Krauthgamer, Yang 2015]: streaming

symmetric norms

15

slide-67
SLIDE 67

16

slide-68
SLIDE 68

16

No embeddings with distortion O(1) into ℓ1 – ε No sketches* of size and approximation O(1)

slide-69
SLIDE 69
  • Convert non-embeddability into lower bounds for sketches in a black

box way

16

No embeddings with distortion O(1) into ℓ1 – ε No sketches* of size and approximation O(1)

slide-70
SLIDE 70
  • Convert non-embeddability into lower bounds for sketches in a black

box way

16

No embeddings with distortion O(1) into ℓ1 – ε No sketches* of size and approximation O(1)

*in fact, any communication

protocols

slide-71
SLIDE 71

17

slide-72
SLIDE 72
  • ℓp spaces: p > 2 is hard, 1 ≤ p ≤ 2 is easy, p < 1 is not a norm

17

slide-73
SLIDE 73
  • ℓp spaces: p > 2 is hard, 1 ≤ p ≤ 2 is easy, p < 1 is not a norm
  • Can classify mixed norms ℓp(ℓq): in particular, ℓ1(ℓ2) is easy, while

ℓ2(ℓ1) is hard! (Jayram, Woodruff 2009), (Kalton 1985)

ℓq ℓp

17

slide-74
SLIDE 74
  • ℓp spaces: p > 2 is hard, 1 ≤ p ≤ 2 is easy, p < 1 is not a norm
  • Can classify mixed norms ℓp(ℓq): in particular, ℓ1(ℓ2) is easy, while

ℓ2(ℓ1) is hard! (Jayram, Woodruff 2009), (Kalton 1985)

  • A non-example: edit distance is not a norm, sketchability is largely
  • pen (Ostrovsky, Rabani 2005), (Andoni, Jayram, Pătraşcu 2010)

ℓq ℓp

17

slide-75
SLIDE 75

18

slide-76
SLIDE 76
  • For x: R[Δ]×[Δ] → R with ∑i,j xi,j = 0, define the Earth Mover’s Distance

‖x‖EMD as the cost of the best transportation of the positive part of x to the negative part (Monge-Kantorovich norm)

18

slide-77
SLIDE 77
  • For x: R[Δ]×[Δ] → R with ∑i,j xi,j = 0, define the Earth Mover’s Distance

‖x‖EMD as the cost of the best transportation of the positive part of x to the negative part (Monge-Kantorovich norm)

18

Original motivation of this work!

slide-78
SLIDE 78
  • For x: R[Δ]×[Δ] → R with ∑i,j xi,j = 0, define the Earth Mover’s Distance

‖x‖EMD as the cost of the best transportation of the positive part of x to the negative part (Monge-Kantorovich norm)

  • Best upper bounds:
  • D = O(1 / ε) and s = Δε (Andoni, Do Ba, Indyk, Woodruff 2009)
  • D = O(log Δ) and s = O(1) (Charikar 2002), (Indyk, Thaper 2003), (Naor,

Schechtman 2005)

18

Original motivation of this work!

slide-79
SLIDE 79
  • For x: R[Δ]×[Δ] → R with ∑i,j xi,j = 0, define the Earth Mover’s Distance

‖x‖EMD as the cost of the best transportation of the positive part of x to the negative part (Monge-Kantorovich norm)

  • Best upper bounds:
  • D = O(1 / ε) and s = Δε (Andoni, Do Ba, Indyk, Woodruff 2009)
  • D = O(log Δ) and s = O(1) (Charikar 2002), (Indyk, Thaper 2003), (Naor,

Schechtman 2005)

No embedding into ℓ1 – ε with distortion O(1) (Naor, Schechtman 2005) No sketches with D = O(1) and s = O(1)

18

slide-80
SLIDE 80

19

slide-81
SLIDE 81
  • For an n × n matrix A define the Trace Norm (the Nuclear Norm) ‖A‖

to be the sum of the singular values

19

slide-82
SLIDE 82
  • For an n × n matrix A define the Trace Norm (the Nuclear Norm) ‖A‖

to be the sum of the singular values

  • Previously: lower bounds only for certain restricted classes of

sketches (Li, Nguyen, Woodruff 2014)

19

slide-83
SLIDE 83
  • For an n × n matrix A define the Trace Norm (the Nuclear Norm) ‖A‖

to be the sum of the singular values

  • Previously: lower bounds only for certain restricted classes of

sketches (Li, Nguyen, Woodruff 2014)

Any embedding into ℓ1 requires distortion Ω(n1/2) (Pisier 1978) Any sketch must satisfy sD = Ω(n1/2 / log n)

19

slide-84
SLIDE 84
  • For an n × n matrix A define the Trace Norm (the Nuclear Norm) ‖A‖

to be the sum of the singular values

  • Previously: lower bounds only for certain restricted classes of

sketches (Li, Nguyen, Woodruff 2014)

Any embedding into ℓ1 requires distortion Ω(n1/2) (Pisier 1978) Any sketch must satisfy sD = Ω(n1/2 / log n)

19

  • Subsequent work (Li, Woodruff 2016): for D = 1 + ε, s ≥ n1 - f(ε)
  • One-way communication complexity
slide-85
SLIDE 85

20

slide-86
SLIDE 86

If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ1 – ε with distortion O(sD / ε)

20

slide-87
SLIDE 87

If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ1 – ε with distortion O(sD / ε) Sketches Weak embedding into ℓ2 Linear embedding into ℓ1 – ε

Information theory Nonlinear functional analysis

20

slide-88
SLIDE 88

If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ1 – ε with distortion O(sD / ε) Sketches Weak embedding into ℓ2 Linear embedding into ℓ1 – ε

Information theory Nonlinear functional analysis

A map f: X → Y is (s1, s2, τ1, τ2)-threshold, if

20

slide-89
SLIDE 89

If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ1 – ε with distortion O(sD / ε) Sketches Weak embedding into ℓ2 Linear embedding into ℓ1 – ε

Information theory Nonlinear functional analysis

A map f: X → Y is (s1, s2, τ1, τ2)-threshold, if

  • dX(x1, x2) ≤ s1 implies dY(f(x1), f(x2)) ≤ τ1

20

slide-90
SLIDE 90

If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ1 – ε with distortion O(sD / ε) Sketches Weak embedding into ℓ2 Linear embedding into ℓ1 – ε

Information theory Nonlinear functional analysis

A map f: X → Y is (s1, s2, τ1, τ2)-threshold, if

  • dX(x1, x2) ≤ s1 implies dY(f(x1), f(x2)) ≤ τ1
  • dX(x1, x2) ≥ s2 implies dY(f(x1), f(x2)) ≥ τ2

20

slide-91
SLIDE 91

If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ1 – ε with distortion O(sD / ε) Sketches Weak embedding into ℓ2 Linear embedding into ℓ1 – ε

Information theory Nonlinear functional analysis

A map f: X → Y is (s1, s2, τ1, τ2)-threshold, if

  • dX(x1, x2) ≤ s1 implies dY(f(x1), f(x2)) ≤ τ1
  • dX(x1, x2) ≥ s2 implies dY(f(x1), f(x2)) ≥ τ2

(1, O(sD), 1, 10)-threshold map from X to ℓ2

20

slide-92
SLIDE 92

21

slide-93
SLIDE 93

X has a sketch of size s and approximation D There is a (1, O(sD), 1, 10)- threshold map from X to ℓ2

21

slide-94
SLIDE 94

X has a sketch of size s and approximation D There is a (1, O(sD), 1, 10)- threshold map from X to ℓ2 No (1, O(sD), 1, 10)- threshold map from X to ℓ2

21

slide-95
SLIDE 95

X has a sketch of size s and approximation D There is a (1, O(sD), 1, 10)- threshold map from X to ℓ2 No (1, O(sD), 1, 10)- threshold map from X to ℓ2 Poincaré-type inequalities on X

Convex duality

21

slide-96
SLIDE 96

X has a sketch of size s and approximation D There is a (1, O(sD), 1, 10)- threshold map from X to ℓ2 No (1, O(sD), 1, 10)- threshold map from X to ℓ2 Poincaré-type inequalities on X

Convex duality

ℓk

∞(X) has no sketches

  • f size Ω(k) and

approximation Θ(sD)

(Andoni, Jayram, Pătraşcu 2010) (direct sum theorem for information complexity)

21

slide-97
SLIDE 97

X has a sketch of size s and approximation D There is a (1, O(sD), 1, 10)- threshold map from X to ℓ2 No (1, O(sD), 1, 10)- threshold map from X to ℓ2 Poincaré-type inequalities on X

Convex duality

ℓk

∞(X) has no sketches

  • f size Ω(k) and

approximation Θ(sD)

(Andoni, Jayram, Pătraşcu 2010) (direct sum theorem for information complexity)

‖(x1, …, xk)‖ = maxi ‖xi‖

21

slide-98
SLIDE 98

X has a sketch of size s and approximation D There is a (1, O(sD), 1, 10)- threshold map from X to ℓ2 No (1, O(sD), 1, 10)- threshold map from X to ℓ2 Poincaré-type inequalities on X

Convex duality

ℓk

∞(X) has no sketches

  • f size Ω(k) and

approximation Θ(sD)

(Andoni, Jayram, Pătraşcu 2010) (direct sum theorem for information complexity)

X has no sketches of size s and approximation D ‖(x1, …, xk)‖ = maxi ‖xi‖

21

slide-99
SLIDE 99

22

slide-100
SLIDE 100

X has sketches of size s and approximation D ℓk

∞(X) has sketches of size O(s)

and approximation Dk

22

slide-101
SLIDE 101

X has sketches of size s and approximation D ℓk

∞(X) has sketches of size O(s)

and approximation Dk

Alice Bob

(a1, a2, …, ak) (b1, b2, …, bk)

22

slide-102
SLIDE 102

X has sketches of size s and approximation D ℓk

∞(X) has sketches of size O(s)

and approximation Dk

Alice Bob

(a1, a2, …, ak) (b1, b2, …, bk)

22

(σ1, σ2, …, σk) — random ±1’s

slide-103
SLIDE 103

X has sketches of size s and approximation D ℓk

∞(X) has sketches of size O(s)

and approximation Dk

Alice Bob

(a1, a2, …, ak) (b1, b2, …, bk) ∑i σiai ∑i σibi

22

(σ1, σ2, …, σk) — random ±1’s

slide-104
SLIDE 104

X has sketches of size s and approximation D ℓk

∞(X) has sketches of size O(s)

and approximation Dk

Alice Bob

(a1, a2, …, ak) (b1, b2, …, bk) ∑i σiai ∑i σibi sketch(∑i σi ai) sketch(∑i σi bi)

22

(σ1, σ2, …, σk) — random ±1’s

slide-105
SLIDE 105

X has sketches of size s and approximation D ℓk

∞(X) has sketches of size O(s)

and approximation Dk

Alice Bob

(a1, a2, …, ak) (b1, b2, …, bk) ∑i σiai ∑i σibi sketch(∑i σi ai) sketch(∑i σi bi) maxi ‖ai - bi‖ ≤ ‖∑i σi(ai – bi)‖ ≤ ∑i ‖ai - bi‖ ≤ k maxi ‖ai - bi‖

22

(σ1, σ2, …, σk) — random ±1’s

slide-106
SLIDE 106

X has sketches of size s and approximation D ℓk

∞(X) has sketches of size O(s)

and approximation Dk

Alice Bob

(a1, a2, …, ak) (b1, b2, …, bk) ∑i σiai ∑i σibi sketch(∑i σi ai) sketch(∑i σi bi) maxi ‖ai - bi‖ ≤ ‖∑i σi(ai – bi)‖ ≤ ∑i ‖ai - bi‖ ≤ k maxi ‖ai - bi‖

22

(σ1, σ2, …, σk) — random ±1’s with probability 1/2

slide-107
SLIDE 107

X has sketches of size s and approximation D ℓk

∞(X) has sketches of size O(s)

and approximation Dk

Alice Bob

(a1, a2, …, ak) (b1, b2, …, bk) ∑i σiai ∑i σibi sketch(∑i σi ai) sketch(∑i σi bi) maxi ‖ai - bi‖ ≤ ‖∑i σi(ai – bi)‖ ≤ ∑i ‖ai - bi‖ ≤ k maxi ‖ai - bi‖ Crucially use the linear structure of X (not enough to be merely a metric!)

22

(σ1, σ2, …, σk) — random ±1’s with probability 1/2

slide-108
SLIDE 108

23

slide-109
SLIDE 109

(1, O(sD), 1, 10)-threshold map from X to ℓ2 Linear embedding into ℓ1 – ε with distortion O(sD / ε)

23

slide-110
SLIDE 110

(1, O(sD), 1, 10)-threshold map from X to ℓ2 Linear embedding into ℓ1 – ε with distortion O(sD / ε) Uniform embedding into ℓ2

23

slide-111
SLIDE 111

(1, O(sD), 1, 10)-threshold map from X to ℓ2 Linear embedding into ℓ1 – ε with distortion O(sD / ε) Uniform embedding into ℓ2

g: X → ℓ2 s.t. L(‖x1 – x2‖) ≤ ‖g(x1) – g(x2)‖ ≤ U(‖x1 – x2‖) where

  • L and U are non-decreasing,
  • L(t) > 0 for t > 0
  • U(t) → 0 as t → 0

23

slide-112
SLIDE 112

(1, O(sD), 1, 10)-threshold map from X to ℓ2 Linear embedding into ℓ1 – ε with distortion O(sD / ε) Uniform embedding into ℓ2

g: X → ℓ2 s.t. L(‖x1 – x2‖) ≤ ‖g(x1) – g(x2)‖ ≤ U(‖x1 – x2‖) where

  • L and U are non-decreasing,
  • L(t) > 0 for t > 0
  • U(t) → 0 as t → 0

(Aharoni, Maurey, Mityagin 1985) (Nikishin 1973)

23

slide-113
SLIDE 113

24

slide-114
SLIDE 114

  • A map f: X → ℓ2 such that
  • ‖x1 - x2‖ ≤ 1 implies ‖f(x1) - f(x2)‖ ≤ 1
  • ‖x1 - x2‖ ≥ Θ(sD) implies ‖f(x1) - f(x2)‖ ≥ 10

24

slide-115
SLIDE 115

  • A map f: X → ℓ2 such that
  • ‖x1 - x2‖ ≤ 1 implies ‖f(x1) - f(x2)‖ ≤ 1
  • ‖x1 - x2‖ ≥ Θ(sD) implies ‖f(x1) - f(x2)‖ ≥ 10
  • Building on (Johnson, Randrianarivony 2006)

24

slide-116
SLIDE 116

  • A map f: X → ℓ2 such that
  • ‖x1 - x2‖ ≤ 1 implies ‖f(x1) - f(x2)‖ ≤ 1
  • ‖x1 - x2‖ ≥ Θ(sD) implies ‖f(x1) - f(x2)‖ ≥ 10
  • Building on (Johnson, Randrianarivony 2006)
  • 1-net N of X; f Lipschitz on N

24

slide-117
SLIDE 117

  • A map f: X → ℓ2 such that
  • ‖x1 - x2‖ ≤ 1 implies ‖f(x1) - f(x2)‖ ≤ 1
  • ‖x1 - x2‖ ≥ Θ(sD) implies ‖f(x1) - f(x2)‖ ≥ 10
  • Building on (Johnson, Randrianarivony 2006)
  • 1-net N of X; f Lipschitz on N
  • Extend f from N to a Lipschitz function on the whole X

24

slide-118
SLIDE 118

25

slide-119
SLIDE 119
  • Extend to as general class of metrics as possible (Edit Distance?)

25

slide-120
SLIDE 120
  • Extend to as general class of metrics as possible (Edit Distance?)
  • Strengthen to “sketches with O(1) size and approximation imply

embedding into ℓ1 with distortion O(1)”?

  • Equivalent to an old open problem from Functional Analysis (Kwapien 1969)

25

slide-121
SLIDE 121
  • Extend to as general class of metrics as possible (Edit Distance?)
  • Strengthen to “sketches with O(1) size and approximation imply

embedding into ℓ1 with distortion O(1)”?

  • Equivalent to an old open problem from Functional Analysis (Kwapien 1969)
  • Keep in mind negative-type metrics that do not embed into ℓ1

(Khot, Vishnoi 2005) (Cheeger, Kleiner, Naor 2009)

25

slide-122
SLIDE 122
  • Extend to as general class of metrics as possible (Edit Distance?)
  • Strengthen to “sketches with O(1) size and approximation imply

embedding into ℓ1 with distortion O(1)”?

  • Equivalent to an old open problem from Functional Analysis (Kwapien 1969)
  • Keep in mind negative-type metrics that do not embed into ℓ1

(Khot, Vishnoi 2005) (Cheeger, Kleiner, Naor 2009)

  • Spaces that require s = Ω(d) for D = O(1) besides ℓ∞?

25

slide-123
SLIDE 123
  • Extend to as general class of metrics as possible (Edit Distance?)
  • Strengthen to “sketches with O(1) size and approximation imply

embedding into ℓ1 with distortion O(1)”?

  • Equivalent to an old open problem from Functional Analysis (Kwapien 1969)
  • Keep in mind negative-type metrics that do not embed into ℓ1

(Khot, Vishnoi 2005) (Cheeger, Kleiner, Naor 2009)

  • Spaces that require s = Ω(d) for D = O(1) besides ℓ∞?
  • Linear sketches with f(s) measurements and g(D) approximation?

25

slide-124
SLIDE 124
  • Extend to as general class of metrics as possible (Edit Distance?)
  • Strengthen to “sketches with O(1) size and approximation imply

embedding into ℓ1 with distortion O(1)”?

  • Equivalent to an old open problem from Functional Analysis (Kwapien 1969)
  • Keep in mind negative-type metrics that do not embed into ℓ1

(Khot, Vishnoi 2005) (Cheeger, Kleiner, Naor 2009)

  • Spaces that require s = Ω(d) for D = O(1) besides ℓ∞?
  • Linear sketches with f(s) measurements and g(D) approximation?

25