Sketching as a tool for Algorithmic Design Alex Andoni (Columbia - - PowerPoint PPT Presentation

β–Ά
sketching as a tool for algorithmic design
SMART_READER_LITE
LIVE PREVIEW

Sketching as a tool for Algorithmic Design Alex Andoni (Columbia - - PowerPoint PPT Presentation

Sketching as a tool for Algorithmic Design Alex Andoni (Columbia University) Find similar pairs Methodology ? Small space algorithms Sketching Fast algorithms 000000 000000 dimension 011100 011100 010100 010100 reduction 000100


slide-1
SLIDE 1

Sketching as a tool for Algorithmic Design

Alex Andoni

(Columbia University)

slide-2
SLIDE 2

Find similar pairs

slide-3
SLIDE 3

000000 011100 010100 000100 010100 011111 000000 001100 000100 000100 110100 111111

Methodology ?

000000 011100 010100 000100 010100 011111 000000 001100 000100 000100 110100 111111β‰ˆ

Sketching Fast algorithms

  • compression
  • good for

specific task

  • lossy

dimension reduction

Dimension reduction: linear map 𝑇: β„π‘œ β†’ ℝ𝑙 s.t:

  • for any points π‘ž, π‘Ÿ ∈ β„π‘œ:

Pr

𝑇 ||𝑇 π‘ž βˆ’π‘‡(π‘Ÿ)|| ||π‘žβˆ’π‘Ÿ||

∈ 1 Β± πœ— β‰₯ 1 βˆ’ πœ€ [Johnson- Lindenstrauss’84]: 𝑇 = Gaussian matrix 𝑙 = 𝑃 1 πœ—2 log 1 πœ€

Small space algorithms

slide-4
SLIDE 4

Plan

4

 Numerical Linear Algebra  Nearest Neighbor Search  Min-cost matching in plane

a sketch of sketching applications…

slide-5
SLIDE 5

Plan

5

 Numerical Linear Algebra

 the power of linear sketches

 Nearest Neighbor Search  Min-cost matching in plane

slide-6
SLIDE 6

Numerical Linear Algebra

 Problem: Least Square Regression

 π‘¦βˆ— = π‘π‘ π‘•π‘›π‘—π‘œπ‘¦||𝐡𝑦 βˆ’ 𝑐||  where 𝐡 is π‘œ Γ— 𝑒 matrix  π‘œ ≫ 𝑒  1 + πœ— approximation

 Idea: Sketch-And-Solve

 solve 𝑦′ = π‘π‘ π‘•π‘›π‘—π‘œπ‘¦||𝑇 β‹… 𝐡𝑦 βˆ’ 𝑐 || = π‘π‘ π‘•π‘›π‘—π‘œπ‘¦||𝑇𝐡𝑦 βˆ’ 𝑇𝑐||

 where 𝑇: β„π‘œ β†’ ℝ𝑙 is a dimension-reducing matrix

 reduces to much smaller 𝑙 Γ— 𝑒 problem  Hope: ||𝐡𝑦′ βˆ’ 𝑐|| ≀ 1 + πœ— ||π΅π‘¦βˆ— βˆ’ 𝑐||

𝐡 𝑐 βˆ’ 𝑇𝐡 𝑦 𝑇𝑐 βˆ’ 𝑦 𝑇 𝑇

slide-7
SLIDE 7

Sketch-And-Solve

[S’06, CW’13, NN’13, MM’13, C’16]

 Issue: time to compute sketch

 When 𝑇=Gaussian ([JL]) β‡’ computing 𝑇𝐡 takes 𝑃(π‘œ β‹… 𝑒2) time  Idea: structured 𝑇 s.t. 𝑇𝐡 can be computed faster

 +structured 𝑇: 𝑃 π‘œπ‘œπ‘¨ 𝐡 +

𝑒 πœ— 𝑃 1

time

 +Preconditioner: 𝑃

π‘œπ‘œπ‘¨ 𝐡 + 𝑒𝑃 1 β‹… log

1 πœ—

Oblivious Subspace Embedding: linear map 𝑇: β„π‘œ β†’ ℝ𝑙s.t.

  • for any linear subspace 𝑄 βŠ‚ β„π‘œ of dimension 𝑒:

Pr

𝑇

βˆ€π‘ž ∈ 𝑄 ∢

||𝑇 π‘ž || ||π‘ž|| ∈ 1 Β± πœ—

β‰₯ 1 βˆ’ πœ€ 𝑙~𝑒

slower than the

  • riginal problem !
slide-8
SLIDE 8

β„“1 regression

 No similar dimension reduction in β„“1 [BC’04,JN’09]  +structured 𝑇, +preconditioner: 𝑃 π‘œπ‘œπ‘¨ 𝐡 β‹… log π‘œ +

𝑒 πœ— 𝑃 1

 More: other norms (β„“π‘ž, M-estimator, Orlicz norms), low-rank

approximation & optimization, matrix multiplication, see [Woodruff, FnTTCS’14,…]

Weak DR: linear map 𝑇: β„π‘œ β†’ ℝ𝑙, s.t.

  • for any π‘ž ∈ β„π‘œ: Pr

𝑇

1 ≀

||𝑇 π‘ž ||1 ||π‘ž||1

≀

1 πœ€ β‰₯ 1 βˆ’ 𝑃(πœ€)

Weak(er) OSE: linear map 𝑇: β„π‘œ β†’ ℝ𝑙s.t.

  • for any linear subspace 𝑄 βŠ‚ β„π‘œof dimension 𝑒:

Pr

𝑇

βˆ€π‘ž ∈ 𝑄 ∢ 1 ≀

||𝑇 π‘ž ||1 ||π‘ž||1

≀ 𝑒𝑃 1 β‰₯ 0.9 π‘‡π‘—π‘˜ ∼ Cauchy distribution, or 1/Exponential 𝑙 = 𝑃(𝑒 β‹… log 𝑒)

[I’00] [SW’11, MM’13, WZ’13, WW’18]

slide-9
SLIDE 9

Plan

9

 Numerical Linear Algebra  Nearest Neighbor Search

 ultra-small sketches

 Min-cost matching in plane

slide-10
SLIDE 10

Approximate Near Neighbor Search

 Preprocess: a set of 𝑂 point

 approximation 𝑑 > 1

 Query: given a query point π‘Ÿ, report a

point π‘žβˆ— ∈ 𝑄 with the smallest distance to π‘Ÿ

 up to factor 𝑑

 Near neighbor: threshold 𝑠  Parameters: space & query time

10

𝑠 π‘Ÿ π‘žβˆ— π‘žβ€² 𝑑𝑠

slide-11
SLIDE 11

Ultra-small sketches

11

 [KOR’98,IM’98]: β„“2, β„“1 have 1 + πœ—, 0.1, 𝑃

1 πœ—2

  • DE

sketches

 Via: bit sampling (Hamming),  or discretizing dimension reduction

Distance Estimation Sketch: for approx 𝑑, & all thresholds 𝑠 map 𝑇: ℝ𝑒 β†’ {0,1}𝑙, estimator 𝑭(β‹…,β‹…), s.t. for any π‘ž, π‘Ÿ ∈ ℝ𝑒:

  • ||π‘ž βˆ’ π‘Ÿ|| ≀ 𝑠, then Pr

𝑇 𝑭 𝑇 π‘ž , 𝑇 π‘Ÿ

= "π‘‘π‘šπ‘π‘‘π‘“" β‰₯ 1 βˆ’ πœ€

  • ||π‘ž βˆ’ π‘Ÿ|| > 𝑑𝑠, then Pr

𝑇 𝑭 𝑇 π‘ž , 𝑇 π‘Ÿ

= "π‘‘π‘šπ‘π‘‘π‘“" ≀ πœ€ (𝑑, πœ€, 𝑙)- DE sketch const # of bits!

000000 011100 010100 000100 010100 011111

slide-12
SLIDE 12

DE Sketch => NNS

12

Proof: construct a sketch with failure probability 1/𝑂

 by concatenating 𝑃 log 𝑂 i.i.d. copies of the sketch, and taking

majority vote

 Data structure: a look-up table for all possible sketches of a

query: 2𝑃 𝑙⋅log 𝑂 = 𝑂𝑃 𝑙 possibilities only

 Query time: computing the sketch, typically ~𝑃(𝑙𝑒 log 𝑂)

[see also AC’06]

Const size DES => NNS with polynomial space!

[KOR’98,IM’98]: (𝑑, 1/3, 𝑙)-DES imply 𝑑-approx NNS with space 𝑂𝑃 𝑙 and 1 memory probe per query [AK+ANNRW’18]: (𝑑, 0.1, 𝑙)-DES implies NNS with 𝑃(𝑑𝑙)-approximation and 𝑃(𝑂1.1) space, 𝑃 𝑂0.1 memory probes per query

slide-13
SLIDE 13

[AKR’15]: when π‘Œ is a norm:

Beyond β„“1 and β„“2

13

𝜷-embedding of metric 𝒀 into β„“πŸ: for distortion 𝐸, power 𝛽 β‰₯ 1: map 𝑔: π‘Œ β†’ β„“1, s.t. for any π‘ž, π‘Ÿ ∈ π‘Œ:

  • ||𝑔 π‘ž βˆ’ 𝑔 π‘Ÿ ||𝛽 ≀ π‘’π‘—π‘‘π‘’π‘Œ π‘ž, π‘Ÿ ≀ 𝐸 β‹… ||𝑔 π‘ž βˆ’ 𝑔 π‘Ÿ ||𝛽

Embedding with 𝐸 = 𝑑

OPEN: if 𝛽 = 1 achievable

𝑃 𝑑 , 0.1, 𝑃 1

  • DES

NNS Embedding with 𝐸 = 𝑃(𝑑𝑙) 𝑃 𝑑 , 0.1, 𝑙 -DES

Not true for general π‘Œ [KN]

slide-14
SLIDE 14

NNS with smaller space?

14

 Space closer to linear in 𝑂 ? LSH Sketch: for approx 𝑑, & βˆ€ thresholds 𝑠 map 𝑇: ℝ𝑒 β†’ {0,1}𝑙, estimator 𝑭(β‹…,β‹…), s.t. for any π‘ž, π‘Ÿ ∈ ℝ𝑒:

  • ||π‘ž βˆ’ π‘Ÿ|| ≀ 𝑠, then Pr

𝑇 𝑭 𝑇 π‘ž , 𝑇 π‘Ÿ

= "π‘‘π‘šπ‘π‘‘π‘“" β‰₯ 2βˆ’πœπ‘™

  • ||π‘ž βˆ’ π‘Ÿ|| > 𝑑𝑠, then Pr

𝑇 𝑭 𝑇 π‘ž , 𝑇 π‘Ÿ

= "π‘‘π‘šπ‘π‘‘π‘“" ≀ 2βˆ’π‘™+1

  • 𝐹 𝜏, 𝜐 = "π‘‘π‘šπ‘π‘‘π‘“β€œ iff 𝜏 = 𝜐

(𝑑, 𝜍, 𝑙)-LSH

[IM’98]: (𝑑, 𝜍, 𝑙)-LSH imply 𝑑-approx NNS with 𝑃(𝑂1+𝜍) space and 𝑃 π‘‚πœ memory probes per query

[IM’98]: 𝜍 = 1/𝑑 for β„“1

slide-15
SLIDE 15

Plan

15

 Numerical Linear Algebra  Nearest Neighbor Search  Min-cost matching in plane

 specialized sketches

 Exploit sketches for:

 input  internal state / partial computations

Computation

slide-16
SLIDE 16

 Problem:

 Given two sets 𝐡, 𝐢 of points in ℝ2,  Find min-cost matching (1 + πœ— approx.)  a.k.a., Earth-Mover Distance, optimal transport,

Wasserstein metric, etc

 Classically: LP with π‘œ2 variables

 General: ෨

𝑃(π‘œ2/πœ—4) time [AWR’17]

 In 2D: hope for β‰ˆ π‘œ time [SA’12]

LP for Geometric Matching

16

min

πœŒβˆˆβ„+

π‘œ2 ෍

π‘—π‘˜

||π‘žπ‘— βˆ’ π‘Ÿπ‘˜|| β‹… πœŒπ‘—π‘˜ s.t. 𝜌𝟐 =

1 π‘œ 𝟐 and πœŒπ‘’πŸ = 1 π‘œ 𝟐

[ANOY’14]: Solve-And-Sketch framework Solves in π‘œ1+𝑝(1) time (for fixed πœ—)

slide-17
SLIDE 17

Solve-And-Sketch (=Divide & Conquer)

 Partition the space hierarchically in a β€œnice way”  In each part

 Compute a β€œsolution” for the local view  Sketch the solution using small space  Combine local sketches into (more) global solution

17

slide-18
SLIDE 18

 Partition the space hierarchically in a β€œnice way”  In each part

 Compute a β€œsolution” for the local view  Sketch the solution using small space  Combine local sketches into (more) global solution

Solve-And-Sketch for 2D Matching

quad-tree after committing to a wrong alternation, cannot get <2 approximation! cannot precompute any β€œlocal solution”

all potential local solutions

Sketch of all potential local solutions: Small-space sketch of the β€œsolution” function 𝐺: ℝ𝑙 β†’ ℝ+

  • input 𝑦 ∈ ℝ𝑙 defines the flow (matching) at the

β€œinterface” to the rest

  • 𝐺(𝑦) is the min-cost matching assuming flow 𝑦 at

interface

Exists with polylog(n) space

slide-19
SLIDE 19

A sketch of the rest

19

 Numerical Linear Algebra

 linear sketching

 Nearest Neighbor Search

 ultra-small sketches

 Min-cost matching in plane

 specialized sketching

 Graph sketching

 Linear sketch for graph => data structures for dynamic connectivity

[AGM’12, KKM’13]

 Characterization of DE-sketch size for metrics:

 For symmetric norms [BBCKY’17]

 Adaptive sketching: when we know we sketch set 𝐡 βŠ‚ ℝ𝑒

 Then 𝑇 β‹… may depend (weakly) on 𝐡  Non-oblivious subspace embeddings [DMM’06,…, Woodruff’14]  Data-dependent LSH [AINR’14, AR’15]

Sketching Fast algorithms

slide-20
SLIDE 20

Bibliography 1

20

 Sarlos’06  Clarkson-Woodruff’13,  Nguyen-Nelson’13,  Mahoney-Meng’13,  Cohen’16  Indyk’00  Sohler-Woodruff’11  Woodruff-Zhang’13  Wang-Woodruff’18 (arxiv)

slide-21
SLIDE 21

Bibliography 2

21

 Kushilevitz-Ostrovsky-Rabani’98  Indyk-Motwani’98  Ailon-Chazelle’06  Khot-Naor (unpublished)  A-Krauthgamer (unpublished)  A-Naor-Nikolov-Razenshteyn-Weingarten’18  Altschuler-Weed-Rigolet’17  Sharathkumar-Agarwal’12

 A.-Nikolov-Onak-Yaroslavtsev’14

 Ahn-Guha-McGregor’12  Kapron-King-Mountjoy’13  Blasiok-Braverman-Chestnut-Krauthgamer-Yang’17  Drineas-Mahoney-Muthukrishnan’06  A-Indyk-Nguyen-Razenshteyn’14  A-Razenshteyn’15