Communication and Memory Efficient Testing of Discrete - - PowerPoint PPT Presentation

communication and memory efficient testing of discrete
SMART_READER_LITE
LIVE PREVIEW

Communication and Memory Efficient Testing of Discrete - - PowerPoint PPT Presentation

Communication and Memory Efficient Testing of Discrete Distributions Themis Gouleakis USC MPI July 21, 2019 Joint work with: Ilias Diakonikolas (USC), Daniel Kane (UCSD) and Sankeerth Rao (UCSD) M OTIVATION Datasets growing too


slide-1
SLIDE 1

Communication and Memory Efficient Testing of Discrete Distributions Themis Gouleakis

USC→MPI

July 21, 2019 Joint work with: Ilias Diakonikolas (USC), Daniel Kane (UCSD) and Sankeerth Rao (UCSD)

slide-2
SLIDE 2

MOTIVATION

◮ Datasets growing → too many samples needed! ◮ Can we do property testing distributedly? ◮ Insufficient memory! ◮ Design low memory algorithms!

2 / 15

slide-3
SLIDE 3

Is the lottery fair?

vs ◮ We can learn the distribution: Ω(n) samples. ◮ Centralized sampling/ unbounded memory: we can test (uniform vs ε-far) with Θ(√n/ε2) samples. ◮ What if we have memory constraints/unavailable centralized sampling?

3 / 15

slide-4
SLIDE 4

DEFINITION AND (CENTRALIZED) PRIOR WORK

vs

Uniformity testing problem

Given samples from a probability distribution p, distinguish p = Un from p − Un1 > ε with success probability at least 2/3. ◮ Sample complexity: Θ

√n

ε2

  • [Goldreich, Ron 00],[Batu,

Fisher, Fortnow, Kumar, Rubinfeld, White 01],[Paninski 08], [Chan, Diakonikolas, Valiant, Valiant 14], [Diakonikolas, G, Peebles, Price 17]

4 / 15

slide-5
SLIDE 5

PRIOR/RELATED WORK

Distributed learning

◮ Parameter estimation [ZDJW13],[GMN14],[BGMNW16],[JLY16],[HOW18] ◮ Non-parametric [DGLNOS17],[HMOW18]

Distributed testing

◮ Single sample per machine with sublogarithmic size messages: [Acharya, Cannone, Tyagi 18] ◮ Two-party setting: [Andoni, Malkin, Nosatzki 18] ◮ LOCAL and CONGEST models: [Fisher, Meir, Oshman 18]

5 / 15

slide-6
SLIDE 6

CENTRALIZED COLLISION-BASED ALGORITHM

[GOLDREICH, RON 00],[BATU, FISHER, FORTNOW, KUMAR, RUBINFELD, WHITE 01]

Problem: Given distribution p over [n], distinguish p = Un from p − Un1 ≥ ǫ. ◮ m samples ◮ Node labels: i.i.d samples from p. ◮ Edges: {i, j} ∈ E iff L(i) = L(j) ◮ Define statistic Z = ♯edges ⇒ E[Z] =

m

2

· p2

2

◮ Minimized for p = Un

◮ Idea: Draw enough samples and compare Z to some threshold.

6 / 15

slide-7
SLIDE 7

GENERIC BIPARTITE TESTING ALGORITHM

ℓ SAMPLES PER MACHINE

Problem: Given distribution p over [n], distinguish p = Un from p − Un1 ≥ ǫ. ◮ ℓ samples per machine. ◮ Node labels: i.i.d samples from p. ◮ Edges: {i, j} ∈ E iff (i ∈ S1) ∧ (j ∈ S2) ∧ (L(i) = L(j))

7 / 15

slide-8
SLIDE 8

GENERIC BIPARTITE TESTING ALGORITHM

ℓ SAMPLES PER MACHINE

Problem: Given distribution p over [n], distinguish p = Un from p − Un1 ≥ ǫ. ◮ ℓ samples per machine. ◮ Node labels: i.i.d samples from p. ◮ Edges: {i, j} ∈ E iff (i ∈ S1) ∧ (j ∈ S2) ∧ (L(i) = L(j)) ◮ Define statistic Z = ♯edges ⇒ E[Z] = |S1| · |S2| · p2

2

◮ Minimized for p = Un

◮ Remark: Suboptimal sample complexity, but can lead to

  • ptimal communication complexity in certain cases.

8 / 15

slide-9
SLIDE 9

COMMUNICATION MODEL

◮ Unbounded number of players ◮ Players can broadcast on the blackboard ◮ The referee asks questions to players and receives replies. ◮ Goal: Minimize total number of bits of communication.

9 / 15

slide-10
SLIDE 10

A COMMUNICATION EFFICIENT ALGORITHM

◮ Idea: Statistic Z = sum of degrees on one side.

◮ Only the opposite side needs to reveal samples exactly.

◮ Broadcasted samples: ℓ · |S1| = √

n/ℓ ǫ2√log n

◮ Not enough for testing.

◮ And the samples on the right?

◮ Only degrees dk sent to the referee.

◮ O(1) bits/message w.l.o.g.

◮ Communication complexity: O

n/ℓ√log n ǫ2

  • bits.

◮ Matching lower bound of Ω √

n/ℓ√ log n ǫ2

  • bits for small ℓ.

◮ Better than naive O

√n log n

ǫ2

  • bits.

10 / 15

slide-11
SLIDE 11

COMMUNICATION EFFICIENT IMPLEMENTATION

TWO ALGORITHMS

Case I: ℓ = ˜ O(n1/3/ε4/3) samples/ machine ◮ Use cross collisions - bipartite graph ◮ Communication complexity: O

n/ℓ√log n ǫ2

  • bits.

Case II: ℓ = ˜ Ω(n1/3/ε4/3) samples/machine ◮ Each machine sends that number of local collisions and to the referee. ◮ The referee computes the total sum Z of the collisions.

◮ E[Z] = ℓ

2

  • p2

2

◮ Threshold: (1 + ε2)E[Z]

◮ Communication complexity: O

  • n log n

ℓ2ǫ4

  • bits.

11 / 15

slide-12
SLIDE 12

MEMORY EFFICIENT IMPLEMENTATION

IN THE ONE-PASS STREAMING MODEL

Model:

One-pass streaming algorithm: The samples arrive in a stream and the algorithm can access them only once.

Memory constraint: At most m bits for some m ≥ log n/ε6

◮ Use N1 = m/2 log n samples to get the multiset of labels S1. ◮ Use collision information from N2 = Θ

n log n/(mε4)

  • ther samples (i.e the multiset of labels S2).

Remarks: ◮ We can store

r

  • k=1

dk, 1 ≤ r ≤ N2 in a single pass. ◮ For m = Ω(√n log n/ε2), we simply run the classical collision-based tester using the first O(√n/ε2) samples.

12 / 15

slide-13
SLIDE 13

SUMMARY OF RESULTS

Sample Complexity Bounds with Memory Constraints Property Upper Bound Lower Bound 1 Lower Bound 2 Uniformity O

  • n log n

mε4

  • n log n

mε4

  • n

mε2

  • Conditions

n0.9 ≫ m ≫ log(n)/ε2 m = ˜ Ω( n0.34

ε8/3 + n0.1 ε4 )

Unconditional Closeness O(n

  • log(n)/(√mε2))

– – Conditions ˜ Θ(min(n, n2/3/ε4/3)) ≫ m ≫ log(n) – – Communication Complexity Bounds Property UB 1 UB 2 LB 1 LB 2 LB 3 Uniformity O

n log(n)/ℓ ε2

  • O

n log(n)

ℓ2ε4

n log(n)/ℓ ε2

  • Ω(

n/ℓ ε

) Ω(

n ℓ2ε2 log n)

Conditions

ε8n log n ≫ ℓ ≫ ε−4 n0.9

ℓ ≪

√n ε2

ε4/3n0.3 ≫ ℓ ℓ = ˜ O

  • n1/3

ε4/3

  • ℓ = ˜

  • n1/3

ε4/3

  • Closeness

O

  • n2/3 log1/3(n)

ℓ2/3ε4/3

  • Conditions

nε4/ log(n) ≫ ℓ

  • 13 / 15
slide-14
SLIDE 14

LOWER BOUNDS (ONE PASS)

k SAMPLES, m BITS OF MEMORY, ℓ SAMPLES PER MACHINE

  • 1. Memory:

◮ k · m = Ω( n

ε2 )

◮ Under technical assumptions: k · m = Ω( n log n

ε4

)

Reduction (low communication ⇒ low memory)

◮ samples/machine: ℓ ◮ bits of communication: t

Store samples of the next player only ⇒ t + ℓ log n-memory

  • 2. Communication (ℓ = O
  • n1/3

ε4/3(log n)1/3

  • )-one pass:

◮ Ω √

n/ℓ ε

  • samples.

◮ Under assumptions: Ω √

n log n/ℓ ε2

  • 3. Communication (ℓ = Ω
  • n1/3

ε4/3(log n)1/3

  • )-one pass:

◮ Ω

  • n

ℓ2ε2 log n

  • samples.

14 / 15

slide-15
SLIDE 15

SUMMARY-OPEN PROBLEMS

◮ We described a bipartite collision-based algorithm for uniformity.

◮ Then applied it to memory constrained and distributed settings.

◮ Showed matching lower bounds for certain parameter regimes.

◮ An asymptotically optimal algorithm becomes (provably) suboptimal as ℓ grows.

Open Problems: ◮ Do the lower bounds still hold if multiple passes are allowed? ◮ Is there an algorithm with a better communication-sample complexity trade-off?

15 / 15