fifty shades of adaptivity (in property testing) An Adaptivity - - PowerPoint PPT Presentation

fifty shades of adaptivity in property testing
SMART_READER_LITE
LIVE PREVIEW

fifty shades of adaptivity (in property testing) An Adaptivity - - PowerPoint PPT Presentation

fifty shades of adaptivity (in property testing) An Adaptivity Hierarchy Theorem for Property Testing Clment Canonne (Columbia University) July 9, 2017 Joint work with Tom Gur (Weizmann Institute UC Berkeley) property testing?


slide-1
SLIDE 1

fifty shades of adaptivity (in property testing)

An Adaptivity Hierarchy Theorem for Property Testing

Clément Canonne (Columbia University) July 9, 2017

Joint work with Tom Gur (Weizmann Institute UC Berkeley)

slide-2
SLIDE 2

“property testing?”

slide-3
SLIDE 3

why?

Sublinear, approximate, randomized decision algorithms that make queries ∙ Big object: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options ∙ Good Enough: a priori knowledge Need to infer information – one bit – from the data: quickly, or with very few lookups.

2

slide-4
SLIDE 4

why?

Sublinear, approximate, randomized decision algorithms that make queries ∙ Big object: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options ∙ Good Enough: a priori knowledge Need to infer information – one bit – from the data: quickly, or with very few lookups.

2

slide-5
SLIDE 5

why?

Sublinear, approximate, randomized decision algorithms that make queries ∙ Big object: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options ∙ Good Enough: a priori knowledge Need to infer information – one bit – from the data: quickly, or with very few lookups.

2

slide-6
SLIDE 6

why?

Sublinear, approximate, randomized decision algorithms that make queries ∙ Big object: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options ∙ Good Enough: a priori knowledge Need to infer information – one bit – from the data: quickly, or with very few lookups.

2

slide-7
SLIDE 7

why?

Sublinear, approximate, randomized decision algorithms that make queries ∙ Big object: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options ∙ Good Enough: a priori knowledge Need to infer information – one bit – from the data: quickly, or with very few lookups.

2

slide-8
SLIDE 8

why?

Sublinear, approximate, randomized decision algorithms that make queries ∙ Big object: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options ∙ Good Enough: a priori knowledge Need to infer information – one bit – from the data: quickly, or with very few lookups.

2

slide-9
SLIDE 9

why?

Sublinear, approximate, randomized decision algorithms that make queries ∙ Big object: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options ∙ Good Enough: a priori knowledge Need to infer information – one bit – from the data: quickly, or with very few lookups.

2

slide-10
SLIDE 10

why?

Sublinear, approximate, randomized decision algorithms that make queries ∙ Big object: too big ∙ Expensive access: pricey data ∙ “Model selection”: many options ∙ Good Enough: a priori knowledge Need to infer information – one bit – from the data: quickly, or with very few lookups.

2

slide-11
SLIDE 11

3

slide-12
SLIDE 12

how?

Known space (say, {0, 1}N) Property P ⊆ {0, 1}N) Query (oracle) access to unknown x ∈ {0, 1}N Proximity parameter ε ∈ (0, 1] Must decide: x , or d x ? (and be correct on any x with probability at least 2 3)

4

slide-13
SLIDE 13

how?

Known space (say, {0, 1}N) Property P ⊆ {0, 1}N) Query (oracle) access to unknown x ∈ {0, 1}N Proximity parameter ε ∈ (0, 1] Must decide: x ∈ P , or d x ? (and be correct on any x with probability at least 2 3)

4

slide-14
SLIDE 14

how?

Known space (say, {0, 1}N) Property P ⊆ {0, 1}N) Query (oracle) access to unknown x ∈ {0, 1}N Proximity parameter ε ∈ (0, 1] Must decide: x ∈ P, or d(x, P) > ε? (and be correct on any x with probability at least 2 3)

4

slide-15
SLIDE 15

how?

Known space (say, {0, 1}N) Property P ⊆ {0, 1}N) Query (oracle) access to unknown x ∈ {0, 1}N Proximity parameter ε ∈ (0, 1] Must decide: x ∈ P, or d(x, P) > ε? (and be correct on any x with probability at least 2/3)

4

slide-16
SLIDE 16

how?

Property Testing: in an (egg)shell.

5

slide-17
SLIDE 17

how?

Property Testing: in an (egg)shell.

5

slide-18
SLIDE 18

how?

Property Testing: in an (egg)shell.

5

slide-19
SLIDE 19

it’s complicated

Many flavors… … one-sided vs. two-sided, query-based vs. sample-based, uniform

  • vs. distribution-free, adaptive vs. non-adaptive

6

slide-20
SLIDE 20

it’s complicated

Many flavors… … one-sided vs. two-sided, query-based vs. sample-based, uniform

  • vs. distribution-free, adaptive vs. non-adaptive

6

slide-21
SLIDE 21

it’s complicated

Many flavors… … one-sided vs. two-sided, query-based vs. sample-based, uniform

  • vs. distribution-free,

adaptive vs. non-adaptive

6

slide-22
SLIDE 22

it’s complicated

Many flavors… … one-sided vs. two-sided, query-based vs. sample-based, uniform

  • vs. distribution-free, adaptive vs. non-adaptive

6

slide-23
SLIDE 23

adaptivity

slide-24
SLIDE 24
  • ur focus: adaptivity

Non-adaptive algorithm Makes all its queries upfront: Q ⊆ [N] = Q(ε, r) = {i1, . . . , iq} Adaptive algorithm Each query can depend arbitrarily on the previous answers:

8

slide-25
SLIDE 25

some observations

Dense graph model At most a quadratic gap between adaptive and non-adaptive algorithms: q vs. 2q2 [AFKS00, GT03],[GR11] Boolean functions At most an exponential gap between adaptive and non-adaptive algorithms: q vs. 2q Bounded-degree graph model Everything is possible: O(1) vs. Ω(√n). [RS06]

9

slide-26
SLIDE 26

why should we care?

Of course Fewer queries is always better. But Many parallel queries can beat few sequential ones. Understanding the benefits and tradeoffs of adaptivity is crucial.

10

slide-27
SLIDE 27

why should we care?

Of course Fewer queries is always better. But Many parallel queries can beat few sequential ones. Understanding the benefits and tradeoffs of adaptivity is crucial.

10

slide-28
SLIDE 28

why should we care?

Of course Fewer queries is always better. But Many parallel queries can beat few sequential ones. Understanding the benefits and tradeoffs of adaptivity is crucial.

10

slide-29
SLIDE 29

this work

A closer look Does the power of testing algorithms smoothly grow with the “amount of adaptivity?”

(and what does “amount of adaptivity” even mean?)

11

slide-30
SLIDE 30

this work

A closer look Does the power of testing algorithms smoothly grow with the “amount of adaptivity?”

(and what does “amount of adaptivity” even mean?)

11

slide-31
SLIDE 31

coming up with a definition

Definition (Round-Adaptive Testing Algorithms)

Let Ω be a domain of size n, and k, q ≤ n. A randomized algorithm is said to be a (k, q)-round-adaptive tester for P ⊆ 2Ω, if, on input ε ∈ (0, 1] and granted query access to f: Ω → {0, 1}: (i) Query Generation: The algorithm proceeds in k 1 rounds, such that at round 0, it produces a set of queries Q x

1

x

Q

, based on its own internal randomness and the answers to the previous sets of queries Q0 Q

1, and receives

f Q f x

1

f x

Q

; (ii) Completeness: If f , then it outputs accept with probability 2 3; (iii) Soundness: If dist f , then it outputs reject with probability 2 3. The query complexity q of the tester is the total number of queries made to f, i.e., q

k 0 Q .

12

slide-32
SLIDE 32

coming up with a definition

Definition (Round-Adaptive Testing Algorithms)

Let Ω be a domain of size n, and k, q ≤ n. A randomized algorithm is said to be a (k, q)-round-adaptive tester for P ⊆ 2Ω, if, on input ε ∈ (0, 1] and granted query access to f: Ω → {0, 1}: (i) Query Generation: The algorithm proceeds in k + 1 rounds, such that at round ℓ ≥ 0, it produces a set of queries Qℓ := {x(ℓ),1, . . . , x(ℓ),|Qℓ|} ⊆ Ω, based on its own internal randomness and the answers to the previous sets of queries Q0, . . . , Qℓ−1, and receives f(Qℓ) = {f(x(ℓ),1), . . . , f(x(ℓ),|Qℓ|)}; (ii) Completeness: If f , then it outputs accept with probability 2 3; (iii) Soundness: If dist f , then it outputs reject with probability 2 3. The query complexity q of the tester is the total number of queries made to f, i.e., q

k 0 Q .

12

slide-33
SLIDE 33

coming up with a definition

Definition (Round-Adaptive Testing Algorithms)

Let Ω be a domain of size n, and k, q ≤ n. A randomized algorithm is said to be a (k, q)-round-adaptive tester for P ⊆ 2Ω, if, on input ε ∈ (0, 1] and granted query access to f: Ω → {0, 1}: (i) Query Generation: The algorithm proceeds in k + 1 rounds, such that at round ℓ ≥ 0, it produces a set of queries Qℓ := {x(ℓ),1, . . . , x(ℓ),|Qℓ|} ⊆ Ω, based on its own internal randomness and the answers to the previous sets of queries Q0, . . . , Qℓ−1, and receives f(Qℓ) = {f(x(ℓ),1), . . . , f(x(ℓ),|Qℓ|)}; (ii) Completeness: If f ∈ P, then it outputs accept with probability 2/3; (iii) Soundness: If dist(f, P) > ε, then it outputs reject with probability 2/3. The query complexity q of the tester is the total number of queries made to f, i.e., q

k 0 Q .

12

slide-34
SLIDE 34

coming up with a definition

Definition (Round-Adaptive Testing Algorithms)

Let Ω be a domain of size n, and k, q ≤ n. A randomized algorithm is said to be a (k, q)-round-adaptive tester for P ⊆ 2Ω, if, on input ε ∈ (0, 1] and granted query access to f: Ω → {0, 1}: (i) Query Generation: The algorithm proceeds in k + 1 rounds, such that at round ℓ ≥ 0, it produces a set of queries Qℓ := {x(ℓ),1, . . . , x(ℓ),|Qℓ|} ⊆ Ω, based on its own internal randomness and the answers to the previous sets of queries Q0, . . . , Qℓ−1, and receives f(Qℓ) = {f(x(ℓ),1), . . . , f(x(ℓ),|Qℓ|)}; (ii) Completeness: If f ∈ P, then it outputs accept with probability 2/3; (iii) Soundness: If dist(f, P) > ε, then it outputs reject with probability 2/3. The query complexity q of the tester is the total number of queries made to f, i.e., q = ∑k

ℓ=0 |Qℓ|.

12

slide-35
SLIDE 35

that was a mouthful, but… (i can’t draw)

13

slide-36
SLIDE 36

some remarks

∙ Other possible choices: e.g., tail-adaptive ∙ Probability amplification ∙ Similar in spirit to…

14

slide-37
SLIDE 37

some remarks

∙ Other possible choices: e.g., tail-adaptive ∙ Probability amplification ∙ Similar in spirit to…

14

slide-38
SLIDE 38

some remarks

∙ Other possible choices: e.g., tail-adaptive ∙ Probability amplification ∙ Similar in spirit to…

14

slide-39
SLIDE 39

we have a definition…

… now, what do we do with it? Does the power of testing algorithms smoothly grow with the “amount of adaptivity” number of rounds of adaptivity?

15

slide-40
SLIDE 40
  • ur results
slide-41
SLIDE 41

we have a question…

… and we have an answer. Yes, the power of testing algorithms smoothly grows with the number of rounds of adaptivity. Theorem (Hierarchy Theorem I) For every n and 0 k n0 33 there is a property

n k of strings

  • ver

n such that:

(i) there exists a k-round-adaptive tester for

n k with query

complexity O k , yet (ii) any k 1 -round-adaptive tester for

n k must make

n k2 queries.

17

slide-42
SLIDE 42

we have a question…

… and we have an answer. Yes, the power of testing algorithms smoothly grows with the number of rounds of adaptivity. Theorem (Hierarchy Theorem I) For every n ∈ N and 0 ≤ k ≤ n0.33 there is a property Pn,k of strings

  • ver Fn such that:

(i) there exists a k-round-adaptive tester for Pn,k with query complexity ˜ O(k), yet (ii) any (k − 1)-round-adaptive tester for Pn,k must make ˜ Ω(n/k2) queries.

17

slide-43
SLIDE 43

can we have something a bit less contrived?

It’s only natural. Yes, that also happens for actual things people care about. Theorem (Hierarchy Theorem II) Let k be a constant. Then, (i) there exists a k-round-adaptive tester with query complexity O 1 for 2k 1 -cycle freeness in the bounded-degree graph model; yet (ii) any k 1 -round-adaptive tester for 2k 1 -cycle freeness in the bounded-degree graph model must make n queries, where n is the number of vertices in the graph.

18

slide-44
SLIDE 44

can we have something a bit less contrived?

It’s only natural. Yes, that also happens for actual things people care about. Theorem (Hierarchy Theorem II) Let k ∈ N be a constant. Then, (i) there exists a k-round-adaptive tester with query complexity O(1/ε) for (2k + 1)-cycle freeness in the bounded-degree graph model; yet (ii) any (k − 1)-round-adaptive tester for (2k + 1)-cycle freeness in the bounded-degree graph model must make Ω (√n ) queries, where n is the number of vertices in the graph.

18

slide-45
SLIDE 45
  • utline of the proof
slide-46
SLIDE 46
  • utline of the proof

Main Idea Getting a hierarchy theorem directly for property testing seems hard; but we know how to get one easily in the decision tree complexity

  • model. Can we lift it to property testing?

Function f hard to compute in k rounds (but easy in k 1) Property Cf hard to test in k rounds (but easy in k 1)

20

slide-47
SLIDE 47
  • utline of the proof

Main Idea Getting a hierarchy theorem directly for property testing seems hard; but we know how to get one easily in the decision tree complexity

  • model. Can we lift it to property testing?

Function f hard to compute in k rounds (but easy in k + 1) ⇕ Property Cf hard to test in k rounds (but easy in k + 1)

20

slide-48
SLIDE 48
  • utline of the proof, ct’d

Fix any α > 0. Let C: Fn

n → Fm n be a code with constant relative

distance δ(C) > 0, with ∙ linearity: ∀i ∈ [m], there is a(i) ∈ Fn

n s.t. C(x)i = ⟨a(i), x⟩ for all x;

∙ rate: m ≤ n1+α; ∙ testability: C is a one-sided LTC* with non-adaptive tester; ∙ decodability: C is a LDC.* Theorem ([GGK15]) These things exist.*

21

slide-49
SLIDE 49
  • utline of the proof, ct’d

Fix any α > 0. Let C: Fn

n → Fm n be a code with constant relative

distance δ(C) > 0, with ∙ linearity: ∀i ∈ [m], there is a(i) ∈ Fn

n s.t. C(x)i = ⟨a(i), x⟩ for all x;

∙ rate: m ≤ n1+α; ∙ testability: C is a one-sided LTC* with non-adaptive tester; ∙ decodability: C is a LDC.* Theorem ([GGK15]) These things exist.*

21

slide-50
SLIDE 50

22

slide-51
SLIDE 51
  • utline of the proof, ct’d ct’d

For any f: Fn

n → {0, 1}, consider the subset of codewords

Cf := C(f−1(1)) = { C(x) : x ∈ Fn

n, f(x) = 1 } ⊆ C

  • Lemma. (LDT ⇝ PT)

k-round-adaptive tester for Cf with query complexity q implies k-round-adaptive LDT* algorithm for f with query complexity q.

  • Lemma. (PT

DT) k-round-adaptive DT algorithm for f with query complexity q implies k-round-adaptive tester for

f with query complexity O q .

Transference lemmas

23

slide-52
SLIDE 52
  • utline of the proof, ct’d ct’d

For any f: Fn

n → {0, 1}, consider the subset of codewords

Cf := C(f−1(1)) = { C(x) : x ∈ Fn

n, f(x) = 1 } ⊆ C

  • Lemma. (LDT ⇝ PT)

k-round-adaptive tester for Cf with query complexity q implies k-round-adaptive LDT* algorithm for f with query complexity q.

  • Lemma. (PT ⇝ DT)

k-round-adaptive DT algorithm for f with query complexity q implies k-round-adaptive tester for Cf with query complexity ˜ O(q). Transference lemmas

23

slide-53
SLIDE 53
  • utline of the proof, ct’d ct’d

For any f: Fn

n → {0, 1}, consider the subset of codewords

Cf := C(f−1(1)) = { C(x) : x ∈ Fn

n, f(x) = 1 } ⊆ C

  • Lemma. (LDT ⇝ PT)

k-round-adaptive tester for Cf with query complexity q implies k-round-adaptive LDT* algorithm for f with query complexity q.

  • Lemma. (PT ⇝ DT)

k-round-adaptive DT algorithm for f with query complexity q implies k-round-adaptive tester for Cf with query complexity ˜ O(q). Transference lemmas

23

slide-54
SLIDE 54
  • utline of the proof, ct’d ct’d ct’d

Putting it together Apply the above for f being the k-iterated address function fk : Fn

n → {0, 1}.

Lemma For every 0 ≤ k ≤ ˜ O(n1/3), no k-round-adaptive LDT algorithm can compute fk+1 with o(n/(k2 log n)) queries. Proof. Reduction to communication complexity,* lower bound of [NW93] on the “pointer-following” problem.

24

slide-55
SLIDE 55
  • utline of the proof, ct’d ct’d ct’d

Putting it together Apply the above for f being the k-iterated address function fk : Fn

n → {0, 1}.

Lemma For every 0 ≤ k ≤ ˜ O(n1/3), no k-round-adaptive LDT algorithm can compute fk+1 with o(n/(k2 log n)) queries. Proof. Reduction to communication complexity,* lower bound of [NW93] on the “pointer-following” problem.

24

slide-56
SLIDE 56
  • ther results
slide-57
SLIDE 57

the end is nigh

slide-58
SLIDE 58
  • pen questions

∙ Can we swap the quantifiers in the theorems? (∀k∃Pk ⇝ ∃P∀k) ∙ Can we prove that for t-juntas? ∙ Can we simulate k rounds with rounds? ∙ Other applications of the transference lemmas?

27

slide-59
SLIDE 59
  • pen questions

∙ Can we swap the quantifiers in the theorems? (∀k∃Pk ⇝ ∃P∀k) ∙ Can we prove that for t-juntas? ∙ Can we simulate k rounds with rounds? ∙ Other applications of the transference lemmas?

27

slide-60
SLIDE 60
  • pen questions

∙ Can we swap the quantifiers in the theorems? (∀k∃Pk ⇝ ∃P∀k) ∙ Can we prove that for t-juntas? ∙ Can we simulate k rounds with ℓ rounds? ∙ Other applications of the transference lemmas?

27

slide-61
SLIDE 61
  • pen questions

∙ Can we swap the quantifiers in the theorems? (∀k∃Pk ⇝ ∃P∀k) ∙ Can we prove that for t-juntas? ∙ Can we simulate k rounds with ℓ rounds? ∙ Other applications of the transference lemmas?

27

slide-62
SLIDE 62

conclusion

∙ A strong hierarchy theorem for adaptivity in property testing ∙ Also holds for some natural properties ∙ Some debatable choice of title ∙ Codes are great!

28

slide-63
SLIDE 63

conclusion

∙ A strong hierarchy theorem for adaptivity in property testing ∙ Also holds for some natural properties ∙ Some debatable choice of title ∙ Codes are great!

28

slide-64
SLIDE 64

conclusion

∙ A strong hierarchy theorem for adaptivity in property testing ∙ Also holds for some natural properties ∙ Some debatable choice of title ∙ Codes are great!

28

slide-65
SLIDE 65

conclusion

∙ A strong hierarchy theorem for adaptivity in property testing ∙ Also holds for some natural properties ∙ Some debatable choice of title ∙ Codes are great!

28

slide-66
SLIDE 66

Thank you

29

slide-67
SLIDE 67

Noga Alon, Eldar Fischer, Michael Krivelevich, and Mario Szegedy. Efficient testing of large graphs. Combinatorica, 20(4):451–476, 2000. Oded Goldreich, Tom Gur, and Ilan Komargodski. Strong locally testable codes with relaxed local decoders. In Conference on Computational Complexity, volume 33 of LIPIcs, pages 1–41. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2015. Oded Goldreich and Dana Ron. Algorithmic aspects of property testing in the dense graphs model. SIAM J. Comput., 40(2):376–445, 2011. Oded Goldreich and Luca Trevisan. Three theorems regarding testing graph properties. Random Struct. Algorithms, 23(1):23–57, 2003. Noam Nisan and Avi Wigderson. Rounds in communication complexity revisited. SIAM Journal on Computing, 22(1):211–219, February 1993. Sofya Raskhodnikova and Adam D. Smith. A note on adaptivity in testing properties of bounded degree graphs. Electronic Colloquium on Computational Complexity (ECCC), 13(089), 2006.

29