L ECTURE 4 Last time Testing if a graph is connected. Estimating - - PowerPoint PPT Presentation

l ecture 4
SMART_READER_LITE
LIVE PREVIEW

L ECTURE 4 Last time Testing if a graph is connected. Estimating - - PowerPoint PPT Presentation

Sublinear Algorithms L ECTURE 4 Last time Testing if a graph is connected. Estimating the number of connected components. Estimating the weight of a MST Today Limitations of sublinear-time algorithms Yaos Minimax Principle


slide-1
SLIDE 1

9/15/2020

Sublinear Algorithms

LECTURE 4

Last time

  • Testing if a graph is connected.
  • Estimating the number of connected

components.

  • Estimating the weight of a MST

Today

  • Limitations of sublinear-time algorithms
  • Yao’s Minimax Principle

Sofya Raskhodnikova;Boston University

slide-2
SLIDE 2

Query Complexity

  • Query complexity of an algorithm is the maximum number of queries

the algorithm makes.

– Usually expressed as a function of input length (and other parameters) – Example: the test for sortedness (from Lecture 2) had query complexity 𝑃 log 𝑜 for constant 𝜁, more precisely 𝑃

log 𝑜 𝜁

– running time ≥ query complexity

  • Query complexity of a problem 𝑄, denoted 𝑟 𝑄 , is the query

complexity of the best algorithm for the problem.

– What is 𝑟(testing sortednes𝑡)? How do we know that there is no better algorithm?

Today: Techniques for proving lower bounds on 𝑟 𝑄 .

2

slide-3
SLIDE 3

Yao’s Principle

A Method for Proving Lower Bounds

slide-4
SLIDE 4

Yao’s Minimax Principle

4

Consider a computational problem on a finite domain.

  • The following statements are equivalent.
  • Need for lower bounds

Yao’s Minimax Principle (easy direction): Statement 2 ⇒ Statement 1.

Statement 1 For any probabilistic algorithm A of complexity 𝑟 there exists an input 𝑦 s.t. Pr

𝑑𝑝𝑗𝑜 𝑢𝑝𝑡𝑡𝑓𝑡 𝑝𝑔 𝐵[A(𝑦) is wrong] > 1/3.

Statement 2 There is a distribution D on the inputs, s.t. for every deterministic algorithm of complexity q, Pr

𝑦←𝐸[A(𝑦) is wrong] > 1/3.

slide-5
SLIDE 5

Proof of Easy Direction of Yao’s Principle

5

  • Consider a finite set of inputs 𝑌 (e.g., all inputs of length n).
  • Consider a randomized algorithm that takes an input 𝑦 ∈ 𝑌,

makes ≤ 𝑟 queries to 𝑦 and outputs accept or reject.

  • Every randomized algorithm can be viewed as a distribution 𝜈
  • n deterministic algorithms (which are decision trees).
  • Let Y be the set of all 𝑟-query deterministic algorithms that run
  • n inputs in X.
slide-6
SLIDE 6

Proof of Easy Direction of Yao’s Principle

6

  • Consider a matrix M with

– rows indexed by inputs 𝑦 from X, – columns indexed by algorithms 𝑧 from 𝑍, – entry 𝑁 𝑦, 𝑧 = ቊ1 if algorithm 𝑧 is correct on input 𝑦 if algorithm 𝑧 is wrong on input 𝑦

  • Then an algorithm A is a distribution 𝜈 over columns 𝑍 with

probabilities satisfying σ𝑧∈𝑍 𝜈(𝑧) = 1.

𝒛𝟐 𝒛𝟑 … 𝒚𝟐 1 𝒚𝟑 1 1 …

slide-7
SLIDE 7

Rephrasing Statements 1 and 2 in Terms of M

  • For all distributions 𝜈 over columns 𝑍, there exists a row 𝑦 s.t.

Pr

𝑧←𝜈[𝑁(𝑦, 𝑧) = 0] > 1/3.

  • There is a distribution D over rows X, s.t. for all columns 𝑧,

Pr

𝑦←𝐸[𝑁(𝑦, 𝑧) = 0] > 1/3.

7

Statement 1 For any probabilistic algorithm A of complexity q there exists an input 𝑦 s.t. Pr

𝑑𝑝𝑗𝑜 𝑢𝑝𝑡𝑡𝑓𝑡 𝑝𝑔 𝐵[A(𝑦) is wrong] > 1/3.

Statement 2 There is a distribution D on the inputs, s.t. for every deterministic algorithm of complexity q, Pr

𝑦←𝐸[A(𝑦) is wrong] > 1/3.

slide-8
SLIDE 8

Statement 2 ⇒ Statement 1

  • Suppose there is a distribution D over X, s.t. for all columns 𝑧,

Pr

𝑦←𝐸[𝑁(𝑦, 𝑧) = 0] > 1/3.

  • Then for all distributions 𝜈 over Y,

Pr

𝑦←𝐸 𝑧←𝜈

[𝑁(𝑦, 𝑧) = 0] > 1/3.

  • Then for all distributions 𝜈 over Y, there exists a row 𝑦,

Pr

𝑧←𝜈[𝑁(𝑦, 𝑧) = 0] > 1/3.

8

𝒛𝟐 𝒛𝟑 … 𝒚𝟐 1 𝒚𝟑 1 1 …

slide-9
SLIDE 9

Yao’s Principle (Easy Direction)

9

  • Need for lower bounds

Yao’s Minimax Principle (easy direction): Statement 2 ⇒ Statement 1.

NOTE: Also applies to restricted algorithms

  • 1-sided error tests
  • nonadaptive tests

Statement 1 For any probabilistic algorithm A of complexity q there exists an input x s.t. Pr

𝑑𝑝𝑗𝑜 𝑢𝑝𝑡𝑡𝑓𝑡 𝑝𝑔 𝐵[A(x) is wrong] > 1/3.

Statement 2 There is a distribution D on the inputs, s.t. for every deterministic algorithm of complexity q, Pr

𝑦←𝐸[A(x) is wrong] > 1/3.

slide-10
SLIDE 10

Yao’s Minimax Principle as a game

10

Players: Evil algorithms designer Al and poor lower bound prover Lola.

Game1 Move 1. Al selects a q-query randomized algorithm A for the problem. Move 2. Lola selects an input on which A errs with largest probability. Game2 Move 1. Lola selects a distribution on inputs. Move 2. Al selects a q-query deterministic algorithm with as large probability of success on Lola’s distribution as possible.

slide-11
SLIDE 11

Toy Example: a Lower Bound for Testing 0*

Input: string of n bits Question: Does the string contain only 0’s or is it 𝜁-far form the all-0 string?

  • Claim. Any algorithm needs (1/𝜁) queries to answer this question w.p. ≥ 𝟑/𝟒.

Proof: By Yao’s Minimax Principle, enough to prove Statement 2.

11

Distribution D on n-bit strings

  • Divide the input string into 1/ε blocks of size ε𝑜.
  • Let yi be the string where the ith block is 1s and remaining bits are 0.
  • Distribution D gives the all-0 string w.p. 1/2 and yi with w.p. 1/2,

where 𝑗 is chosen uniformly at random from 1, …, 1/ε.

0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 𝜻𝒐 𝜻𝒐 𝜻𝒐 𝜻𝒐

slide-12
SLIDE 12

A Lower Bound for Testing 0*

  • Claim. Any 𝜁-test for 0* needs (1/𝜁) queries.

Proof (continued): Now fix a deterministic tester A making q < 1/3𝜁 queries. 1. A must accept if all answers are 0. Otherwise, it would be wrong on all-0 string, that is, with probability 1/2 with respect to D. 2. Let 𝑗1, . . . , 𝑗𝑟 be the positions A queries when it sees only 0s. The test can choose its queries based on previous answers. However, since all these answers are 0 and since A is deterministic, the query positions are fixed.

  • At least 1/𝜁 − q >

2 3𝜁 of the blocks do not hold any queried indices.

  • Therefore, A accepts > 2/3 of the inputs yi. Thus, it is wrong with probability

>

2 3𝜁 ⋅ 𝜁 2 = 1 3

Context: [Alon Krivelevich Newman Szegedy 99]

Every regular language can be tested in O(1/𝜁 polylog 1/𝜁) time

12

0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 𝜻𝒐 𝜻𝒐 𝜻𝒐 𝜻𝒐

slide-13
SLIDE 13

A Lower Bound for Testing Sortedness

Input: a list of n numbers x1 , x2 ,..., xn Question: Is the list sorted or 𝜁-far from sorted? Already saw: two different O((log n)/𝜁) time testers. Known [Ergün Kannan Kumar Rubinfeld Viswanathan 98, Fischer 01]: (log n) queries are required for all constant 𝜁 ≤ 1/2 Today: (log n) queries are required for all constant 𝜁 ≤ 1/2 for every 1-sided error nonadaptive test.

  • A test has 1-sided error if it always accepts all

YES instances.

  • A test is nonadaptive if its queries do not

depend on answers to previous queries.

13

1-sided Error Property Tester

Far from YES

YES

Reject with probability ≥ 𝟑/𝟒 Don’t care Accept with probability ≥ 𝟑/𝟒

𝜁

slide-14
SLIDE 14

1-Sided Error Tests Must Catch “Mistakes”

  • A pair (𝑗, 𝑘) is violated if 𝑗 < 𝑘 but 𝑦𝑗 > 𝑦𝑘

Proof: Every sorted partial list can be extended to a sorted list.

14

  • Claim. A 1-sided error test can reject only if it finds a violated pair.

1 ? ? 4 … 7 ? ? 9

slide-15
SLIDE 15

Yao’s Principle Game [Jha]

Lola’s distribution is uniform over the following log 𝑜 lists:

15

Claim 2. Every pair (𝑗, 𝑘) is violated in exactly one list above.

1 1 1 1 1 1 1 1

ℓ1 ℓ2

1 1 1 1 2 2 2 2 1 1 1 1 1 1 2 2 1 1 3 2 3 2 4 4 3 3

ℓ3

1 2 1 3 2 4 3 5 6 4 5 7 6 8 7

ℓlog 𝑜

. . .

Claim 1. All lists above are 1/2-far from sorted.

slide-16
SLIDE 16

Yao’s Principle Game: Al’s Move

Al picks a set 𝑅 = {𝑏1, 𝑏2, … , 𝑏|𝑅|} of positions to query.

  • His test must be correct, i.e., must find a violated pair with probability ≥

2/3 when input is picked according to Lola’s distribution.

  • 𝑅 contains a violated pair ⇔ (𝑏𝑗, 𝑏𝑗+1) is violated for some 𝑗

Pr

ℓ←Lola′s distribution

[ 𝑏𝑗, 𝑏𝑗+1 for some 𝑗 is vilolated in list ℓ] ≤ 𝑅 − 1 log 𝑜

  • If 𝑅 ≤

2 3 log 𝑜 then this probability is < 2 3

  • So, 𝑅 = Ω(log 𝑜)
  • By Yao’s Minimax Principle, every randomized 1-sided error

nonadaptive test for sortedness must make Ω(log 𝑜) queries.

16

? ? ? ? 𝑏1 𝑏2 𝑏3 𝑏|𝑅| …

By the Union Bound

slide-17
SLIDE 17

Testing Monotonicity of functions on Hypercube

Non-adaptive 1-sided error Lower Bound

slide-18
SLIDE 18

18

f(000) f(111) f(011) f(100) f(101) f(110) f(010) f(001)

Boolean Functions 𝒈 ∶ 𝟏, 𝟐 𝒐 → {𝟏, 𝟐}

Graph representation: 𝑜-dimensional hypercube

  • 2𝑜 vertices: bit strings of length 𝑜
  • 2𝑜−1𝑜 edges: (𝑦, 𝑧) is an edge if 𝑧 can be obtained from 𝑦 by

increasing one bit from 0 to 1

  • each vertex 𝑦 is labeled with 𝑔(𝑦)

001001 011001 𝑦 𝑧

slide-19
SLIDE 19

19

Boolean Functions 𝒈 ∶ 𝟏, 𝟐 𝒐 → {𝟏, 𝟐}

Graph representation: 𝑜-dimensional hypercube

  • 2𝑜 vertices: bit strings of length 𝑜
  • 2𝑜−1𝑜 edges: (𝑦, 𝑧) is an edge if 𝑧 can be obtained from 𝑦 by

increasing one bit from 0 to 1

  • each vertex 𝑦 is labeled with 𝑔(𝑦)

001001 011001 𝑦 𝑧

𝑔(00 ⋯ 00) 𝑔(11 ⋯ 11) Vertices: increasing weight

slide-20
SLIDE 20

Monotonicity of Functions

20

[Goldreich Goldwasser Lehman Ron Samorodnitsky, Dodis Goldreich Lehman Raskhodnikova Ron Samorodnitsky Fischer Lehman Newman Raskhodnikova Rubinfeld Samorodnitsky]

  • A function 𝑔 ∶ 0,1 𝑜 → {0,1} is monotone

if increasing a bit of 𝑦 does not decrease 𝑔(𝑦).

  • Is 𝑔 monotone or 𝜁-far from monotone

(𝑔 has to change on many points to become monontone)? – Edge 𝑦𝑧 is violated by 𝑔 if 𝑔 (𝑦) > 𝑔 (𝑧).

Time:

– 𝑃(𝑜/𝜁), logarithmic in the size of the input, 2𝑜

– Ω( 𝑜/𝜁) for 1-sided error, nonadaptive tests – Advanced techniques: Θ( 𝑜/𝜁2) for nonadaptive tests, Ω

3 𝑜

[Khot Minzer Safra 15, Chen De Servidio Tang 15, Chen Waingarten Xie 17]

1 1 1 1 1 1 1 1 monotone

1 2-far from monotone

slide-21
SLIDE 21

Hypercube 1-sided Error Lower Bound

21

  • 1-sided error test must accept if no violated pair is uncovered.

Violated pair:

– A distribution on far from monotone functions suffices.

Lemma

Every 1-sided error nonadaptive test for monotonicity of functions 𝑔 ∶ 0,1 𝑜 → {0,1} requires Ω 𝑜 queries. 1

[Fischer Lehman Newman Raskhodnikova Rubinfeld Samorodnitsky]

slide-22
SLIDE 22

Hypercube 1-sided Error Lower Bound

22

  • Hard distribution: pick coordinate 𝑗 at random and output 𝑔

𝑗.

2 𝑜 1 − coordinate 𝑗 1 𝑔

𝑗 ∶

𝑔

𝑗(𝑦) =

1 if 𝑦 > 𝑜 2 + 𝑜 1 − 𝑦𝑗 if 𝑦 = 𝑜 2 ± 𝑜 if 𝑦 < 𝑜 2 − 𝑜

slide-23
SLIDE 23

The Fraction of Nodes in Middle Layers

E[Y]= 𝜁 =

23

Let Y1, … , Ys be independently distributed random variables in [0,1]. Let Y =

1 𝑡 ⋅ σ 𝑡 𝑗=1

Yi (called sample mean). Then Pr Y − E Y ≥ 𝜁 ≤ 2e−2𝑡𝜁2.

Hoeffding Bound

2 𝑜 1 − coordinate 𝑗 1 𝑔

𝑗 ∶

slide-24
SLIDE 24

Hypercube 1-sided Error Lower Bound

24

  • Hard distribution: pick coordinate 𝑗 at random and output 𝑔

𝑗.

Analysis

2 𝑜 1 − coordinate 𝑗 1 𝑔

𝑗 ∶

  • Edges from (𝑦1, … , 𝑦𝑗−1, 0, 𝑦𝑗+1, … , 𝑦𝑜) to (𝑦1, … , 𝑦𝑗−1, 1, 𝑦𝑗+1, … , 𝑦𝑜) are

violated if both endpoints are in the middle.

  • The middle contains a constant fraction of vertices.
  • All 𝑜 functions are 𝜁-far from monotone for some constant 𝜁.

𝑔

𝑗(𝑦) =

1 if 𝑦 > 𝑜 2 + 𝑜 1 − 𝑦𝑗 if 𝑦 = 𝑜 2 ± 𝑜 if 𝑦 < 𝑜 2 − 𝑜

slide-25
SLIDE 25

Hypercube 1-sided Error Lower Bound

25

  • How many functions does a set of 𝑟 queries expose?

# functions that a query pair (𝑦, 𝑧) exposes ≤ # coordinates on which 𝑦 and 𝑧 differ ≤ 2 𝑜

111011 001001 𝑦 𝑧

𝑗 𝑘 𝑙 Pair (𝑦, 𝑧) can expose only functions 𝑔

𝑗, 𝑔 𝑘 and 𝑔 𝑙

queries 2 𝑜 1 𝑔 𝑦 𝑧

Only pairs of queries in the Green Band can be violated ⇒ disagreements ≤ 2 𝑜

Naive Analysis

# functions exposed by 𝑟 queries ≤ 𝑟2 ⋅ 2 𝑜

slide-26
SLIDE 26

Hypercube 1-sided Error Lower Bound

26

  • How many functions does a set of 𝑟 queries expose?

# functions that a query pair (𝑦, 𝑧) exposes ≤ # coordinates on which 𝑦 and 𝑧 differ ≤ 2 𝑜

111011 001001 𝑦 𝑧

𝑗 𝑘 𝑙 Pair (𝑦, 𝑧) can expose only functions 𝑔

𝑗, 𝑔 𝑘 and 𝑔 𝑙

queries 2 𝑜 1 𝑔 𝑦 𝑧

Only pairs of queries in the Green Band can be violated ⇒ disagreements ≤ 2 𝑜

Claim

# functions exposed by 𝑟 queries ≤ (𝑟 − 1) ⋅ 2 𝑜

slide-27
SLIDE 27

Hypercube 1-sided Error Lower Bound

27

  • How many functions does a set of 𝑟 queries expose?

queries 2 𝑜 1 𝑔 𝑦 𝑧

Claim

# functions exposed by 𝑟 queries ≤ (𝑟 − 1) ⋅ 2 𝑜 (𝑦, 𝑧) a violation pair ⇓ Some adjacent pair of vertices in a minimum spanning forest on the query set is also violated

sufficient to consider adjacent vertices in a minimum spanning forest

  • n the query set
slide-28
SLIDE 28

Hypercube 1-sided Error Lower Bound

28

  • How many functions does a set of 𝑟 queries expose?

queries 2 𝑜 1 𝑔 𝑦 𝑧

Claim

# functions exposed by 𝑟 queries ≤ (𝑟 − 1) ⋅ 2 𝑜 ⇓

Claim

Every deterministic test that makes a set 𝑅 of 𝑟 queries (in the middle) succeeds with probability 𝑃

𝑟 𝑜 on our distribution.