1
Sublinear Algorithms
Lecture 5
Sofya Raskhodnikova
Penn State University
Thanks to Madhav Jha (Penn State) for help with creating these slides.
Sublinear Algorithms Lecture 5 Sofya Raskhodnikova Penn State - - PowerPoint PPT Presentation
Sublinear Algorithms Lecture 5 Sofya Raskhodnikova Penn State University Thanks to Madhav Jha (Penn State) for help with creating these slides. 1 Today Lecture 5. Limitations of sublinear algorithms. Yaos Minimax Principle. Query
1
Thanks to Madhav Jha (Penn State) for help with creating these slides.
the algorithm makes.
– Usually expressed as a function of input length (and other parameters) – Example: the test for sortedness (from Lecture 2) had query complexity O(log n) for constant 𝜁. – running time ≥ query complexity
– What is 𝑟(testing sortednes𝑡)? How do we know that there is no better algorithm?
3
5
The following statements are equivalent.
Yao’s Minimax Principle (easy direction): Statement 2 ⇒ Statement 1. Prove it.
Statement 1 For any probabilistic algorithm A of complexity q there exists an input x s.t. Pr
𝑑𝑝𝑗𝑜 𝑢𝑝𝑡𝑡𝑓𝑡 𝑝𝑔 𝐵[A(x) is wrong] > 1/3.
Statement 2 There is a distribution D on the inputs, s.t. for every deterministic algorithm of complexity q, Pr
𝑦←𝐸[A(x) is wrong] > 1/3.
6
Players: Evil algorithms designer Al and poor lower bound prover Lola.
Game1 Move 1. Al selects a q-query randomized algorithm A for the problem. Move 2. Lola selects an input on which A errs with largest probability. Game2 Move 1. Lola selects a distribution on inputs. Move 2. Al selects a q-query deterministic algorithm with as large probability of success on Lola’s distribution as possible.
Input: string of n bits Question: Is the string contains only 1’s or is it 𝜁-far form the all-1 string?
Proof: By Yao’s Minimax Principle, enough to prove Statement 2. Distribution on n-bit strings:
chosen uniformly at random from 1, …, 1/𝜁.
7
Proof (continued): Now fix a deterministic tester A making q < 1/3𝜁 queries. 1. A must accept if all answers are 1. Otherwise, it would be wrong on all-1 string, that is, with probability 1/2 with respect to D. 2. Let i1, . . . , iq be the positions A queries when it sees only 1s. The test can choose its queries based on previous answers. However, since all these answers are 1 and since A is deterministic, the query positions are fixed.
> 2/3𝜁 ⋅
𝜁 2 = 1/3
Context: [Alon Krivelevich Newman Szegedy 99]
Every regular language can be tested in O(1/𝜁 polylog 1/𝜁) time
8
Input: a list of n numbers x1 , x2 ,..., xn Question: Is the list sorted or 𝜁-far from sorted? Already saw: two different O((log n)/𝜁) time testers. Known [Ergün Kannan Kumar Rubinfeld Viswanathan 98, Fischer 01]: (log n) queries are required for all constant 𝜁 ≤ 1/2 Today: (log n) queries are required for all constant 𝜁 ≤ 1/2 for every 1-sided error nonadaptive test.
YES instances.
depend on answers to previous queries.
9
1-sided Error Property Tester
Far from YES
YES
Reject with probability ≥ 𝟑/𝟒 Don’t care Accept with probability ≥ 𝟑/𝟒
𝜁
10
1 ? ? 4 … 7 ? ? 9
Lola’s distribution is uniform over the following log 𝑜 lists:
11
1 1 1 1 1 1 1 1
1 1 1 1 2 2 2 2 1 1 1 1 1 1 2 2 1 1 3 2 3 2 4 4 3 3
1 2 1 3 2 4 3 5 6 4 5 7 6 8 7
Al picks a set 𝑅 = {𝑏1, 𝑏2, … , 𝑏|𝑅|} of positions to query.
≥ 2/3 when input is picked according to Lola’s distribution.
Pr
ℓ←Lola′s distribution
[ 𝑏𝑗, 𝑏𝑗+1 for some 𝑗 is vilolated in list ℓ] ≤ 𝑅 − 1 log 𝑜
2 3 log 𝑜 then this probability is < 2 3
nonadaptive test for sortedness must make Ω(log 𝑜) queries.
12
? ? ? ? 𝑏1 𝑏2 𝑏3 𝑏|𝑅| …
By the Union Bound
14
f(000) f(111) f(011) f(100) f(101) f(110) f(010) f(001)
001001 011001 𝑦 𝑧
15
001001 011001 𝑦 𝑧
𝑔(00 ⋯ 00) 𝑔(11 ⋯ 11) Vertices: increasing weight
16
[Goldreich Goldwasser Lehman Ron Samorodnitsky, Dodis Goldreich Lehman Raskhodnikova Ron Samorodnitsky]
– Edge 𝑦𝑧 is violated by 𝑔 if 𝑔 (𝑦) > 𝑔 (𝑧).
Time:
– 𝑃(𝑜/𝜁), logarithmic in the size of the input, 2𝑜
– Ω( 𝑜/𝜁) for restricted class of tests
1 1 1 1 1 1 1 1 monotone
1 2-far from monotone
17
Violated pair:
– Only a distribution on far from monotone values suffices.
Every 1-sided error non-adaptive test for monotonicity of functions 𝑔 ∶ 0,1 𝑜 → {0,1} requires Ω 𝑜 queries. 1
[Fischer Lehman Newman Raskhodnikova Rubinfeld Samorodnitsky]
18
𝑗.
2 𝑜 1 − coordinate 𝑗 1 𝑔
𝑗 ∶
violated if both endpoints are in the middle.
19
# functions that a query pair (𝑦, 𝑧) exposes ≤ # coordinates on which 𝑦 and 𝑧 differ ≤ 2 𝑜
111011 001001 𝑦 𝑧
𝑗 𝑘 𝑙 Pair (𝑦, 𝑧) can expose only functions 𝑔
𝑗, 𝑔 𝑘 and 𝑔 𝑙
queries 2 𝑜 1 𝑔 𝑦 𝑧
Only queries in the Green Band can be violated ⇒ disagreements ≤ 2 𝑜
# functions exposed by 𝑟 queries ≤ 𝑟2 ⋅ 2 𝑜
20
# functions that a query pair exposes ≤ # disagreements between vertices of the pair ≤ 2 𝑜
111011 001001 𝑦 𝑧
𝑗 𝑘 𝑙 Pair (𝑦, 𝑧) can expose only functions 𝑔
𝑗, 𝑔 𝑘 and 𝑔 𝑙
queries 2 𝑜 1 𝑔 𝑦 𝑧
Only queries in the Green Band can be violated ⇒ disagreements ≤ 2 𝑜
# functions exposed by 𝑟 queries ≤ (𝑟 − 1) ⋅ 2 𝑜
21
queries 2 𝑜 1 𝑔 𝑦 𝑧
# functions exposed by 𝑟 queries ≤ (𝑟 − 1) ⋅ 2 𝑜 (𝑦, 𝑧) a violation pair ⇓ Some adjacent pair of vertices in a minimum spanning forest on the query set is also violated
sufficient to consider adjacent vertices in a minimum spanning forest
22
queries 2 𝑜 1 𝑔 𝑦 𝑧
# functions exposed by 𝑟 queries ≤ (𝑟 − 1) ⋅ 2 𝑜 ⇓
Every deterministic test that makes a set 𝑅 of 𝑟 queries (in the middle) succeeds with probability 𝑃
𝑟 𝑜 on our distribution.
24
Hard distribution: randomly pick a subset 𝐶 of coordinates from [𝑜] by independently choosing each coordinate to lie in 𝐶 with probability
1 10 𝑜 .
Uniformly choose good𝐶 or bad𝐶. 𝑜 majority of Coordinates in 𝐶 1 good𝐶 ∶ 𝑜 1 bad𝐶 ∶ minority of Coordinates in 𝐶
Every test for monotonicity of functions 𝑔 ∶ 0,1 𝑜 → 0,1 requires Ω log 𝑜 queries.
[Fischer Lehman Newman Raskhodnikova Rubinfeld Samorodnitsky]