sublinear algorithms
play

Sublinear Algorithms Lectures 1 and 2 Sofya Raskhodnikova Penn - PowerPoint PPT Presentation

Sublinear Algorithms Lectures 1 and 2 Sofya Raskhodnikova Penn State University 1 Tentative Topics Introduction, examples and general techniques. Sublinear-time algorithms for graphs strings basic properties of functions


  1. Sublinear Algorithms Lectures 1 and 2 Sofya Raskhodnikova Penn State University 1

  2. Tentative Topics Introduction, examples and general techniques. Sublinear-time algorithms for • graphs • strings • basic properties of functions • algebraic properties and codes • metric spaces • distributions Tools: probability, Fourier analysis, combinatorics , codes, … Sublinear-space algorithms: streaming 2

  3. Tentative Plan Introduction, examples and general techniques. Lecture 1. Background. Testing properties of images and lists. Lecture 2. Properties of functions and graphs. Sublinear approximation. Lecture 3-5. Background in probability. Techniques for proving hardness. Other models for sublinear computation. 3

  4. Motivation for Sublinear-Time Algorithms Massive datasets • world-wide web • online social networks • genome project • sales logs • census data • high-resolution images • scientific measurements Long access time • communication bottleneck (dial-up connection) • implicit data (an experiment per data point) 4

  5. What Can We Hope For? • What can an algorithm compute if it – reads only a sublinear portion of the data? – runs in sublinear time? • Some problems have exact deterministic solutions • For most interesting problems algorithms must be – approximate – randomized 5

  6. A Sublinear-Time Algorithm B L A - B L A - B L A - B L A - B L A - B L A - B L A - B L A ? L ? B ? L ? A sublinear-time algorithm approximate answer Resources Quality of  number of samples vs. approximation  running time 6

  7. Types of Approximation Classical approximation • need to compute a value  output is close to the desired value  examples: average, median values • need to compute the best structure  output is a structure with “cost” close to optima l  examples: furthest pair of points, minimum spanning tree Property testing • need to answer YES or NO  output is a correct answer for a given input, or at least some input close to it 7

  8. Classical Approximation A Simple Example

  9. Approximate Diameter of a Point Set [Indyk] Input: 𝑛 points, described by a distance matrix 𝐸 – 𝐸 𝑗𝑘 is the distance between points 𝑗 and 𝑘 – 𝐸 satisfies triangle inequality and symmetry (Note: input size is 𝑜 = 𝑛 2 ) Let 𝑗, 𝑘 be indices that maximize 𝐸 𝑗𝑘 . Maximum 𝐸 𝑗𝑘 is the diameter. • Output: (𝑙, ℓ) such that 𝐸 𝑙ℓ  𝐸 𝑗𝑘 /2

  10. Algorithm and Analysis 𝑘 Algorithm (𝑛, 𝐸) 1. Pick 𝑙 arbitrarily ℓ 2. Pick ℓ to maximize 𝐸 𝑙ℓ 3. Output (𝑙, ℓ) • Approximation guarantee 𝐸 𝑗𝑘 ≤ 𝐸 𝑗𝑙 + 𝐸 𝑙𝑘 (triangle inequality) 𝑙 ≤ 𝐸 𝑙ℓ + 𝐸 𝑙ℓ (choice of ℓ + symmetry of 𝐸 ) ≤ 2𝐸𝑙 ℓ • Running time: 𝑃(𝑛) = 𝑃(𝑛 = 𝑜) 𝑗 A rare example of a deterministic sublinear-time algorithm

  11. Property Testing

  12. Property Testing: YES/NO Questions Does the input satisfy some property? (YES/NO) “in the ballpark” vs. “out of the ballpark” Does the input satisfy the property or is it far from satisfying it? • sometimes it is the right question (probabilistically checkable proofs (PCPs)) • as good when the data is constantly changing (WWW) • fast sanity check to rule out inappropriate inputs (airport security questioning) 12

  13. Property Tester Definition Probabilistic Algorithm Property Tester YES YES Accept with Accept with probability ≥ 𝟑/𝟒 probability ≥ 𝟑/𝟒 𝜁 Don’t care Close to YES NO Far from Reject with Reject with   YES probability 2/3 probability 2/3 𝜁 - ( ≥ 𝜁 fraction of places) far = differs in many places 13

  14. Randomized Sublinear Algorithms Toy Examples

  15. Property Testing: a Toy Example Input: a string 𝑥 ∈ 0,1 𝑜 0 0 0 1 … 0 1 0 0 Question: Is 𝑥 = 00 … 0 ? Requires reading entire input. Is 𝑥 = 00 … 0 or Approximate version: does it have ≥ 𝜁𝑜 1’s (“errors”)? Test (𝑜, 𝑥) Sample 𝑡 = 2/𝜁 positions uniformly and independently at random 1. 2. If 1 is found, reject ; otherwise, accept Used: 1 − 𝑦 ≤ 𝑓 −𝑦 Analysis: If 𝑥 = 00 … 0 , it is always accepted. If 𝑥 is 𝜁 -far, Pr[error] = Pr [no 1’s in the sample] ≤ 1 − 𝜁 𝑡 ≤ 𝑓 −𝜁𝑡 = 𝑓 −2 < 1 3 Witness Lemma If a test catches a witness with probability ≥ 𝑞 , 2 then s = 𝑞 iterations of the test catch a witness with probability ≥ 2/3. 15

  16. Randomized Approximation: a Toy Example Input: a string 𝑥 ∈ 0,1 𝑜 0 0 0 1 … 0 1 0 0 Goal: Estimate the fraction of 1’s in 𝑥 (like in polls) It suffices to sample 𝑡 = 1 ⁄ 𝜁 2 positions and output the average to get the fraction of 1’s ±𝜁 (i.e., additive error 𝜁 ) with probability ¸ 2/3 Hoeffding Bound Let Y 1 , … , Y s be independently distributed random variables in [0,1] and 𝑡 ≥ δ ≤ 2e −2𝜀 2 /𝑡 . let Y = ∑ Y i (sample sum). Then Pr Y − E Y 𝑗=1 𝑡 Y i = value of sample 𝑗 . Then E[Y] = ∑ E[Y i ] = 𝑡 ⋅ (fraction of 1’s in 𝑥 ) 𝑗=1 Pr (sample average) − fraction of 1′s in 𝑥 ≥ 𝜁 = Pr Y − E Y ≥ 𝜁𝑡 ≤ 2e −2𝜀 2 /𝑡 = 2𝑓 −2 < 1/3 substitute 𝑡 = 1 ⁄ 𝜁 2 Apply Hoeffding Bound with 𝜀 = 𝜁𝑡 16

  17. Property Testing Simple Examples

  18. Testing Properties of Images 18

  19. Pixel Model Input: 𝑜 × 𝑜 matrix of pixels (0/1 values for black-and-white pictures) Query: point (𝑗 1 , 𝑗 2 ) Answer: color of (𝑗 1 , 𝑗 2 ) 19

  20. Testing if an Image is a Half-plane [R03] A half-plane or 𝜁 -far from a half-plane? O(1/ 𝜁) time 20

  21. Half-plane Instances 1 A half-plane 4 -far from a half-plane 21

  22. Half-plane Instances 1 A half-plane 4 -far from a half-plane 22

  23. Half-plane Instances 1 A half-plane 4 -far from a half-plane 23

  24. Half-plane Instances 1 A half-plane 4 -far from a half-plane 24

  25. Half-plane Instances 1 A half-plane 4 -far from a half-plane 25

  26. Half-plane Instances 1 A half-plane 4 -far from a half-plane 26

  27. Half-plane Instances 1 A half-plane 4 -far from a half-plane 27

  28. Strategy “ Testing by implicit learning ” paradigm • Learn the outline of the image by querying a few pixels. • Test if the image conforms to the outline by random sampling, and reject if something is wrong. 28

  29. Half-plane Test Claim. The number of sides with different corners is 0, 2, or 4. ? ? ? ? Algorithm 1. Query the corners. 29

  30. Half-plane Test: 4 Bi-colored Sides Claim. The number of sides with different corners is 0, 2, or 4. Analysis • If it is 4, the image cannot be a half-plane. Algorithm 1. Query the corners. 2. If the number of sides with different corners is 4, reject . 30

  31. Half-plane Test: 0 Bi-colored Sides Claim. The number of sides with different corners is 0, 2, or 4. Analysis ? ? • If all corners have the same color, the image is a ? half-plane if and only if it is unicolored. ? ? ? Algorithm 1. Query the corners. If all corners have the same color 𝑑 , test if all pixels have color 𝑑 2. (as in Toy Example 1). 31

  32. Half-plane Test: 2 Bi-colored Sides Claim. The number of sides with different 𝜁𝑜/2 corners is 0, 2, or 4. ? ? 𝑋 Analysis The area outside of 𝑋 ∪ 𝐶 has ≤ 𝜁𝑜 2 /2 pixels. • • If the image is a half-plane, W contains only 𝐶 white pixels and B contains only black pixels. If the image is 𝜁 -far from half-planes, it has • ≥ 𝜁𝑜 2 /2 wrong pixels in 𝑋 ∪ 𝐶. ? By Witness Lemma, 4/𝜁 samples suffice to ? • 𝜁𝑜/2 catch a wrong pixel. Algorithm 1. Query the corners. 2. If # of sides with different corners is 2, on both sides find 2 different pixels within distance 𝜁𝑜/2 by binary search. Query 4/𝜁 pixels from 𝑋 ∪ 𝐶 3. Accept iff all 𝑋 pixels are white and all 𝐶 pixels are black. 4. 32

  33. Testing if an Image is a Half-plane [R03] A half-plane or 𝜁 -far from a half-plane? O(1/ 𝜁) time 33

  34. Other Results on Properties of Images • Pixel Model Convexity [R03] Convex or 𝜁 -far from convex? O(1/ 𝜁 2 ) time Connectedness [R03] Connected or 𝜁 -far from connected? O(1/ 𝜁 4 ) time Partitioning [Kleiner Keren Newman 10] Can be partitioned according to a template or is 𝜁 -far? time independent of image size • Properties of sparse images [Ron Tsur 10] 34

  35. Testing if a List is Sorted Input: a list of n numbers x 1 , x 2 ,..., x n • Question: Is the list sorted? Requires reading entire list:  (n) time • Approximate version: Is the list sorted or ² -far from sorted? (An ² fraction of x i ’s have to be changed to make it sorted.) [Ergün Kannan Kumar Rubinfeld Viswanathan 98, Fischer 01]: O((log n)/ ² ) time  (log n) queries • Attempts: 1. Test: Pick a random i and reject if x i > x i+1 . Fails on: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 Ã 1/2-far from sorted 2. Test: Pick random i < j and reject if x i > x j . Fails on: 1 0 2 1 3 2 4 3 5 4 6 5 7 6 Ã 1/2-far from sorted 35

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend