randomness in computing
play

Randomness in Computing L ECTURE 27 Last time Stationary - PowerPoint PPT Presentation

Randomness in Computing L ECTURE 27 Last time Stationary distributions Random walks on graphs Algorithm for - -PATH Today Sublinear algorithms Differential privacy 4/29/2020 Sofya Raskhodnikova;Randomness in


  1. Randomness in Computing L ECTURE 27 Last time • Stationary distributions • Random walks on graphs • Algorithm for 𝑡 - 𝑢 -PATH Today • Sublinear algorithms • Differential privacy 4/29/2020 Sofya Raskhodnikova;Randomness in Computing; based on slides by Baranasuriya et al.

  2. A Sublinear-Time Algorithm B L A - B L A - B L A - B L A - B L A - B L A - B L A - B L A ? L ? B ? L ? A randomized algorithm approximate answer Resources Quality of • number of queries approximation • running time

  3. Goal: Fundamental Understanding of Sublinear Computation • What computational tasks? • How to measure quality of approximation? • What type of access to the input? • Can we make our computations robust (e.g., to noise or erased data)?

  4. Fundamental Computational Tasks • Property testing • need to answer YES or NO  intuition: only require correct answers on two sets of instances that are very different from each other • Learning • need an approximate representation of an object  input is from a given class (or is close to it) • Classical approximation • need to compute a value  output should be close to the desired value 4

  5. Property Testing: Definition [Rubinfeld Sudan, Goldreich Goldwasser Ron] Randomized Algorithm Property Tester YES YES Accept with Accept with probability ≥ 𝟑/𝟒 probability ≥ 𝟑/𝟒 𝜁 Don’t care Close to YES NO Far from Reject with Reject with   YES probability 2/3 probability 2/3 𝜁 - ( ≥ 𝜁 fraction of places) far = differs in many places

  6. Example: Lipschitz Testing [Jha R] Input: a list of 𝑜 numbers 𝑦 1 , 𝑦 2 , … , 𝑦 𝑜 • A list of numbers is Lipschitz if 𝑦 𝑗+1 − 𝑦 𝑗 ≤ 1 for all 𝑗. • Question: Is the list Lipschitz? Requires reading entire list:  (𝑜) time • Approximate version: Is the list Lipschitz or 𝜁 -far from Lipschitz? (An 𝜁 fraction of 𝑦 𝑗 ’s have to be changed to make it Lipschitz.) Our result: O ((log 𝑜)/𝜁) time 5 6 5 4 5 4 3 2 2 1 𝒚 𝒋 6 5 4 3 2 1 𝒋 1 2 3 4 5 6 7 8 9 10 6

  7. Lipschitz Testing: Attempts 1. Test : Pick a random 𝑗 and reject if 𝑦 𝑗+1 − 𝑦 𝑗 > 1 Fails on: ← 1/2-far from Lipschitz 0 1 2 3 5 6 7 8 𝒚 𝒋 6 5 4 3 2 1 𝒋 1 2 3 4 5 6 7 8 2. Test : Pick random 𝑗 < 𝑘 and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 Fails on: ← 1/2-far from Lipschitz 0 2 1 3 2 4 3 5 4 6 𝒚 𝒋 6 5 4 3 2 1 𝒋 1 2 3 4 5 6 7 8 9 10

  8. Is a list Lipschitz or 𝜁 -far from Lipschitz? Idea: Associate positions in the list with vertices of the directed line. … … … 𝒐 -1 𝒐 1 2 3 Construct a graph (2-spanner) ≤ 𝑜 log 𝑜 edges [Bhattacharyya Grigorescu Jung R Woodruff] • by adding a few “shortcut” edges (𝑗, 𝑘) for 𝑗 < 𝑘 • where each pair of vertices is connected by a path of length at most 2

  9. Is a list Lipschitz or 𝜁 -far from Lipschitz? Test Pick a random edge (𝑗, 𝑘) from the 2-spanner and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗. 3 2 2 4 6 6 7 2 4 6 x k x i x j Analysis: Call a pair (𝑗, 𝑘) violated if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 , and satisfied otherwise. • • If 𝑗 is an endpoint of a violated edge, call 𝑦 𝑗 bad . Otherwise, call it good . Claim 1. All pairs of good numbers are satisfied. Proof: Consider any two good numbers, x i and x j . They are connected by a path of (at most) two satisfied edges 𝑗, 𝑙 , (𝑙, 𝑘) ⇒ 𝑦 𝑙 − 𝑦 𝑗 ≤ 𝑙 − 𝑗 and 𝑦 𝑘 − 𝑦 𝑙 ≤ 𝑘 − 𝑙 ⇒ 𝑦 𝑘 − 𝑦 𝑗 ≤ 𝑦 𝑘 − 𝑦 𝑙 + 𝑦 𝑙 − 𝑦 𝑗 ≤ 𝑘 − 𝑙 + 𝑙 − 𝑗 = 𝑘 − 𝑗

  10. Is a list Lipschitz or 𝜁 -far from Lipschitz? Test Pick a random edge (𝑗, 𝑘) from the 2-spanner and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗. 3 2 2 4 6 6 7 2 4 6 x k x i x j Analysis: Call a pair (𝑗, 𝑘) violated if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 , and satisfied otherwise. • • If 𝑗 is an endpoint of a violated edge, call 𝑦 𝑗 bad . Otherwise, call it good . Claim 1. All pairs of good numbers are satisfied. Claim 2. An 𝜁 -far list violates ≥ 𝜁/(2 log 𝑜) fraction of edges in 2-spanner. Proof: If a list is 𝜁 -far from Lipschitz, it has ≥ 𝜁𝑜 bad numbers. (Claim 1) • Each violated edge contributes 2 bad numbers. 𝜁𝑜 • 2-spanner has ≥ 2 violated edges out of 𝑜 log 𝑜 .

  11. Is a list Lipschitz or 𝜁 -far from Lipschitz? Test Pick a random edge (𝑗, 𝑘) from the 2-spanner and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗. 3 2 2 4 6 6 7 2 4 6 x k x i x j Analysis: • Call a pair (𝑗, 𝑘) violated if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 , and satisfied otherwise. Claim 2. An 𝜁 -far list violates ≥ 𝜁/(2 log 𝑜) fraction of edges in 2-spanner. Algorithm 4 log 𝑜 edges ( x i ,x j ) from the 2-spanner and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 . Sample 𝜁 Guarantee: All Lipschitz lists are accepted. All lists that are 𝜁 -far from Lipschitz are rejected with probability ≥ 2/3. Time: O((log n)/ ² ) 11

  12. Testing if a List is Lipschitz: Summary • [Jha R]: We can determine if a list of 𝑜 numbers is Lipschitz or 𝜁 -far from Lipschitz log 𝑜 in O time. 𝜁 • [Jha R, Blais R Yaroslavtsev, Chakrabarty Dixit Jha Seshadhri]: This cannot be improved.

  13. Testing Properties of High-Dimensional Functions In polylogarithmic time, we can test a large class of properties of functions 𝑔: 1, … , 𝑜 𝑒 → ℝ , including: x y • Lipschitz property [Jha R ] • Monotonicity [Goldreich Goldwasser Lehman Ron, Dodis Goldreich Lehman R Ron Samorodnitsky] • Bounded-derivative properties [Chakrabarty Dixit Jha Seshadhri] • Unateness [Baleshzar Chakrabarty Pallavoor R Seshadhri]

  14. Sublinear Algorithms: Summary • Many problems admit sublinear-time algorithms • Algorithms are often simple • Analysis requires creation of interesting combinatorial, geometric and algebraic tools • Unexpected connections to other areas • Many open questions

  15. Private Data Analysis Individuals Curator Data Analysts 𝑦 1 ( Queries ) 𝑦 2 Answers 𝑦 3  x =  𝑦 𝑒−1 𝑦 𝑒 Typical examples: census, medical studies, what big companies want to publish about our data … Two conflicting goals  Protect privacy of individuals • Differential privacy [Dwork McSherry Nissim Smith 06]  Give accurate answers

  16. Neighboring Datasets Two datasets 𝑦, 𝑦′ are neighbors if they differ in one person’s data. 𝑦 1 𝑦 1 𝑦 2 𝑦 2 𝑦 3 𝒚′ 𝟒   𝑦 𝑒−1 𝑦 𝑒−1 𝑦 𝑒 𝑦 𝑒 𝑦 𝑦′

  17. Differential Privacy [Dwork McSherry Nissim Smith] Privacy Definition An algorithm A is 𝝑 -differentially private if for all pairs of neighbors 𝒚, 𝒚′ and all sets of answers S : 𝐐𝐬 𝑩 𝒚 ∈ 𝑻 ≤ 𝒇 𝝑 𝐐𝐬 𝑩 𝒚 ′ ∈ 𝑻 𝑦 1 𝑦 1 𝑦 2 𝑦 2 𝑦 3 𝒚′ 𝟒   𝑦 𝑒−1 𝑦 𝑒−1 𝑦 𝑒 𝑦 𝑒 𝑦 𝑦′

  18. Properties of Differential Privacy • Composition: If algorithms 𝐵 1 and 𝐵 2 are 𝜗 -differentially private then algorithm that outputs (𝐵 1 𝑦 , 𝐵 2 (𝑦)) is 2 𝜗 -differentially private • Meaningful in the presence of arbitrary external information 18

  19. Output Perturbation Frameworks for designing differentially private algorithms 19

  20. Output Perturbation Individuals Curator Data Analysts 𝑦 1 𝑦 2 Evaluate 𝒈(𝒚) 𝑦 3 x = A 𝒚 = 𝒈 𝒚 +  𝒐𝒑𝒋𝒕𝒇 𝑦 𝑒−1 𝑦 𝑒

  21. Global Sensitivity Framework Global sensitivity of a function 𝑔 is 𝐨𝐟𝐣𝐡𝐢𝐜𝐩𝐬𝑡 𝑦,𝑦 ′ 𝑔 𝑦 − 𝑔 𝑦 ′ . 𝑯𝑻 𝒈 = max 𝑦 1 +⋯+𝑦 𝑜 Example: 𝑦 1 , … , 𝑦 𝑜 ∈ 0,1 , ave 𝑦 = 𝑜 • 𝐻𝑇 ave = ?

  22. Global Sensitivity Framework Global sensitivity of a function 𝑔 is 𝐨𝐟𝐣𝐡𝐢𝐜𝐩𝐬𝑡 𝑦,𝑦 ′ 𝑔 𝑦 − 𝑔 𝑦 ′ . 𝑯𝑻 𝒈 = max 𝑦 1 +⋯+𝑦 𝑜 Example: 𝑦 1 , … , 𝑦 𝑜 ∈ 0,1 , ave 𝑦 = 𝑜 • 𝐻𝑇 ave = 1/𝑜 Theorem [Dwork McSherry Nissim Smith] 𝐻𝑇 𝑔 If 𝐵 𝑦 = 𝑔 𝑦 + 𝑀𝑏𝑞 then 𝐵 is 𝜗 -differentially private. 𝜗

  23. Global Sensitivity: Noise Distribution Laplace Mechanism Theorem [Dwork McSherry Nissim Smith] 𝐻𝑇 𝑔 If 𝐵 𝑦 = 𝑔 𝑦 + 𝑀𝑏𝑞 then 𝐵 is 𝜗 -differentially private. 𝜗 2𝜇 ⋅ 𝑓 − 𝑧 1 Laplace distribution Lap (𝜇) has density ℎ 𝑧 = 𝜇 (mean 0, standard deviation 2 ⋅ 𝜇 ) 𝐻𝑇 𝑔 Sliding Property of 𝑀𝑏𝑞 𝜗 𝜗⋅ 𝜀 ℎ 𝑧 𝐻𝑇𝑔 for all 𝑧, 𝜀 : ℎ 𝑧+𝜀 ≤ 𝑓

  24. When is Laplace Mechanism Useful? • Laplace mechanism is always private. • When is it accurate? 𝑦 1 +⋯+𝑦 𝑜 Example: 𝑦 1 , … , 𝑦 𝑜 ∈ 0,1 , ave 𝑦 = 𝑜 1 • 𝐻𝑇 ave = 1/𝑜 Noise= Lap 𝜗𝑜 Accurate when GS is low (and 𝑜 , the size of the database, is sufficiently large)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend