Randomness in Computing L ECTURE 27 Last time Stationary - PowerPoint PPT Presentation

Randomness in Computing L ECTURE 27 Last time • Stationary distributions • Random walks on graphs • Algorithm for 𝑡 - 𝑢 -PATH Today • Sublinear algorithms • Differential privacy 4/29/2020 Sofya Raskhodnikova;Randomness in Computing; based on slides by Baranasuriya et al.

A Sublinear-Time Algorithm B L A - B L A - B L A - B L A - B L A - B L A - B L A - B L A ? L ? B ? L ? A randomized algorithm approximate answer Resources Quality of • number of queries approximation • running time

Goal: Fundamental Understanding of Sublinear Computation • What computational tasks? • How to measure quality of approximation? • What type of access to the input? • Can we make our computations robust (e.g., to noise or erased data)?

Fundamental Computational Tasks • Property testing • need to answer YES or NO  intuition: only require correct answers on two sets of instances that are very different from each other • Learning • need an approximate representation of an object  input is from a given class (or is close to it) • Classical approximation • need to compute a value  output should be close to the desired value 4

Property Testing: Definition [Rubinfeld Sudan, Goldreich Goldwasser Ron] Randomized Algorithm Property Tester YES YES Accept with Accept with probability ≥ 𝟑/𝟒 probability ≥ 𝟑/𝟒 𝜁 Don’t care Close to YES NO Far from Reject with Reject with   YES probability 2/3 probability 2/3 𝜁 - ( ≥ 𝜁 fraction of places) far = differs in many places

Example: Lipschitz Testing [Jha R] Input: a list of 𝑜 numbers 𝑦 1 , 𝑦 2 , … , 𝑦 𝑜 • A list of numbers is Lipschitz if 𝑦 𝑗+1 − 𝑦 𝑗 ≤ 1 for all 𝑗. • Question: Is the list Lipschitz? Requires reading entire list:  (𝑜) time • Approximate version: Is the list Lipschitz or 𝜁 -far from Lipschitz? (An 𝜁 fraction of 𝑦 𝑗 ’s have to be changed to make it Lipschitz.) Our result: O ((log 𝑜)/𝜁) time 5 6 5 4 5 4 3 2 2 1 𝒚 𝒋 6 5 4 3 2 1 𝒋 1 2 3 4 5 6 7 8 9 10 6

Lipschitz Testing: Attempts 1. Test : Pick a random 𝑗 and reject if 𝑦 𝑗+1 − 𝑦 𝑗 > 1 Fails on: ← 1/2-far from Lipschitz 0 1 2 3 5 6 7 8 𝒚 𝒋 6 5 4 3 2 1 𝒋 1 2 3 4 5 6 7 8 2. Test : Pick random 𝑗 < 𝑘 and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 Fails on: ← 1/2-far from Lipschitz 0 2 1 3 2 4 3 5 4 6 𝒚 𝒋 6 5 4 3 2 1 𝒋 1 2 3 4 5 6 7 8 9 10

Is a list Lipschitz or 𝜁 -far from Lipschitz? Idea: Associate positions in the list with vertices of the directed line. … … … 𝒐 -1 𝒐 1 2 3 Construct a graph (2-spanner) ≤ 𝑜 log 𝑜 edges [Bhattacharyya Grigorescu Jung R Woodruff] • by adding a few “shortcut” edges (𝑗, 𝑘) for 𝑗 < 𝑘 • where each pair of vertices is connected by a path of length at most 2

Is a list Lipschitz or 𝜁 -far from Lipschitz? Test Pick a random edge (𝑗, 𝑘) from the 2-spanner and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗. 3 2 2 4 6 6 7 2 4 6 x k x i x j Analysis: Call a pair (𝑗, 𝑘) violated if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 , and satisfied otherwise. • • If 𝑗 is an endpoint of a violated edge, call 𝑦 𝑗 bad . Otherwise, call it good . Claim 1. All pairs of good numbers are satisfied. Proof: Consider any two good numbers, x i and x j . They are connected by a path of (at most) two satisfied edges 𝑗, 𝑙 , (𝑙, 𝑘) ⇒ 𝑦 𝑙 − 𝑦 𝑗 ≤ 𝑙 − 𝑗 and 𝑦 𝑘 − 𝑦 𝑙 ≤ 𝑘 − 𝑙 ⇒ 𝑦 𝑘 − 𝑦 𝑗 ≤ 𝑦 𝑘 − 𝑦 𝑙 + 𝑦 𝑙 − 𝑦 𝑗 ≤ 𝑘 − 𝑙 + 𝑙 − 𝑗 = 𝑘 − 𝑗

Is a list Lipschitz or 𝜁 -far from Lipschitz? Test Pick a random edge (𝑗, 𝑘) from the 2-spanner and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗. 3 2 2 4 6 6 7 2 4 6 x k x i x j Analysis: Call a pair (𝑗, 𝑘) violated if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 , and satisfied otherwise. • • If 𝑗 is an endpoint of a violated edge, call 𝑦 𝑗 bad . Otherwise, call it good . Claim 1. All pairs of good numbers are satisfied. Claim 2. An 𝜁 -far list violates ≥ 𝜁/(2 log 𝑜) fraction of edges in 2-spanner. Proof: If a list is 𝜁 -far from Lipschitz, it has ≥ 𝜁𝑜 bad numbers. (Claim 1) • Each violated edge contributes 2 bad numbers. 𝜁𝑜 • 2-spanner has ≥ 2 violated edges out of 𝑜 log 𝑜 .

Is a list Lipschitz or 𝜁 -far from Lipschitz? Test Pick a random edge (𝑗, 𝑘) from the 2-spanner and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗. 3 2 2 4 6 6 7 2 4 6 x k x i x j Analysis: • Call a pair (𝑗, 𝑘) violated if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 , and satisfied otherwise. Claim 2. An 𝜁 -far list violates ≥ 𝜁/(2 log 𝑜) fraction of edges in 2-spanner. Algorithm 4 log 𝑜 edges ( x i ,x j ) from the 2-spanner and reject if 𝑦 𝑘 − 𝑦 𝑗 > 𝑘 − 𝑗 . Sample 𝜁 Guarantee: All Lipschitz lists are accepted. All lists that are 𝜁 -far from Lipschitz are rejected with probability ≥ 2/3. Time: O((log n)/ ² ) 11

Testing if a List is Lipschitz: Summary • [Jha R]: We can determine if a list of 𝑜 numbers is Lipschitz or 𝜁 -far from Lipschitz log 𝑜 in O time. 𝜁 • [Jha R, Blais R Yaroslavtsev, Chakrabarty Dixit Jha Seshadhri]: This cannot be improved.

Testing Properties of High-Dimensional Functions In polylogarithmic time, we can test a large class of properties of functions 𝑔: 1, … , 𝑜 𝑒 → ℝ , including: x y • Lipschitz property [Jha R ] • Monotonicity [Goldreich Goldwasser Lehman Ron, Dodis Goldreich Lehman R Ron Samorodnitsky] • Bounded-derivative properties [Chakrabarty Dixit Jha Seshadhri] • Unateness [Baleshzar Chakrabarty Pallavoor R Seshadhri]

Sublinear Algorithms: Summary • Many problems admit sublinear-time algorithms • Algorithms are often simple • Analysis requires creation of interesting combinatorial, geometric and algebraic tools • Unexpected connections to other areas • Many open questions

Private Data Analysis Individuals Curator Data Analysts 𝑦 1 ( Queries ) 𝑦 2 Answers 𝑦 3  x =  𝑦 𝑒−1 𝑦 𝑒 Typical examples: census, medical studies, what big companies want to publish about our data … Two conflicting goals  Protect privacy of individuals • Differential privacy [Dwork McSherry Nissim Smith 06]  Give accurate answers

Neighboring Datasets Two datasets 𝑦, 𝑦′ are neighbors if they differ in one person’s data. 𝑦 1 𝑦 1 𝑦 2 𝑦 2 𝑦 3 𝒚′ 𝟒   𝑦 𝑒−1 𝑦 𝑒−1 𝑦 𝑒 𝑦 𝑒 𝑦 𝑦′

Differential Privacy [Dwork McSherry Nissim Smith] Privacy Definition An algorithm A is 𝝑 -differentially private if for all pairs of neighbors 𝒚, 𝒚′ and all sets of answers S : 𝐐𝐬 𝑩 𝒚 ∈ 𝑻 ≤ 𝒇 𝝑 𝐐𝐬 𝑩 𝒚 ′ ∈ 𝑻 𝑦 1 𝑦 1 𝑦 2 𝑦 2 𝑦 3 𝒚′ 𝟒   𝑦 𝑒−1 𝑦 𝑒−1 𝑦 𝑒 𝑦 𝑒 𝑦 𝑦′

Properties of Differential Privacy • Composition: If algorithms 𝐵 1 and 𝐵 2 are 𝜗 -differentially private then algorithm that outputs (𝐵 1 𝑦 , 𝐵 2 (𝑦)) is 2 𝜗 -differentially private • Meaningful in the presence of arbitrary external information 18

Output Perturbation Frameworks for designing differentially private algorithms 19

Output Perturbation Individuals Curator Data Analysts 𝑦 1 𝑦 2 Evaluate 𝒈(𝒚) 𝑦 3 x = A 𝒚 = 𝒈 𝒚 +  𝒐𝒑𝒋𝒕𝒇 𝑦 𝑒−1 𝑦 𝑒

Global Sensitivity Framework Global sensitivity of a function 𝑔 is 𝐨𝐟𝐣𝐡𝐢𝐜𝐩𝐬𝑡 𝑦,𝑦 ′ 𝑔 𝑦 − 𝑔 𝑦 ′ . 𝑯𝑻 𝒈 = max 𝑦 1 +⋯+𝑦 𝑜 Example: 𝑦 1 , … , 𝑦 𝑜 ∈ 0,1 , ave 𝑦 = 𝑜 • 𝐻𝑇 ave = ?

Global Sensitivity Framework Global sensitivity of a function 𝑔 is 𝐨𝐟𝐣𝐡𝐢𝐜𝐩𝐬𝑡 𝑦,𝑦 ′ 𝑔 𝑦 − 𝑔 𝑦 ′ . 𝑯𝑻 𝒈 = max 𝑦 1 +⋯+𝑦 𝑜 Example: 𝑦 1 , … , 𝑦 𝑜 ∈ 0,1 , ave 𝑦 = 𝑜 • 𝐻𝑇 ave = 1/𝑜 Theorem [Dwork McSherry Nissim Smith] 𝐻𝑇 𝑔 If 𝐵 𝑦 = 𝑔 𝑦 + 𝑀𝑏𝑞 then 𝐵 is 𝜗 -differentially private. 𝜗

Global Sensitivity: Noise Distribution Laplace Mechanism Theorem [Dwork McSherry Nissim Smith] 𝐻𝑇 𝑔 If 𝐵 𝑦 = 𝑔 𝑦 + 𝑀𝑏𝑞 then 𝐵 is 𝜗 -differentially private. 𝜗 2𝜇 ⋅ 𝑓 − 𝑧 1 Laplace distribution Lap (𝜇) has density ℎ 𝑧 = 𝜇 (mean 0, standard deviation 2 ⋅ 𝜇 ) 𝐻𝑇 𝑔 Sliding Property of 𝑀𝑏𝑞 𝜗 𝜗⋅ 𝜀 ℎ 𝑧 𝐻𝑇𝑔 for all 𝑧, 𝜀 : ℎ 𝑧+𝜀 ≤ 𝑓

When is Laplace Mechanism Useful? • Laplace mechanism is always private. • When is it accurate? 𝑦 1 +⋯+𝑦 𝑜 Example: 𝑦 1 , … , 𝑦 𝑜 ∈ 0,1 , ave 𝑦 = 𝑜 1 • 𝐻𝑇 ave = 1/𝑜 Noise= Lap 𝜗𝑜 Accurate when GS is low (and 𝑜 , the size of the database, is sufficiently large)

Randomness in Computing L ECTURE 27 Last time Stationary - PowerPoint PPT Presentation

Randomness in Computing L ECTURE 27 Last time Stationary distributions Random walks on graphs Algorithm for - -PATH Today Sublinear algorithms Differential privacy 4/29/2020 Sofya Raskhodnikova;Randomness in

Randomness in Computing L ECTURE 1 Randomness in Computing Course information Verifying

Algorithmic randomness Cuny logic worshop Benoit Monin - LACL - Universit e Paris-Est Cr

Lecture 19 Randomness, Pseudo Randomness, and Confidentiality Stephen Checkoway University

Counting Words: Non- Randomness Pre-Processing and Non-Randomness The End Marco Baroni &

Randomness Some content taken from Silence on the Wire by Michal Zalewski Todays Agenda

CS 574: Randomized Algorithms Lecture 1. Introduction to Randomness August 25, 2015 Lecture 1.

15-251 Great Theoretical Ideas in Computer Science Lecture 21: Introduction to Randomness and

Firmware Insider Bluetooth Randomness is Mostly Random RANDOMNESS IS MY PASSION Jrn

Randomness Dependent Randomness Dependent Message Security g y Eleanor Birrell Kai

Randomness in C 2 and Pluripotential Theory Randomness in C 2 and Pluripotential Theory Outline 1

Randomness and analysis: a tutorial Part I: Randomness notions and almost everywhere theorems

Computability, randomness and the ergodic decomposition Mathieu Hoyrup ( t r

Higher Randomness and hK-Trivials Paul-Elliot Angls dAuriac Benot Monin March 26, 2019

Pseudo-Random Number Generators Functional Programming and Intelligent Algorithms Prof Hans Georg

Higher randomness Benoit Monin - LIAFA - University of Paris VII Victoria university - 16 April

Higher randomness Benoit Monin - LIAFA - University of Paris VII Join work with Laurent Bienvenu

CheriABI Hardware enforced memory safety for FreeBSD Brooks Davis , Robert N. M. Watson,

Relational Contracts and the Value of Loyalty Simon Board Department of Economics, UCLA November

A Space Optimal Streaming Algorithm for Sketching Small Moments Daniel M. Kane Jelani Nelson

Sparse Johnson-Lindenstrauss Transforms Jelani Nelson MIT May 24, 2011 joint work with Daniel

Communication Complexity David P. Woodruff IBM Almaden Talk Outline 1. Information Theory

Stochastic Programming Models with Decision Dependent Probabilities David L. Woodruff Graduate

Translating Evidence Into Practice Susan E. Shapiro, PhD, RN Associate Chief Nursing Officer,

Joshua Brody and Amit Chakrabarti Dartmouth College 24 th CCC, 2009, Paris Joshua Brody 1

Randomness in Computing L ECTURE 27 Last time Stationary - PowerPoint PPT Presentation

Randomness in Computing L ECTURE 27 Last time Stationary distributions Random walks on graphs Algorithm for - -PATH Today Sublinear algorithms Differential privacy 4/29/2020 Sofya Raskhodnikova;Randomness in

Randomness in Computing L ECTURE 1 Randomness in Computing Course information Verifying

Algorithmic randomness Cuny logic worshop Benoit Monin - LACL - Universit e Paris-Est Cr

Lecture 19 Randomness, Pseudo Randomness, and Confidentiality Stephen Checkoway University

Counting Words: Non- Randomness Pre-Processing and Non-Randomness The End Marco Baroni &amp;

Randomness Some content taken from Silence on the Wire by Michal Zalewski Todays Agenda

CS 574: Randomized Algorithms Lecture 1. Introduction to Randomness August 25, 2015 Lecture 1.

15-251 Great Theoretical Ideas in Computer Science Lecture 21: Introduction to Randomness and

Firmware Insider Bluetooth Randomness is Mostly Random RANDOMNESS IS MY PASSION Jrn

Randomness Dependent Randomness Dependent Message Security g y Eleanor Birrell Kai

Randomness in C 2 and Pluripotential Theory Randomness in C 2 and Pluripotential Theory Outline 1

Randomness and analysis: a tutorial Part I: Randomness notions and almost everywhere theorems

Computability, randomness and the ergodic decomposition Mathieu Hoyrup ( t r

Higher Randomness and hK-Trivials Paul-Elliot Angls dAuriac Benot Monin March 26, 2019

Pseudo-Random Number Generators Functional Programming and Intelligent Algorithms Prof Hans Georg

Higher randomness Benoit Monin - LIAFA - University of Paris VII Victoria university - 16 April

Higher randomness Benoit Monin - LIAFA - University of Paris VII Join work with Laurent Bienvenu

CheriABI Hardware enforced memory safety for FreeBSD Brooks Davis , Robert N. M. Watson,

Relational Contracts and the Value of Loyalty Simon Board Department of Economics, UCLA November

A Space Optimal Streaming Algorithm for Sketching Small Moments Daniel M. Kane Jelani Nelson

Sparse Johnson-Lindenstrauss Transforms Jelani Nelson MIT May 24, 2011 joint work with Daniel

Communication Complexity David P. Woodruff IBM Almaden Talk Outline 1. Information Theory

Stochastic Programming Models with Decision Dependent Probabilities David L. Woodruff Graduate

Translating Evidence Into Practice Susan E. Shapiro, PhD, RN Associate Chief Nursing Officer,

Joshua Brody and Amit Chakrabarti Dartmouth College 24 th CCC, 2009, Paris Joshua Brody 1

Counting Words: Non- Randomness Pre-Processing and Non-Randomness The End Marco Baroni &