Building complex DP algorithms using composition Privacy & - PowerPoint PPT Presentation

Building complex DP algorithms using composition Privacy & Fairness in Data Science CS848 Fall 2019

2 Outline • Recap – Laplace Mechanism • Composition Theorems • Optimizing accuracy of DP algorithms – Utilizing Parallel Composition – Postprocessing & Inference – Strategy Selection – Data dependent noise

3 Differential Privacy [Dwork ICALP 2006] For every pair of inputs For every output … that differ in one row D 1 D 2 O Adversary should not be able to distinguish between any D 1 and D 2 based on any O ∀Ω ∈ range A , ln Pr[𝐵 𝐸 0 ∈ Ω] ≤ 𝜁, 𝜁 > 0 Pr[𝐵 𝐸 2 ∈ Ω]

4 Laplace mechanism e.g., COUNT Aggregate Query: q D Noisy Answer Analyst Private Database 𝒓 𝑬 = 𝒓 𝑬 + 𝐌𝐛𝐪 𝑻(𝒓) 7 𝜻 Sensitivity -10 -5 0 5 10

6 Sequential Composition M 1 , ε 1 M 1 (D) M 2 , ε 2 D M 2 (D, M 1 (D)) … Private Database • If M 1 , M 2 , ..., M k are algorithms that access a private database D such that each M i satisfies ε i -differential privacy, then the combination of their outputs satisfies ε- differential privacy with ε = ε 1 + ... + ε k

7 Parallel Composition M 1 , ε 1 D 1 M 1 (D 1 ) M 2 , ε 2 M 2 (D 2 ) D 2 … Private Database • If M 1 , M 2 , ..., M k are algorithms that access are algorithms that access disjoint databases D 1 , D 2 , …, D k such that each M i satisfies ε i -differential privacy, then the combination of their outputs satisfies ε- differential privacy with ε = max(ε 1 , ... , ε k )

8 Postprocessing M, ε D A(M(D)) M(D) A Private Database • If M is an ε-differentially private algorithm, any additional post-processing 𝐵 ∘ 𝑁 also satisfies ε- differential privacy.

9 Transformations & Stability M, ε V V(D) D M(V(D)) Transformed Private Database Database Transformation need not satisfy DP • 𝜏 F : Stability of the transformation – Maximum number of rows in V that can change due to changing a single row in D

10 Transformations & Stability M, ε V V(D) D M(V(D)) Transformed Private Database Database • Executing an ε-differentially private algorithm M on a transformation of a database V(D) satisfies 𝜁 G 𝜏 F -differential privacy. • 𝜏 F : Stability of the transformation – Maximum number of rows in V that can change due to changing a single row in D

11 Transformations & Stability • V 1 : For each row (x1, x2, x3) à (x1, x2+x3) Stability = 1 • V 2 : Each row in D is a tweet (id, {words}). For each row in D, generate k rows with first k words {(id, word 1 ), …, (id, word k )} Stability = k • V 3 : Sample each row with probability p. Stability = 1 … but can prove 2p 𝜁 -differential privacy* *Adam Smith, Differential Privacy and Secrecy of the Sample

13 Problem Sex Height Weight Queries: M 6’2” 210 # Males with BMI < 25 • F 5’3” 190 # Males • F 5’9” 160 # Females with BMI < 25 • M 5’3” 180 # Females • M 6’7” 250 • Design an ε-differentially private algorithm that can answer all these questions. • What is the total error?

14 Algorithm 1 Return: • (# Males with BMI < 25) + Lap(4/ε) • (# Males) + Lap(4/ε) • (# Females with BMI) < 25 + Lap(4/ε) • (# Females) + Lap(4/ε)

15 Privacy • BMI can be computed by transforming each row (s, h, w) à (s, bmi). This is stability 1. • Sensitivity of count = 1. So each query is answered using a ε/4-DP algorithm. • By sequential composition, we get ε-DP.

16 Utility Error: 2 M 𝐹 𝑟 𝐸 − 𝑟 𝐸 O Total Error: 2 2 4 ×4 = 128 𝜁 2 𝜁

17 Algorithm 2 Compute: 𝑟 0 = (# Males with BMI < 25) + Lap(1/ε) • V 𝑟 2 = (# Males with BMI > 25) + Lap(1/ε) • V 𝑟 W = (# Females with BMI < 25) + Lap(1/ε) • V 𝑟 X = (# Females with BMI > 25) + Lap(1/ε) • V Return 𝑟 0 , V 𝑟 0 + V 𝑟 2 , V 𝑟 W , V 𝑟 W + V • V 𝑟 X

18 Privacy • Sensitivity of count = 1. So each query is answered using a ε-DP algorithm. • 𝑟 0 , 𝑟 2 , 𝑟 W , 𝑟 X are counts on disjoint portions of the database. Thus by parallel composition releasing V 𝑟 0 , V 𝑟 2 , V 𝑟 W , V 𝑟 X satisfies ε-DP. • By the postprocessing theorem , releasing V 𝑟 0 , V 𝑟 0 + V 𝑟 2 , 𝑟 W , V 𝑟 W + V 𝑟 X also satisfies ε-DP. V

19 Utility Error: 2 M 𝐹 𝑟 𝐸 − 𝑟 𝐸 O Total Error: 2 2 2 2 2 1 + 2 G 2 1 + 2 1 + 2 G 2 1 = 12 𝜁 2 𝜁 𝜁 𝜁 𝜁 V 𝑟 0 V 𝑟 0 + V 𝑟 2 𝑟 W V V 𝑟 W + V 𝑟 X

20 Utility Tighter privacy analysis gives better accuracy for the same level of privacy Total Error: 2 2 2 2 2 1 + 2 G 2 1 + 2 1 + 2 G 2 1 = 12 𝜁 2 𝜁 𝜁 𝜁 𝜁 V 𝑟 0 V 𝑟 0 + V 𝑟 2 𝑟 W V V 𝑟 W + V 𝑟 X

21 Generalized Sensitivity • Let 𝑔: 𝒠 → ℝ ] be a function that outputs a vector of d real numbers. The sensitivity of f is given by: a,a b : |a∆a b |e0 𝑔 𝐸 − 𝑔(𝐸 f ) 0 𝑇 𝑔 = max where 𝐲 − 𝐳 0 = ∑ j 𝑦 j − 𝑧 j

22 Generalized Sensitivity • 𝑟 0 = # Males with BMI < 25 • 𝑟 2 = # Males with BMI > 25 • 𝑟 = # Males with BMI • Let f 1 be a function that answers both 𝑟 0 , 𝑟 2 • Let f 2 be a function that answers both 𝑟 0 , 𝑟 • Sensitivity of f 1 = 1 • Sensitivity of f 2 = 2 • An alternate privacy proof for Alg 2 is to show that the generalized sensitivity of V 𝑟 0 , V 𝑟 2 , V 𝑟 W , V 𝑟 X is 1.

24 Improving utility of Alg 2 Compute: 𝑟 0 = # Males with BMI < 25 + Lap(1/ε) • V 𝑟 2 = # Males with BMI > 25 + Lap(1/ε) • V Return 𝑟 0 , V 𝑟 0 + V • V 𝑟 2 We know 𝑟 0 ≤ 𝑟 0 + 𝑟 2 , but P[ V 𝑟 0 > V 𝑟 0 + V 𝑟 2 ] > 0

25 Constrained Inference DATA OWNER ANALYST Q ( I ) Q ( I ) Step 1 I Diff. • • Private Interface Q ( I ) = q Constrained ˜ q q • • Inference Private Step 2 Data Step 3

26 Constrained Inference • 𝑟 0 , 𝑟 2 , …, 𝑟 m be a set of queries 𝑟 0 , V 𝑟 2 , …, V 𝑟 m be the noisy answers • V • Constraint C( 𝑟 0 , 𝑟 2 , …, 𝑟 m ) = 1 holds on true answers (for all typical databases), but does not hold on noisy answers. • Goal: Find 𝑟 0 , 𝑟 2 , …, 𝑟 m that are: – Close to V 𝑟 0 , V 𝑟 2 , …, V 𝑟 m – Satisfy the constraint C( 𝑟 0 , 𝑟 2 , …, 𝑟 m )

27 Least Squares Optimization 𝑟 0 − 𝑟 0 2 min M V 𝑡. 𝑢. 𝐷(𝑟 0 , 𝑟 2 , … , 𝑟 m )

28 Geometric Interpretation 𝑟 0 − 𝑟 0 2 min M V Noise 𝑡. 𝑢. 𝐷(𝑟 0 , 𝑟 2 , … , 𝑟 m ) 𝑟 0 , V 𝑟 2 , …, V 𝒓 = (V 7 𝑟 m ) 𝒓 = (𝑟 0 , 𝑟 2 , …, 𝑟 m ) Space of Outputs t 𝒓 = (𝑟 0 , 𝑟 2 , … , 𝑟 m ) satisfying the Projection constraint

29 Geometric Interpretation 𝑟 0 − 𝑟 0 2 min M V Noise 𝑡. 𝑢. 𝐷(𝑟 0 , 𝑟 2 , … , 𝑟 m ) 𝑟 0 , V 𝑟 2 , …, V 7 𝒓 = (V 𝑟 m ) 𝒓 = (𝑟 0 , 𝑟 2 , …, 𝑟 m ) Space of Outputs 𝒓 = (𝑟 0 , 𝑟 2 , … , 𝑟 m ) t satisfying the Projection constraint Theorem: 𝒓 − t 𝒓 2 when the constraints 𝒓 2 ≤ 𝒓 − 7 form a convex space

30 Ordering Constraint 𝑟 0 − 𝑟 0 2 min M V Isotonic Regression: 𝑡. 𝑢. 𝑟 0 ≤ 𝑟 0 ≤ … ≤ 𝑟 m

32 Problem Sex Height Weight Queries: M 6’2” 210 # people with height in [5’1”, 6’2”] • F 5’3” 190 # people with height in [2’0”, 4’0”] • F 5’9” 160 # people with height in [3’3”, 7’0”] • M 5’3” 180 … • M 6’7” 250 • Design an ε-differentially private algorithm that can answer all range queries. • What is the total error?

33 Problem • Let {v 1 , …, v k } be the domain of an attribute • Let {x 1 , …, x k } be the number of rows with values v 1 , …, v k • Range Query: q ij = x i + x i+1 + …+ x j • Goal: Answer all range queries

34 Strategy 1: • Answer all range queries using Laplace mechanism • Sensitivity: O( 𝑙 2 ) • Total Error: O( 𝑙 X /𝜁 2 )

35 Strategy 2: • Estimate each individual x i using Laplace mechanism • Answer: 𝑟 jw = 7 𝑦 jx0 +…+ 7 𝑦 j + V 𝑦 w • Error in each 7 𝑦 j : 𝑃(1/𝜁 2 ) • Error in 𝑟 0m : 𝑃(𝑙/𝜁 2 ) • Total Error: 𝑃(𝑙 W /𝜁 2 )

Building complex DP algorithms using composition Privacy & - PowerPoint PPT Presentation

Building complex DP algorithms using composition Privacy & Fairness in Data Science CS848 Fall 2019 2 Outline Recap Laplace Mechanism Composition Theorems Optimizing accuracy of DP algorithms Utilizing Parallel

Complex Numbers Complex Numbers 1 / 19 Complex Numbers Complex numbers ( C ) are an extension of

Intermembrane Space H + H + Cyt c Co Q Complex Complex III IV H + ATPase H + Complex

Framework for Metric Composition + Spatial Composition of Spatial Composition of Metrics Al

An introduction to complex numbers The complex numbers Are the real numbers not sufficient? A

Energy Complex (EnCo) (New and Existing Building) 117,859 m 2 Building A 61,45 8 m 2 Building B

Marion County Waste Composition 2016 February 27, 2018 Peter Spendelow Oregon Department of

Real Composition Algebras Steven Clanton Harriet L. Wilkes Honors College Florida Atlantic

AP VS DE ENGLISH 11 th : AP Language and Composition or DE 111 and 112 12 th : AP Literature and

Automated Web Service Composition in Practice: from Composition Requirements Specification to

Towards Abstractions for Web Services Composition Manuel Mazzara Manuel Mazzara Towards

Overview of Complex Networks Complex Networks Principles of Complex Systems | @pocsvox Basic

Complex Networks Principles of Complex Systems Basic definitions Examples of CSYS/MATH 300,

Why Complex-Valued When Are Integration . . . Relation to Complex . . . Fuzzy? Why Complex

Math 211 Math 211 Complex Numbers and Matrices October 29, 2001 2 Complex Numbers Complex

Complex Networks Basic definitions Principles of Complex Systems Books Course 300, Fall, 2008

Graph Algorithms Chapter 22 1 CPTR 430 Algorithms Graph Algorithms Why Study Graph Algorithms?

The algebra of functions Given two functions, say f ( x ) = x 2 and g ( x ) = x + 1 , we can, in

Random Probing Security Verification, Composition, Expansion and New Constructions Sonia Belad 1

Chapter 5: Integer Compositions and Partitions and Set Partitions Prof. Tesler Math 184A Fall

Secure Protocol Composition Anupam Datta Ante Derek John C. Mitchell Dusko Pavlovic

Composition of product-form Generalized Stochastic Petri Nets: a modular approach Simonetta

A Composition Theorem for Conical Juntas Mika G o os University of Toronto IBM Almaden

Lecture 16: Weighted Finite State Transducers (WFST) Mark Hasegawa-Johnson All content CC-SA 4.0

Composition of Cryptographic Protocols - Feasibility Muthu Venkitasubramaniam University of