Differential Privacy for Relational Algebra: improving the - - PowerPoint PPT Presentation

differential privacy for relational algebra improving the
SMART_READER_LITE
LIVE PREVIEW

Differential Privacy for Relational Algebra: improving the - - PowerPoint PPT Presentation

Differential Privacy for Relational Algebra: improving the sensitivity bounds via constraint systems Marco Stronati Catuscia Palamidessi Universit` a di Pisa, Italy INRIA and LIX, Ecole Polytechnique, France marco@stronati.org


slide-1
SLIDE 1

Differential Privacy for Relational Algebra: improving the sensitivity bounds via constraint systems

Marco Stronati

Universit` a di Pisa, Italy marco@stronati.org

Catuscia Palamidessi

INRIA and LIX, Ecole Polytechnique, France catuscia@lix.polytechnique.fr

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 1 / 30

slide-2
SLIDE 2

Introduction

Statistical Disclosure Control

Revealing accurate statistics vs Preserving the privacy of individuals.

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 2 / 30

slide-3
SLIDE 3

Introduction

Statistical Disclosure Control

Revealing accurate statistics vs Preserving the privacy of individuals.

How many people have cancer?

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 2 / 30

slide-4
SLIDE 4

Introduction

Statistical Disclosure Control

Revealing accurate statistics vs Preserving the privacy of individuals.

How many people have cancer? Does John Doe have cancer?

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 2 / 30

slide-5
SLIDE 5

Background Quantitative approach

Information Hiding

Dalenius’ ad omnia privacy desideratum (’77): nothing about an individual should be learnable from the database that could not be learned without access to the database.

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 3 / 30

slide-6
SLIDE 6

Background Quantitative approach

Information Hiding

Dalenius’ ad omnia privacy desideratum (’77): nothing about an individual should be learnable from the database that could not be learned without access to the database.

Trade off between privacy and utility

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 3 / 30

slide-7
SLIDE 7

Background Quantitative approach

Information Hiding

Dalenius’ ad omnia privacy desideratum (’77): nothing about an individual should be learnable from the database that could not be learned without access to the database.

Trade off between privacy and utility Quantitative Approach

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 3 / 30

slide-8
SLIDE 8

Differential Privacy

Differential Privacy - Dwork, McSherry, Smith, Nissim

A randomized function H : R → R satisfies ǫ-differential privacy if for all pairs R, R′ ∈ R, with R ∼ R′, and all X ⊆ R: Pr[H(R) ∈ X] ≤ Pr[H(R′) ∈ X] · eǫ

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 4 / 30

slide-9
SLIDE 9

Differential Privacy

Differential Privacy - Dwork, McSherry, Smith, Nissim

A randomized function H : R → R satisfies ǫ-differential privacy if for all pairs R, R′ ∈ R, with R ∼ R′, and all X ⊆ R: e−ǫ ≤ Pr[H(R) ∈ X] Pr[H(R′) ∈ X] ≤ eǫ

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 4 / 30

slide-10
SLIDE 10

Differential Privacy

Differential Privacy - Dwork, McSherry, Smith, Nissim

A randomized function H : R → R satisfies ǫ-differential privacy if for all pairs R, R′ ∈ R, with R ∼ R′, and all X ⊆ R: e−ǫ ≤ Pr[H(R) ∈ X] Pr[H(R′) ∈ X] ≤ eǫ ǫ-indistinguishability

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 4 / 30

slide-11
SLIDE 11

Differential Privacy

Overview

(oblivious case)

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 5 / 30

slide-12
SLIDE 12

Differential Privacy Noise addition

Noise addition

Laplacian distribution Lap(x | b) = 1 2b exp

  • −|x|

b

  • Marco Stronati (APVP’12)

Differential Privacy for Relational Algebra 6 / 30

slide-13
SLIDE 13

Differential Privacy Noise addition

Noise addition

Laplacian distribution Lap(x | b) = 1 2b exp

  • −|x|

b

  • Theorem (Dwork06)

For Q : R → R, the randomized mechanism H that adds noise with distribution Lap(∆Q/ǫ) enjoys ǫ-differential privacy.

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 6 / 30

slide-14
SLIDE 14

Differential Privacy Noise addition

Noise addition

Laplacian distribution Lap(x | b) = 1 2b exp

  • −|x|

b

  • Theorem (Dwork06)

For Q : R → R, the randomized mechanism H that adds noise with distribution Lap(∆Q/ǫ) enjoys ǫ-differential privacy. ↑ ∆Q ↓ ǫ

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 6 / 30

slide-15
SLIDE 15

Differential Privacy Sensitivity

Sensitivity

Definition (Sensitivity Dwork06)

Given a query Q : R → R, the sensitivity of Q, denoted by ∆Q, is defined as: ∆Q = sup

R∼R′ | Q(R) − Q(R′) |

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 7 / 30

slide-16
SLIDE 16

Contribution

Contribution

◮ a compositional method to compute a bound on the sensitivity of a query

expressed in relational algebra

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 8 / 30

slide-17
SLIDE 17

Contribution

Contribution

◮ a compositional method to compute a bound on the sensitivity of a query

expressed in relational algebra

◮ constraints used to obtain the exact sensitivity

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 8 / 30

slide-18
SLIDE 18

Differential Privacy for Relational Algebra

Differential Privacy for Relational Algebra

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 9 / 30

slide-19
SLIDE 19

Differential Privacy for Relational Algebra Relational Algebra

Relational Algebra - A Formal SQL

◮ T : universe of tuples ◮ Relation R: a set of tuples ◮ R: universe of relations

Definition (Relation Schema)

name(a1 : D1, a2 : D2, . . . , an : Dn)

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 10 / 30

slide-20
SLIDE 20

Differential Privacy for Relational Algebra Relational Algebra

Relational Algebra - A Formal SQL

◮ T : universe of tuples ◮ Relation R: a set of tuples ◮ R: universe of relations

Definition (Relation Schema)

name(a1 : D1, a2 : D2, . . . , an : Dn)

Example

Items { Item : String, Price : Int, Cost : Int } Item Price Cost Oil 100 10 Salt 50 11

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 10 / 30

slide-21
SLIDE 21

Differential Privacy for Relational Algebra Constraints

Constraints

CREATE TABLE products ( product no integer, name text, price numeric CHECK (price > 0) );

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 11 / 30

slide-22
SLIDE 22

Differential Privacy for Relational Algebra Constraints

Constrained Schema

T (C) R(C) = 2T (C)

Items { Item : String, Price : Int, Cost : Int } { < Cost ≤ 1000 Cost ≤ Price ≤ 1000 }

c-schema: schema + set of constraints C

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 12 / 30

slide-23
SLIDE 23

Differential Privacy for Relational Algebra Constraints

Constrained Schema

T (C) R(C) = 2T (C)

Items { Item : String, Price : Int, Cost : Int } { < Cost ≤ 1000 Cost ≤ Price ≤ 1000 }

c-schema: schema + set of constraints C Transformation from c-schema to c-schema

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 12 / 30

slide-24
SLIDE 24

Differential Privacy for Relational Algebra Sensitivity Constrained

Sensitivity Constrained

Definition (Sensitivity constrained)

Given f : (X, dX) → (Y , dY ), set of constraints C on X ∆f (C) = sup

x,x′∈ sol(C) x=x′

dY (f (x), f (x′)) dX(x, x′)

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 13 / 30

slide-25
SLIDE 25

Differential Privacy for Relational Algebra Metric Spaces

Metric Spaces

Adjacency relation (R, ∼)

− − − − − − − − − − − − − − ∼ − − − − − − − − − − − − − − − − − − − − −

Hamming Graph

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 14 / 30

slide-26
SLIDE 26

Differential Privacy for Relational Algebra Metric Spaces

Metric Spaces

Adjacency relation (R, ∼)

− − − − − − − − − − − − − − ∼ − − − − − − − − − − − − − − − − − − − − −

Hamming Graph

Definition (Hamming distance dH)

Given R, R′ ∈ R dH(R, R′) = |R ⊖ R′| = |(R \ R′) ∪ (R′ \ R)|

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 14 / 30

slide-27
SLIDE 27

Differential Privacy for Relational Algebra Metric Spaces

Metric Spaces

Definition (n-Hamming Distance dnH)

Given R, R

′ ∈ Rn:

dnH(R, R

′) = max (dH(R1, R′ 1), . . . , dH(Rn, R′ n))

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 15 / 30

slide-28
SLIDE 28

Differential Privacy for Relational Algebra Metric Spaces

Metric Spaces

Definition (n-Hamming Distance dnH)

Given R, R

′ ∈ Rn:

dnH(R, R

′) = max (dH(R1, R′ 1), . . . , dH(Rn, R′ n))

Definition (Euclidean Distance dE)

Given x, x′ ∈ R dE(x, x′) = |x − x′|

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 15 / 30

slide-29
SLIDE 29

Operators

Structure of a Query

(Rn, dnH)

Op

− → . . .

Op

− → (Rn, dnH)

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 16 / 30

slide-30
SLIDE 30

Operators

Structure of a Query

(Rn, dnH)

Op

− → . . .

Op

− → (Rn, dnH)

AγF

− − → (R, dE)

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 16 / 30

slide-31
SLIDE 31

Operators

Structure of a Query

(Rn, dnH)

Op

− → . . .

Op

− → (Rn, dnH)

AγF

− − → (R, dE)

  • p

∈ ∪, ∩, \, σϕ π, ×, ⊲ ⊳

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 16 / 30

slide-32
SLIDE 32

Operators Operators Sensitivity

Operators Sensitivity

∆op(C) = sup R,R′∈ R(C)

R=R′ dH(op(R),op(R′)) dH(R,R′)

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 17 / 30

slide-33
SLIDE 33

Operators Operators Sensitivity

Operators Sensitivity

∆op(C) = sup R,R′∈ R(C)

R=R′ dH(op(R),op(R′)) dH(R,R′)

= min

  • ∆op(∅), diam(C ⊗ Cop)
  • Marco Stronati (APVP’12)

Differential Privacy for Relational Algebra 17 / 30

slide-34
SLIDE 34

Operators Operators Sensitivity

Operators Sensitivity

∆op(C) = sup R,R′∈ R(C)

R=R′ dH(op(R),op(R′)) dH(R,R′)

= min

  • ∆op(∅), diam(C ⊗ Cop)
  • Marco Stronati (APVP’12)

Differential Privacy for Relational Algebra 17 / 30

slide-35
SLIDE 35

Operators Operators Sensitivity

Operators Sensitivity

∆op(C) = sup R,R′∈ R(C)

R=R′ dH(op(R),op(R′)) dH(R,R′)

= min

  • ∆op(∅), diam(C ⊗ Cop)
  • Marco Stronati (APVP’12)

Differential Privacy for Relational Algebra 17 / 30

slide-36
SLIDE 36

Operators Operators Sensitivity

Other operators

  • p

∆ Schema ∪ 2 (A, C1 ∨ C2) ∩ 2 (A, C1 ∧ C2) \ 2 (A, C1 ∧ (¬C2)) σϕ 1 (A, C ∧ ϕ) πA′ 1 (A′, C)

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 18 / 30

slide-37
SLIDE 37

Operators Cartesian product

Cartesian product

× : (R2, d2H) → (R, dH)

Name Age Height John 30 180 Alice 45 160 × Car Owner Fiat Alice Ford Alice = Name Age Height Car Owner John 30 180 Fiat Alice John 30 180 Ford Alice Alice 45 160 Fiat Alice Alice 45 160 Ford Alice

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 19 / 30

slide-38
SLIDE 38

Operators Cartesian product

Cartesian product

× : (R2, d2H) → (R, dH)

Name Age Height John 30 180 Alice 45 160 × Car Owner Fiat Alice Ford Alice = Name Age Height Car Owner John 30 180 Fiat Alice John 30 180 Ford Alice Alice 45 160 Fiat Alice Alice 45 160 Ford Alice

The (unrestricted) cartesian product has unbounded sensitivity.

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 19 / 30

slide-39
SLIDE 39

Operators Cartesian product

Cartesian product

× : (R2, d2H) → (R, dH)

Name Age Height John 30 180 Alice 45 160 × Car Owner Fiat Alice Ford Alice = Name Age Height Car Owner John 30 180 Fiat Alice John 30 180 Ford Alice Alice 45 160 Fiat Alice Alice 45 160 Ford Alice

The (unrestricted) cartesian product has unbounded sensitivity.

Join ⊲ ⊳

R ⊲ ⊳

R.ai=T.ai T = σ(R.ai=T.ai)(R×T)

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 19 / 30

slide-40
SLIDE 40

Operators Query Sensitivity

Intermediate query sensitivity

◮ op any of ∪, ∩, \, σ, π, ×, ×1 ◮ op : (Rn, dnH) → (R, dH) ◮ Cop constraint obtained after op

S(Id) = min(1, diam(CId)) base case S(op ◦ Q) = min(∆op · S(Q), diam(Cop◦Q)) if n = 1 S(op ◦ (Q1, Q2)) = min(∆op · max(S(Q1), S(Q2)), diam(Cop◦(Q1,Q2))) if n = 2

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 20 / 30

slide-41
SLIDE 41

Operators Query Sensitivity

Intermediate query sensitivity

◮ op any of ∪, ∩, \, σ, π, ×, ×1 ◮ op : (Rn, dnH) → (R, dH) ◮ Cop constraint obtained after op

S(Id) = min(1, diam(CId)) base case S(op ◦ Q) = min

  • ∆op · S(Q), diam(Cop◦Q)
  • if n = 1

S(op ◦ (Q1, Q2)) = min

  • ∆op · max(S(Q1), S(Q2)), diam(Cop◦(Q1,Q2))
  • if n = 2

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 21 / 30

slide-42
SLIDE 42

Aggregation γ

Aggregation γ

{a1,...,am} γ {f1,...,fk} : (R, dH) → (R, dH) ◮ groups tuples with the same values of ai ◮ computes fj for each group (count,max,min,avg,sum) ◮ returns a single tuple for each group, with ai and fj.

SELECT Car, Count(*), Avg(Height) FROM R GROUPBY Car

{Car}γ{Count,Avg(Height)}

  

Name Age Height Car Alice 45 160 Ford John 30 180 Fiat Frank 45 165 Bmw Eve 20 170 Ford

   =

Car Count Avg(Height) Ford 2 165 Fiat 1 180 Bmw 1 165

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 22 / 30

slide-43
SLIDE 43

Aggregation γ Functions

Functions

Assume ∅γf

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 23 / 30

slide-44
SLIDE 44

Aggregation γ Functions

Functions

Assume ∅γf R ∼ R′

∆count(C) = 1 ×n ∆sumai (C) = max{| sup(C, ai)|, | inf(C, ai)|} ×n ∆avgai (C) = | sup(C, ai) − inf(C, ai)| × 1

2

∆maxai (C) = | sup(C, ai) − inf(C, ai)| ∆minai (C) = | sup(C, ai) − inf(C, ai)|

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 23 / 30

slide-45
SLIDE 45

Aggregation γ Functions

Functions

Assume ∅γf dH(R, R′) = n

∆count(C) = 1 ×n ∆sumai (C) = max{| sup(C, ai)|, | inf(C, ai)|} ×n ∆avgai (C) = | sup(C, ai) − inf(C, ai)| ×

n 1+n

∆maxai (C) = | sup(C, ai) − inf(C, ai)| ∆minai (C) = | sup(C, ai) − inf(C, ai)|

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 23 / 30

slide-46
SLIDE 46

Global Sensitivity

Global Sensitivity

Definition (global sensitivity)

The global sensitivity GS of a query γf (Q) is defined as: GS(γf (Q)) = ∆f (CQ) · S(Q) if f = count, sum, avg ∆f (CQ) if f = max, min

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 24 / 30

slide-47
SLIDE 47

Global Sensitivity

Global Sensitivity

Definition (global sensitivity)

The global sensitivity GS of a query γf (Q) is defined as: GS(γf (Q)) = ∆f (CQ) · S(Q) if f = count, sum, avg ∆f (CQ) if f = max, min

Theorem (Soundness and strictness)

The sensitivity bound computed by GS(·) is sound and strict. Namely: GS(γf (Q)) = ∆γf (Q)

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 24 / 30

slide-48
SLIDE 48

Global Sensitivity Example

Example

c-schema ({Weight, Height}, CI) CI = {Weight ∈ [0, 150] ∧ Height ∈ [0, 200]}

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 25 / 30

slide-49
SLIDE 49

Global Sensitivity Example

Example

c-schema ({Weight, Height}, CI) CI = {Weight ∈ [0, 150] ∧ Height ∈ [0, 200]} γavg(Weight)(σWeight≤Height−100(R))

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 25 / 30

slide-50
SLIDE 50

Global Sensitivity Example

Example

c-schema ({Weight, Height}, CI) CI = {Weight ∈ [0, 150] ∧ Height ∈ [0, 200]} γavg(Weight)(σWeight≤Height−100(R)) CQ = {Weight ∈ [0, 150] ∧ Height ∈ [0, 200] ∧ Weight ≤ Height − 100}

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 25 / 30

slide-51
SLIDE 51

Global Sensitivity Example

Example

c-schema ({Weight, Height}, CI) CI = {Weight ∈ [0, 150] ∧ Height ∈ [0, 200]} γavg(Weight)(σWeight≤Height−100(R)) CQ = {Weight ∈ [0, 150] ∧ Height ∈ [0, 200] ∧ Weight ≤ Height − 100} ∆(CI, γavg(Weight)) = |max(CI ,Weight)−min(CI ,Weight)|

2

= 75 ∆(CQ, γavg(Weight)) = |max(CQ,Weight)−min(CQ,Weight)|

2

= 50

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 25 / 30

slide-52
SLIDE 52

Future work

Future Work

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 26 / 30

slide-53
SLIDE 53

Future work Join datasets

Join datasets

Join ⊲ ⊳

◮ ×n: product with blocks of a fixed n size , to obtain n sensitivity. policies to

pick these representative elements

◮ ×γ: single record is built as an aggregation of the relation, thus falling in the

case of ×1 sensitivity

◮ a mix the two approaches could be considered, building n aggregations,

possibly using the operator {ai}γf

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 27 / 30

slide-54
SLIDE 54

Future work Compositional ǫ analysis

Compositional ǫ analysis

Sequential composition: Qi queries each providing ǫi differential privacy, Qn ◦ . . . ◦ Q1 provides (

i ǫi)-differential privacy.

Parallel composition: Qi queries each providing ǫ differential privacy, parallel application to disjoint subsets of the input provides again ǫ-differential privacy. Extend static analysis to compute ǫ and optimize to parallelize

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 28 / 30

slide-55
SLIDE 55

Future work Comparison of metric spaces

Comparison of metric spaces

Distance dnH dnH((R1, . . . , Rn), (R′

1, . . . , R′ n)) = max(dH(R1, R′ 1), . . . , dH(Rn, R′ n))

At first was the Manhattan distance d2H((R1, R2), (R3, R4)) = dH((R1, R2)) + dH(R3, R4)

◮ lower sensibilities on many operators, namely ∪, ∩, \ all had sensitivity 1 ◮ did not allow us to compute the real sensitivity but just a bound

Explore further the comparison

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 29 / 30

slide-56
SLIDE 56

Future work Comparison of metric spaces

Thanks

Marco Stronati (APVP’12) Differential Privacy for Relational Algebra 30 / 30