Differen'al*Privacy:*Basics* CompSci(590.03( - - PowerPoint PPT Presentation

differen al privacy basics
SMART_READER_LITE
LIVE PREVIEW

Differen'al*Privacy:*Basics* CompSci(590.03( - - PowerPoint PPT Presentation

Differen'al*Privacy:*Basics* CompSci(590.03( Instructor:(Ashwin(Machanavajjhala( Lecture*2*:*590.03*Fall*16* 1* Outline*of*lecture* Differen'al*Privacy* Basic*Algorithms* Laplace*Mechanism* Composi'on*Theorems* Exercise* *


slide-1
SLIDE 1

Differen'al*Privacy:*Basics*

CompSci(590.03( Instructor:(Ashwin(Machanavajjhala(

1* Lecture*2*:*590.03*Fall*16*

slide-2
SLIDE 2

Outline*of*lecture*

  • Differen'al*Privacy*
  • Basic*Algorithms*

– Laplace*Mechanism*

  • Composi'on*Theorems*
  • Exercise*

*

Lecture*2*:*590.03*Fall*16* 2*

slide-3
SLIDE 3

Differen'al*Privacy*

For*every*output*…* O" D2" D1" Adversary*should*not*be*able*to*dis'nguish* between*any*D1*and*D2*based*on*any*O* ! ! !Pr[A(D1)!=!O]!!!! ! !Pr[A(D2)!=!O]!!!!!!!!!!!!!!!!.! For*every*pair*of*inputs* that*differ*in*one*row" !!<!!ε!!!(ε>0)!

log*

[Dwork!ICALP!2006]!

Lecture*2*:*590.03*Fall*16* 3*

slide-4
SLIDE 4

Why*pairs*of*datasets*that(differ(in(one(row?*

D2" D1" For*every*pair*of*inputs* that*differ*in*one*row" Simulate*the*presence*or*absence*of* a*single*record" For*every*output*…* O"

Lecture*2*:*590.03*Fall*16* 4*

slide-5
SLIDE 5

Why*all(pairs*of*datasets*…?*

D2" D1" For*every*pair*of*inputs* that*differ*in*one*row" Guarantee*holds*no*maTer*what*the*

  • ther*records*are."

For*every*output*…* O"

Lecture*2*:*590.03*Fall*16* 5*

slide-6
SLIDE 6

Why*all*outputs?*

Lecture*2*:*590.03*Fall*16* 6*

D2" D1"

Set of all

  • utputs

.! .! .! A(D1) = O1 P [ A(D1) = O1 ] P [ A(D2) = Ok ]

slide-7
SLIDE 7

Should not be able to distinguish whether input was D1 or D2 no* maTer*what*the*output

Lecture*2*:*590.03*Fall*16* 7*

.! .! .! Worst!discrepancy! in!probabiliGes!

D2" D1"

slide-8
SLIDE 8

Privacy*Parameter*ε*

D2" D1" For*every*pair*of*inputs* that*differ*in*one*row" Pr[A(D1) = O] ≤ e Pr[A(D2) = O] For*every*output*…* O" Controls the degree to which D1 and D2 can be distinguished. Smaller the more the privacy (and better the utility)

Lecture*2*:*590.03*Fall*16* 8*

slide-9
SLIDE 9

Outline*of*the*Module*2*

  • Differen'al*Privacy*
  • Basic*Algorithms*

– Laplace*Mechanism*

  • Composi'on*Theorems*

Lecture*2*:*590.03*Fall*16* 9*

slide-10
SLIDE 10

Can*determinis'c*algorithms*sa'sfy*differen'al*privacy?*

10* Lecture*2*:*590.03*Fall*16*

slide-11
SLIDE 11

NonYtrivial*determinis'c*Algorithms*do*not* sa'sfy*differen'al*privacy*

Space!of!all!inputs! Space!of!all!outputs! (at!least!2!disGnct!ouputs)!

11* Lecture*2*:*590.03*Fall*16*

slide-12
SLIDE 12

Each(input(mapped(to(a(disBnct(

  • utput.(

NonYtrivial*determinis'c*Algorithms*do*not* sa'sfy*differen'al*privacy*

12* Lecture*2*:*590.03*Fall*16*

slide-13
SLIDE 13

Pr!>!0! Pr!=!0!

There*exist*two*inputs*that*differ*in*one*entry* mapped*to*different*outputs.*

13* Lecture*2*:*590.03*Fall*16*

slide-14
SLIDE 14

Random*Sampling*…*

*…*also*does*not*sa'sfy*differen'al*privacy*

Input* Output* D2" D1" O"

!=!∞!

log*

Pr[D1 ! O] Pr[D2 ! O]

Pr[D2 ! O] = 0 implies

14* Lecture*2*:*590.03*Fall*16*

slide-15
SLIDE 15

Output*Randomiza'on*

  • Add*noise*to*answers*such*that:*

– Each*answer*does*not*leak*too*much*informa'on*about*the*database.* – Noisy*answers*are*close*to*the*original*answers.**

* Database!

Researcher!

Query! Add!noise!to! true!answer!

15* Lecture*2*:*590.03*Fall*16*

slide-16
SLIDE 16

Laplace*Mechanism*

0! 0.2! 0.4! 0.6! P10! P8! P6! P4! P2! 0! 2! 4! 6! 8! 10!

Laplace!DistribuGon!–!Lap(λ)!

Database!

Researcher!

Query!q!

True!answer!

q(D)! q(D)!+!η! η!

h(η)*α*exp(Yη*/*λ)*

Privacy*depends*on* the*λ*parameter* Mean:*0,** Variance:*2*λ2*

16* Lecture*2*:*590.03*Fall*16*

slide-17
SLIDE 17

How*much*noise*for*privacy?*

* SensiGvity:*Consider*a*query*q:*I*!*R.*S(q)*is*the*smallest*number* s.t.*for*any*neighboring*tables*D,*D’,** |*q(D)*–*q(D’)*|**≤**S(q)** * * Thm:*If*sensiGvity!of*the*query*is*S,*then*the*following*guarantees*εY differen'al*privacy.**

λ*=*S/ε*

17* Lecture*2*:*590.03*Fall*16*

[Dwork*et*al.,*TCC*2006]*

slide-18
SLIDE 18

Sensi'vity:*COUNT*query*

  • Number*of*people*having*disease*
  • Sensi'vity*=*1*
  • Solu'on:*3*+*η,**

where*η*is*drawn*from*Lap(1/ε)* – Mean*=*0** – Variance*=*2/ε2** *

Lecture*2*:*590.03*Fall*16* 18*

Disease!(Y/ N)! Y* Y* N* Y* N* N* D!

slide-19
SLIDE 19

Sensi'vity:*SUM*query*

  • Suppose*all*values*x*are*in*[a,b]*
  • * Sensi'vity*=*b*

Lecture*2*:*590.03*Fall*16* 19*

slide-20
SLIDE 20

Privacy*of*Laplace*Mechanism*

  • Consider*neighboring*databases*D*and*D’*
  • Consider*some*output*O*

* *

Lecture*2*:*590.03*Fall*16* 20*

slide-21
SLIDE 21

U'lity*of*Laplace*Mechanism*

  • Laplace*mechanism*works*for*any!funcGon!that*returns*a*real*

number*

  • Error:*E(true*answer*–*noisy*answer)2**

* * ** * * *=*Var(*Lap(S(q)/ε)*)* * * ** * * *=*2*S(q)2*/*ε2*

Lecture*2*:*590.03*Fall*16* 21*

slide-22
SLIDE 22

Outline*of*the*Module*2*

  • Differen'al*Privacy*
  • Basic*Algorithms*

– Laplace*&*Exponen'al*Mechanism* – Randomized*Response*

  • Composi'on*Theorems*

*

Lecture*2*:*590.03*Fall*16* 22*

slide-23
SLIDE 23

Why*Composi'on?**

  • Reasoning*about*privacy*of**

a*complex*algorithm*is*hard.**

  • Helps*sotware*design*

– If*building*blocks*are*proven*to*be*private,*it*would*be*easy*to*reason* about*privacy*of*a*complex*algorithm*built*en'rely*using*these*building* blocks.*

Lecture*2*:*590.03*Fall*16* 23*

slide-24
SLIDE 24

A*bound*on*the*number*of*queries*

  • In*order*to*ensure*u'lity,*a*sta's'cal*database*must*leak*some*

informa'on*about*each*individual**

  • We*can*only*hope*to*bound*the**

amount*of*disclosure*

  • Hence,*there*is*a*limit*on*number*of**

queries*that*can*be*answered*

Lecture*2*:*590.03*Fall*16* 24*

slide-25
SLIDE 25

Dinur*Nissim*Result*

  • A*vast*majority*of*records*in*a*database*of*size*n(can*be*

reconstructed*when*n(log(n)2*queries*are*answered*by*a* sta's'cal*database*…* * …*even*if*each*answer*has*been*arbitrarily*altered*to*have*up*to*

  • (√n)*error*

.**

[DinurPNissim!PODS!2003]!

Lecture*2*:*590.03*Fall*16* 25*

slide-26
SLIDE 26

Sequen'al*Composi'on*

  • If*M1,*M2,*...,*Mk(are*algorithms*that*access*a*private*database*D*

such*that*each*Mi((sa'sfies*εi(Ydifferen'al*privacy,** * then*the*combina'on*of*their*outputs*sa'sfies** εYdifferen'al*privacy*withε=ε1+...+εk**

Lecture*2*:*590.03*Fall*16* 26*

slide-27
SLIDE 27

Privacy*as*Constrained*Op'miza'on*

  • Three*axes*

– Privacy** – Error* – Queries*that*can*be*answered*

  • E.g.:*Given*a*fixed*set*of*queries*and*privacy!budget!ε,*what*is*the*

minimum*error*that*can*be*achieved?**

Lecture*2*:*590.03*Fall*16* 27*

slide-28
SLIDE 28

Parallel*Composi'on*

  • If*M1,*M2,*...,*Mk(are*algorithms*that*access*disjoint*databases*D1,*

D2,*…,*Dk*such*that*each*Mi((sa'sfies*εi(Ydifferen'al*privacy,** * then*the*combina'on*of*their*outputs*sa'sfies** εYdifferen'al*privacy*withε=*max{ε1,...,εk}*

Lecture*2*:*590.03*Fall*16* 28*

slide-29
SLIDE 29

Postprocessing*

  • If*M1*is*an*εdifferen'ally*private*algorithm*that*accesses*a*private*

database*D,** * then*outpu{ng*M2(M1(D))*also*sa'sfies*εYdifferen'al*privacy.*

Lecture*2*:*590.03*Fall*16* 29*

slide-30
SLIDE 30

Case*Study:*KYmeans*Clustering*

Lecture*2*:*590.03*Fall*16* 30*

slide-31
SLIDE 31

Kmeans*

  • Par''on*a*set*of*points*x1,*x2,*…,*xn*into*k*clusters*S1,*S2,*…,*Sk*such*

that*the*following*is*minimized:**

Lecture*2*:*590.03*Fall*16* 31*

!! − !!

! ! !!∈!! ! !!!

!

!

Mean*of*the*cluster*Si*

slide-32
SLIDE 32

Kmeans*

Algorithm:**

  • Ini'alize*a*set*of*k*centers*
  • Repeat*

*Assign*each*point*to*its*nearest*center* *Recompute*the*set*of*centers* Un'l*convergence*…**

  • Output*the*final*k*centers*

Lecture*2*:*590.03*Fall*16* 32*

slide-33
SLIDE 33

Exercise**

  • What*is*a*differen'ally*private*algorithm*for*releasing*a*kYmeans*

clustering*(i.e.,*outpu{ng*the*final*set*of*k*centers)?**

Lecture*2*:*590.03*Fall*16* 33*

slide-34
SLIDE 34

Summary*

  • Differen'ally*private*algorithms*ensure*an*aTacker*can’t*infer*the*

presence*or*absence*of*a*single*record*in*the*input*based*on*any*

  • utput.**
  • Building*blocks**

– Laplace*mechanism**

  • Composi'on*rules*help*build*complex*algorithms*using*building*

blocks*

Lecture*2*:*590.03*Fall*16* 34*