What% is %(and% is#not )%Privacy?% CompSci#590.03# Instructor:#Ashwin#Machanavajjhala# Lecture%3%:%590.03%Fall%16% 1%
Outline%of%lecture% • Recap:%DifferenEal%Privacy% • Exercise:%DifferenEally%Private%KJmeans%Clustering% – Consistency% • Privacy%Problem%Statement% • What%privacy%is% not#…# • What% is %privacy?% % Lecture%3%:%590.03%Fall%16% 2%
DifferenEal%Privacy% [Dwork!ICALP!2006]! For%every%pair%of%inputs% For%every%output%…% that%differ%in%one%row " D 1" D 2" O" Adversary%should%not%be%able%to%disEnguish% between%any%D 1 %and%D 2 %based%on%any%O% ! ! !Pr[A(D 1 )!=!O]!!!! log% !!<!! ε!!!(ε>0) ! ! !Pr[A(D 2 )!=!O]!!!!!!!!!!!!!!!!.! Lecture%3%:%590.03%Fall%16% 3%
Privacy%Parameter%ε% For%every%pair%of%inputs% For%every%output%…% that%differ%in%one%row " D 1" D 2" O" Pr[A(D 1 ) = O] ≤ e � Pr[A(D 2 ) = O] Controls the degree to which D 1 and D 2 can be distinguished. Smaller the � more the privacy (and better the utility) Lecture%3%:%590.03%Fall%16% 4%
Laplace%Mechanism% Query!q! Database! True!answer! q(D)!+!η! q(D)! Researcher! η! Privacy%depends%on% the%λ%parameter% h(η)%α%exp(Jη%/%λ)% Laplace!DistribuGon!–!Lap(λ)! 0.6! Mean:%0,%% 0.4! Variance:%2%λ 2% 0.2! 0! Lecture%3%:%590.03%Fall%16% 5% ;10! ;8! ;6! ;4! ;2! 0! 2! 4! 6! 8! 10!
How%much%noise%for%privacy?% % [Dwork%et%al.,%TCC%2006]% SensiGvity :%Consider%a%query%q:% I % ! %R.%S(q)%is%the%smallest%number% s.t.%for%any%neighboring%tables%D,%D’,%% |%q(D)%–%q(D’)%|%%≤%%S(q)%% % % Thm :%If% sensiGvity! of%the%query%is% S ,%then%the%following%guarantees%εJ differenEal%privacy.%% λ%=%S/ε% Lecture%3%:%590.03%Fall%16% 6%
SequenEal%ComposiEon% • If%M 1 ,%M 2 ,%...,%M k # are%algorithms%that%access%a%private%database%D% such%that%each%M i## saEsfies%ε i # JdifferenEal%privacy,%% % then%the%combinaEon%of%their%outputs%saEsfies%% εJdifferenEal%privacy%withε=ε 1 +...+ε k %% Lecture%3%:%590.03%Fall%16% 7%
Parallel%ComposiEon% • If%M 1 ,%M 2 ,%...,%M k # are%algorithms%that%access%disjoint%databases%D 1 ,% D 2 ,%…,%D k %such%that%each%M i## saEsfies%ε i # JdifferenEal%privacy,%% % then%the%combinaEon%of%their%outputs%saEsfies%% εJdifferenEal%privacy%with%ε=%max{ε 1 ,...,ε k }% Lecture%3%:%590.03%Fall%16% 8%
Postprocessing% • If%M 1 %is%an%εdifferenEally%private%algorithm%that%accesses%a%private% database%D,%% % then%outpupng%M 2 (M 1 (D))%also%saEsfies%εJdifferenEal%privacy.% Lecture%3%:%590.03%Fall%16% 9%
Outline%of%lecture% • Recap:%DifferenEal%Privacy% • Exercise:%DifferenEally%Private%KJmeans%Clustering% – Consistency% • Privacy%Problem%Statement% • What%privacy%is% not#…# • What% is %privacy?% % Lecture%3%:%590.03%Fall%16% 10%
Case%Study:%KJmeans%Clustering% Lecture%3%:%590.03%Fall%16% 11%
Kmeans% • ParEEon%a%set%of%points%x 1 ,%x 2 ,%…,%x n %into%k%clusters%S 1 ,%S 2 ,%…,%S k% such% that%the%following%is%minimized:%% ! ! ! ! − ! ! ! ! ! ! ! ! ! ∈ ! ! ! Mean%of%the%cluster%S i% Lecture%3%:%590.03%Fall%16% 12%
Kmeans% Algorithm:%% • IniEalize%a%set%of%k%centers% • Repeat% %Assign%each%point%to%its%nearest%center% %Recompute%the%set%of%centers% UnEl%convergence%…%% • Output%final%set%of%k%centers% Tutorial:%DifferenEal%Privacy%in%the% Module%2% 13% Wild%
DifferenEally%Private%Kmeans% [BDMN%05]% • Suppose%we%fix%the%number%of%iteraEons%to%T% % • In%each%iteraEon%(given%a%set%of%centers):%% % %1.%Assign%the%points%to%the%new%center%to%form%clusters% % %2.%Noisily%compute%the%size%of%each%cluster%% % % %3.%Compute%noisy%sums%of%points%in%each%cluster% % % Tutorial:%DifferenEal%Privacy%in%the% Module%2% 14% Wild%
DifferenEally%Private%Kmeans% • Suppose%we%fix%the%number%of%iteraEons%to%T% % Each%iteraEon%uses%ε/T%privacy%budget,%total%privacy%loss%is%ε% • In%each%iteraEon%(given%a%set%of%centers):%% % %1.%Assign%the%points%to%the%new%center%to%form%clusters% % %2.%Noisily%compute%the%size%of%each%cluster%% % % %3.%Compute%noisy%sums%of%points%in%each%cluster% % % Tutorial:%DifferenEal%Privacy%in%the% Module%2% 15% Wild%
DifferenEally%Private%Kmeans% • Suppose%we%fix%the%number%of%iteraEons%to%T% % Exercise:%Which%of%these%steps%expends%privacy%budget?%% • In%each%iteraEon%(given%a%set%of%centers):%% % %1.%Assign%the%points%to%the%new%center%to%form%clusters% % %2.%Noisily%compute%the%size%of%each%cluster%% % % %3.%Compute%noisy%sums%of%points%in%each%cluster% % % Tutorial:%DifferenEal%Privacy%in%the% Module%2% 16% Wild%
DifferenEally%Private%Kmeans% • Suppose%we%fix%the%number%of%iteraEons%to%T% % Exercise:%Which%of%these%steps%expends%privacy%budget?%% • In%each%iteraEon%(given%a%set%of%centers):%% % NO% %1.%Assign%the%points%to%the%new%center%to%form%clusters% % YES% %2.%Noisily%compute%the%size%of%each%cluster%% % %3.%Compute%noisy%sums%of%points%in%each%cluster% YES% % % Tutorial:%DifferenEal%Privacy%in%the% Module%2% 17% Wild%
DifferenEally%Private%Kmeans% • Suppose%we%fix%the%number%of%iteraEons%to%T% % What%is%the%sensiEvity?%% • In%each%iteraEon%(given%a%set%of%centers):%% % %1.%Assign%the%points%to%the%new%center%to%form%clusters% % 1% %2.%Noisily%compute%the%size%of%each%cluster%% % Domain% %3.%Compute%noisy%sums%of%points%in%each%cluster% size% % % Tutorial:%DifferenEal%Privacy%in%the% Module%2% 18% Wild%
DifferenEally%Private%Kmeans% • Suppose%we%fix%the%number%of%iteraEons%to%T% % Each%iteraEon%uses%ε/T%privacy%budget,%total%privacy%loss%is%ε% • In%each%iteraEon%(given%a%set%of%centers):%% % %1.%Assign%the%points%to%the%new%center%to%form%clusters% % Laplace(2T/ε)% %2.%Noisily%compute%the%size%of%each%cluster%% % %3.%Compute%noisy%sums%of%points%in%each%cluster% % Laplace(2T%|dom|/ε)% % Tutorial:%DifferenEal%Privacy%in%the% Module%2% 19% Wild%
Results% (T%=%10%iteraEons,%random%iniEalizaEon)% Original!Kmeans!algorithm!! Laplace!Kmeans!algorithm!! �� �� �� �� ������������� ������������� ���� ���� ���� ���� ���� ���� ���� ���� �� ���� ���� ���� ���� ���� �� ���� ���� �� ���� ���� ���� ���� ���� ���� ���� ���� �� �� �� �� �� ���� ���� ���� ���� ���� ���� ���� ���� ���� �� �� ���� ���� ���� ���� ���� ���� ���� ���� ���� �� • Even%though%we%noisily%compute%centers,%Laplace%kmeans%can%disEnguish% clusters%that%are%far%apart.% %% • Since%we%add%noise%to%the%sums%with%sensiEvity%proporEonal%to%|dom|,% Laplace%kJmeans%can’t%disEnguish%small%clusters%that%are%close%by.% Tutorial:%DifferenEal%Privacy%in%the% Module%2% 20% Wild%
Consistency% Lecture%3%:%590.03%Fall%16% 21%
Outline%of%lecture% • Recap:%DifferenEal%Privacy% • Exercise:%DifferenEally%Private%KJmeans%Clustering% – Consistency% • Privacy%Problem%Statement% • What%privacy%is% not#…# • What% is %privacy?% % Lecture%3%:%590.03%Fall%16% 22%
StaEsEcal%Databases% Person%1% Person%2% Person%3% Person%N% Individuals!with! r 1 " r 2 " r 3 " r N % !sensiGve!data! Census% Google% Hospital% Data!Collectors! DB" DB" DB" Economists% InformaEon% Data!Analysts! Medical% Doctors% Retrieval% RecommenJ% Researchers% Researchers% daEon% Algorithms% Lecture%3%:%590.03%Fall%16% 23%
StaEsEcal%Database%Privacy%% FuncEon%provided% by%the%analyst% Server% Output%can%disclose% sensiEve%informaEon% DB" about%individuals% Person%1% Person%2% Person%3% Person% N! r 1 " r 2 " r 3 " r N " Lecture%3%:%590.03%Fall%16% 24%
StaEsEcal%Database%Privacy%% Privacy%for%individuals% (controlled%by%a%parameter% ε)%%%% Server% ! !"#$%&' ( !" , ! ) ! ! DB" Person%1% Person%2% Person%3% Person% N! r 1 " r 2 " r 3 " r N " Lecture%3%:%590.03%Fall%16% 25%
StaEsEcal%Database%Privacy%% UElity%for%analyst% %%% % Server% ! !"#$%&' ( !" , ! ) ! ! DB" Person%1% Person%2% Person%3% Person% N! r 1 " r 2 " r 3 " r N " Lecture%3%:%590.03%Fall%16% 26%
StaEsEcal%Database%Privacy%% (untrusted%collector)% Server%wants%to% compute%f% Server% % f# (%%%%%%)% DB" Individuals%do%not%want% server%to%infer%their% records% Person%1% Person%2% Person%3% Person% N! r 1 " r 2 " r 3 " r N " Lecture%3%:%590.03%Fall%16% 27%
StaEsEcal%Database%Privacy%% (untrusted%collector)% Perturb%records%to% Server% ensure%privacy%for% % f# (%%%%%%)% individuals%and% UElity%for%server% DB*" Person%1% Person%2% Person%3% Person% N! r 1 " r 2 " r 3 " r N " Lecture%3%:%590.03%Fall%16% 28%
Recommend
More recommend