No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin - PowerPoint PPT Presentation

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 15: 590.03 Fall 12 1

Outline • Background: Domain-independent privacy definitions • No Free Lunch in Data Privacy [Kifer- M SIGMOD ‘11] • Correlations: A case for domain specific privacy definitions [Kifer- M SIGMOD ‘11] • Pufferfish Privacy Framework [Kifer- M PODS’12] • Defining Privacy for Correlated Data [Kifer- M PODS’12 & Ding - M ‘13] – Next class Lecture 15: 590.03 Fall 12 2

Data Privacy Problem Utility: Privacy: No breach about any individual Server D B Individual 1 Individual 2 Individual 3 Individual N r 1 r 2 r 3 r N Lecture 15: 590.03 Fall 12 3

Data Privacy in the real world Application Data Collector Third Party Private Function (utility) (adversary) Information Medical Hospital Epidemiologist Disease Correlation between disease and geography Genome Hospital Statistician/ Genome Correlation between analysis Researcher genome and disease Advertising Google/FB/Y! Advertiser Clicks/Brows Number of clicks on an ad ing by age/region/gender … Social Facebook Another user Friend links Recommend other users Recommen- / profile or ads to users based on dations social network iDASH Privacy Workshop 9/29/2012 4

Semantic Privacy ... nothing about an individual should be learnable from the database that cannot be learned without access to the database. T. Dalenius, 1977 Lecture 15: 590.03 Fall 12 5

Can we achieve semantic privacy? • … or is there one (“precious…”) privacy definition to rule them all? Lecture 15: 590.03 Fall 12 6

Defining Privacy • In order to allow utility, a non-negligible amount of information about an individual must be disclosed to the adversary. • Measuring information disclosed to an adversary involves carefully modeling the background knowledge already available to the adversary. • … but we do not know what information is available to the adversary. Lecture 15: 590.03 Fall 12 7

Many definitions & several attacks • Linkage attack K-Anonymity • Background knowledge attack Sweeney et al. L-diversity IJUFKS ‘02 • Minimality /Reconstruction Machanavajjhala et. al attack TKDD ‘07 T-closeness • de Finetti attack E-Privacy • Composition attack Li et. al ICDE ‘07 Machanavajjhala et. al Diff ifferenti tial VLDB ‘09 Privacy Dw Dwork et. al ICALP ‘06 Lecture 15: 590.03 Fall 12 8

Composability [Dwork et al, TCC 06] Theorem (Composability) : If algorithms A 1 , A 2 , …, A k use independent randomness and each A i satisfies ε i -differential privacy, resp. Then, outputting all the answers together satisfies differential privacy with ε = ε 1 + ε 2 + … + ε k Lecture 15: 590.03 Fall 12 10

Differential Privacy • Domain independent privacy definition that is independent of the attacker. • Tolerates many attacks that other definitions are susceptible to. – Avoids composition attacks – Claimed to be tolerant against adversaries with arbitrary background knowledge. • Allows simple, efficient and useful privacy mechanisms – Used in a live US Census Product [ M et al ICDE ‘08] Lecture 15: 590.03 Fall 12 11

Outline • Background: Domain independent privacy definitions. • No Free Lunch in Data Privacy [Kifer- M SIGMOD ‘11] • Correlations: A case for domain specific privacy definitions [Kifer- M SIGMOD ‘11] • Pufferfish Privacy Framework [Kifer- M PODS’12] • Defining Privacy for Correlated Data [Kifer- M PODS’12 & Ding - M ‘13] – Current research Lecture 15: 590.03 Fall 12 12

No Free Lunch Theorem It is not possible to guarantee any utility in addition to privacy, without making assumptions about [Kifer- Machanavajjhala SIGMOD ‘11] • the data generating distribution • the background knowledge available [Dwork-Naor JPC ‘10] to an adversary Lecture 15: 590.03 Fall 12 13

Discriminant: Sliver of Utility • Does an algorithm A provide any utility? w(k, A) > c if there are k inputs { D 1 , …, D k } such that A(D i ) give different outputs with probability > c . • Example: If A can distinguish between tables of size <100 and size >1000000000, then w(2,A) = 1 . 14

Discriminant: Sliver of Utility Theorem: The discriminant of Laplace mechanism is 1. Proof: • Let Di = a database with n records and n∙i /k cancer patients • Let Si = the range [ n∙i /k – n/3k, n∙i /k + n/3k]. All Si are disjoint • Let M be the laplace mechanism on the query “how many cancer patients are there”. • Pr(M(Di) ε Si) = Pr(Noise < n/3k) > 1 – e -n/3k ε = 1 – δ • Hence, discriminant w(k,M) > 1- δ • As n tends to infinity, discriminant tends to 1. 15

Discriminant: Sliver of Utility • Does an algorithm A provide any utility? w(k, A) > c if there are k inputs { D 1 , …, D k } such that A(D i ) give different outputs with probability > c . • If w(k, A) is close to 1 - we may get some utility after using A . • If w(k, A) is close to 0 - we cannot distinguish any k inputs – no utility. 16

Non-privacy • D is randomly drawn from P data . • q is a sensitive query with k answers, s.t., knows P data but cannot guess value of q • A is not private if: can guess q correctly based on P data and A 17

No Free Lunch Theorem • Let A be a privacy mechanism with w(k,A) > 1- ε • Let q be a sensitive query with k possible outcomes. • There exists a data generating distribution P data , s.t. – q(D) is uniformly distributed, but – wins with probability greater than 1- ε 18

Outline • Background: Domain independent privacy definitions • No Free Lunch in Data Privacy [Kifer- M SIGMOD ‘11] • Correlations: A case for domain specific privacy definitions [Kifer- M SIGMOD ‘11] • Pufferfish Privacy Framework [Kifer- M PODS’12] • Defining Privacy for Correlated Data [Kifer- M PODS’12 & Ding - M ‘13] – Current research Lecture 15: 590.03 Fall 12 19

Correlations & Differential Privacy • When an adversary knows that individuals in a table are correlated, then (s)he can learn sensitive information about individuals even from the output of a differentially private mechanism. • Example 1: Contingency tables with pre-released exact counts • Example 2: Social Networks Lecture 15: 590.03 Fall 12 20

Contingency tables Each tuple takes k=4 different values 2 2 2 8 D Count( , ) Lecture 15: 590.03 Fall 12 21

Contingency tables Want to release counts privately ? ? ? ? D Count( , ) Lecture 15: 590.03 Fall 12 22

Laplace Mechanism 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 8 + Lap(1/ ε ) Mean : 8 D Variance : 2/ ε 2 Guarantees differential privacy. Lecture 15: 590.03 Fall 12 23

Marginal counts 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 4 4 2 + Lap(1/ ε ) 8 + Lap(1/ ε ) 10 10 4 4 10 10 Auxiliary marginals published for following reasons: 1. Legal : 2002 Supreme Court case Utah v. Evans 2. Contractual : Advertisers must know exact D demographics at coarse granularities Does Laplace mechanism still guarantee privacy? Lecture 15: 590.03 Fall 12 24

Marginal counts 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 4 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 8 + Lap(1/ ε ) 10 2 + Lap(1/ ε ) 4 10 Count ( , ) = 8 + Lap(1/ ε ) Count ( , ) = 8 - Lap(1/ ε ) D Count ( , ) = 8 - Lap(1/ ε ) Count ( , ) = 8 + Lap(1/ ε ) Lecture 15: 590.03 Fall 12 25

Marginal counts 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 4 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 2 + Lap(1/ ε ) 8 + Lap(1/ ε ) 10 2 + Lap(1/ ε ) 4 10 Mean : 8 D Variance : 2/ke 2 can reconstruct the table with high precision for large k Lecture 15: 590.03 Fall 12 26

Reason for Privacy Breach • Pairs of tables that differ in one tuple • cannot distinguish them Tables that do not satisfy background knowledge Space of all possible tables Lecture 15: 590.03 Fall 12 27

Reason for Privacy Breach can distinguish between every pair of these tables based on the output Space of all possible tables Lecture 15: 590.03 Fall 12 28

Correlations & Differential Privacy • When an adversary knows that individuals in a table are correlated, then (s)he can learn sensitive information about individuals even from the output of a differentially private mechanism. • Example 1: Contingency tables with pre-released exact counts • Example 2: Social Networks Lecture 15: 590.03 Fall 12 29

A count query in a social network Bob Alice • Want to release the number of edges between blue and green communities. • Should not disclose the presence/absence of Bob-Alice edge. 30

Adversary knows how social networks evolve • Depending on the social network evolution model, (d 2 -d 1 ) is linear or even super-linear in the size of the network. 31

Differential privacy fails to avoid breach Output (d 1 + δ ) δ ~ Laplace(1/ ε ) Output (d 2 + δ ) Adversary can distinguish between the two worlds if d 2 – d 1 is large. 32

Outline • Background: Domain independent privacy definitions • No Free Lunch in Data Privacy [Kifer- M SIGMOD ‘11] • Correlations: A case for domain-specific privacy definitions [Kifer- M SIGMOD ‘11] • Pufferfish Privacy Framework [Kifer- M PODS’12] • Defining Privacy for Correlated Data [Kifer- M PODS’12 & Ding - M ‘13] – Current research Lecture 15: 590.03 Fall 12 33

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin - PowerPoint PPT Presentation

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 15: 590.03 Fall 12 1 Outline Background: Domain-independent privacy definitions No Free Lunch in Data Privacy [Kifer- M SIGMOD 11]

Lunch n Learn Lunch n Learn Lunch n Learn Lunch n Learn Understanding Understanding

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

WASTE-FREE LUNCH Day 1 OBJECTIVE Describe how a waste-free lunch at the Olympics could

CS573 Data Privacy and Security Data Privacy and Security in Healthcare Data Privacy and Security

% Free Lunch by Zip Code 2009-10 (NCES Common Core of Data) LOCATION CHARTER TOTAL FREE LUNCH

$ Lesson Fourteen Consumer Privacy 04/09 privacy and information information privacy: privacy

$ Lesson Ten Consumer Privacy 04/09 privacy and information information privacy: privacy that

CS305 Topic Privacy Concept Evolution Rights to Privacy Privacy and Technologies

Privacy Protection privacy notions and metrics; privacy in RFID systems; location privacy in

Privacy & Data Governance Privacy & Data Governance Privacy & Data Governance

Data privacy: an introduction (part 1) Klara Stokes What is privacy? Privacy has been defined in

Data Privacy Law Overview Privacy Protections (D) Working Group Jennifer McAdam Senior Counsel

Introduction to Cybersecurity Database Privacy Review: Anonymity vs. Privacy Privacy -

Database Privacy Review: Anonymity vs. Privacy Privacy - Privacy is the claim of individuals,

Privacy engineering, CyLab privacy by design, privacy impact assessments, and privacy governance

Privacy Enhancing Technologies Spring 2006 Outline Privacy Overview Course Topics

Blakely Butler Learning Session Blakely Butler Moot Court Competition Fall 2020 What is Blakely

CLASS OF 2021 The following slides were shared with students at Marauder Camp this year! Please

School Programs Monthly Claims & Reimbursement Training 2020-21 School Year 3310 Meal

Open Security Controls Assessment Language (OSCAL) Lunch with the OSCAL Developers David

Lunc Lunch Cha h Chat ESSA Professional Qualifications/ In-Field and Equity Federal Programs

Welcome Santa Clara, California | April 23rd 25th, 2018 Be Social! Download the App! App

SCHOOL RE-ENTRY AT A GLANCE Health and Safety Plans COVID-19 CONSIDERATIONS COVID-19 RE-ENTRY

No Free Lunch David Cervini, Danica Porobic , Pnar Tzn, Anastasia Ailamaki Why Hardware

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin - PowerPoint PPT Presentation

No Free Lunch in Data Privacy CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 15: 590.03 Fall 12 1 Outline Background: Domain-independent privacy definitions No Free Lunch in Data Privacy [Kifer- M SIGMOD 11]

Lunch n Learn Lunch n Learn Lunch n Learn Lunch n Learn Understanding Understanding

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

WASTE-FREE LUNCH Day 1 OBJECTIVE Describe how a waste-free lunch at the Olympics could

CS573 Data Privacy and Security Data Privacy and Security in Healthcare Data Privacy and Security

% Free Lunch by Zip Code 2009-10 (NCES Common Core of Data) LOCATION CHARTER TOTAL FREE LUNCH

$ Lesson Fourteen Consumer Privacy 04/09 privacy and information information privacy: privacy

$ Lesson Ten Consumer Privacy 04/09 privacy and information information privacy: privacy that

CS305 Topic Privacy Concept Evolution Rights to Privacy Privacy and Technologies

Privacy Protection privacy notions and metrics; privacy in RFID systems; location privacy in

Privacy &amp; Data Governance Privacy &amp; Data Governance Privacy &amp; Data Governance

Data privacy: an introduction (part 1) Klara Stokes What is privacy? Privacy has been defined in

Data Privacy Law Overview Privacy Protections (D) Working Group Jennifer McAdam Senior Counsel

Introduction to Cybersecurity Database Privacy Review: Anonymity vs. Privacy Privacy -

Database Privacy Review: Anonymity vs. Privacy Privacy - Privacy is the claim of individuals,

Privacy engineering, CyLab privacy by design, privacy impact assessments, and privacy governance

Privacy Enhancing Technologies Spring 2006 Outline Privacy Overview Course Topics

Blakely Butler Learning Session Blakely Butler Moot Court Competition Fall 2020 What is Blakely

CLASS OF 2021 The following slides were shared with students at Marauder Camp this year! Please

School Programs Monthly Claims &amp; Reimbursement Training 2020-21 School Year 3310 Meal

Open Security Controls Assessment Language (OSCAL) Lunch with the OSCAL Developers David

Lunc Lunch Cha h Chat ESSA Professional Qualifications/ In-Field and Equity Federal Programs

Welcome Santa Clara, California | April 23rd 25th, 2018 Be Social! Download the App! App

SCHOOL RE-ENTRY AT A GLANCE Health and Safety Plans COVID-19 CONSIDERATIONS COVID-19 RE-ENTRY

No Free Lunch David Cervini, Danica Porobic , Pnar Tzn, Anastasia Ailamaki Why Hardware

Privacy & Data Governance Privacy & Data Governance Privacy & Data Governance

School Programs Monthly Claims & Reimbursement Training 2020-21 School Year 3310 Meal