Simulatability The enemy knows the system, Claude Shannon CompSci - PowerPoint PPT Presentation

Simulatability “The enemy knows the system”, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 6 : 590.03 Fall 12 1

Announcements • Please meet with me at least 2 times before you finalize your project (deadline Sep 28). Lecture 6 : 590.03 Fall 12 2

Recap – L-Diversity • The link between identity and attribute value is the sensitive information. “ Does Bob have Cancer? Heart disease? Flu?” “Does Umeko have Cancer? Heart disease? Flu?” • Adversary knows ≤ L -2 negation statements. “ Umeko does not have Heart Disease.” – Data Publisher may not know exact adversarial knowledge • Privacy is breached when identity can be linked to attribute value with high probability Pr[ “ Bob has Cancer” | published table, adv. knowledge ] > t Lecture 6 : 590.03 Fall 12 3

Recap – 3-Diverse Table Zip Age Nat. Disease 1306* <=40 * Heart 1306* <=40 * Flu 1306* <=40 * Cancer L-Diversity Principle : 1306* <=40 * Cancer Every group of tuples with the 1485* >40 * Cancer same Q-ID values has ≥ L 1485* >40 * Heart distinct sensitive values of 1485* >40 * Flu 1485* >40 * Flu roughly equal proportions. 1305* <=40 * Heart 1305* <=40 * Flu 1305* <=40 * Cancer 1305* <=40 * Cancer Lecture 6 : 590.03 Fall 12 4

Outline • Simulatable Auditing • Minimality Attack in anonymization • Simulatable algorithms for anoymization Lecture 6 : 590.03 Fall 12 5

Query Auditing Query Yes Database Safe to publish? No Researcher Database has numeric values (say salaries of employees). Database either truthfully answers a question or denies answering. MIN, MAX, SUM queries over subsets of the database. Question: When to allow/deny queries? Lecture 6 : 590.03 Fall 12 6

Why should we deny queries? • Q1: Ben’s sensitive value? Name 1 st year Gender Sensitiv – DENY PhD e value Ben Y M 1 • Q2: Max sensitive value of Bha N M 1 males? Ios Y M 1 – ANSWER: 2 Jan N M 2 Jian Y M 2 Jie N M 1 • Q3: Max sensitive value of 1 st Joe N M 2 year PhD students? Moh N M 1 – ANSWER: 3 Son N F 1 Xi Y F 3 • But Q3 + Q2 => Xi = 3 Yao N M 2 Lecture 6 : 590.03 Fall 12 7

Value-Based Auditing • Let a 1 , a 2 , …, a k be the answers to previous queries Q 1 , Q 2 , …, Q k . • Let a k+1 be the answer to Q k+1 . a i = f(c i1 x 1 , c i2 x 2 , …, c in x n ), i = 1 … k+1 c im = 1 if Q i depends on x m Check if any x j has a unique solution. Lecture 6 : 590.03 Fall 12 8

Value-based Auditing • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. • Allow query if value of xi can’t be inferred. x 1 x 2 x 3 x 4 x 5 Lecture 6 : 590.03 Fall 12 9

Value-based Auditing • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. - ∞ ≤ x 1 … x 5 ≤ 10 • Allow query if value of xi can’t be inferred. max(x 1 , x 2 , x 3 , x 4 , x 5 ) x 1 x 2 Ans: 10 10 x 3 x 4 x 5 Lecture 6 : 590.03 Fall 12 10

Value-based Auditing • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. • Allow query if value of xi can’t be inferred. - ∞ ≤ x 1 … x 4 ≤ 8 max(x 1 , x 2 , x 3 , x 4 , x 5 ) x 1 => x 5 = 10 x 2 Ans: 10 10 x 3 max(x 1 , x 2 , x 3 , x 4 ) x 4 x 5 Ans: 8 DENY Lecture 6 : 590.03 Fall 12 11

Value-based Auditing • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. Denial means some • Allow query if value of xi can’t be inferred. value can be compromised! max(x 1 , x 2 , x 3 , x 4 , x 5 ) x 1 x 2 Ans: 10 10 x 3 max(x 1 , x 2 , x 3 , x 4 ) x 4 x 5 Ans: 8 DENY Lecture 6 : 590.03 Fall 12 12

Value-based Auditing • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. What could • Allow query if value of xi can’t be inferred. max(x1, x2, x3, x4) be? max(x 1 , x 2 , x 3 , x 4 , x 5 ) x 1 x 2 Ans: 10 10 x 3 max(x 1 , x 2 , x 3 , x 4 ) x 4 x 5 Ans: 8 DENY Lecture 6 : 590.03 Fall 12 13

Value-based Auditing • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. From first answer, • Allow query if value of xi can’t be inferred. max(x1,x2,x3,x4) ≤ 10 max(x 1 , x 2 , x 3 , x 4 , x 5 ) x 1 x 2 Ans: 10 10 x 3 max(x 1 , x 2 , x 3 , x 4 ) x 4 x 5 Ans: 8 DENY Lecture 6 : 590.03 Fall 12 14

Value-based Auditing • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. If, max(x1,x2,x3,x4) = 10 • Allow query if value of xi can’t be inferred. Then, no privacy breach max(x 1 , x 2 , x 3 , x 4 , x 5 ) x 1 x 2 Ans: 10 10 x 3 max(x 1 , x 2 , x 3 , x 4 ) x 4 x 5 Ans: 8 DENY Lecture 6 : 590.03 Fall 12 15

Value-based Auditing • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. Hence, • Allow query if value of xi can’t be inferred. max(x1,x2,x3,x4) < 10 => x5 = 10! max(x 1 , x 2 , x 3 , x 4 , x 5 ) x 1 x 2 Ans: 10 10 x 3 max(x 1 , x 2 , x 3 , x 4 ) x 4 x 5 Ans: 8 DENY Lecture 6 : 590.03 Fall 12 16

Value-based Auditing • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. Hence, • Allow query if value of xi can’t be inferred. max(x1,x2,x3,x4) < 10 => x5 = 10! Denials leak information. max(x 1 , x 2 , x 3 , x 4 , x 5 ) x 1 Attack occurred since privacy analysis did x 2 Ans: 10 10 x 3 not assume that attacker knows the algorithm. max(x 1 , x 2 , x 3 , x 4 ) x 4 x 5 Ans: 8 DENY Lecture 6 : 590.03 Fall 12 17

Simulatable Auditing [Kenthapadi et al PODS ‘05] • An auditor is simulatable if the decision to deny a query Q k is made based on information already available to the attacker. – Can use querie s Q 1 , Q 2 , …, Q k and answers a 1 , a 2 , …, a k-1 – Cannot use a k or the actual data to make the decision. • Denials provably do not leak informaiton – Because the attacker could equivalently determine whether the query would be denied. – Attacker can mimic or simulate the auditor. Lecture 6 : 590.03 Fall 12 18

Simulatable Auditing Algorithm • Data Values: {x 1 , x 2 , x 3 , x 4 , x 5 }, Queries: MAX. Ans > 10 => not possible • Allow query if value of xi can’t be inferred. Ans = 10 => - ∞ ≤ x 1 … x 4 ≤ 10 SAFE UNSAFE Ans < 10 => x 5 = 10 max(x 1 , x 2 , x 3 , x 4 , x 5 ) x 1 x 2 Ans: 10 10 x 3 max(x 1 , x 2 , x 3 , x 4 ) x 4 Before x 5 computing DENY answer Lecture 6 : 590.03 Fall 12 19

Summary of Simulatable Auditing • Decision to deny answers must be based on past queries answered in some ( many! ) cases. • Denials can leak information if the adversary does not know all the information that is used to decide whether to deny the query. Lecture 6 : 590.03 Fall 12 20

Outline • Simulatable Auditing • Minimality Attack in anonymization • Simulatable algorithms for anoymization Lecture 6 : 590.03 Fall 12 21

Minimality attack on Generalization algorithms • Algorithms for K-anonymity, L-diversity, T-closeness, etc. try to maximize utility. – Find a minimally generalized table in the lattice that satisfies privacy, and maximizes utility. • But … attacker also knows this algorithm! Lecture 6 : 590.03 Fall 12 22

Example Minimality attack [Wong et al VLDB07] • Dataset with one quasi-identifier and 2 values q1, q2. • q1, q2 generalize to Q. • Sensitive attribute: Cancer – yes/no • We want to ensure P[Cancer = yes] < ½. – OK to know if an individual does not have Cancer. QID Cancer Q Yes Q Yes • Published Table: Q No Q No q2 No q2 No Lecture 6 : 590.03 Fall 12 23

Which input datasets could have led to the published table? Output dataset {q1,q2}  Q Possible Input dataset (“2 - diverse”) 3 occurrences of q1 QID Cancer QID Cancer QID Cancer q1 Yes q1 Yes Q Yes q1 Yes q1 No Q Yes q1 No q1 No Q No q2 No q2 Yes Q No q2 No q2 No q2 No q2 No q2 No q2 No Lecture 6 : 590.03 Fall 12 24

Which input datasets could have led to the published table? Output dataset {q1,q2}  Q Possible Input dataset (“2 - diverse”) 3 occurrences of q1 QID Cancer QID Cancer q1 Yes Q Yes Q No Q Yes Q No Q No q2 Yes Q No q2 No q2 No q2 No q2 No This is a better generalization! Lecture 6 : 590.03 Fall 12 25

Which input datasets could have led to the published table? Output dataset {q1,q2}  Q Possible Input dataset (“2 - diverse”) 1 occurrence of q1 QID Cancer QID Cancer QID Cancer q2 Yes q2 Yes Q Yes q1 Yes q2 Yes Q Yes q2 No q1 No Q No q2 No q2 No Q No q2 No q2 No q2 No q2 No q2 No q2 No Lecture 6 : 590.03 Fall 12 26

Which input datasets could have led to the published table? Output dataset {q1,q2}  Q Possible Input dataset (“2 - diverse”) 3 occurrences of q1 QID Cancer QID Cancer q2 Yes Q Yes Q No Q Yes Q No Q No q2 Yes Q No q2 No q2 No q2 No q2 No This is a better generalization! Lecture 6 : 590.03 Fall 12 27

Which input datasets could have led to the published table? Output dataset {q1,q2}  Q Possible Input dataset (“2 - diverse”) 3 occurrences of q1 QID Cancer QID Cancer q2 Yes Q Yes There must be exactly two tuples with q1 Q No Q Yes Q No Q No q2 Yes Q No q2 No q2 No q2 No q2 No Lecture 6 : 590.03 Fall 12 28

Simulatability The enemy knows the system, Claude Shannon CompSci - PowerPoint PPT Presentation

Simulatability The enemy knows the system, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 6 : 590.03 Fall 12 1 Announcements Please meet with me at least 2 times before you finalize your project (deadline

Simulatability The enemy knows the system, Claude Shannon

Anonymization Algorithms - Other techniques, metrics, and extended scenarios Li Xiong CS573

realisation in Opt-Out trial site Nepean Blue Mountains Jo-Anna Wood Change & Adoption

Disclaimer This webinar may be recorded. This webinar presents a sampling of best practices and

CME for POTS/EDS/Chiari My story with POTS/EDS/Chiari By Amanda Ross Frequent Questions How

I U Health Motility Conference July 2, 2014 Anne Mary Montero, PhD, HSPP Prevalence

Disclosures Vanderbilt Institutional Review Board approved this study Vanderbilt University

The London Neurogastroenterology Course 2019 Session slides Day 1 Wednesday 17 April 2019

APPROACH TO BLOATING Epidemiology Very common symptom in primary care (10- 30% in population

Antibiotics Macrolides/Fluoroquinolones Danita Dee Narciso Pharm D 1 2 Objectives Become

M98-863 Trial Lopinavir-ritonavir versus Nelfinavir in Treatment-Nave M98-863: Study Design

Reference Chuang-Stein C, Beltangady M. (2011 ) Reporting cumulative proportion of

Interstitial Pulmonary Fibrosis A Deadly Disease. Said Chaaban, MD Assistant Professor of

CAPACITY Results Conference Call CAPACITY Results Conference Call February 3, 2009 Innovative

Oncology Grand Rounds New Agents and Strategies in PARP Inhibition in the Management of Common

Assessing the Safety of Rosiglitazone for the Treatment of Type 2 Diabetes Konstantinos

NUTRITION AND COMBINED NUTRITION PLUS PHYSICAL ACTIVITY INTERVENTIONS FOR OLDER ADULTS LIVING

Orophary ryngeal Dysphaga And Frail ility: Can It It Be Rela lated? Glistan BAHAT, zlem

Update for Nursing Homes: COVID-19 Sarah Rowland, Speech and Language Therapist Common Causes of

Dysphagia: Recreation Therapy and Speech-Language Pathology Collaborations at the Sunnybrook

Dysphagia: decisions, decisions, decisions Sean White Home Enteral Feed Dietitian Sheffield

Therapy in Pediatrics Erin n Reier er, OTD, OTR/L, CBIS Pediatric Program Leader Michell

Bulbar Features Swallowing Jodi Allen, Dietitian, National Hospital for Neurology and

Disfagia tardiva riportata dal paziente dopo tra7amento cura:vo

Simulatability The enemy knows the system, Claude Shannon CompSci - PowerPoint PPT Presentation

Simulatability The enemy knows the system, Claude Shannon CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 6 : 590.03 Fall 12 1 Announcements Please meet with me at least 2 times before you finalize your project (deadline

Simulatability The enemy knows the system, Claude Shannon

Anonymization Algorithms - Other techniques, metrics, and extended scenarios Li Xiong CS573

realisation in Opt-Out trial site Nepean Blue Mountains Jo-Anna Wood Change &amp; Adoption

Disclaimer This webinar may be recorded. This webinar presents a sampling of best practices and

CME for POTS/EDS/Chiari My story with POTS/EDS/Chiari By Amanda Ross Frequent Questions How

I U Health Motility Conference July 2, 2014 Anne Mary Montero, PhD, HSPP Prevalence

Disclosures Vanderbilt Institutional Review Board approved this study Vanderbilt University

The London Neurogastroenterology Course 2019 Session slides Day 1 Wednesday 17 April 2019

APPROACH TO BLOATING Epidemiology Very common symptom in primary care (10- 30% in population

Antibiotics Macrolides/Fluoroquinolones Danita Dee Narciso Pharm D 1 2 Objectives Become

M98-863 Trial Lopinavir-ritonavir versus Nelfinavir in Treatment-Nave M98-863: Study Design

Reference Chuang-Stein C, Beltangady M. (2011 ) Reporting cumulative proportion of

Interstitial Pulmonary Fibrosis A Deadly Disease. Said Chaaban, MD Assistant Professor of

CAPACITY Results Conference Call CAPACITY Results Conference Call February 3, 2009 Innovative

Oncology Grand Rounds New Agents and Strategies in PARP Inhibition in the Management of Common

Assessing the Safety of Rosiglitazone for the Treatment of Type 2 Diabetes Konstantinos

NUTRITION AND COMBINED NUTRITION PLUS PHYSICAL ACTIVITY INTERVENTIONS FOR OLDER ADULTS LIVING

Orophary ryngeal Dysphaga And Frail ility: Can It It Be Rela lated? Glistan BAHAT, zlem

Update for Nursing Homes: COVID-19 Sarah Rowland, Speech and Language Therapist Common Causes of

Dysphagia: Recreation Therapy and Speech-Language Pathology Collaborations at the Sunnybrook

Dysphagia: decisions, decisions, decisions Sean White Home Enteral Feed Dietitian Sheffield

Therapy in Pediatrics Erin n Reier er, OTD, OTR/L, CBIS Pediatric Program Leader Michell

Bulbar Features Swallowing Jodi Allen, Dietitian, National Hospital for Neurology and

Disfagia tardiva riportata dal paziente dopo tra7amento cura:vo

realisation in Opt-Out trial site Nepean Blue Mountains Jo-Anna Wood Change & Adoption