Privacy Skyline: Privacy with Multidimensional Adversarial - PowerPoint PPT Presentation

�� Privacy Skyline: Privacy with Multidimensional Adversarial Knowledge Bee-Chung Chen, Kristen LeFevre University of Wisconsin – Madison Raghu Ramakrishnan Yahoo! Research Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� Example: Medical Record Dataset • A data owner wants to release data for medical research • An adversary wants to discover individuals’ sensitive info Name Age Gender Zipcode Disease Ann 20 F 12345 AIDS Bob 24 M 12342 Flu Cary 23 F 12344 Flu Dick 27 M 12343 AIDS Ed 35 M 12412 Flu Frank 34 M 12433 Cancer Gary 31 M 12453 Cancer Tom 38 M 12455 AIDS 2 Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� What If the Adversary Knows … Age Gender Zipcode Group Group Disease (Ann) 20 F 12345 AIDS (Bob) 24 M 12342 Flu 2* Any 1234* 1 1 Flu (Cary) 23 F 12344 AIDS (Dick) 27 M 12343 (Ed) 35 M 12412 Flu (Frank) 34 M 12433 Cancer 3* 2 2 M 123** Cancer (Gary) 31 M 12453 AIDS (Tom) 38 M 12455 • Without any additional knowledge, Pr(Tom has AIDS) = ¼ • What if the adversary knows “Tom does not have Cancer and Ed has Flu” Pr(Tom has AIDS | above data and above knowledge) = 1 1 3 Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� Privacy with Adversarial Knowledge • Bayesian privacy definition: A released dataset D * is safe if, for any person t and any sensitive value s , safe Pr( t has s | D * , Adversarial Knowledge ) < c – This probability is the adversary’s confidence that person t has sensitive value s , after he sees the released dataset – Equivalent definition: D * is safe if max t , s Pr( t has s | D * , Adversarial Knolwedge) < c Maximum breach probability – Prior work following this intuition: [Machanavajjhala et al., 2006; Martin et al., 2007; Xiao and Tao, 2006] 4 Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� Questions to be Addressed • Bayesian privacy criterion: max Pr( t has s | D * , Adversarial Knowledge ) < c • How to describe describe various kinds of adversarial knowledge – We provide intuitive knowledge expressions that cover three kinds of common adversarial knowledge • How to analyze analyze data safety in the presence of various kinds of possible adversarial knowledge – We propose a skyline tool for what-if analysis in the “knowledge space” • How to efficiently generate generate a safe dataset to release – We develop algorithms (based on a “congregation” property) orders of magnitude faster than the best known dynamic programming technique [Martin et al., 2007] 5 Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� Outline • Theoretical framework (possible-world semantics) – How the privacy breach is defined • Three-dimensional knowledge expression • Privacy Skyline • Efficient and scalable algorithms • Experimental results • Conclusion and future work 6 Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� Theoretical Framework Release candidate D * Original dataset D Name Age Gender Zipcode Disease Age Gender Zipcode Group Group Disease Ann 20 F 12345 AIDS (Ann) 20 F 12345 AIDS Bob 24 M 12342 Flu (Bob) 24 M 12342 Flu 1 1 Flu Cary 23 F 12344 Flu (Cary) 23 F 12344 AIDS Dick 27 M 12343 AIDS (Dick) 27 M 12343 Ed 35 M 12412 Flu (Ed) 35 M 12412 Flu Frank 34 M 12433 Cancer (Frank) 34 M 12433 Cancer 2 2 Cancer Gary 31 M 12453 Cancer (Gary) 31 M 12453 AIDS Tom 38 M 12455 AIDS (Tom) 38 M 12455 • Assume each person has • Each group is called a QI-group only one sensitive value • This abstraction includes (in the talk) • Generalization-based methods • Sensitive attribute can be • Bucketization set-valued (in the paper) 7 Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� Theoretical Framework Reconstruction A reconstruction of D * is intuitively a possible original dataset (possible world) that would generate D * by using the grouping mechanism Reconstructions of Group 2 Release candidate D * Age Gender Zipcode Group Group Disease Ed … Flu (Ann) 20 F 12345 Frank … Cancer AIDS (Bob) 24 M 12342 Flu 1 1 Flu Gary … Cancer (Cary) 23 F 12344 AIDS (Dick) 27 M 12343 Tom … AIDS (Ed) 35 M 12412 Flu (Frank) 34 M 12433 Cancer 2 2 Cancer (Gary) 31 M 12453 Ed … AIDS AIDS (Tom) 38 M 12455 Frank … Cancer Fix Permute Gray … Cancer Assumption: Without any additional knowledge, Tom … Flu every reconstruction is equally likely 8 Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� Probability Definition • Knowledge expression K : Logic sentence [ Martin et al., 2007 ] E.g., K = (Tom[ S ] ≠ Cancer) ∧ (Ed[ S ] = Flu) Pr( Tom[ S ] = AIDS | K , D * ) # of reconstructions of D * that satisfy K ∧ (Tom[ S ] = AIDS) ≡ # of reconstructions of D * that satisfy K • Worst-case disclosure – Knowledge expressions may also include variables E.g., K = (Tom[ S ] ≠ x x ) ∧ ( u u [ S ] ≠ y y ) ∧ ( v v [ S ] = s s → Tom[ S ] = s s ) – Maximum breach probability s | D * , K ) t [ S ] = s max Pr( t The maximization is over variables t , u , v , s , x , y , by substituting them with constants in the dataset 9 Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� What Kinds of Expressions • Privacy criterion: Release candidate D * is safe if max Pr( t [ S ] = s | D * , K ) < c • Prior work by Martin et al., 2007 – K is a conjunction of m implications E.g., K = ( u 1 [ S ] = x 1 → v 1 [ S ] = y 1 ) ∧ … ∧ ( u m [ S ] = x m → v m [ S ] = y m ) – Not intuitive: What is the practical meaning of m implications? – Some limitations: Some simple knowledge cannot be expressed • Complexity for general logic sentences – Computing breach probability is NP-hard • Goal: Identify classes of expressions that are – Useful (intuitive & cover common adversarial knowledge) – Computationally feasible 10 Bee-Chung Chen 2007 beechung@cs.wisc.edu

�� Outline • Theoretical framework • Three-dimensional knowledge expression – Tradeoff between expressiveness and feasibility • Privacy Skyline • Efficient and scalable algorithms • Experimental results • Conclusion and future work 11 Bee-Chung Chen 2007 beechung@cs.wisc.edu

Privacy Skyline: Privacy with Multidimensional Adversarial - PowerPoint PPT Presentation

Outline Ranking and skyline Top- k algorithms Skyline algorithms Reconciling top-k

SKYLINE ELEMENTARY SCHOOL Solana Beach School District Phase 2 & 3 S P U R L O C K Skyline

ASSOCIATED STUDENTS OF ASSOCIATED STUDENTS OF SKYLINE COLLEGE SKYLINE COLLEGE PRESENTATION TO:

The Delft Skyline Debates An Overview Delft, June 4, 2010 Andrzej Stankiewicz 1 Friday AS

Discovering relative importance of skyline attributes D. Mindolin & J. Chomicki Department

ZINC: Efficient Indexing for Skyline Computation Bin Liu Chee-Yong Chan Department of Computer

Secure Skyline Queries on Encrypted Data CS 573 Data Privacy and Security Jinfei Liu, Juncheng

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

$ Lesson Fourteen Consumer Privacy 04/09 privacy and information information privacy: privacy

$ Lesson Ten Consumer Privacy 04/09 privacy and information information privacy: privacy that

CS305 Topic Privacy Concept Evolution Rights to Privacy Privacy and Technologies

Privacy Protection privacy notions and metrics; privacy in RFID systems; location privacy in

VMware Skyline Turn Moments of Panic into Moments to Shine with Proactive Support Arron King

Skyline Drive Bella Villagio Staff reasons FOR abandonment: Dedication required for

Skyline Parking AG The revolution in car parking August, 2012 Agenda 1 Company 2 Products

Lecture 5: Top-1 and Skyline CMSC 5705 Advanced Topics in Database Systems Yufei Tao Department

OCF 1.0 Candidate Release Summary, Analysis, and Update from OIC1.1 Michael McCool Intel Osaka,

Build Trust in Your Build-to-Deployment Flow! Baruch

What is a Hazard? A hazard means any risk of harm to the health or safety of an actual or

Impact of South-South integration on the export upgrading of African economies Alessia Amighini

US US-23 Fl Flex ex Rout Route Pa Part rt-Time Shou e Shoulder O er Opera erations

Loops II Warmup Edit your loop from last time that asks for names and prints the one earlier

Experiment Readiness Review For MINERvA Howard Budd, University of Rochester Feb 8, 2013

Repetitive Structures Lecture 7 CGS 3416 Fall 2015 September 23, 2015 Repetition Statement

Privacy Skyline: Privacy with Multidimensional Adversarial - PowerPoint PPT Presentation

Outline Ranking and skyline Top- k algorithms Skyline algorithms Reconciling top-k

SKYLINE ELEMENTARY SCHOOL Solana Beach School District Phase 2 &amp; 3 S P U R L O C K Skyline

ASSOCIATED STUDENTS OF ASSOCIATED STUDENTS OF SKYLINE COLLEGE SKYLINE COLLEGE PRESENTATION TO:

The Delft Skyline Debates An Overview Delft, June 4, 2010 Andrzej Stankiewicz 1 Friday AS

Discovering relative importance of skyline attributes D. Mindolin &amp; J. Chomicki Department

ZINC: Efficient Indexing for Skyline Computation Bin Liu Chee-Yong Chan Department of Computer

Secure Skyline Queries on Encrypted Data CS 573 Data Privacy and Security Jinfei Liu, Juncheng

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

$ Lesson Fourteen Consumer Privacy 04/09 privacy and information information privacy: privacy

$ Lesson Ten Consumer Privacy 04/09 privacy and information information privacy: privacy that

CS305 Topic Privacy Concept Evolution Rights to Privacy Privacy and Technologies

Privacy Protection privacy notions and metrics; privacy in RFID systems; location privacy in

VMware Skyline Turn Moments of Panic into Moments to Shine with Proactive Support Arron King

Skyline Drive Bella Villagio Staff reasons FOR abandonment: Dedication required for

Skyline Parking AG The revolution in car parking August, 2012 Agenda 1 Company 2 Products

Lecture 5: Top-1 and Skyline CMSC 5705 Advanced Topics in Database Systems Yufei Tao Department

OCF 1.0 Candidate Release Summary, Analysis, and Update from OIC1.1 Michael McCool Intel Osaka,

Build Trust in Your Build-to-Deployment Flow! Baruch

What is a Hazard? A hazard means any risk of harm to the health or safety of an actual or

Impact of South-South integration on the export upgrading of African economies Alessia Amighini

US US-23 Fl Flex ex Rout Route Pa Part rt-Time Shou e Shoulder O er Opera erations

Loops II Warmup Edit your loop from last time that asks for names and prints the one earlier

Experiment Readiness Review For MINERvA Howard Budd, University of Rochester Feb 8, 2013

Repetitive Structures Lecture 7 CGS 3416 Fall 2015 September 23, 2015 Repetition Statement

SKYLINE ELEMENTARY SCHOOL Solana Beach School District Phase 2 & 3 S P U R L O C K Skyline

Discovering relative importance of skyline attributes D. Mindolin & J. Chomicki Department