kb kb-Anonymity: A Model for Anonymized
Behavior-Preserving Test and Debugging Data
Aditya Budi, David Lo, Lingxiao Jiang, Lucia
Behavior Preservation Privacy Preservation Where is the best place to stay?
kb -Anonymity: A Model for Anonymized kb Behavior-Preserving Test - - PowerPoint PPT Presentation
kb -Anonymity: A Model for Anonymized kb Behavior-Preserving Test and Debugging Data Where is the Privacy best place to Preservation Aditya Budi, David Lo, Lingxiao Jiang, Lucia stay? Behavior Preservation Software Testing & Debugging
Behavior Preservation Privacy Preservation Where is the best place to stay?
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
In-house during development process Post-deployment in user fields
2
Testing & Debugging
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
3
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
4
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
5
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
6
Gender Zipcode DOB Disease Male 95110 6/7/72 Heart Disease Female 95110 1/31/80 Hepatitis … … … … Name DOB Gender Zipcode Bob 6/7/72 Male 95110 Beth 1/31/80 Female 95110 … … … …
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
7
Gender Zipcode DOB Disease Male 95110 6/7/72 Heart Disease Female 95110 1/31/80 Hepatitis … … … … Name DOB Gender Zipcode Bob 6/7/72 Male 95110 Beth 1/31/80 Female 95110 … … … …
Gender Zipcode DOB Disease Male * * Heart Disease Female * * Hepatitis … … … …
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
8
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
9
Sex Zipcode DOB Disease Male 95110 6/7/72 Heart Disease Female 95110 1/31/80 Hepatitis … … … …
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
10
Sex Zipcode DOB Disease Male 95110 6/7/72 Heart Disease Female 95110 1/31/80 Hepatitis … … … …
USA CA, USA
San Jose 95* * * , 1972
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
11
Sex Zipcode DOB Disease Male 95110 6/7/72 Heart Disease Female 95110 1/31/80 Hepatitis … … … …
USA CA, USA
San Jose 95* * * , 1972
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
How to anonymize
Follow guidance provided by the k-anonymity privacy model
Each tuple has at least k-1 indistinguishable peers
Generate concrete values always Remove indistinguishable tuples
How useful is the anonymized data
Preserve utility for testing and debugging Each anonymized tuple exhibits certain kinds of behavior
12
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
13
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
14
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
15
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
16
r=<f1,…,fi r,…fn>
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
17
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
18
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
19
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
20
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
OpenHospital, iTrust, PDManager From sourceforge Modified to deal with integers only Randomly generated test data for anonymization
21
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
22
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
23
x-axis: different configurations; y-axis: running time in seconds; Different colors represent the sizes of different original data sets
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
Reply on data owners to choose appropriate QIs
Do not maintain data statistics, and thus not suitable
May handle string constraints based on JPF+ jFuzz
24
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
25
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
26
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
27
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
28
[ISSRE 2010] consider same statement coverage; focus on choosing better QIs, then use standard k-anonymity algorithm [USENIX Security 2008, ASPLOS 2008, ICSE 2011] consider path conditions; focus on anonymizing a single tuple [USENIX Security 2003] focus on anonymizing a single tuple only These studies complement ours in cases when only a limited number of failed test inputs are considered.
PLDI, San Jose Convention Center, June 7th, 2011 kb-Anonymity
29