CS573 Data Privacy and Security
Li Xiong
Department of Mathematics and Computer Science Emory University
CS573 Data Privacy and Security Statistical Databases Li Xiong - - PowerPoint PPT Presentation
CS573 Data Privacy and Security Statistical Databases Li Xiong Department of Mathematics and Computer Science Emory University Statistical databases Definitions Early query restriction methods Output perturbation and differential
Department of Mathematics and Computer Science Emory University
– pure statistical database:
– ordinary database with statistical access
Slide credit: Dr Lawrie Brown (UNSW@ADFA) for “Computer Security: Principles and Practice”, 1/e, by William Stallings and Lawrie Brown, Chapter 5 “Database Security”.
Slide credit: Dr Lawrie Brown (UNSW@ADFA) for “Computer Security: Principles and Practice”, 1/e, by William Stallings and Lawrie Brown, Chapter 5 “Database Security”.
Slide credit: Dr Lawrie Brown (UNSW@ADFA) for “Computer Security: Principles and Practice”, 1/e, by William Stallings and Lawrie Brown, Chapter 5 “Database Security”.
– user may infer confidential information about individual entities represented in the SDB – Such an inference is called a compromise
Partial slide credit: Computer Security and Statistical Databases By William Stallings (http://www.informit.com/articles/article.aspx?p=782117)
Slide credit: Computer Security and Statistical Databases By William Stallings (http://www.informit.com/articles/article.aspx?p=782117)
Noise Added
User 2
Query Results Original Database Perturbed Database
User 1
Query Results
Query 1 Query 1 Results Query 2 Results Query 2
K K Query Results Query Results
Original Database
Noise Added to Results
User 2
Query Results Original Database
User 1
Query Results
Query 1 Query 1 Results Query 2 Results Query 2
K K Query Results Query Results
Original Database
(Age = 42 & Sex = Male & Employer = ABC) ) = B What if B = A+1?
(Age = 42 & Sex = Male & Employer = ABC) ) = B If B = A+1
(Age = 42 & Sex = Male & Employer = ABC) & Diagnosis = Schizophrenia)
Positively or negatively compromised!
can be modeled as an equation q = 𝑏1𝑦1 + 𝑏2𝑦2 … + 𝑦𝑀𝑦𝑀
equations 𝐵𝑌 = 𝐸 where 𝐵 is an 𝑛 × 𝑀 binary matrix, 𝑌 is the vector of sensitive values, and 𝐸 is the vector of query result
queries and update it when a new query is issued