a cryptography flavored
play

A Cryptography-Flavored Method for sanitizing a database Meaningful - PowerPoint PPT Presentation

Think Census A Cryptography-Flavored Method for sanitizing a database Meaningful statistical analysis Approach to Privacy in Preservation of individuals privacy Public Databases What do we mean? Drineas, Dwork, Goldberg,


  1. Think “Census” A Cryptography-Flavored � Method for sanitizing a database � Meaningful statistical analysis Approach to Privacy in � Preservation of individuals’ privacy Public Databases � What do we mean? Drineas, Dwork, Goldberg, Isard, Redz, Smith, Stockmeyer “Privacy” in English Focus on Geometric Data � Protection from being brought to the � Real database (RDB) consists of n points attention of others [Gavison] in d-dimensional space (say, unit ball) � inherently valuable � points are unlabeled � attention invites further privacy loss, eg info � Publish sanitized database (SDB) � One’s privacy is maintained to the extent � candidate sanitization procedure (later) that one blends in with the crowd. � Crowd size exceeds threshold T 1

  2. Relative Notion of Isolation Adversary: The Isolator � Inputs to a c-isolator: � SDB T x � auxiliary information z � Output x δ � Success occurs if q c δ Cryptographic Flavoring Isolation Does Not Imply Failure of Sanitization � Cynthia publishes her point p on web � SDB shouldn’t help the isolator “too much” � I(SDB,Cynthia’s web site) = p � Definition of “not too much” should be � δ = 0 and ball of radius c δ contains only one fairly forgiving, eg, advantage obtained from seeing the SDB may be, say, n 1+ ε RDB point � Not the fault of the sanitization procedure! � I’(Cynthia’s web sit) = p 2

  3. Candidate: Effective Sanitization Distribution on Databases? � Don’t want to deal with crypto-like definitions, in which, say, sum of every 7 th elements is congruent to 23 mod 51 � Take statistician’s approach: each point in the RDB is an independent sample from a single fixed distribution Meaningful Statistical Analysis Candidate Sanitization Procedure � For each x � RDB � Dream: find a large class of algorithms � Find T x = distance to T th nearest neighbor that “perform well” on sanitized data � Choose x’ � R B(x,T x ) � Start with clustering � Complements definition of c-isolation � if q c-isolates x then D(q,x) � T x /(c-1) � clusterings have measures of quality (diameter, conductance, etc.) � consequence: high dimensionality is our friend � Intuition: � See how measures are preserved � perturb minimally to prevent isolation � under sanitization � outliers randomized to oblivion � under de-sanitization � kills isolated anomalies, maintains group anomalies 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend