. . . . . . . . . . . . . . . . Privacy Christos Dimitrakakis September 17, 2019 C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . 1 / 38
. . . . . . . . . . . . . . Introduction Introduction Database access models Privacy in databases k -anonymity Difgerential privacy C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . 2 / 38
. . . . . . . . . . . . . . . . . Introduction C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . 3 / 38
. . . . . . . . . . . . . . . Introduction Privacy in statitical disclosure. Not about cryptography An issue of trust C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . 4 / 38 . . . . ▶ Public analysis of sensitive data. ▶ Publication of “anonymised” data. ▶ Secure communication and computation. ▶ Authentication and verifjcation. ▶ Who to trust and how much. ▶ With what data to trust them. ▶ What you want out of the service.
. . . . . . . . . . . . . . Database access models Introduction Database access models Privacy in databases k -anonymity Difgerential privacy C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . 5 / 38
. Profession 1001 60 1e6 150,000 Li Pu 1959060783 Postcode 1946061408 Age Deposits Salary Name ID Example 1 (Typical relational database in a tax offjce) Databases Politician Sara Lee . 40 September 17, 2019 Privacy C. Dimitrakakis Database access Time Traveller 1001 100,000 300,000 10,000 A. B. Student 2100010101 Rentier 1001 72 -1e9 Database access models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 38 ▶ When owning the database: Direct look-up. ▶ When accessing a server etc: Query model.
. Li Pu Sara Lee 1946061408 Politician 1001 60 1e6 150,000 1959060783 -1e9 Profession Postcode Age Deposits Salary Name ID 300,000 72 Databases Python program September 17, 2019 Privacy C. Dimitrakakis Figure: Database access model response Query Database System Time Traveller 1001 1001 40 100,000 10,000 A. B. Student 2100010101 Rentier Example 1 (Typical relational database in a tax offjce) Database access models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 / 38
. . . . . . . . . . . . . . Database access models Queries in SQL The SELECT statement Selecting rows SELECT * FROM table WHERE column = value; Arithmetic queries C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . 7 / 38 ▶ SELECT column1, column2 FROM table; ▶ SELECT * FROM table; ▶ SELECT COUNT(column) FROM table WHERE condition; ▶ SELECT AVG(column) FROM table WHERE condition; ▶ SELECT SUM(column) FROM table WHERE condition;
. . . . . . . . . . . . . . Privacy in databases Introduction Database access models Privacy in databases k -anonymity Difgerential privacy C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . 8 / 38
. Postcode 60-70 80 190 Li Pu 06/07 Profession Age Politician Weight Height Name Birthday Example 2 (Typical relational database in Tinder) Anonymisation Privacy in databases 1001 06/14 . 70 September 17, 2019 Privacy C. Dimitrakakis Time Traveller 6732 40-60 170 Sara Lee A. B. Student 01/01 Rentier 1001 70+ 110 185 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 38
. Age 60-70 80 190 06/07 Profession Postcode Weight Politician Height Name Birthday Example 2 (Typical relational database in Tinder) Anonymisation Privacy in databases 1001 06/14 . 6732 September 17, 2019 Privacy C. Dimitrakakis anonymisation. The simple act of hiding or using random identifjers is called Time Traveller 40-60 185 70 170 01/01 Rentier 1001 70+ 110 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 / 38
. Medication . . . . . . Privacy in databases Record linkage Ethnicity Date Diagnosis Procedure Charge . Name Address Registration Party Lastvote Postcode Birthdate Sex 87% of Americans identifjable Bill Weld, R-MA C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 / 38
. Medication . . . . . . Privacy in databases Record linkage Ethnicity Date Diagnosis Procedure Charge . Name Address Registration Party Lastvote Postcode Birthdate Sex 87% of Americans identifjable Bill Weld, R-MA C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 / 38
. Medication . . . . . . Privacy in databases Record linkage Ethnicity Date Diagnosis Procedure Charge . Name Address Registration Party Lastvote Postcode Birthdate Sex 87% of Americans identifjable Bill Weld, R-MA C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 / 38
. A. B. Student Height Name Birthday Example 4 (Typical relational database in a tax offjce) Time Traveller 6732 40 100,000 10,000 2100010101 Age Rentier 1001 72 -1e9 300,000 Sara Lee 1946061408 Politician 1001 60 Weight Postcode 150,000 Rentier September 17, 2019 Privacy C. Dimitrakakis Time Traveller 6732 40-60 70 170 01/01 1001 Profession 70+ 110 185 06/14 Politician 1001 60-70 80 190 06/07 1e6 Li Pu . . . . . . . . . . . . . . . . . . . . . . . . . 1959060783 . Profession Postcode Age Deposits Salary Name ID Example 3 (Typical relational database in a tax offjce) Privacy in databases . . . . . . . . . . . . . 11 / 38
. . . . . . . . . . . . . . k -anonymity k -anonymity (a) Samarati (b) Sweeney Defjnition 5 ( k -anonymity) A database provides k -anonymity if for every person in the database is It’s the analyst’s job to defjne quasi-identifjers C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . 12 / 38 indistinguishable from k − 1 persons with respect to quasi-identifjers .
. Sara Lee 170 Nikos Papadopoulos 06/12 Rentier 1001 60+ 110 185 06/14 60+ Politician 1001 60+ 80 190 Li Pu 06/07 Profession 82 1243 Age 175 September 17, 2019 Privacy C. Dimitrakakis Table: 1-anonymity. Time Traveller 6910 30-40 72 Li Yang Politician 05/08 Time Traveller 6732 40-60 70 170 A. B. Student 01/01 Postcode Weight . . . . . . . . . . . . . . . . . . . . . . Height . Name Birthday k -anonymity . . . . . . . . . . . . . . . . 13 / 38
. Politician 06/12 Rentier 1001 60+ 110 185 06/14 1001 82 60+ 80 190 06/07 Profession Postcode Age Weight 170 60+ Name 72 September 17, 2019 Privacy C. Dimitrakakis 1-anonymity Policeman 6910 30-40 175 1243 05/08 Time Traveller 6732 40-60 70 170 01/01 Politician Height Birthday . . . . . . . . . . . . . . . . . . . . . k -anonymity . . . . . . . . . . . . . . . . . . 13 / 38
. 80+ 60+ 80+ 180-190 06/14 1* 60+ 180-190 06/12 06/07 Profession Postcode Age Weight Height Name 1* 170-180 k -anonymity 170-180 September 17, 2019 Privacy C. Dimitrakakis 1-anonymity 6* 20-60 60-80 05/08 60+ 6* 20-60 60-80 170-180 01/01 1* 60+ Birthday . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 / 38
. Postcode 180-190 1* 60+ 80+ 180-190 Profession Age 60+ Weight Height Name Birthday k -anonymity . 80+ 1* . 60-80 September 17, 2019 Privacy C. Dimitrakakis Table: 2-anonymity: the database can be partitioned in sets of at least 2 records 6* 20-60 170-180 170-180 6* 20-60 60-80 170-180 1* 60+ 60-80 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 / 38
. . . . . . . . . . . . . . Difgerential privacy x x 1 x 2 a infer anything about the data from the public output. C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . 14 / 38 Figure: If two people contribute their data x = ( x 1 , x 2 ) to a medical database, and an algorithm π computes some public output a from x , then it should be hard
. . . . . . . . . . . . . . Difgerential privacy x x 1 x 2 a infer anything about the data from the public output. C. Dimitrakakis Privacy September 17, 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . 14 / 38 Figure: If two people contribute their data x = ( x 1 , x 2 ) to a medical database, and an algorithm π computes some public output a from x , then it should be hard
Recommend
More recommend