-Presence M. Ercan Nergiz Maurizio Atzori Chris Clifton Pisa KDD - - PowerPoint PPT Presentation

presence
SMART_READER_LITE
LIVE PREVIEW

-Presence M. Ercan Nergiz Maurizio Atzori Chris Clifton Pisa KDD - - PowerPoint PPT Presentation

C onsiglio N azionale delle R icerche Hiding the Presence of Individuals from Shared Databases: -Presence M. Ercan Nergiz Maurizio Atzori Chris Clifton Pisa KDD Lab Outline Adversary Models Existential Uncertainty Model


slide-1
SLIDE 1

Consiglio Nazionale delle Ricerche

Pisa KDD Lab

Hiding the Presence of Individuals from Shared Databases:

δ-Presence

  • M. Ercan Nergiz

Maurizio Atzori Chris Clifton

slide-2
SLIDE 2

2

Outline

  • Adversary Models

– Existential Uncertainty Model

  • δ-Presence

– Checking for δ-Presence Property – Providing δ-Presence

  • Future Work
slide-3
SLIDE 3

3

Adversary Models

Flu Indianapolis F 25 Tetanus Lafayette F 23 Obesity Lafayette M 16 Obesity

  • W. Lafayette

M 17 Disease Address Sex Age Flu Indiana F 22-26 Tetanus Indiana F 22-26 Obesity

  • G. Lafayette

M 15-18 Obesity

  • G. Lafayette

M 15-18 Disease Address Sex Age Original Dataset k-Anonymity Adversary: “I know that Chris is ‘Male’, from ‘W. Lafayette’ and 17-year-old. What is his disease?” “Chris is definitely obese.”

slide-4
SLIDE 4

4

Adversary Models

Flu Indiana * 15-26 Tetanus Lafayette * 15-26 Obesity Lafayette * 15-26 Obesity Indiana * 15-26 Disease Address Sex Age l-Diversity, t-Closeness Adversary: “Chris is not necessarily

  • bese.”

{Ob,Flu} Indianapolis F 25 {Ob,Te} Lafayette F 23 {Ob,Te} Lafayette M 16 {Ob,Flu}

  • W. Lafayette

M 17 Disease Address Sex Age Anatomization Adversary: “Chris is still not necessarily

  • bese.”
slide-5
SLIDE 5

5

Adversary Models and Possible Threats

  • Existential Certainty: Adversary knows that the

individual is in the private dataset and tries to learn the sensitive information about the individual in the private dataset.

– Linking Attacks: Linking Identities with sensitive attributes

  • Existential Uncertainty: Adversary doesn’t know

the individual is or is not in the private dataset.

– Linking Attacks: Existential disclosure is not considered as a privacy violation given that sensitive information is protected according to given privacy constraints. – Presence Hiding: Disclosure of existence or absence of an individual in the private dataset is a privacy violation.

slide-6
SLIDE 6

6

k-Anonymity

  • Provides some protections for all of the

adversary models.

– Sensitive info protection – Identity protection by QI anonymizations

  • BUT is not perfect for any of the models
slide-7
SLIDE 7

7

k-Anonymity Extensions

k-Anonymity

l-Diversity t-Closeness Anatomization

Existential Certainty Existential Uncertainty Linking Attacks Presence Hiding

Weak k-Anon. δ-Presence

Linking Attacks

slide-8
SLIDE 8

8

δ-Presence

  • The risk is simply from identifying that an

individual is (or is not) in an anonymized dataset.

  • Can be interpreted in terms of increased risk of

disclosure.

  • A meaningful bridge between human-

understandable policy and mathematically sound standards for anonymity.

– E.g., can we speak of privacy in terms of risk/cost/benefit? – Can convert $ to δ (see paper).

slide-9
SLIDE 9

9

δ-Presence

Given an external (public) background knowledge P, and a private table T; δ = (δmin, δmax)-presence holds for a generalization T* of T if δmin ≤ Pr(t Є T | T*,P) ≤ δmax for every t Є P

slide-10
SLIDE 10

10

Presence Challenge

T P

How to find δ-present generalization of T?

slide-11
SLIDE 11

11

Checking for Presence Property: Non-overlapping Generalization

  • A generalization T* of T is a non-
  • verlapping generalization w.r.t. P if

– every tuple in P can be mapped onto at most

  • ne equivalence class in T* .
  • Checking presence property for non-
  • verlapping generalizations is easy
slide-12
SLIDE 12

12

Checking for Presence Property: Non-overlapping Generalization Ex.

T* P

slide-13
SLIDE 13

13

Checking for Presence Property: Non-overlapping Generalization Ex.

T* P* *

slide-14
SLIDE 14

14

Checking for Presence Property

  • Let T* be a non-overlapping generalization
  • f T w.r.t. P. Then T* is δ-present, if for

each equivalence class ec of the corresponding P*: δmin ≤ (# of 1s in Sen.) / |ec| ≤ δmax

slide-15
SLIDE 15

15

(.5-.66)-Presence

T* P* Pr(ta Є T | T*) = 0.5 Pr(tg Є T | T*) = 0.66

slide-16
SLIDE 16

16

k-Anonymity Fails

Pr(ta Є T | T*) = 0 Pr(tb Є T | T*) = 1 5-anonymous T* P*

slide-17
SLIDE 17

17

How to Provide Presence?: Anti-monotonicity

  • Given a public table P, private table T, a

non-overlapping generalization T1* of T, and a non-overlapping generalization T2*

  • f T1*.

If T2* is not δ-present w.r.t. P and T then neither is T1*.

slide-18
SLIDE 18

18

How to Provide Presence?: SPALM, MPALM

  • SPALM: Optimum Single Dim. Presence

Alg.

– Analogous to Incognito [LDR SIGMOD05] – Top down pruning approach

  • MPALM: Multi Dim. Presence Alg.

– Analogous to Mondrian [LDR ICDE06] – With different attribute selection heuristics

slide-19
SLIDE 19

19

Experiments

slide-20
SLIDE 20

20

Experiments

slide-21
SLIDE 21

21

Future Work

  • Assume distribution of attributes instead of a

public table.

  • Apply randomization on private table T to satisfy

presence.

  • Design a clustering based presence algorithm

with overlapping equivalence classes.

  • Assume sensitive attributes exist in T
  • Make risk analysis on the selection of δ

parameters w.r.t. real world scenarios.

  • Personalize privacy based on attributes of the

individuals.

slide-22
SLIDE 22

22

Hiding the Presence of Individuals from Shared Databases: δ-Presence

Thanks for listening atzori@di.unipi.it Questions?