Towards Plausible Graph Anonymization Yang Zhang, Mathias Humbert, - - PowerPoint PPT Presentation

towards plausible graph anonymization
SMART_READER_LITE
LIVE PREVIEW

Towards Plausible Graph Anonymization Yang Zhang, Mathias Humbert, - - PowerPoint PPT Presentation

Towards Plausible Graph Anonymization Yang Zhang, Mathias Humbert, Bartlomiej Surma, Praveen Manoharan, Jilles Vreeken, Michael Backes Graph sharing 2 Graph anonymization 3 Graph anonymization id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5


slide-1
SLIDE 1

Yang Zhang, Mathias Humbert, Bartlomiej Surma, Praveen Manoharan, Jilles Vreeken, Michael Backes

Towards Plausible Graph Anonymization

slide-2
SLIDE 2

Graph sharing

2

slide-3
SLIDE 3

Graph anonymization

3

slide-4
SLIDE 4

Graph anonymization

4

id 6 id 2 id 1 id 7 id 8 id 5 id 4 id 3

slide-5
SLIDE 5

Graph anonymization

5

id 6 id 2 id 1 id 7 id 8 id 5 id 4 id 3

slide-6
SLIDE 6

Graph anonymization

6

id 6 id 2 id 1 id 7 id 8 id 5 id 4 id 3

slide-7
SLIDE 7

Graph anonymization

7

id 6 id 2 id 1 id 7 id 8 id 5 id 4 id 3

slide-8
SLIDE 8

Our work

▪ Find a fundamental flaw in graph anonymization designs

8

slide-9
SLIDE 9

Our work

▪ Find a fundamental flaw in graph anonymization designs ▪ Exploit it to recover original graph

9

slide-10
SLIDE 10

Our work

▪ Find a fundamental flaw in graph anonymization designs ▪ Exploit it to recover original graph ▪ Use our findings to enhance anonymization designs

10

slide-11
SLIDE 11

Our work

▪ Find a fundamental flaw in graph anonymization designs ▪ Exploit it to recover original graph ▪ Use our findings to enhance anonymization designs ▪ Evaluate privacy and usability of enhanced techniques on 3 real life datasets: ▪ Enron, NO, Snap

11

slide-12
SLIDE 12

Graph anonymization methods

▪ ’08 Liu et al. - k-anonymity (k-DA) ▪ ’08 Zhou et al. - k-anonymity (k-NA) ▪ ’10 Cheng et al. - k-anonymity (k-iso) ▪ ’11 Sala et al. - differential privacy ▪ ’12 Mittal et al. - random walk privacy ▪ ’14 Xiao et al. - differential privacy

12

slide-13
SLIDE 13

k-DA algorithm

13

id 6 id 4 id 1 id 8 id 7 id 5 id 3 id 2

slide-14
SLIDE 14

k-DA algorithm

id 6 id 4 id 1 id 8 id 7 id 5 id 3 id 2 # nodes

1 2 3 4 5

node degree

1 2 3 4

slide-15
SLIDE 15

k-DA algorithm

15

2-DA id 6 id 4 id 1 id 8 id 7 id 5 id 3 id 2 # nodes

1 2 3 4 5

node degree

1 2 3 4

# nodes

1 2 3 4 5 6

node degree

1 2 3 4

slide-16
SLIDE 16

k-DA algorithm

16

2-DA id 6 id 4 id 1 id 8 id 7 id 5 id 3 id 2 id 6 id 4 id 1 id 8 id 7 id 5 id 3 id 2 # nodes

1 2 3 4 5

node degree

1 2 3 4

# nodes

1 2 3 4 5 6

node degree

1 2 3 4

slide-17
SLIDE 17

SalaDP algorithm

17

ɛ-DP

dK-2 series perturbed dK-2 series

id 6 id 4 id 1 id 8 id 7 id 5 id 3 id 2 id 6 id 4 id 1 id 8 id 7 id 5 id 3 id 2

slide-18
SLIDE 18

Social network graph properties

18

id 6 id 2 id 1 id 7 id 8 id 5 id 4 id 3

slide-19
SLIDE 19

id 2 id 1 id 3 id 5 id 4

Social network graph properties

19

id 6 id 7 id 8

slide-20
SLIDE 20

id 4

Social network graph properties

20

id 6 id 2 id 1 id 7 id 8 id 5 id 3

slide-21
SLIDE 21

Social network graph properties

21

id 6 id 2 id 1 id 7 id 8 id 5 id 4 id 3

slide-22
SLIDE 22

Graph recovery attack - overview

22

slide-23
SLIDE 23

Graph recovery attack - graph embedding

23

▪ Node embeddings with node2vec ’16 Grover and Leskovec ▪ Mapping users into continuous vector space ▪ User’s vector reflects structural properties

slide-24
SLIDE 24

Graph recovery attack - graph embedding

24

▪ Plausibility is cosine similarity between embeddings

−0.2 0.0 0.2 0.4 0.6 0.8 1.0

Edge plausibility

1 2 3 4 5 6 7

Number of edges

×104 Original edges Fake edges

slide-25
SLIDE 25

Graph recovery attack - graph embedding

25

▪ Plausibility is cosine similarity between embeddings

−0.2 0.0 0.2 0.4 0.6 0.8 1.0

Edge plausibility

1 2 3 4 5 6 7

Number of edges

×104 Original edges Fake edges

Enron NO SNAP 0.0 0.2 0.4 0.6 0.8 1.0

AUC

Cosine Euclidean Bray-Curtis Embeddedness Jaccard Adamic-Adar

slide-26
SLIDE 26

Graph recovery attack - graph embedding

26

▪ Find a cutoff point and remove non-plausible edges

−0.2 0.0 0.2 0.4 0.6 0.8 1.0

Edge plausibility

1 2 3 4 5 6 7

Number of edges

×104 Original edges Fake edges F1 score

slide-27
SLIDE 27

Enhancing anonymization

▪ get fake edges with highest plausibility? ▪ the distribution will look unnatural

27

slide-28
SLIDE 28

Enhancing anonymization

▪ get fake edges with highest plausibility? ▪ the distribution will look unnatural ▪ draw fake edges from same plausibility distribution?

28

slide-29
SLIDE 29

Enhancing anonymization

▪ get fake edges with highest plausibility? ▪ the distribution will look unnatural ▪ draw fake edges from same plausibility distribution?

29

k-DA (k=100) Enhanced k-DA (k=100)

slide-30
SLIDE 30

Resilience to graph recovery attack

▪ F1 score for original anonymizations ▪ F1 score for enhanced anonymizations

30

k-DA drops by:
 26~51% SalaDP drops by: 37~48%

slide-31
SLIDE 31

Utility of Enhanced anonymization

31

0.6 0.7 0.8 0.9 1.0

Utility of GA

0.6 0.7 0.8 0.9 1.0

Utility of GF

Eigencentrality (Enron) Eigencentrality (NO) Eigencentrality (SNAP) Degree distribution (Enron) Degree distribution (NO) Degree distribution (SNAP) Triangle count (Enron) Triangle count (NO) Triangle count (SNAP)

slide-32
SLIDE 32

Resilience to deanonymization attack

32

Enron NO SNAP 5 10 15 20 25 30

Anonymity gain (%)

k-DA (k =50) k-DA (k =75) k-DA (k =100) SalaDP (✏ =100) SalaDP (✏ =50) SalaDP (✏ =10)

slide-33
SLIDE 33

Conclusion

33

We find flaws in current graph anonymizations

slide-34
SLIDE 34

Conclusion

34

We find flaws in current graph anonymizations We recover the original, pre-anonymized graph

slide-35
SLIDE 35

Conclusion

35

We find flaws in current graph anonymizations We enhance the anonymization techniques We recover the original, pre-anonymized graph

slide-36
SLIDE 36

Conclusion

36

We find flaws in current graph anonymizations We enhance the anonymization techniques We evaluate privacy and utility

  • f enhanced anonymization

We recover the original, pre-anonymized graph