an evaluation of edge modification techniques for privacy
play

An Evaluation of Edge Modification Techniques for Privacy-Preserving - PowerPoint PPT Presentation

Introduction Edge Modification Techniques Experimental Set Up Information loss Conclusions An Evaluation of Edge Modification Techniques for Privacy-Preserving on Graphs Jordi Casas-Roma Universitat Oberta de Catalunya Barcelona, Spain


  1. Introduction Edge Modification Techniques Experimental Set Up Information loss Conclusions An Evaluation of Edge Modification Techniques for Privacy-Preserving on Graphs Jordi Casas-Roma Universitat Oberta de Catalunya Barcelona, Spain jcasasr@uoc.edu MDAI 2015, Sk¨ ovde, Sweden, September 21-23, 2015 Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  2. Introduction Edge Modification Techniques Experimental Set Up Information loss Conclusions Overview Introduction 1 Edge Modification Techniques 2 Experimental Set Up 3 Information loss 4 Conclusions 5 Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  3. Introduction Edge Modification Techniques Motivation Experimental Set Up Definitions Information loss Conclusions Introduction Scenario Release data to third parties Preserve the privacy of users Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  4. Introduction Edge Modification Techniques Motivation Experimental Set Up Definitions Information loss Conclusions Simple Anonymization Simple anonymization does not work! User Dan can be re-identified using his structural properties. Figure 1 : Original network Figure 2 : Simple anonymization 1 2 3 4 Amy Tim Bob Lis 5 6 7 Ann Dan Tom 8 9 Eva Joe Figure 3 : Dan’s 1-neighbourhood Figure 4 : Dan is re-identified 2 3 1 2 3 4 6 5 6 7 8 9 8 9 Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  5. Introduction Edge Modification Techniques Motivation Experimental Set Up Definitions Information loss Conclusions Anonymization methods Goals Introduce noise to hinder the re-identification processes. Adding/removing edges. Adding fake nodes. Grouping nodes into clusters. . . . Preserve user’s privacy vs. Maximize data utility (minimize information loss). Figure 5 : Dan’s 1-neighbourhood Figure 6 : Noise added 2 3 1 2 3 4 6 5 6 7 8 9 8 9 Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  6. Introduction Edge Modification Techniques Motivation Experimental Set Up Definitions Information loss Conclusions Definitions Network Let G = ( V , E ) be a simple, unweighed and undirected network, where V is the set of nodes and E the set of edges. We define n = | V | to denote the number of nodes and m = | E | to denote the number of edges. Perturbed graphs We designate G = ( V , E ) and � G = ( � V , � E ) to refer the original and the anonymous graphs, respectively. Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  7. Introduction Graph modification techniques Edge Modification Techniques Edge add Experimental Set Up Edge del Information loss Edge add/del Conclusions Edge switch Graph modification techniques Random perturbation Adding/Removing/Switching edges Trying to preserve some features Our approach (average distance, spectral properties, Edge add etc) Edge del Constrained perturbation Edge add/del Sequential edge modifications in order Edge switch to fulfil some desired constraints Also adding new fake vertices Example, k -anonymity model Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  8. Introduction Graph modification techniques Edge Modification Techniques Edge add Experimental Set Up Edge del Information loss Edge add/del Conclusions Edge switch Edge add Properties Create a new edge { v i , v j } �∈ E m > m � v j v i True relationships will be preserved in perturbed data Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  9. Introduction Graph modification techniques Edge Modification Techniques Edge add Experimental Set Up Edge del Information loss Edge add/del Conclusions Edge switch Edge del Properties Remove an existing edge { v i , v j } ∈ E m < m � No fake relationships are included in v i v j the anonymous data, but several true relations are deleted from original data. Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  10. Introduction Graph modification techniques Edge Modification Techniques Edge add Experimental Set Up Edge del Information loss Edge add/del Conclusions Edge switch Edge add/del Properties It is a combination of the previous pair methods. Delete an existing edge { v i , v j } ∈ E v j v i and add a new one { v k , v p } �∈ E Some true relations are deleted and some fake ones are created m = m � v p v k All vertices involved in this operation change their degree Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  11. Introduction Graph modification techniques Edge Modification Techniques Edge add Experimental Set Up Edge del Information loss Edge add/del Conclusions Edge switch Edge switch Properties Delete edge { v i , v j } ∈ E and creating a new edge { v i , v p } �∈ E v i v j Some true relations are removed, some fake ones are created m = m � Two vertices change their degree ( v j v p and v p ) while the third one ( v i ) does not. Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  12. Introduction Edge Modification Techniques Framework Experimental Set Up Evaluated Metrics Information loss Datasets Conclusions Experimental framework Perturbation � G G process p Metric m Metric m ǫ m ( G , � G ) Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  13. Introduction Edge Modification Techniques Framework Experimental Set Up Evaluated Metrics Information loss Datasets Conclusions Network metrics Structural network metrics : Average distance ( dist ) Transitivity ( T ) Spectral network metrics The largest eigenvalue of the adjacency matrix A ( λ 1 ) We compute the error on these network metrics as follows: ǫ m ( G , � G ) = | m ( G ) − m ( � G p ) | (1) Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  14. Introduction Edge Modification Techniques Framework Experimental Set Up Evaluated Metrics Information loss Datasets Conclusions Vertex metrics Structural vertex metrics Betweenness centrality ( C B ) Closeness centrality ( C C ) Degree centrality ( C D ) And we compute the error on vertex metrics by: � 1 g 1 ) 2 + . . . + ( g n − � ǫ m ( G , � g n ) 2 ) G ) = n (( g 1 − � (2) Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  15. Introduction Edge Modification Techniques Framework Experimental Set Up Evaluated Metrics Information loss Datasets Conclusions Synthetic networks Erd¨ os-R´ enyi Model defines a random graph as n vertices connected by m edges that are chosen randomly from the n ( n − 1) / 2 possible edges. In our experiments, we set n =1,000 and m =5,000. This dataset is denoted as “ER-1000”. Barab´ asi-Albert Model, also called scale-free model, is a network whose degree distribution follows a power-law. That is, for degree d , its probability density function is P ( k ) = d − γ . In our experiments, we set the number of vertices to be 1,000 and γ =1, i.e. linear preferential attachment. This dataset is denoted as “BA-1000”. Dataset n m deg dist D ER-1000 1,000 4,969 9.938 3.263 5 BA-1000 1,000 4,985 9.970 2.481 4 Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  16. Introduction Edge Modification Techniques Framework Experimental Set Up Evaluated Metrics Information loss Datasets Conclusions Real networks Zachary’s Karate Club is a network widely used in literature. The graph shows the relationships among 34 members of a karate club. Jazz musicians is a collaboration graph of jazz musicians and their relationship. URV email is the email communication network at the University Rovira i Virgili in Tarragona (Spain). Nodes are users and each edge represents that at least one email has been sent. Political blogosphere data ( polblogs ) compiles the data on the links among US political blogs. Dataset n m deg dist D Zachary’s Karate Club 34 78 4.588 2.408 5 Jazz musicians 198 2,742 27.697 2.235 6 URV email 1,133 5,451 9.622 3.606 8 Polblogs 1,222 16,714 27.31 2.737 8 Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

  17. Introduction Edge Modification Techniques Experimental Set Up Empirical results Information loss Conclusions Some examples 20 7.5 2.3 15 7.0 2.2 Add Del 6.5 2.1 10 AddDel Switch Add 6.0 2.0 Del 5 AddDel Switch 1.9 5.5 0 5 10 15 20 25 0 5 10 15 20 25 0 200 400 600 800 1000 (e) Deg. dist. ER-1000 (f) λ 1 on Karate (g) dist on Jazz 0.22 250 0.20 0.20 Add 150 Del 0.18 AddDel 0.10 Switch Add 0.16 Del 50 AddDel 0.14 0.00 Switch 0 0 5 10 15 20 25 0 5 10 15 20 25 0 200 400 600 800 1000 (h) Deg. dist. BA-1000 (i) C C on URV email (j) T on Polblogs Jordi Casas-Roma Edge Modification Techniques for Privacy-Preserving on Graphs

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend