quantifying privacy loss of human mobility graph topology
play

Quantifying Privacy Loss of Human Mobility Graph Topology The 18th - PowerPoint PPT Presentation

Quantifying Privacy Loss of Human Mobility Graph Topology The 18th Privacy Enhancing Technologies Symposium July 2427, 2018 Dionysis Manousakas , Cecilia Mascolo , , Alastair R. Beresford , Dennis Chan , Nikhil Sharma


  1. Quantifying Privacy Loss of Human Mobility Graph Topology The 18th Privacy Enhancing Technologies Symposium July 24–27, 2018 Dionysis Manousakas ∗ , Cecilia Mascolo ∗ , † , Alastair R. Beresford ∗ , Dennis Chan ∗ , Nikhil Sharma ‡ ∗ University of Cambridge † The Alan Turing Institute ‡ UCL

  2. Mobility data privacy vs. utility analytics PETS’18 Background 2 • Information sharing for data-driven customization and large-scale • context-awareness • transportation management, health studies, urban development • Utility -preserving anonymized data representations • timestamped GPS, CDR, etc. measurements • histograms • heatmaps • graphs • How privacy conscientious they are? • often poorly understood, leading to privacy breaches

  3. Mobility data privacy vs. utility analytics PETS’18 Background 2 • Information sharing for data-driven customization and large-scale • context-awareness • transportation management, health studies, urban development • Utility -preserving anonymized data representations • timestamped GPS, CDR, etc. measurements • histograms • heatmaps • graphs • How privacy conscientious they are? • often poorly understood, leading to privacy breaches

  4. Mobility data privacy vs. utility analytics PETS’18 Background 2 • Information sharing for data-driven customization and large-scale • context-awareness • transportation management, health studies, urban development • Utility -preserving anonymized data representations • timestamped GPS, CDR, etc. measurements • histograms • heatmaps • graphs • How privacy conscientious they are? • often poorly understood, leading to privacy breaches

  5. Mobility data privacy vs. utility analytics PETS’18 Background 2 • Information sharing for data-driven customization and large-scale • context-awareness • transportation management, health studies, urban development • Utility -preserving anonymized data representations • timestamped GPS, CDR, etc. measurements • histograms • heatmaps • graphs • How privacy conscientious they are? • often poorly understood, leading to privacy breaches

  6. Deanonymizing mobility Raw mobility data Inference on individual traces information 1 Sparsity and regularity-based [Zang and Bolot, 2011] [de Montjoye et al., 2013] [Naini et al., 2016] PETS’18 Background 3 • ”top- N ” location attacks • unicity of spatio-temporal points • matching of individual mobility histograms

  7. Deanonymizing mobility Raw mobility data Inference on individual traces information [De Mulder et al., 2008] PETS’18 Background 4 2 Probabilistic models • Markovian mobility models • Mobility Markov chains [Gambs et al., 2014]

  8. Deanonymizing mobility Raw mobility data Inference on population statistics aggregated mobility data [Xu et al., 2017] location time-series [Pyrgelis et al., 2017] PETS’18 Background 5 3 On aggregate information • Individual trajectory recovery from • Probabilistic inference on aggregated

  9. Mobility representations raw mobility data sequences of pseudonymised regions of interest e.g. MDC research track, Device Analyzer storage cost utility inference diffjculty privacy loss ? PETS’18 Motivation 6

  10. Mobility representations raw mobility data sequences of pseudonymised regions of interest e.g. MDC research track, Device Analyzer storage cost utility inference diffjculty privacy loss ? PETS’18 Motivation 6

  11. Mobility representations raw mobility data sequences of pseudonymised regions of interest e.g. MDC research track, Device Analyzer storage cost utility inference diffjculty privacy loss ? PETS’18 Motivation 6

  12. Mobility representations raw mobility data sequences of pseudonymised regions of interest e.g. MDC research track, Device Analyzer storage cost utility inference diffjculty privacy loss ? PETS’18 Motivation 6

  13. Mobility representations raw mobility data sequences of pseudonymised regions of interest e.g. MDC research track, Device Analyzer storage cost utility inference diffjculty privacy loss ? PETS’18 Motivation 6

  14. Motivation Let’s remove – What is the privacy leakage of this representation? – Does topology still bear identifjable information? – Can an adversary exploit it in a deanonymization attack? PETS’18 Motivation 7 • temporal (except from ordering of states) • geographic, and • cross-referencing information

  15. Motivation Let’s remove – What is the privacy leakage of this representation? – Does topology still bear identifjable information? – Can an adversary exploit it in a deanonymization attack? PETS’18 Motivation 7 • temporal (except from ordering of states) • geographic, and • cross-referencing information

  16. Mobility information fmow Removal of Geographic-Temporal Information Mobility Data Graph Topology Sparsity Recurrence Privacy Loss PETS’18 Overview 8

  17. Mobility information fmow Removal of Geographic-Temporal Information Mobility Data Graph Topology Sparsity Recurrence Privacy Loss PETS’18 Overview 8

  18. Mobility information fmow Removal of Geographic-Temporal Information Mobility Data Graph Topology Sparsity Recurrence Privacy Loss PETS’18 Overview 8

  19. Mobility information fmow Removal of Geographic-Temporal Information Mobility Data Graph Topology Sparsity Recurrence Privacy Loss PETS’18 Overview 8

  20. Difgerences of our approach Mobility deanonymization Overview PETS’18 information Sharad and Danezis, 2014] [Narayanan and Shmatikov, 2008, node matching entire graph : No need for Privacy on graphs [Lin et al., 2015] ) information (as opposed to locations 9 • Each user ’s information is an • No cross-referencing between • No fjne-grained temporal • No social network

  21. Data information, cellular and wireless location PETS’18 Overview 10 • Device Analyzer : global dataset from mobile devices with system • 1500 users with the most cid location datapoints • average of 430 days of observation, • 200 regions of interest • cids pseudonymized per handset

  22. Mobility networks Graphs with nodes corresponding to ROIs and edges to recorded transitions between ROIs data [Scholtes, 2017] visited regions in user’s routine PETS’18 Overview 11 • Network Order Selection via Markov chain modeling of sequential • Node attributes with no temporal/geographic information • Edge weights corresponding to frequency of transitions • Location pruning to top − N networks by keeping the most frequently

  23. Empirical statistics Graphs with: PETS’18 Overview 12 • heavy-tailed degree distributions • large number of rarely repeated transitions • small number of frequent transitions • high recurrence rate

  24. Privacy framework is the minimum cardinality of isomorphism classes within a population of graphs [Sweeney, 2002] PETS’18 Method 13 k − anonymity via graph isomorphism Graph k − anonymity

  25. directed undirected undirected networks PETS’18 Method 14 Identifjability of top − N mobility networks • 15 and 19 locations suffjce to form uniquely identifjable directed and • 5 and 8 are the corresponding theoretical upper bounds

  26. directed and undirected networks respectively PETS’18 Method 15 Anonymity size of top − N mobility networks • small isomorphism clusters for even very few locations • median anonymity becomes one for network sizes of 5 and 8 in

  27. Recurring patterns in typical user’s mobility 1st half of the observation period 2nd half of the observation period observation window PETS’18 Method 16 shown edges correspond to the 10 % most frequent transitions in the respective

  28. Threat Model PETS’18 Method 17

  29. Threat Model DISCLOSED IDs UNDISCLOSED IDs PETS’18 Method 18 G train G test • closed-world • partition point for each user randomly ∈ (0 . 3 , 0 . 7) of total obs. period • state frequency information

  30. Threat Model PETS’18 Method 19

  31. Attacks: Uninformed Adversary P for every G i PETS’18 Method 20 ( ) l G ′ = l G i = 1/ |L| , ∈ G train expected rank = |L| / 2

  32. Attacks: Informed Adversary P Method PETS’18 f : non-decreasing K : graph similarity metric, , 21 K ( G i , G ′ ) ( l G ′ = l G i ) ( ) |G train , K ∝ f for every G i ∈ G train

  33. Attacks: Informed Adversary PL Method PETS’18 true P 22 P • Posterior probability K ( G i , G ′ ) ( ) ( ) l G ′ = l G i |G train , K ∝ f , for every G i ∈ G train • Privacy Loss ( ) l G ′ = l G ′ true |G train , K G ′ ; G train , K ( ) = P − 1 ( ) l G ′ = l G ′

  34. Graph Similarity Functions Graph Kernels Method PETS’18 G 23 Express similarity as inner product of vectors with graph statistics [Vishwanathan et al., 2010] subtrees) • on Atomic Substructures (e.g. Shortest-Paths, Weisfeiler-Lehman ⟨ ⟩ φ ( G ′ ) φ ( G ) K ( G , G ′ ) = || φ ( G ) || , || φ ( G ′ ) || • Deep Kernels [Yanardag and Vishwanathan, 2015] K ( G , G ′ ) = φ G ′ ) ( ) T M φ ( M : encodes similarities between substructures

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend