Time Distortion Anonymization for the Publication of Mobility Data - - PowerPoint PPT Presentation
Time Distortion Anonymization for the Publication of Mobility Data - - PowerPoint PPT Presentation
Time Distortion Anonymization for the Publication of Mobility Data with High Utility Vincent Primault, Sonia Ben Mokhtar, Cdric Lauradoux and Lionel Brunie Mobility data usefulness Real-time traffic, traffic prediction Companies
Mobility data usefulness…
2
Real-time traffic, traffic prediction Long-term place prediction Companies collecting data
… and threats
Gambs et al. Show Me How You Move and I Will Tell You Who You Are. Transactions
- n Data Privacy, 2011.
3
Privacy-preserving data publication
5
Anonymized mobility traces Partially re-identified mobility traces Clustering White pages Public maps Raw mobility traces Protection mechanism Attacker Useful analysis results Researcher Data mining Machine learning Simulations
Outline
- Introduction
- State of the art
- Making a PROMESSE
- Experimental evaluation
- Conclusion
6
A mobility trace
7
A trace is a temporally
- rdered list of records
belonging to a same user. A record is a triplet (user, location, timestamp).
Extraction of points of interest (POIs)
8
Done by using, e.g., an appropriate clustering algorithm. Points of interest convey semantic information about habits and can lead to users re-identification.
Location privacy protection mechanisms for data publication
9
k-anonymity Differential privacy
Wait For Me [Abul et al., 2010] Geo-Indistinguishability [Andrés et al., 2013] Abul et al. Anonymization of moving objects databases by clustering and
- perturbation. Information Systems, 2010.
Andrés et al. Geo-indistinguishability: Differential privacy for Location-based Systems. CCS, 2013.
Wait For Me
∂ represents the incertitude that comes from GPS measurements. Wait For Me enforces (k,∂)-anonymity, i.e., there is always at least k users in a cylinder of radius ∂/2.
10
Abul et al. Anonymization of moving objects databases by clustering and
- perturbation. Information Systems, 2010.
∂ ∂
Geo-Indistinguishability
11
Andrés et al. Geo-indistinguishability: Differential privacy for Location-based Systems. CCS, 2013. l1, r1 l2, r2 l3, r3 l4, r4 Level of privacy li within ri proportional to an ε
Real location Protected location
Outline
- Introduction
- State of the art
- Making a PROMESSE
- Experimental evaluation
- Conclusion
12
Intuition behind our work
No state-of-the-art mechanism is both privacy- preserving and useful for data scientists.
13
We believe geographical information is the most important one, so we propose a new mechanism that minimally distort the location. Almost all of them alter the geographical information in some way.
Hiding POIs with speed smoothing
The idea To guarantee a constant speed along a trace. More challengingto identify where a user stops, and therefore her POIs. How? Divide traces into smaller trajectories, typically one day long. Enforce an equal duration and length between two consecutive records.
14
Speed smoothing
15
10:05 10h08 10:05 10:08 10:07 10:06 epsilon 10:06 10:07 Point of interest
Outline
- Introduction
- State of the art
- Making a PROMESSE
- Experimental evaluation
- Conclusion
16
Experimenting with three real-life datasets
Cabspotting Geolife MDC Records 8,9M 3,8M 1,1M Traces 5,5k 2,4k 4,6k Avg trace duration 32 h 3 h 3 h Avg sampling rate 72 s 7 s 32 s
17
POIs retrieval
POIs with maximum diameter of 200 meters and minimum duration of 15 minutes. Two POIs match if their centroids are within 100 meters.
18
POIs retrieval (F-score)
19
0% 10% 20% 30% 40% 50% 60% Cabspotting Geolife MDC
Lower is better
Average spatial error
20
Real trace Protected trace
Average spatial error
21
0,1 1 10 100 1000 10000 100000 Spatial error, in meters Cabspotting Geolife MDC
Lower is better (log scale)
Range queries distortion
22
From 2 to 8 hours 1,000 different queries
Distortion is |Q(D) – Q(D’)|/Q(D)
Range queries distortion
23
0% 20% 40% 60% 80% 100% 120% Cabspotting Geolife MDC
Lower is better
Outline
- Introduction
- State of the art
- Making a PROMESSE
- Experimental evaluation
- Future work
- Conclusion
24
Summary
- Introduced time distortion, opened a new
research direction.
- Implemented a new protection mechanism for
data publishing, addressing a severe threat while maintaining high utility.
- Evaluated against three real-life datasets.
25
Questions
26